New manpage format #1921

jnavila · 2024-11-17T21:54:48Z

Changes

This PR changes the way the asciidoc source of manpage is processed, by adding the "synopsis" paragraph style and reworking the backtick format.

Context

The style change has been pushed to master and will be applied to git-clone and git-init in the next version.

dscho · 2024-11-18T18:28:43Z

I just triggered a pair of workflow runs to update the manual pages and to update the translated manual pages, fetched the result and rendered it locally. Here are two examples:

language	before	after
English
French

Personally, I cannot spot any difference, apart from the version number (because this here PR branch is based on v2.46.2 while the updated manual pages include v2.47.0) and the incorrect =<regexp> on the "before" side of the French version (fixed on the "after" side).

Even looking at the HTML of the synopses (taking the French version, so that there is a known difference), I only see this:

diff --git a/before b/after
index 1a87d1348..6185fb72b 100644
--- a/before
+++ b/after
@@ -1,5 +1,5 @@
 <pre class="content"><em>git config list</em> [&lt;option-de-fichier&gt;] [&lt;option-d-affichage&gt;] [--includes]
-<em>git config get</em> [&lt;option-de-fichier&gt;] [&lt;option-d-affichage&gt;] [--includes] [--all] [--regexp=&lt;regexp&gt;] [--value=&lt;valeur&gt;] [--fixed-value] [--default=&lt;default&gt;] &lt;nom&gt;
+<em>git config get</em> [&lt;option-de-fichier&gt;] [&lt;option-d-affichage&gt;] [--includes] [--all] [--regexp] [--value=&lt;valeur&gt;] [--fixed-value] [--default=&lt;default&gt;] &lt;nom&gt;
 <em>git config set</em> [&lt;option-de-fichier&gt;] [--type=&lt;type&gt;] [--all] [--value=&lt;valeur&gt;] [--fixed-value] &lt;nom&gt; &lt;valeur&gt;
 <em>git config unset</em> [&lt;option-de-fichier&gt;] [--all] [--value=&lt;valeur&gt;] [--fixed-value] &lt;nom&gt; &lt;valeur&gt;
 <em>git config rename-section</em> [&lt;option-de-fichier&gt;] &lt;ancien-name&gt; &lt;nouveau-name&gt;

@jnavila what am I missing?

jnavila · 2024-11-18T21:40:44Z

The manpage of git-config has not been converted yet.
I pushed a branch "test-refactor" on git-html-l10n, where I hand-edited fr/git-add.txt.

After importing, here is the result:

I'm not satisfied with the styles, particularly when dealing with inline formats:

you can test by yourself locally, and tell me your judgment.

The new style makes the code spans lighter and more integrated into the text. The new style also makes the code spans more readable and less intrusive. Signed-off-by: Jean-Noël Avila <jn.avila@free.fr>

This commit adds a upcoming manpage format to the AsciiDoc backend. The new format changes are: * The synopsis is now a section with a dedicated style. This "synopsis" style allows to automatically format the keywords as monospaced and <placeholders> as italic. * the backticks are now used to format synopsis-like syntax in inline elements. All the manpages are processed with this format. It may upset the formatting for older manpages, making it not consistent across a page, but this will be a mild side effect, as this was not really consistent before. Signed-off-by: Jean-Noël Avila <jn.avila@free.fr>

jnavila · 2024-12-22T18:02:12Z

@dscho I updated the CSS, so it is ready for review.

To1ne

@jnavila I've added a few questions. Thanks for this contribution.

To1ne · 2024-12-26T09:26:48Z

script/asciidoctor-extensions.rb

+
+      def process parent, reader, attrs
+        outlines = reader.lines.map do |l|
+          l.gsub(/(\.\.\.?)([^\]$.])/, '`\1`\2')


I think a line of comment wouldn't hurt with these regexes. Maybe best with an example:

Suggested change

l.gsub(/(\.\.\.?)([^\]$.])/, '`\1`\2')

l.gsub(/(\.\.\.?)([^\]$.])/, '`\1`\2') # wrap ellipsis in backticks: ...something => `...`something

I think the intended use is for [...<more>]? Should we include the [ and ] in the regex?

This line is trying to differentiate the three dots in different contexts, where they have a different meaning and require different formatting.

First there is the form <commit1>...<commit2> when describing a range of commits, where the three dots are a "keyword" understood by git and must be formatted as code.

Then there is the forms used in the grammar to express repetition, such as in " ..." with optionally square brackets, such as "[...]" which usually appear at the end of the command line. These three dots must not be formatted as code, but left as is.

This line matches the former case and forces the corresponding format. I'll add a comment in the same line as yours.

To1ne · 2024-12-26T09:30:19Z

script/asciidoctor-extensions.rb

+      def process parent, reader, attrs
+        outlines = reader.lines.map do |l|
+          l.gsub(/(\.\.\.?)([^\]$.])/, '`\1`\2')
+           .gsub(%r{([\[\] |()>]|^)([-a-zA-Z0-9:+=~@,/_^\$]+)}, '\1{empty}`\2`{empty}')


To be honest, I don't know what this one is for.

This one is the line that matches all the words which are not placeholders and not grammar signs, and format them as code. These words (in the general sense here) are keywords (option names, enum strings, two or three dot notation, etc).

To1ne · 2024-12-26T09:43:11Z

script/asciidoctor-extensions.rb

+        outlines = reader.lines.map do |l|
+          l.gsub(/(\.\.\.?)([^\]$.])/, '`\1`\2')
+           .gsub(%r{([\[\] |()>]|^)([-a-zA-Z0-9:+=~@,/_^\$]+)}, '\1{empty}`\2`{empty}')
+           .gsub(/(<([[:word:]]|[-0-9.])+>)/, '__\\1__')


I had to dig deep to find what [[:word:]] does, but it seems to be a Ruby non-POSIX bracket expression: https://docs.ruby-lang.org/en/master/Regexp.html#class-Regexp-label-POSIX+Bracket+Expressions. Personally I'm not a fan, what's the advantage over \w?

Also why are the inner brackets round brackets?

I just wonder if we can simplify to:

Suggested change

.gsub(/(<([[:word:]]|[-0-9.])+>)/, '__\\1__')

.gsub(/(<[^>]+>)/, '__\\1__')

And one more question, why the double backslash in the replacement string?

The '\w` is for ascii, but here, we are going to process internationalized texts (because placeholders are translated), and this processing requires the special form with double brackets. I'm not an expert in Ruby regexes; this is the form I have found to work well with the translations.

As for the using a more generic regex (expecting everything between brackets to be the placeholder's name), the placeholder's names are not supposed to contain spaces, which is perfect when we have to match something like:

$ git foo < in-file > out-file

To1ne · 2024-12-27T19:23:00Z

script/asciidoctor-extensions.rb

+        if node.type == :monospaced
+          node.text.gsub(/(\.\.\.?)([^\]$.])/, '<code>\1</code>\2')
+              .gsub(%r{([\[\s|()>.]|^|\]|&gt;)(\.?([-a-zA-Z0-9:+=~@,/_^\$]+\.{0,2})+)}, '\1<code>\2</code>')
+              .gsub(/(&lt;([[:word:]]|[-0-9.])+&gt;)/, '<em>\1</em>')


So we more or less need to repeat the regexes here?

That's unfortunate, but the two regex are very alike, except that this one processes the text after some pre-processing steps, and the transformations need to be in final result form (with tags and escaped characters).

I evaluated the opportunity for factorization, but it makes the code more messy as it is already.

jnavila changed the base branch from gh-pages to main November 17, 2024 21:57

dscho had a problem deploying to github-pages November 18, 2024 17:45 — with GitHub Actions Failure

dscho changed the base branch from main to gh-pages November 18, 2024 19:01

jnavila marked this pull request as draft November 18, 2024 21:36

jnavila added 2 commits November 30, 2024 17:12

stylesheet: remove background and border from code spans

d9070e6

The new style makes the code spans lighter and more integrated into the text. The new style also makes the code spans more readable and less intrusive. Signed-off-by: Jean-Noël Avila <jn.avila@free.fr>

jnavila force-pushed the new_manpage_format branch from 3cb6e5e to 9bd765a Compare November 30, 2024 16:16

jnavila marked this pull request as ready for review November 30, 2024 16:19

To1ne reviewed Dec 30, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

New manpage format #1921

New manpage format #1921

jnavila commented Nov 17, 2024

dscho commented Nov 18, 2024

jnavila commented Nov 18, 2024

jnavila commented Dec 22, 2024

To1ne left a comment

To1ne Dec 26, 2024

jnavila Dec 31, 2024

To1ne Dec 26, 2024

jnavila Dec 31, 2024

To1ne Dec 26, 2024

jnavila Dec 31, 2024

To1ne Dec 27, 2024

jnavila Dec 31, 2024 •

edited

Loading

	l.gsub(/(\.\.\.?)([^\]$.])/, '`\1`\2')
	l.gsub(/(\.\.\.?)([^\]$.])/, '`\1`\2') # wrap ellipsis in backticks: ...something => `...`something

	.gsub(/(<([[:word:]]\|[-0-9.])+>)/, '__\\1__')
	.gsub(/(<[^>]+>)/, '__\\1__')

New manpage format #1921

Are you sure you want to change the base?

New manpage format #1921

Conversation

jnavila commented Nov 17, 2024

Changes

Context

dscho commented Nov 18, 2024

jnavila commented Nov 18, 2024

jnavila commented Dec 22, 2024

To1ne left a comment

Choose a reason for hiding this comment

To1ne Dec 26, 2024

Choose a reason for hiding this comment

jnavila Dec 31, 2024

Choose a reason for hiding this comment

To1ne Dec 26, 2024

Choose a reason for hiding this comment

jnavila Dec 31, 2024

Choose a reason for hiding this comment

To1ne Dec 26, 2024

Choose a reason for hiding this comment

jnavila Dec 31, 2024

Choose a reason for hiding this comment

To1ne Dec 27, 2024

Choose a reason for hiding this comment

jnavila Dec 31, 2024 • edited Loading

Choose a reason for hiding this comment

jnavila Dec 31, 2024 •

edited

Loading