Man pages: check for corrupt tables

Every so often we hear reports of a corrupt man page table,
where columns are misaligned in nonsensical ways. The
traditional symptom looks like:

   |----------------|--------------|
   | option name    |              |
   |----------------|--------------|
   |                | description  |
   |----------------|--------------|

Cause: one of the tools in the man page generation chain,
maybe 'man' itself, has an undocumented length limit on
table cells, _and_ an undocumented page width as well.
If you exceed these undocumented limits, you get corrupt
man pages. Silently.

This adds a horrible test for those. And I mean horrible:

  - unreadable
  - unmaintainable
  - unreliable (heuristic, no guarantees)
  - slows down 'make docs' (less than a second, but still)

I've tested by adding long '| sdf sdf | dsf |' rows to
a few man pages, and it triggers. That's the only good
thing I can say about it.

Other approaches I tried:
  - man -l -Tascii | grep non-ascii-art
  - man -l ... 2>&1 | grep "table wider than"
  - perusing the generated .1/.5 pages, seeing if my eye
    can detect something different about too-long cells
  - same, using 'tbl'
  - checking for too-long cells in the source document

...and more that I've forgotten. This was the only way
that produced reliable errors. If you have a better way,
please oh please submit it.

Signed-off-by: Ed Santiago <santiago@redhat.com>
This commit is contained in:
Ed Santiago
2023-07-18 14:33:14 -06:00
parent 7791ffd215
commit 11ffea313b

View File

@ -483,6 +483,17 @@ $(MANPAGES): %: %.md .install.md2man docdir
# like '[cgroups(7)](https://.....)' -> just 'cgroups(7)';
# 4. Remove HTML-ish stuff like '<sup>..</sup>' and '<a>..</a>'
# 5. Replace "\" (backslash) at EOL with two spaces (no idea why)
# Then two sanity checks:
# 1. test for "included file options/blahblah"; this indicates a failure
# in the markdown-preprocess tool; and
# 2. run 'man -l' against the generated man page, and check for tables
# with an empty right-hand column followed by an empty left-hand
# column on the next line. (Technically, on the next-next line,
# because the next line must be table borders). This is a horrible
# unmaintainable rats-nest of duplication, obscure grep options, and
# ASCII art. I (esm) believe the cost of releasing corrupt man pages
# is higher than the cost of carrying this kludge.
#
@$(SED) -e 's/\((podman[^)]*\.md\(#.*\)\?)\)//g' \
-e 's/\[\(podman[^]]*\)\]/\1/g' \
-e 's/\[\([^]]*\)](http[^)]\+)/\1/g' \
@ -492,6 +503,9 @@ $(MANPAGES): %: %.md .install.md2man docdir
@if grep 'included file options/' docs/build/man/*; then \
echo "FATAL: man pages must not contain ^^^^"; exit 1; \
fi
@if man -l $(subst source/markdown,build/man,$@) | grep -Pazoq '│\s+│\n\s+├─+┼─+┤\n\s+│\s+│'; then \
echo "FATAL: $< has a too-long table column; use 'man -l $(subst source/markdown,build/man,$@)' and look for empty table cells."; exit 1; \
fi
.PHONY: docdir
docdir: