PolyglotMan(1)PolyglotMan(1)NAME
PolyglotMan, rman - reverse compile man pages from format-
ted form to a number of source formats
SYNOPSISrman [ options ] [ file ]
DESCRIPTION
PolyglotMan takes man pages from most of the popular fla-
vors of UNIX and transforms them into any of a number of
text source formats. PolyglotMan was formerly known as
RosettaMan. The name of the binary is still called rman ,
for scripts that depend on that name; mnemonically, just
think "reverse man". Previously PolyglotMan required
pages to be formatted by nroff prior to its processing.
With version 3.0, it prefers [tn]roff source and usually
produces results that are better yet. And source process-
ing is the only way to translate tables. Source format
translation is not as mature as formatted, however, so try
formatted translation as a backup.
In parsing [tn]roff source, one could implement an arbi-
trarily large subset of [tn]roff, which I did not and will
not do, so the results can be off. I did implement a sig-
nificant subset of those use in man pages, however,
including tbl (but not eqn), if tests, and general macro
definitions, so usually the results look great. If they
don't, format the page with nroff before sending it to
PolyglotMan. If PolyglotMan doesn't recognize a key macro
used by a large class of pages, however, e-mail me the
source and a uuencoded nroff-formatted page and I'll see
what I can do. When running PolyglotMan with man page
source that includes or redirects to other [tn]roff source
using the .so (source or inclusion) macro, you should be
in the parent directory of the page, since pages are writ-
ten with this assumption. For example, if you are trans-
lating /usr/man/man1/ls.1, first cd into /usr/man.
PolyglotMan accepts man pages from: SunOS, Sun Solaris,
Hewlett-Packard HP-UX, AT&T System V, OSF/1 aka Digital
UNIX, DEC Ultrix, SGI IRIX, Linux, FreeBSD, SCO. Source
processing works for: SunOS, Sun Solaris, Hewlett-Packard
HP-UX, AT&T System V, OSF/1 aka Digital UNIX, DEC Ultrix.
It can produce printable ASCII-only (control characters
stripped), section headers-only, Tk, TkMan, [tn]roff (tra-
ditional man page source), SGML, HTML, MIME, LaTeX,
LaTeX2e, RTF, Perl 5 POD. A modular architecture permits
easy addition of additional output formats.
The latest version of PolyglotMan is always available from
ftp://ftp.cs.berkeley.edu/ucb/peo-
ple/phelps/tcltk/rman.tar.Z .
1
PolyglotMan(1)PolyglotMan(1)OPTIONS
The following options should not be used with any others
and exit PolyglotMan without processing any input.
-h|--help Show list of command line options and exit.
-v|--version Show version number and exit.
You should specify the filter first, as this sets a number
of parameters, and then specify other options.
-f|--filter
<ASCII|roff|TkMan|Tk|Sec-
tions|HTML|SGML|MIME|LaTeX|LaTeX2e|RTF|POD>
Set the output filter. Defaults to ASCII.
-S|--source PolyglotMan tries to automatically deter-
mine whether its input is source or format-
ted; use this option to declare source
input.
-F|--format|--formatted
PolyglotMan tries to automatically deter-
mine whether its input is source or format-
ted; use this option to declare formatted
input.
-l|--title printf-string
In HTML mode this sets the <TITLE> of the
man pages, given the same parameters as -r
.
-r|--reference|--manref printf-string
In HTML and SGML modes this sets the URL
form by which to retrieve other man pages.
The string can use two supplied parameters:
the man page name and its section. (See the
Examples section.) If the string is null
(as if set from a shell by "-r ''"), `-' or
`off', then man page references will not be
HREFs, just set in italics. If your printf
supports XPG3 positions specifier, this can
be quite flexible.
-V|--volumes <colon-separated list>
Set the list of valid volumes to check
against when looking for cross-references
to other man pages. Defaults to
1:2:3:4:5:6:7:8:9:o:l:n:p (volume names can
be multicharacter). If an non-whitespace
string in the page is immediately followed
by a left parenthesis, then one of the
valid volumes, and ends with optional other
characters and then a right
2
PolyglotMan(1)PolyglotMan(1)
parenthesis--then that string is reported
as a reference to another manual page. If
this -V string starts with an equals sign,
then no optional characters are allowed
between the match to the list of valids and
the right parenthesis. (This option is
needed for SCO UNIX.)
The following options apply only when formatted pages are
given as input. They do not apply or are always handled
correctly with the source.
-b|--subsections
Try to recognize subsection titles in addi-
tion to section titles. This can cause
problems on some UNIX flavors.
-K|--nobreak Indicate manual pages don't have page
breaks, so don't look for footers and head-
ers around them. (Older nroff -man macros
always put in page breaks, but lately some
vendors have realized that printout are
made through troff, whereas nroff -man is
used to format pages for reading on screen,
and so have eliminated page breaks.) Poly-
glotMan usually gets this right even with-
out this flag.
-k|--keep Keep headers and footers, as a canonical
report at the end of the page. changeleft
Move changebars, such as those found in the
Tcl/Tk manual pages, to the left. -->
notaggressive Disable aggressive man page
parsing. Aggressive manual, which is on by
default, page parsing elides headers and
footers, identifies sections and more. -->
-n|--name name Set name of man page (used in roff format).
If the filename is given in the form " name
. section ", the name and section are auto-
matically determined. If the page is being
parsed from [tn]roff source and it has a
.TH line, this information is extracted
from that line.
-p|--paragraph paragraph mode toggle. The filter deter-
mines whether lines should be linebroken as
they were by nroff, or whether lines should
be flowed together into paragraphs. Mainly
for internal use.
-s|section # Set volume (aka section) number of man page
(used in roff format). tables Turn on
aggressive table parsing. -->
3
PolyglotMan(1)PolyglotMan(1)
-t|--tabstops #
For those macros sets that use tabs in
place of spaces where possible in order to
reduce the number of characters used, set
tabstops every # columns. Defaults to 8.
NOTES ON FILTER TYPES
ROFF
Some flavors of UNIX ship man page without [tn]roff
source, making one's laser printer little more than a
laser-powered daisy wheel. This filer tries to intuit the
original [tn]roff directives, which can then be recompiled
by [tn]roff.
TkMan
TkMan, a hypertext man page browser, uses PolyglotMan to
show man pages without the (usually) useless headers and
footers on each pages. It also collects section and
(optionally) subsection heads for direct access from a
pulldown menu. TkMan and Tcl/Tk, the toolkit in which it's
written, are available via anonymous ftp from
ftp://ftp.smli.com/pub/tcl/
Tk
This option outputs the text in a series of Tcl lists con-
sisting of text-tags pairs, where tag names roughly corre-
spond to HTML. This output can be inserted into a Tk text
widget by doing an eval <textwidget> insert end <text> .
This format should be relatively easily parsible by other
programs that want both the text and the tags. Also see
ASCII.
ASCII
When printed on a line printer, man pages try to produce
special text effects by overstriking characters with them-
selves (to produce bold) and underscores (underlining).
Other text processing software, such as text editors,
searchers, and indexers, must counteract this. The ASCII
filter strips away this formatting. Piping nroff output
through col -b also strips away this formatting, but it
leaves behind unsightly page headers and footers. Also see
Tk.
Sections
Dumps section and (optionally) subsection titles. This
might be useful for another program that processes man
pages.
HTML
With a simple extention to an HTTP server for Mosaic or
other World Wide Web browser, PolyglotMan can produce
high quality HTML on the fly. Several such extensions and
pointers to several others are included in PolyglotMan 's
contrib directory.
4
PolyglotMan(1)PolyglotMan(1)
SGML
This is appoaching the Docbook DTD, but I'm hoping that
someone that someone with a real interest in this will
polish the tags generated. Try it to see how close the
tags are now.
MIME
MIME (Multipurpose Internet Mail Extensions) as defined by
RFC 1563, good for consumption by MIME-aware e-mailers or
as Emacs (>=19.29) enriched documents.
LaTeX and LaTeX2e
Why not?
RTF
Use output on Mac or NeXT or whatever. Maybe take random
man pages and integrate with NeXT's documentation system
better. Maybe NeXT has own man page macros that do this.
PostScript and FrameMaker
To produce PostScript, use groff or psroff . To produce
FrameMaker MIF, use FrameMaker's builtin filter. In both
cases you need [tn]roff source, so if you only have a
formatted version of the manual page, use PolyglotMan 's
roff filter first.
EXAMPLES
To convert the formatted man page named ls.1 back into
[tn]roff source form:
rman-f roff /usr/local/man/cat1/ls.1 >
/usr/local/man/man1/ls.1
Long man pages are often compressed to conserve space
(compression is especially effective on formatted man
pages as many of the characters are spaces). As it is a
long man page, it probably has subsections, which we try
to separate out (some macro sets don't distinguish subsec-
tions well enough for PolyglotMan to detect them). Let's
convert this to LaTeX format:
pcat /usr/catman/a_man/cat1/automount.z | rman-b -n auto-
mount -s 1 -f latex > automount.man
Alternatively, man 1 automount | rman-b -n automount -s 1
-f latex > automount.man
For HTML/Mosaic users, PolyglotMan can, without modifica-
tion of the source code, produce HTML links that point to
other HTML man pages either pregenerated or generated on
the fly. First let's assume pregenerated HTML versions of
man pages stored in /usr/man/html . Generate these one-
by-one with the following form:
rman-f html -r 'http:/usr/man/html/%s.%s.html'
5
PolyglotMan(1)PolyglotMan(1)
/usr/man/cat1/ls.1 > /usr/man/html/ls.1.html
If you've extended your HTML client to generate HTML on
the fly you should use something like:
rman-f html -r 'http:~/bin/man2html?%s:%s'
/usr/man/cat1/ls.1
when generating HTML.
BUGS/INCOMPATIBILITIES
PolyglotMan is not perfect in all cases, but it usually
does a good job, and in any case reduces the problem of
converting man pages to light editing.
Tables in formatted pages, especially H-P's, aren't han-
dled very well. Be sure to pass in source for the page to
recognize tables.
The man pager woman applies its own idea of formatting
for man pages, which can confuse PolyglotMan . Bypass
woman by passing the formatted manual page text directly
into PolyglotMan .
The [tn]roff output format uses fB to turn on boldface. If
your macro set requires .B, you'll have to a postprocess
the PolyglotMan output.
SEE ALSOtkman(1) , xman(1) , man(1) , man(7) or man(5) depending
on your flavor of UNIX
AUTHOR
PolyglotMan
by Thomas A. Phelps ( phelps@ACM.org )
developed at the
University of California, Berkeley
Computer Science Division
Manual page last updated on 2001/02/16 22:49:38
6