O'Reilly & Associates, Inc.
Checklist for Selection of Document Production Tools
[Note: this document discusses isssues concerning the development
of documentation for computer programs and does not explicitly treat
the subject of conservation documentation. Nevertheless, some of it
should be of interest to those involved in developing automated
systems for handling conservation treatment documentation.
Walter Henry, CoOL maintainer]
Article 1831 of comp.text.sgml:
Newsgroups: comp.text.sgml
From: lark@ruby.ora.com (Lar Kaufman)
Subject: Checklist for Selection of Document Production Tools
Keywords: SGML document production checklist
Organization: O'Reilly & Associates, Inc.
Date: Mon, 8 Mar 1993 22:29:16 GMT
Lines: 696
This checklist provides assessment criteria and asks questions
that you should answer in order to determine the usability of a
proposed documentation package for creating structured
documentation. The categories in this checklist necessarily have
some overlap; a specific item is placed under the heading or
subheading that was deemed most relevant to that item. This
checklist is intended to be suitable for judging both integrated
document production tools and combinations of disparate
documentation tools.
This checklist was developed from information provided by
editorial, design, and production staff of O'Reilly and Associates,
Inc. It has been adapted for public distribution.
Disclaimers
This document is made available for informational purposes by
O'Reilly and Associates, Inc. in the hope that it will benefit
software developers and users of documentation development software.
It may be freely redistributed and reused. Alterations should not
be attributed to O'Reilly and Associates, Inc.
This document was developed for a specific workplace, and factors
were weighted according to the requirements of that workplace. While
specific weighting factors have been omitted from this checklist,
the criteria are biased to accomodate our existing documentation and
plans for future documentation, software, and equipment, and this
bias is reflected in verbiage. You should tune this checklist to
reflect your own needs.
This document is not intended to imply that any particular
product or service is superior or inferior to any other. Any
implicit or explicit selection of one technology over another is
simply a reflection of the needs of one company based on historical
use and anticipated needs. Only technology known or believed to be
available to the author was considered; other current and future
technologies may not have received due consideration. (This
document was prepared in 1992.) No representation of fitness for
use is made or implied, and O'Reilly and Associates, Inc. assumes no
liability for any use made of any part of this document.
Some copyrighted products are mentioned; all copyrights are the
properties of their respective owners.
1 SGML Issues
Structured documentation is a critical concept in meeting our
documentation needs. No other method of document preparation
provides the needed document portability and adaptability of source
documents to diverse presentation media. We decided that only the
Standard Generalized Markup Language (SGML, ISO 8879) provides the
necessary, appropriate, and fundamental standard framework for
creating portable structured documentation. SGML compliance and
performance is thus a critical factor in judging the suitability of
tools for use. SGML, as an ISO standard, also provides a suitable
standard for document tagging to permit the coordinated use of
document preparation tools from different sources.
1.1 Parser Issues: How does the parser handle SGML
documents?
This section concerns specific SGML implementation features that
are critical or useful for our purposes in document development and
publication.
- Must import and export documents in pure SGML-format (filter to/
from)
- Must read an arbitrary SGML document type description (DTD), and
allow DTD customization by "hand-tuning"
- Should be able to read and write an ASCII form FOSI style sheet
- Must retain data integrity of attributes, and should parse them
- Must support public entity sets and be able to expand entities
- Should support CONREF
- Should support linking
- Very useful to support the HyTime SGML hypermedia extensions to
SGML
- What SGML functionality is not supported?
- Are there any restrictions on the SGML declaration? (Not limited
to Reference Concrete Syntax, for instance)
- How does the document instance identify the DTD it uses?
- How is the DTD parsed? Can the declaration and definitions exist
in the same file or be separated?
1.2 Interface Issues: How does the user view and interact with
the product?
This area concerns user productivity issues in SGML use.
- SGML validation mechanism should detect and notify the user when
an invalid construction is found, but permit work to continue with
invalid constructions in order to fix errors interactively (the
validation mechanism should not choke on unresolved IDREFs, for
example)
- Should allow SGML element tags and typesetter formatting tags to
be viewed or concealed during edit at user's preference (index tags,
for example)
- Should be able to generate IDs for SGML elements automatically,
and also accept manual ID insertion
- Should distinguish SGML empty tags from tag pairs
- Should enforce an SGML element catalog referencing a DTD,
preventing insertion of alien structures or elements that are not
defined in the DTD.
- Useful to display the actual structure of a document and the
theoretical structure defined by element encapsulation
- Useful to allow document to control layout by structural
elements (independent of placement of elements in text)
- How is creation of new structures and elements facilitated and
controlled? Do you create new elements by altering a DTD, or map
analogous elements to a DTD element?
- Very useful to provide stream WYSIWYG display without arbitrary
page-oriented layout
- Useful to supply facing-pages WYSIWYG view online of a hardcopy
output format
- Should support keyboard function mapping for element tagging
1.3 Typesetting tool
While there are other suitable document description languages,
our existing hardware commits us to mapping SGML tagged documents to
PostScript and X Window output formatting.
- Must support standard Adobe PostScript fonts
- Must allow new PostScript fonts to be installed by user
- Should provide X Window fonts that, where appropriate, are
analogous to PostScript hardcopy output, for convenient online
development and testing of documents intended for printing using
PostScript.
- Should support the ability to replace one style sheet with
another for the same SGML DTD, either to update the current style
sheet for a book, or to use a different style sheet for a particular
task, such as editing the index.
2 General Editing
We work in a production environment where editing is done using a
variety of tools by users ranging from naive to very sophisticated.
The editing capabilities must reflect this diverse user base.
- Must support symbolic cross-referencing
- Must support editable WYSIWYG (or near-WYSIWYG) full-page
display as at least one view of the document
- Should support non-WYSIWYG mode for continuous text editing with
visible tagging; visible tagging should also show endtags
- Should allow editing of files in ASCII form
- Should detect missing hierarchical structures (skipped head
level, for example) even if permitted by the document style
- Should support keyboard function mapping for block mark/move,
cursor placement
- Should support left- and right-adjust tabs, and decimal tabs.
How is tab setting and insertion handled?
- Useful to support custom editing macro assignment to keystrokes
- Useful to map keystroke macros (preferably emacs-style) to
commonly used functions
- Should allow multiple files to be opened, and allow cut and
paste of text and graphics blocks across documents
- Must support robust multi-level indexing:
- Should support arbitrary index entry labeling.
- Useful to support index style control. Can indexing handle page
ranges, highlighting, suppressed page numbers, leader characters,
custom indentation, multi-column output?
- Useful to support multiple index creation. For example, indexing
by subject, author, and title
- Should support indexing across multiple files. Can multi-volume
indexes be built? Is the indexing mechanism dynamic, or are lookup
tables built? Can lookup tables (if they are used) be edited?
- Should support case-sensitive indexing (where specified) at
index generation
- Can indexing facility identify common indexing errors, such as
identifying large blocks of unindexed text, and indexing under
singular and plural word forms
- Must generate tables of contents, with style control (leader
characters, etc.)
- Should generate, for tables of contents, lists of figures and
tables
- Should generate, for tables of contents, lists of examples and
code
- Must support page numbering control: style and reset of page
number, section-page numbering, numbering of blank pages, TOC pages,
index pages, etc.
- Must support cross-referencing and should allow
cross-referencing across files
- Should support cross-reference by page number and by heading
- Should support location and identification of bad
cross-reference entries
- Should allow block-oriented move, copy, and deletion of SGML
structures by automatically selecting all subsidiary elements and
structures
- Should update the prepared document (automatic re-sourcing) when
a sourced graphic is modified or replaced
- Must allow localized hyphenation control. Can hyphenation
mechanism limit the "ladder" effect by limiting number of
consecutive hyphenations, suppress hyphenation of next-to-last line
in a paragraph or at a page break, etc.?
- Should allow general hyphenation rules to be adjusted
- Should support automatic renumbering of numbered text elements
(chapters, sections, etc.) when the document is restructured. Can
you apply arbitrary character size and formatting control on
automatically generated number elements and tags?
- Should support keyboard function mapping for tagging
- Should support interactive spelling checking and correction,
with revisable wordlist. Can you create local and special-purpose
spelling dictionaries?
- Should support spelling checking of referenced documents called
from within the current document
- Useful to have thesaurus. Can we create local thesaurus entries?
- Useful to have automatic glossary creation and acronym
definition
- Should have powerful search and replace capabilities
- Should support search and replace across included and open files
- Should support searching on phrases, and on SGML and formatting
tags
- Should support case-sensitive and insensitive search and replace
- Should provide extensive pattern matching for search; the user
should be able, for example, to examine acronym usage by searching
for strings of uppercase letters.
- Can you set bounds on the search and replace facility, by marked
range or by SGML element contexts? Very useful to provide bounds
control
- Can you use the replace facility to change SGML tags as well as
text? Can you search on text and replace a tag, or the reverse? Very
useful to provide contextual tagging and text editing
- Useful to have interactive grammar checking, including locating
repeated words in the text stream, reporting of passive voice,
suggested substitutions for overused words/phrases, etc.
- Useful to have word count, and word usage/grammar statistics
reporting facility. Can you get a character count, too?
- Useful to find multiple instances of identical sentences/
paragraphs in a document
- Useful to copy text blocks with column control as well as by row
- Useful to have copyreading/proofing markup mode for direct
support of online editing and revision
3 Formatting: Graphics, Composition, Layout and Style
Structured documentation rightly encourages document preparation
with the focus on document organization and structure -- without
focusing on output appearance. However, it is necessary to have good
control of formatting of graphics, tables and figures, document
layout, fonts, and many other style issues in order to have
effective document production tools.
3.1 Tables, Figures, Equations and Examples
Table-making, figure-handling, and imbedding of equations, code
fragments and examples into documents is a very important part of
technical book production. These features should be accommodated by
separate mechanisms. At least it should be possible to distinguish
between these different classes of visual aid.
- Must consistently and appropriately handle construction of large
and complex tables.
- What are the limits on rows and columns in tables, and what
table variables are under your control?
- Should be able to control table break points
- Can you float tables and figures, and force two-page tables to
opposing pages? Can you span a table across opposing pages?
- Can you "reverse" tables, switching rows to columns, columns to
rows?
- What control of margins, widow/orphan control, etc. is provided
in tables?
- Should provide kind of arbitrary text block control in a row or
column
- Useful to provide separate table headings for both rows and
columns?
- Very useful to provide for imbedding structures and graphics
inside tables.
- What table highlighting controls are provided: color, shading,
etc.
- Must allow external figures to be referenced or support
inclusion; both capabilities should be provided
- Must be able to size and control white space around graphics and
graphics captions
- Should generate separate lists of tables and figures
- Should generate lists of code or examples separately from
figures
- Useful to generate lists of equations
- Useful to generate lists of procedures
- Should support footnotes and tablenotes in tables
- Can you imbed various list types and graphics?
- Can you index and cross-reference table entries?
3.2 Graphics
Our graphics production is primarily accomplished by importing
graphics produced using external tools -- often on a MacIntosh
platform. Since our document production environment is X
Windows-based, that work environment is critical. Output to
PostScript is essential, as that provides our typesetting support.
However, output to X Windows is also considered very useful; we must
accomplish this output form using some tool or another; currently we
convert from PostScript to X Windows in a separate step.
- Must support all handled graphics using SGML-compliant
mechanisms
- Must support Encapsulated PostScript (EPS) graphics formats
- Should support X bitmaps created by xwd and X draw/paint/icon
tools or convert them to a form that is supported
- Should support Tag Image File Format (TIFF) rev. 5.0
- Should handle Computer Graphics Metafile (CGM) format per the
U.S. Federal Information Processing Standard Publication 128
- Should handle Graphic Interchange File (GIF) format
- Very useful to interchange Quark Xpress and Aldus Freehand
native graphic formats
- Should have an integrated graphics editor. How powerful is the
integrated graphics composition facility, and what does it compare
with? Can you render drawings, "paint", incorporate CAD
illustrations, etc.? Can you create icons and custom characters?
- Very useful to edit graphics in full page, formatted mode
- Very useful to handle and convert (troff) pic graphics
- Should handle PICT graphic format
- Useful to handle Initial Graphic Exchange Specification (IGES)
graphics per ANSI Y14.26M
- Useful to handle any other graphic formats. What additional
formats can the product edit? handle as input? handle as output?
- Should support automatic or manual bordering/boxing of graphics
and resizing of bounds and image
- Should handle bitmaps larger than page representation (for bleed
and for printing masked portions of the whole)
- Useful to be able to output ASCII and omit graphics
- Should allow arbitrary placement of graphics on the page
- Useful if text can flow around placed graphics
3.3 Composition, Layout and Style
We are very particular about style issues, and value fine control
of style and layout in our production tools. We consider the
distinctive appearance of our books to be a sales asset, so we must
be able to tune the output to the appearance we need.
- Composition, layout and style control must be completely
separable from SGML document tagging
- Should support full graphics bleed (to/beyond edge of page, and
across gutter for two-page spread) with float to opposing pages
- Must support bleed tabs. How are bleed tabs positioned?
- Must support revisable arbitrary external style sheets including
stylesheet substitution, and allow a single style sheet to be
applied to a set of files
- Should allow arbitrary formatting markup to be applied to an
SGML document
- Should allow header and body elements to orient to left- or
right-alignment based on recto or verso page orientation
- Useful for heading capitalization rules to be automatically
applied
- Should support forced recto chapter formatting, automatic verso
page generation (blank page with headers, footers, and page numbers
as appropriate)
- Must provide arbitrary control of page and column breaks, and
should provide automatic page break style control using rules
(enforce widow and orphan control policy, etc.).
- Should support multi-column output
- Useful to have multi-column support for unequal column widths
and synchronous baselines (vertical column balancing)
- Long quotes should be handled as distinct structures
- Very useful to support sidebars or separately formatted, linked
text flows (besides footnotes, endnotes)
- What mechanisms allow structures such as icons, admonitions, and
blurbs to be placed outside the normal text margins?
- At least numbered, bullet, variable, and ballot lists should be
supported, as well as context-sensitive sublists. What distinct
kinds of lists are handled?
- Must format ASCII text output for minimal support of
character-based output devices
- Must support footnotes and footnote style control. How are
footnotes maintained with their points of origins? What controls can
be set on splitting footnotes across pages and displaying a footnote
on another page than the point of reference?
- Footnote text should not be separable from the footnote number
or marker on a page; useful if marker is repeated and continuation
indicated, when footnote is split across pages
- Very useful to support bibliographic-style endnotes output at
the end of each chapter
3.4 Font Control
We need the usual PostScript font support, and we expect all the
tools we examine to provide this. The greatest issues are questions
about how font control is accomplished.
- Should support pair kerning, arbitrary kerning of output, with
fine control of whitespace
- Should preserve font change information and format-specific
information on selection, even for structured documentation
- Can we use arbitrary characters from other character sets, or
insert a graphic as a character?
- How are fonts coded? added?
- How are fonts mapped/condensed/altered?
- How sophisticated is highlighting control? Can you do small caps
and all caps under user size control? Can you use
inverse-highlighted characters, boxed text, apply spot colors and
grey scale to characters and background? How is
superscript/subscript controlled?
- What method does is used for font substitution when a selected
font is not provided for display or printing? For example, if text
that should appear as italic in a particular typeface, would it
appear in oblique if the courier typeface were applied?
- Can supertext graphics be performed, such as placement of text
on arbitrary vectors, "tapering" point size, sliding color or
shading of text stream?
4 Document Conversion
This is a critical area for publications. We must cope with
documents in many formats from a variety of sources.
- Must import plain ASCII text and SGML-tagged ASCII text, and
export it, optionally with preserved SGML tagging
- Should allow converted/imported text to be inserted into a
formatted document directly
- Should import popular file formats; the more the better -- with
these forms most valuable: troff, FrameMaker, RTF, DCA, TeX (LaTeX),
Word (Mac, PC, and UNIX). What file formats can be imported?
- What capabilities can be offered to allow documents that were
imported, in order to facilitate revision by the original author
after conversion to SGML?
- Very useful to handle "custom" PostScript formats (e.g.
FrameMaker's) so that proprietary PostScript forms can be stored and
manipulated as standard (probably Encapsulated) PostScript.
- Useful to export popular file formats. What file formats can be
exported?
- Can other common document conversion tools be used with your
package?
5 File Storage and Management
Document control must meet professional expectations. We have
unusually complex needs for version and revision/edition
control.
- Must store in SGML output form, should store in an tagged,
editable ASCII format that optionally retains formatting information
- Must permit RCS source control, or provide a backup alternative
- Must support RCS file checkout in UNIX or other file locking
mechanism
- Should support comparison of different editions of files, with
reporting on differences
- Should allow multiple-version document control (with history) as
well as revision level control. Instances of revision mark usage
should be controllable by the user locally as well as globally.
- Useful if revision mechanism can differentiate between text that
is moved and text that is added or deleted, and mark accordingly
- Useful if comments can be optionally retained or stripped from
source
- Useful if comments can be optionally hidden or formatted and
displayed
- Should provide notification mechanism to owner of checked out
file when someone wants to edit it
- Should be editable in ASCII form; formatting instructions and
tags should not fragment word strings such that grep, sed, ispell,
and other UNIX tools that search for specific strings would be
thwarted
- Useful to have file/directory backup/copy mechanism for specific
versions/editions of document
6 Output
Our output requirements reflect the hardware we use for
production, as well as our interests in online documentation.
- Must output PostScript
- Must output formatted ASCII
- Must output to printer or to file
- Must have PostScript or comparable online preview directly
equivalent to printed output
- Useful to output CCITT Group III facsimile (Group IV not
currently useful to us)
- Useful to output X bitmap form
- Useful to output "formatted" ASCII with highlighting control for
use with printcap and termcap via a UNIX browser (or pg, more, or
less)
- Useful to provide batch printing control and output device
selection on network (including fax)
- Useful to provide an output path directly to email
(sendmail-compatible)
7 Platform/Environment Considerations
This is directed by our hardware and operating system used for
production.
- Must run under SunOS, X Windows
- Should support Motif and OpenLook window managers
- Useful to run under Mac, PC, NextStep environments
- What other platforms does the product run on?
- Are there other tools/products designed to work with or extend
the functionality of the product?
8 General Production and Support
We are used to having complete control over all areas of
production including "hacking" our troff-based production tools. In
order for us to be comfortable moving away from internally supported
tools, we require some assurances that our support needs can be met
in a timely manner, and that the tools we use will be developed to
meet current and future needs.
- Should have responsive Customer Support
- Should have active user input into product improvement effort
- Should have searchable, comprehensive online documentation
- Should provide error logging system
- Should provide customizable default environment
- Should provide support for a custom user environment
- Useful to have context-sensitive hyperhelp, annotatable help
- Useful to have email response channel and mailing list or
newsgroup for licensees to discuss problems/solutions
- Who else uses this product? Can you give references to customers
with a comparable use/work environment to ours?
-lar
Lar Kaufman lark@ora.com
Production Tools Specialist
O'Reilly and Associates, Inc. voice: 1-617-354-5800
90 Sherman Street fax: 1-617-661-1116
Cambridge, MA, 02140 U.S.A