TechWhirl (TECHWR-L) is a resource for technical writing and technical communications professionals of all experience levels and in all industries to share their experiences and acquire information.
For two decades, technical communicators have turned to TechWhirl to ask and answer questions about the always-changing world of technical communications, such as tools, skills, career paths, methodologies, and emerging industries. The TechWhirl Archives and magazine, created for, by and about technical writers, offer a wealth of knowledge to everyone with an interest in any aspect of technical communications.
1. Define "content" as it applies to element naming conventions. Define it
in such a way that a clear and unambiguous distinction can be made between
a name that conveys content and one that describes a document object such
as a paragraph, a text range within a paragraph, a graphic, a table, or a
list.
Define it also in such a way that a clear and unambiguous distinction can
be made between a name that conveys nothing but content and one that
conveys format. Your description shall not include any cant, SGML purist
insider jargon, or other escape mechanisms that seek to avoid the many
contradictions involved in making a workable DTD that adheres to the SGML
purist rule that content must be separated from format.
2. An element named Para has many parents, but context alone cannot
determine how the element should be formatted. If defining the formatting
parameters in attributes is forbidden, how do you solve this problem so
that a style sheet of some sort can produce the correct formatting? If your
answer is to use Processing Instructions, explain how that is a better
solution than using attributes for the same purpose.
3. If the name Para is forbidden because it describes a document object
rather than content, would you change its name to P because that name is
"formatting neutral", even though everyone who uses the DTD is supposed to
know that P means Paragraph? If that would be your solution, what content
information does the name P convey which makes it superior to Para?
4. A List element has the content model (Item, item+). It is used to
produce four types of lists:bulleted, arabid-numbered, alpha-numbered, and
indented text with no prefix. An attribute name "Type" has a name token
group with the permitted values 1, 2, 3, and 4, where each numeric value
specifies one of the four list types. In order for authors to properly
create such lists, each value must be permanently associated with a
particular type. And any style sheet must format the lists according to the
attribute-specified type. Replacement of the numeric values with names such
as bulleted, arabic, alpha, and indented would eliminate the need for
authors to memorize the meaning of each numeric value. Would you refuse to
make such a change on the grounds that it would introduce forbidden
formatting information into the DTD? If so, please explain why the numeric
values are nothing more than a figleaf to conceal the fact that the Type
attribute is specifying formatting information, no matter what dinds of
values are used.
5. Suppose that the content model of the Item element in the List element
above is:
(PCDATA, List?)
which would optionally allow another List to be nested under an item. But
suppose further that such nesting is not allowed under the Indented text
list type, and that, for the bulleted list type, only nested lists of type
Bulleted are allowed to be nested. To make this possible, the content Model
for the List element would have to be changed to (Bulleted | Arabic | Alpha
| Indented) so that there could be a separate content model for each list
type. Also, the Type attribute would be removed from the List element,.
since it is no longer needed. Now, with this change, the content model for
the Bulleted element would be:
((Item, Bulleted?), (Item, Bulleted?)+
whereas the content model for the Arabic element would be:
((Item, (Numbered | Bulleted | Alpha), (Item, (Numbered | Bulleted | Alpha)+)
Now, you have element names (Numbered, Apha, Bulleted, Indented) which are
clearly conveying formatting information. What would you do? Would you
change the element names for these list types to Type1, Type2, Type3, Type
4 so as to once again conceal with a figleaf the fact that these elements
are describing the forbidden formatting information?
6. Until the CALS table model came along, there was no viable way to
describe how to build and display an SGML table. This apparently was
because SGML purists could not bear the thought that any workable solution
would inevitably introduce formatting into the DTD. The element names in
the CALS table model describe document objects, not content, and most of
the attributes for each element in the model describe how to format the
table. The acceptance of the CALS table model is almost universal. How do
you explain this exception, and what makes it different from from many
needed exceptions which you reject?
7. The requirements specification for developing a DTD identifiies certain
situations where four equally important facets (A, B, C, and D) of content
are present, which can appear singly or in any combination. Thus the
following facet combinations can occur: A, B, C, D, AB, AC, AD, ABC, ABD,
ABCD. Would you create and name an element for each possible combination,
or would you create a single element with attributes to describe each
facet, where the default for each attribute is no value, or would you do
something else?
8.To further elaborate on my statement arguing the need for multiple facets
to describe information content, consider the new Resource Description
Framework (RDF) in the XML standard, whose purpose is (among others) to
facilitate database search and retrieval of information. RDF description
patterns are applicable
to individual nodes or elements within documents as well as whole
documents. Each RDF includes a Universal Resource Identifier that uniquely
specifies
where the resource is located (e.g., within a database, a file, or an
element whose ID attribute specifies an absolute or relative Xpointer
location term.
RDFs can be created independently, or they can be embedded in the structure
of the document, or both. There is no reason that I can think of why this could
not be incorporated into SGML documents as well as in XML ones. It is
possible to define many different description patterns, some more elaborate
than others. If RDF offers a much better and more comprehensive way to
describe information content at any level of structure, do you believe it
might moderate the SGML purists' insistence that element names must always
describe content? If not, why not?
9. The SGML purist's' claim is that "hardcoding formatting attribute values
into the data is wrong and that the application should be responsible for
rendering it so that the data can be used with different media. But XML
defines a new style sheet standard, XSL. Using middleware, it should be
possible to extract XML data from a database, and build a customized style
sheet on the fly to fit the requirements of the user (human or non-human)
who initiated the database query. If style sheets become dynamically
generated doesn't that make the purists' concern irrelevant? Why not
hardcode the formatting for the most demanding formatting requirement
(e.g., high-quality printed books), and let the middleware either ignore
formatting attributes or modify how they are used, depending on the media
and the end user?
10. Why do most of the commonly used DTDs (J2008, ISO 12083, Docbook,
MIL-M38784, HTML, aand even the ATA DTD ) violate with wild abandon the
SGML pusits' view of how a DTD should be built? Is it because the people
who developed them just don't get it right, or is it because,
pragmatically, the reductionistic viewpoint of the purists is simply
impractical in the real world?
====================
| Nullius in Verba |
====================
Dan Emory, Dan Emory & Associates
FrameMaker/FrameMaker+SGML Document Design & Database Publishing
Voice/Fax: 949-722-8971 E-Mail: danemory -at- primenet -dot- com
10044 Adams Ave. #208, Huntington Beach, CA 92646
---Subscribe to the "Free Framers" list by sending a message to
majordomo -at- omsys -dot- com with "subscribe framers" (no quotes) in the body.