TechWhirl (TECHWR-L) is a resource for technical writing and technical communications professionals of all experience levels and in all industries to share their experiences and acquire information.
For two decades, technical communicators have turned to TechWhirl to ask and answer questions about the always-changing world of technical communications, such as tools, skills, career paths, methodologies, and emerging industries. The TechWhirl Archives and magazine, created for, by and about technical writers, offer a wealth of knowledge to everyone with an interest in any aspect of technical communications.
Subject:Vanilla HTML from Word From:Jean Weber <jean -at- wrevenge -dot- com -dot- au> To:"TECHWR-L" <techwr-l -at- lists -dot- raycomm -dot- com> Date:Thu, 18 Jan 2001 11:33:20 +1000
I should not write things from memory, because when I did a small test I
discovered that it's even easier than I said in my last note to get vanilla
HTML from Word, in situations where you do NOT want to carry over any
formatting from Word to HTML.
BTW I use Word 97. I haven't tried this with Word 2000. I read somewhere
recently (probably in Woody's Office Watch) that there is an update or
patch or something to Office 2000 that stops all the XML junk going into
HTML documents, but that's unlikely to be enough to keep out all the other
junk.
The secret to keeping junk out of Word-HTML conversions is to not "convert"
but to create a tile based on the HTML.DOT template which is supplied with
Office 97 (at least it was in my copy). The easiest way to do this is to
choose File > New which open the New dialog box that lists all the
templates. On the Web Pages tab, choose Blank Web Page.
Make sure you use ONLY the H1, H2, H3 etc styles for headings (NOT Heading
1, Heading 2, etc) and Normal for everything else. For a list, first type
the list items, then select all of them and click either the Number or
Bullet button on the toolbar. For bold, italic, or underlined words or
phrases, select the word or phrase and click the relevant toolbar button.
Do NO manual font/character formatting except bold, italic, or underline.
Do NO manual layout formatting at all. (I tried some simple tables and they
worked fine; I haven't tested this method with complex tables yet.)
When you save the document, it is saved as an HTM file, not a DOC file.
Open the HTM file in NotePad and you'll see it's clean (except for some
META junk at the top of the file, easily stripped out).
Converting from an existing document also works, but it's often more fiddly
and takes extra steps. The conversion attaches the HTML.DOT template, but I
find that I usually then have to apply the correct styles (H1 in place of
Heading 1 etc) and make sure everything else is tagged with the Normal
paragraph style. Sometimes I have to fiddle with the Normal style (and
sometimes the heading styles) to make sure they have no font or other
attributes that are different from the Default Paragraph Style.
IF you create a minimalist template in Word and IF you can convince your
contributors to use ONLY certain styles and NO layout fiddles, the
conversion will be remarkably clean, needing minimal if any tidying-up
before being used by your website developers, and it will NOT contain
Microsoft-specific code.
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Develop HTML-Based Help with Macromedia Dreamweaver 4 ($100 STC Discount)
**WEST COAST LOCATIONS** San Jose (Mar 1-2), San Francisco (Apr 16-17) http://www.weisner.com/training/dreamweaver_help.htm or 800-646-9989.
Sponsored by DigiPub Solutions Corp, producers of PDF 2001
Conference East, June 4-5, Baltimore/Washington D.C. area.
http://www.pdfconference.com or toll-free 877/278-2131.
---
You are currently subscribed to techwr-l as: archive -at- raycomm -dot- com
To unsubscribe send a blank email to leave-techwr-l-obscured -at- lists -dot- raycomm -dot- com
Send administrative questions to ejray -at- raycomm -dot- com -dot- Visit
http://www.raycomm.com/techwhirl/ for more resources and info.