Eliminating .txt file hard returns -Reply 2

Subject: Eliminating .txt file hard returns -Reply 2
From: Peter Collins <peter -dot- collins -at- BIGFOOT -dot- COM>
Date: Thu, 22 Oct 1998 17:32:07 +1000

Correction to "This can be done in Word by doing a find and replace on two
consecutive Paragraph Marks and replaceing them with one paragraph mark.
Repeat process until there are no longer any hits. This means that there
are no longer any places in the document where two consecutive hard returns
exist."
This may push consecutive words together and be very hard to unpick.
Text files with ".txt" extension usually have one hard return per line and
several between paragraphs. Opening the file as TXT into Word usually, but
not always, sorts most of them out. The following sequence assumes that
this has not been completely successfully and that one or more of the
following may obtain: that each line has a paragraph mark which may or may
not be preceded or followed with a <space>; that contextual paragraphs are
separated by at least two paragraph marks which are either adjacent or
separated only by spaces; that there are other cases which you may have to
treat, such as the indented paragraph, so simulated by means of strings of
leading spaces or tabs on each consecutive line. Other similar cases such
as bullet lists (or asterisks in their stead) will need a tactic similar to
that given in the list below.
My experience is that a selection of the above characteristics of such
texts generally holds true. The steps are as follows using ^p for paragraph
mark and <space> for a single space::
1. Visually scan the document for 'indented' 'paragraphs'. Place some
unique text (such as "#indented#" or "#bulleted#" as the case may be) at
the start of each to allow you to find them later and apply a suitable
style to the same end. I know of no reliable way to do this step
automatically, given all the other perversions that may also be in the
document.
2. if there are "^p" on every line and more than one between paragraphs,
once only replace all "^p" with "<space>", to collapse line-separated
paragraphs into word-separated text. If you omit the space you risk
concatenating adjacent words across the line boundary if the original text
generator did not place redundant spaces at the start or end of each line.
Do this only once, so to leave at least one paragraph mark between adjacent
paragraphs.
3. repeatedly replace all "<space><space>" with "<space>" repeatedly until
none are found. You have now a document almost in standard form without
simulated indentation, but with possible pervsions between contextual
paragraphs..
4. repeatedly replace all "^p<space>^p" with "^p" until none are found, to
collapse all ^p sets down into adjacent strings of ^p.
5. repeatedly replace all "^p^p" with "^p" until none are found.
6. Search for "#indent#", set the style to an indented one, and search
again until end of document. After this is done your indent marker is no
longer required.
7. Replace all "#indent" with "" (nothing at all in the replace field),
thus removing the obsolete markers.
Using these sorts of strategems you should be able to work out how to
deal with asterisk bullets (hint - "^p*") and the like, but do ask me if
you have such problems and need further help with them.
regards and good luck
P
========================================================
Peter Collins, VIVID Management Pty Ltd,
26 Bradleys Head Road, MOSMAN 2088, Australia
+61 2 9968 3308, fax +61 2 9968 3026, mobile +61 (0)18 419 571
Management Consultants and Technical Writers
email: peter -dot- collins -at- bigfoot -dot- com ICQ#: 10981283
web pages: http://www.angelfire.com/pe/pcollins/
========================================================

From ??? -at- ??? Sun Jan 00 00:00:00 0000=




Previous by Author: Tech writing interviews (Naive Interviewers)
Next by Author: Re: metadiscourse
Previous by Thread: Re: SDK/API Documentation
Next by Thread: Linking external AVIs from within a CHM


What this post helpful? Share it with friends and colleagues:


Sponsored Ads