Re: character string order

Subject: Re: character string order
From: Sandy Harris <sharris -at- dkl -dot- com>
To: TECHWR-L <techwr-l -at- lists -dot- raycomm -dot- com>
Date: Wed, 12 Jan 2000 11:56:11 -0500

Geoff Lane wrote:
>
> > -----Original Message-----
> > From: Benzi Schreiber
> >
> > I have a bunch of character strings that are sorted as follows:
> >
> > adam < armageddon < bob < beryl
> >
> > The programmers are calling this "lexicografical order", but
> > the closest
> > I've found is "lexicografic". Does the word "lexicografical" exist?
> > Does either of these words describe the order I'm using?
> ---
>
> "Lexicographical" exists in the Oxford English Dictionary.

British spelling, and certainly how I (Canadian) would spell it.
Are the forms with 'f' standard American or just errors?
(I'm mildly horrified either way :-)

> It is a adjective related to, "compiling a dictionary" -- my
> interpretation is that it means, "in dictionary order".

Mine, too. I'd use the simpler term "lexicographic order", but would
understand either.

> Your example is not in this order (it should be adam,
> armageddon, beryl, bob).
>
> If the strings only contain alpha characters and the sort is not
> case-sensitive, I'd describe the sorting as 'alphabetical'. Otherwise I'd
> use the method that the program does the sorting (for example, "in ASCII
> order" or, "in EBCDIC order") and give a simple example if necessary.

I agree there.

I think "lexicographic order" implies some attempt to use rules beyond
that, though I'm not sure exactly what those should be. Some possible
rules would sort:

"22" among the 't' words, sorted as "twenty-two"
"22" after "3" although a straight character sort puts it before
"St. Louis" as if it were spelled out "Saint L...",
"St. Louis" as if it were "StLouis", ignoring non-alpha

Dictionaries don't use a simple ASCII sort. If you say you're sorting
in "lexicographic order", I'm going to assume that you're not either,
that your sort procedure implements some set of rules akin to those
above.

The Unix sort utility has a -d (dictionary order) option. A look at
its manual and some experimentation with the utility might be useful
I suspect, though, that it is only a partial solution, implementing
some convenient subset rather than everything a lexicogrpher might
want.

There was a classic paper on computerized sorting by such rules. I
don't recall if it was lexicographic rules or the different set used
in phone books. I thought it was by Knuth, but checking his home page:

http://www-cs-staff.stanford.edu/~knuth/index.html

I cannot find that paper. If anyone does, please let me know.




Previous by Author: Re: How do you handle revisions for translation?
Next by Author: Re: online documentation for web application
Previous by Thread: Re: character string order
Next by Thread: Re: character string order


What this post helpful? Share it with friends and colleagues:


Sponsored Ads