|
Dear Colleagues,
I have now tentatively completed the utility one needs
to exploit the new ASCII-Cyrillic transcription of
Russian -- whose novelty compared with the the Library
of Congress transcription is 100% fidelity. Indeed
ASCII-Cyrillic is entirely "lossless" for arbitrary
8-bit text files, a transcription feature that has
perhaps never been contemplated. I hope the readability
and typability of the Library of Congress transcription
are essentially preserved.
The Library of Congress transcription increases bulk of
simple Russian text by only about 5%, and ASCII-Cyrillic
does slightly better. After gzip compression the
the bulk is within 1% of the 8-bit gzipped original.
Do not hesitate to confront me with bugs; I expect
there will be quite a few. Indeed the "losslessness"
requires a good deal of the sort of "escape" trickery
one sees in unix shell command language, and TeX is
not the best language for handling that. Nevertheless
TeX is perhaps the most widely implemented language
and also a favorite of those who will benefit from
lossless transcription.
Here is the provisional URL:
http://topo.math.u-psud.fr/~lcs/ASCII-Cyrillic/
My warm thanks go to all of you who have already
contributed helpful comments.
Best wishes
Laurent Siebenmann
|
|