Date: Tue, 29 Aug 2000 08:23:16 +0100 (WET DST)
From: Laurent Siebenmann <sieben@cristal.math.u-psud.fr>
Message-Id: <200008290723.IAA20883@stats.math.u-psud.fr>
To: CyrTeX-en@vsu.ru, sieben@cristal.math.u-psud.fr
Subject: Re: ASCII-Cyrillic


Hi Vit Rudovitch!

Thanks for posting Valery Alexeev's russian.el impressive
compilation of existing encodings/representations for
Russian.

Considering only the readable/typable ASCII encodings I notice:

(defconst russian-encoding-naive
'("a" "b" "v" "g" "d" "e" "e" "zh" "z"
"i" "j" "k" "l" "m" "n" "o" "p"
"r" "s" "t" "u" "f" "h" "c" "ch"
"sh" "sch" "'" "y" "'" "e" "yu" "ya"
"A" "B" "V" "G" "D" "E" "E" "Zh" "Z"
"I" "J" "K" "L" "M" "N" "O" "P"
"R" "S" "T" "U" "F" "H" "C" "Ch"
"Sh" "Sch" "'" "Y" "'" "E" "Yu" "Ya"))

(defconst russian-encoding-libcon
'("a" "b" "v" "g" "d" "e" "e" "zh" "z"
"i" "j" "k" "l" "m" "n" "o" "p"
"r" "s" "t" "u" "f" "x" "ts" "ch"
"sh" "shch" "\"" "y" "'" "e" "ju" "ja"
"A" "B" "V" "G" "D" "E" "E" "ZH" "Z"
"I" "J" "K" "L" "M" "N" "O" "P"
"R" "S" "T" "U" "F" "X" "TS" "CH"
"SH" "SHCH" "\"" "Y" "'" "E" "JU" "JA"))

(defconst russian-encoding-tex
'("{a}" "b" "v" "g" "d" "e" "\\\"e" "{zh}" "z"
"i" "{\\u\\i}" "k" "l" "m" "n" "o" "p"
"r" "{s}" "{t}" "u" "f" "{h}" "{ts}" "{ch}"
"{sh}" "{sch}" "{\\cdprime}" "{y}" "{\\cprime}" "\\'e" "{yu}" "{ya}"
"{A}" "B" "V" "G" "D" "E" "\\\"E" "{ZH}" "Z"
"I" "{\\u\\I}" "K" "L" "M" "N" "O" "P"
"R" "{S}" "{T}" "U" "F" "{H}" "{TS}" "{CH}"
"{SH}" "{SCH}" "{\\Cdprime}" "{Y}" "{\\Cprime}" "\\'E" "{YU}" "{YA}"))

Of course I was more or less aware of these.
One can view my tentative encoding of Russian letters
as a modification/simplification of them:

"zh"                   ===> "'z"
"e"="\\\"e"            ===> "'o"
"h" ==> "x"
"c"="ts"="{ts}"        ===> "'t"
"ch"="{ch}"            ===> "c"
"sh"="{sh}"            ===> "w"
"sch"="shch"="{sch}"   ===> "'w
"'"="\""="{\\cdprime}" ===> "q"
"'"="\""="{\\cprime}"  ===> "h"
"e"="\\'e"             ===> "'e"
"yu"="ju"="{yu}"       ===> "'u"
"ya"="ja"="{ya}"       ===> "'u"

-- and similarly for capital letters. Basicly, I have
shifted from the "ligature" paradigm to the "accent"
paradigm.

The encoding "russian-encoding-tex" is essentially due to
Barbara Beeton of the AMS.  This version has enough braces
added to prevent the dangerous ambiguities inherent in
ligature typing.  But readability and typability suffer.

In my tentative encoding, a key feature is that bits of
Latin/English and of TeX are tolerable in the good sense that
one can read, type on the one hand, but also decode to 8-bit
with 100% fidelity. Thus:

\begin{document}  becomes simply  \begin{!document}

without ambiguity.

Are there other efforts towards 100% precise ASCII
representation of Russian allowing a realistic admixture of
Latin and TeX?

              Cheers

                   Laurent Siebenmann