Mailing List CyrTeX-en@vsu.ru Message #246
From: Vladimir Volovich <vvv@vsu.ru>
Subject: Re: Accented Cyrillic Vocals in Unicode
Date: Wed, 18 May 2011 01:32:08 +0400
To: Cyrillic TeX Users Group <CyrTeX-en@vsu.ru>
Greetings,

Wow, this list sees a revival after about 5 years of silence. :-)

"PT" == Plamen Tanovski writes:

 PT> The problem is, that there are no accented cyrillic vocals in
 PT> unicode except for i and e with grave. Perhaps I don't need to
 PT> mention, that accented vocals are urgently needed for homonymes and
 PT> in textbooks.  So, I think, accented vocals should finally go into
 PT> unicode. But I was told, that unicode doesn't provide slots for
 PT> letters anymore, which can be made by the combination of other
 PT> letters.

I see that this question was already discussed on the Unicode mailing
list: there is a thread "Cyrillic - accented/acuted vowels" on

http://unicode.org/mail-arch/unicode-ml/y2005-m05/thread.html#2

 PT> But using combinig diacritics in word processors is a big pain. In
 PT> TeX putting accent is much easier, but it looks not always good,
 PT> and the hyphenation doesn't work anymore.

As far as I see, current engines such as XeTeX and LuaTeX provide
support for using Opentype fonts, which could have pre-composed cyrillic
accented vowels. Moreover, hyphenation seems to work fine, in the
presence of accented letters, as shown by the attached example which
should be processed with "xelatex".

For this example, you need to install the Doulos SIL font in your
~/.fonts folder. Of course, it could be any other font, which has the
precomposed accented vowels.

In this example I see both sentenses properly hyphenated at all possible
points. The only oddity I see is that in the first example, one of the
words is hyphenated as два-дцать, but in the other 2 examples (with
accented vowels), the hyphenation point is different: два́д-цать.

I did not look how hyphenation "works" in the presence of the accented
vowels. Maybe it is by luck that we see the correct places being
selected?  If so, it should be possible to solve it on the engine level
(especially in LuaTeX), to treat accented letters as the non-accented
for the purposes of hyphenation.

 PT> The T2* encodings don't have accented vocals as well, so I think we
 PT> also need such encoding in TeX.

I feel that the 8-bit encodings are a thing of the past...

 PT> So, is that true, that accented cyr. vocals cannot be taken in
 PT> unicode and how do you think about making a proposal to the unicode
 PT> consortium?

Best wishes,
Vladimir
Subscribe (FEED) Subscribe (DIGEST) Subscribe (INDEX) Unsubscribe Mail to Listmaster