A SURVEY OF THE USE OF MODERN CYRILLIC SCRIPT including the complete required repertoire of graphic characters J. W. van Wingen VERSION 3.0, 1999-03-09 Foreword This report started as a Netherlands contribution to the development of ISO 10646, UmoCS. It was extended and partially rewritten in the course of participating in the work of CEN PT004 (version 1), and made into Annex 6 of its report. Version (2) is again an independent document, with an extended bibliography and other improvements. The present version (3) is an adaptation of the previous one made to agree with ISO/IEC 10646-1:1993. Changes are only few, and no attempts were made to update the text to the 1999 situation. 0 Introduction A survey is presented of all languages making use of Cyrillic script today, their spelling history, as far as needed to put things into proper perspective, and the letter forms required for correct spelling. A specification is given of those coded characters that correspond with the elements of the script, with an indication method to identify uniquely a character at referencing. A list is attached giving for every letter the language that has included it in its alphabet. Developments after 1989 have thoroughly changed the political situation. To which extent the way of writing the languages in use will be influenced cannot be predicted. To avoid what can only be called pure speculation this report describes the situation as per 1989-01-01. Possible reforms will be discussed in the final chapter. The states are referred to by their names as current at the same date. 1 The script Cyrillic script has been introduced by St. Cyril around 900, inspired on the Greek script. In Russia letter forms were more or less adapted to those of Latin, forced by Peter the Great. The set of Cyrillic letters is not as uniform in extent as the Latin set, in particular, several letters were added in Serbia, deleting some required for other Slavic languages, and as late as 1917 four letters were removed from the Russian alphabet. In the USSR Cyrillic script was introduced after 1937 for writing almost all non-slavic languages, adding several new forms. Cyrillic script is written from left to right, and has a definite alphabetic order for the letters, (with small deviations for the individual languages). This order is different from that resulting from a Latin alphabet transliterated into Cyrillic, but shares some properties with the order for the Greek alphabet. 2 Political and linguistic situation Up to 1989 Modern Cyrillic script was used for writing the languages of the following states: Yugoslavia, Bulgaria, Mongolia, USSR Primary focus in this report is on the situation in the USSR. That in the other countries is described only summarily. 2.1 Yugoslavia Yugoslavia consisted of the republics of Slovenia, Croatia, Bosnia- Hercegovina, Serbia, Montenegro and Macedonia. The Slavic languages written with Cyrillic script are Serbo-croatian, Macedonian and Ruthenian, which is spoken by a group of 20000 people only. It is closely related to Ukrainian. Serbo-croatian is written with either Latin or Cyrillic script. No intentions for making changes in orthography have been recorded. 2.2 Bulgaria Bulgarian is the language used. A spelling reform occurred after 1945. Recently, discussions on the subject started again. 2.3 Mongolia Mongolian was traditionally written with its own script, running top-down. Since 1941 Cyrillic script is used. Return to the old practices may be proposed now, but this will not add to convenience of writing, because the old script is written top-down, and represents modern pronunciation not very well. 2.4 USSR The USSR was organised into Union republics, reflecting more or less linguistic differences. Smaller areas where a separate language regime was required got the status of Autonomous Republics, still smaller ones that of Autonomous Regions (being part of a larger unit). (Every USSR citizen is assumed to belong to a "nationality", this does not coincide always with the language he actually speaks.) Table 1 The Union republics of the USSR as of 1989 (with script used) RSFSR (Russian Federation) Cyrillic Ukrainia Cyrillic Byelorussia Cyrillic Estonia Latin Latvia Latin Lithuania Latin Moldavia Cyrillic Georgia Georgian Armenia Armenian Azerbaidzhan Cyrillic Turkmenia Cyrillic Kazakhstan Cyrillic Uzbekistan Cyrillic Kirgizia Cyrillic Tadzhikistan Cyrillic The large language islands are mostly present in the RSFSR, constituting 14 autonomous republics. From a linguistic point of view, the following groups of languages may be distinguished (details in Annex A): Table 2 Main groups of languages Source: USSR Census, 1970, 1989. (numbers of speakers in 1000s) 1970 1989 2 Slavic 170300 187773 2 Romanic 2560 3070 3 Iranian 2611 4763 4 Kaukasian (Northwestern group) 511 674 5 Kaukasian (Northeastern group) 1831 2740 6 Finno-Ugric 2519 2202 7 Samoyed 26 29 8 Turkic 30341 46386 9 Mongolian 416 520 10 Tunguso-Mandchuric 27 22 11 Sino-Tibetian 36 66 12 Palaeo-Asiatic 20 18 It is clear that next to the Slavic group (Russian, Byelorussian and Ukrainian) the Turkic group is most important. It is also increasing very much in numbers. 3 Written and unwritten languages The languages spoken in the states covered are of great variety, both in nature and in number of speakers. Whilst the more important, that is those with the largest cultural extent, have been written for centuries, the others were not at all, until missionaries made an attempt to create a script suitable to print translations from religious material, using not only Cyrillic but also Arabic script, in areas where the Islam predominated. Many of these scripts were individual enterprises, the development towards officially written languages being finalised only in the Soviet period. This development did not take a straight course. The idea, after 1917, was to provide for every phoneme, of which the number was often large, (75 for Abkhasian) a separate letter. In the first period Latin script was chosen, extended with newly invented forms, sometimes taken from the International Phonetical Alphabet. This approach did not prove to be very practical, thus after some years simplifications were introduced. Table 3 Languages using Latin script (first phase) 401 Abkhasian 1924-1929 404 Kabarda 1923-1928 803 Azeri 1922-1928 816 Yakut 1920-1929 After these experiments, Latin script was introduced on a large scale for the non-slavic languages (excepting Georgian and Armenian). But even now 26 letters were not considered sufficient. Variants carrying a kind of diacritic were introduced, which resulted in extending the alphabet with 6-8 (Turkic), or even with 16-18 (Kaukasian) additional forms. Because of the typographical problems involved with this way of writing all changed (in 1937-1940) to Cyrillic script. Table 4 Languages using Latin script (second phase) (Where Arabic script was used in the past an "A" is added.) 302 Tadzhik 1928-1940 A 807 Kara-kalpak 1928-1940 A 303 Osetic 1923-1937 808 Nogai 1928-1937 A 401 Abkhaz 1929-1937 809 Kazakh 1928-1940 A 402 Abazin 1932-1937 810 Kirghiz 1927-1940 A 403 Adygei 1926-1937 A 811 Altaic 1929-1937 404 Kabarda 1928-1937 A 812 Karachai 1924-1937 A 501 Chechen 1925-1937 A 813 Balkar 1924-1937 A 502 Ingush 1923-1937 A 814 Uzbek 1927-1940 A 503 Avar 1928-1937 A 815 Uigur 1928-19 ? A 504 Dargva 1928-1937 A 816 Yakut 1929-1937 505 Lezghin 1928-1937 A 817 Tuvinian 1932-19 ? 506 Tabasaran 1931-1937 818 Khakass 1931-1937 507 Lak 1928-1937 A 901 Buryat 1929-1938 610 Mansi 1931-1937 902 Kalmyk 1930-1938 621 Khanti 1931-1937 1001 Evenki 1931-1937 701 Nenets 1931-1937 1002 Even 1931-1937 702 Selkup 1931-1940 1003 Nanay 1931-1937 802 Turkmen 1927-1940 A 1101 Dungan 1928-19 ? 803 Azeri 1928-1940 A 1201 Chukcha 1931-1937 804 Tatar 1928-1937 A 1203 Nivkh 1931-1940 805 Bashkir 1927-1940 A 1204 Eskimo 19 ?-1937 806 Kumyk 1927-1937 A With the conversion to Cyrillic script a different approach was chosen. To those phonemes not obviously corresponding to existing letters digraphs, trigraphs and even tetragraphs were assigned in the orthography. Thus Kaukasian languages, despite having a rich variety of phonemes, can be written with little more than basic Cyrillic (except Abkhasian). With other languages the effort was less coordinated, resulting in defining for the same sound a different letter, varying from one language area to the next. Many new letters for consonants were created by adding to an existing one a little hook, which is, however, not considered a diacritic. Even where a breve or diaeresis is applied, it is thought part of the letter, not a separable thing. After the introduction of a Cyrillic writing system the situation stabilised, as is shown in the publications referenced. These form the basis of this report. Even today not all languages spoken are being written, but all those used in the USSR by more than 20000 people are. (The smallest registered language community is that of Livonian, spoken by only 99 people as their mother language, and another 30 as their second.) Languages spoken in the Baltic countries always retained their Latin script, as did Georgian and Armenian with that of their own. For each of these languages a text example is given in the book by Gilyarevskij and Grivnin (G&G), together with its alphabet. The book by Musaev explains the use of each individual letter in each language, and points out where a phoneme in common is represented by a different letter, or the same letter applied to different use and pronunciation. Information of this kind may serve to design a coding system with alternative renderings for the same character in separate areas. Since publication of these books the number of up to yet still unwritten languages has been further reduced, but because most of these are only spoken by no more than a few 1000's of people they are being ignored in this report. As far as could be verified within the time limits allowed for this survey, no further change of spelling occurred in any of the countries after 1948, except in minor details, like with Azeri in 1958. Thus, for the USSR, the tables in the book by Musaev (1965) can still be taken as a basis. Where there was any doubt, or a difference between both books, the most recent edition of the lexicon for each language, if available, was consulted. The rules for spelling are everywhere strict, (also with the other countries mentioned), but the nature of the document giving those rules could not always be identified with certainty. It may be a lexicon published under state supervision, or of that of an Academy of Sciences, or the letter of a Law. Inasfar these documents have been identified, they are included in the Bibliography. 4 The complete repertoire Based on the available sources, a list of all the letters and other characters, required for Modern Cyrillic, was prepared, that was also suitable for coding in an International Standard. This was carefully checked with the list in the Comments with the Vote of the USSR Member Body (GOST) on the first DIS 10646, being the most authoritative source (GOST has been disbanded since). 4.1 Naming and identification The ISO rules for identification and referencing of characters require that each of them be given a unique full name. This one is included in every standard as part of its "repertoire", the list of characters for which a coding is specified. These names are formed to a definite syntax (described in Annex K of ISO/IEC 10646-1). In many practical situations a shorter form is being preferred. Several have been proposed. That used in this report is derived from the IBM system, based in essence on that from ISO/IEC 6937:1994. It had to be suitably extended to cover the whole repertoire of the ISO/IEC 10646-1:1993. Table B of the Annex presents the list of characters, grouped to the type of these short identifiers. 4.2 Proper justification Only those letters have been included in the repertoire for which the available sources, like the GOST list, provide a justification. Thus Table C of the Annex presents a list of characters, where for each of these the language is indicated that makes use of it, by way of a number code (specified in Table A of the Annex). It is ordered according to the codes assigned to each letter in ISO/IEC 10646-1. 4.3 Distribution of letters To get an insight in the extent Modern Cyrillic letters are used in various languages, Table D of the Annex presents for each of these languages those letters that are required. To enable comparisons and to find out which ones are also used in other languages, the number codes of these are supplied. In this way it is easy to see which letters are unique to a certain language, and which are also used by others. 4.4 Possible selections for a supplementary set The supplementary set for the Videotex system, such as is being proposed as a result of the work of the CEN Project Team PT004, and to be used with the Cyrillic primary set of graphic characters, is based on the requirements for Russia, Ukrainia, Byelorussia, Bulgaria, Serbia and Macedonia, due to the express wish of the European Commission. Should another alternative be considered, covering non-slavic languages from the USSR, then the problem arises which letters should be selected. It is obvious that there is only room for few. When removing the letters required for Serbia and Macedonia only, not more than 9 + 9 can be accomodated (either capital or small). If columns 02 and 03 could be used for letters, up to 16 + 16 additional positions become available. Selection of those should thus be made carefully, based on sound economic considerations. Criteria could be: 1. Number of people using the language to be covered (see Table A) 2. Number of additional letters required 3. Expected computer use with the language instead of Russian 4. Available communication infrastructure and computer literacy Whilst Questions 1 and 2 can easily be answered from the available sources, those on 3 present a problem. First, nationalistic sentiments may direct future developments (see also 5). Second, data about literacy in non-russian languages are hard to obtain. In the Annex Table E, derived from the 1989 USSR Census, is presented, showing use of national languages, in relation to Russian. The preliminary conclusions are: a. Where many people are bilingual, Russian may continue to be used for computing (just like English in Europe). b. Turkic languages may constitute a major and important market, in particular Uzbek and Kazakh (the situation in Azerbaidzhan is very uncertain). They require several extra letters. Conversion to Latin script may be proposed and discussed in these countries, but may appear eventually cost-prohibitive. c. A compromise is not easy to find, that covers all the letters of an area, like Central Asia, because Kazakh requires 9 letters already. In Table G of the Annex some exercises are presented, combining letters for several languages into one supplementary set. 5 Future changes in spelling and writing system Now that several former Union Republics of the USSR have become independent states, spelling reforms are being discussed. It is very difficult to obtain factual information about the results. In Tadzhikistan some people advocate return to Arabic script. It seems that Moldavia has converted to Latin script. Because the spelling of Rumanian could be adopted this would not cause particular problems. Several republics having a language from the Turkic group are considering the same thing, but only Azerbaidzhan has made some progress. It should be noticed that there are too few letters with Latin script for these languages, thus some additional ones from Cyrillic may be kept. Even for the Slavic languages the situation is not certain. Ukrainian has reintroduced their own letter g. The Soviet regime removed i, yat', fita and izhitsa from Russian in 1917. Now that so many old habits are being reinstated, these letters may gather their own band of supporters. The only conclusion to be drawn is that no character repertoire can be considered stable in the successor states of the USSR. 6 Sources The following basic sources were used: Gilyarevskij, R. S. and Grivnin, V. S., Language Identification Guide, Moscow, 1970 (Russian editions: 2, 1961, 3, 1964) Musaev, K. M., Alfavity yazikov narodov SSSR, Moskva, 1965. Comrie, Bernard, The languages of the Soviet Union, Cambridge, 1981. ISO/IEC 10646-1:1993, Universal Multiple-Octet Coded Character Set (UCS) - Part 1: Architecture and Basic Multilingual Plane. USSR Member Body (GOST) comments on DIS 10646 (in ISO/IEC JTC1/SC2/WG2 N 708). IBM Corporate Specification, Graphic Character Identification System, C-H 3-3220-055, 1988-01. Vestnik Statistiki 1990:10. p. 69-71 (1989 USSR Census results with respect to languages), Moskva, "Finansy i Statistika". 7 Acknowledgements Several persons provided significant help in collecting material and contributing invaluable advice. In the first place I thank Dr. Peter Hendriks, of Slavic Studies at RUL (Leiden University), for his long-standing support, only recently discontinued under pressure of petty Faculty burocrats. The many rare books in the private library of Dr. A. Nauta, as well as his expertise in Turkic-Altaic languages were a great help, without which producing this report would not have been possible. The staff of the Documentation Centre for Eastern European Law of RUL, and in particular Dr. G. P. van den Berg, assisted me very much at tracing information in Soviet publications. Finally, the constructive criticism of the CEN PT004 members, Claude Mahy, Borka Jerman-Blazic, J. Friemelt and I. Sebestyen, led to considerable improvements in the readability of this report. 8 Other material To this report is attached a Bibliography and an Annex. BIBLIOGRAPHY/SOURCES: In English: Edward Allworth. Nationalities of the Soviet East: Publications and Writing Systems, Columbia University Press, 1971, ISBN 231-63274-9. John Clews. Language Automation Worldwide: The Development of Character Set Standards. British Library R&D Reports 5962, Harrogate: Sesame Computer Projects, 1988. Bernard Comrie. The Languages of the Soviet Union. Cambridge Language Series, Cambridge University Press, 1981. Florian Coulmas. The Writing Systems of the World. Oxford, B. Blackwell, 1989. Carl Faulmann, das Buch der Schrift, Wien, 1880. R.S. Gilyarevsky and V.S. Grivnin, Language Identification Guide, Nauka, Central Department of Oriental Literature, Moscow, 1970. Kenneth Katzner. The Languages of the World. London. Routledge and Kegan Paul, 1986. Akira Nakanishi. Writing Systems of the World. Charles E. Tuttle, 1980. George Frederick von Ostermann. Manual of Foreign Languages for the Use of Librarians, Bibliographers, Research Workers, Editors, Translators, and Printers. 4th ed. New York, Contrag Book Co., 1952. National Language Support, Reference Manual, Vol. 2, SE09-8002, IBM National Language Technical Center, North York, Ont. , Canada. ALA-LC Romanization Tables: Transliteration Schemes for Non-Roman Scripts. Washington D.C., Library of Congress Cataloging Distribution Services, 1991. In Russian: Vladislav Mitrofanovich Andryushchenko. Kontseptsiya i Arkhitektura Mashinnogo Fonda Russkogo Yazyka, Moscow, Nauka, 1989. Prilozhenie 1: Rasshirennyi Kirillicheskii Alfavit-2. Magomet Izmailovich Isaev. Sto Tridtstat' Ravnopravnykh: O Yazykakh Narodov SSSR. Moscow, Nauka, 1970. Izuchenie Russkogo Yazyka i Istochnikovedenie. Moscow, Nauka, 1969. Katalog shriftov izdatel'stva akademii nauk SSSR, (Catalog of typefaces), Moscow 1962. Kanesbai M Musaev. Alfavity yazykov narodov SSSR, Moscow, Nauka, 1965. Kanesbai M Musaev. Orfografii tyurkskikh literaturnikh, Moscow, Nauka, 1973. Kanesbai M Musaev. O put sovershenstvovaniya alfavitov, Moscow, Nauka, 1982. Kanesbai M Musaev. Leksikologiya tyurkskikh yazikov, Moscow, Nauka, 1984. Kanesbai M Musaev. Razvitie terminologii na yazikakh SSSR, Moscow, Nauka, 1987. M. V. Shul'mejester. Knizhno-Zhurnal'naya verstka, (Typesetting books and journals). Moscow, Kniga, 1989. B. A. Starostin. Transkriptsiya sobstvennykh imen, Kniga, Moscow 1965 Yazyki narodov SSSR. Moscow, Nauka, 1966--1968. Separate Languages: Khukut Solomonovich Bgazhba, Iz istorii Pis'mennosti v Abkhazii, Tbilisi, 1967. Nickolas Poppe. Introduction to Altaic Linguistics. Wiesbaden, 1985. C. H. Andrusyshen. Ukrainian-English Dictionary, Saskatoon, 1955. Assya Humesky. Modern Ukrainian. Canadian Institute of Ukrainian Studies, Edmonton, Toronto, 1988. Pravila Izdaniya Slavyano-Moldavskikh i Moldavskikh Gramot XV-XVII vv., Shtiintsa, Kishinev, 1975. Uzbek Sovet. `Entsiklopediyasi. Toshkent, 1977ff. ISO standards and related documents on Cyrillic characters: International Organization for Standardization / International Electrotechnical Commission, Joint Technical Committee 1, ISO/IEC 8859-5:1988, Information Processing - 8-bit Single-Byte Coded Graphic Character Sets - Part 5: Latin/Cyrillic Alphabet. ISO/IEC 10646-1:1993, Universal Multiple-Octet Coded Character Set (UCS) - Part 1: Architecture and Basic Multilingual Plane. ISO 5427:1984, Extension of the Cyrillic alphabet coded character set for bibliographic information interchange. ISO 6861:1991, Cyrillic alphabet coded character sets for Slavonic languages for bibliographic information interchange. ISO 10754:1996, Extension of the Cyrillic alphabet coded character set for non-Slavic languages for bibliographic information interchange. ISO 9:1995,Transliteration of Cyrillic characters into Latin characters - Slavic and non-Slavic languages. ISO/IEC JTC1/SC2 N 2112, Summary of voting on DP 10646.2, attachment 12, (USSR comments accompanying the negative vote). ISO/IEC JTC1/SC2/WG2 N708, Summary of voting on DIS 10646, (USSR comments accompanying the positive vote). Lexicons (Turkic): Azersko-Russkij Slovar', M T Tagiev, Baku 1986. Bashkirsko-Russkij Slovar', ---- Ufa 1969. Chuvashsko-Russkij Slovar', M I Skvortsov, Moskva 1982. Gagauzsko-Russkij-Moldovskij Slovar', N A Baskakov, Moskva 1973. Karachaevo-Balkarsko-Russkij Slovar', E R Tenishev, Moskva 1989. Karakalpaksko-Russkij Slovar', N A Baskakov, Moskva 1958. Kazakhsko-Russkij Slovar', G Musabaev, Alma-Ata 1954. Khakassko-Russkij Slovar', N A Baskakov, Moskva 1953. Kirgizsko-Russkij Slovar', K K Yudakhin, Moskva 1965. Kumyksko-Russkij Slovar', Z Z Bammatov, Moskva 1969. Nogaisko-Russkij Slovar', N A Baskakov, Moskva 1963. Tatarsko-Russkij Slovar', ---- Moskva 1966. Turkmensko-Russkij Slovar', N A Baskakov, Moskva 1968. Tuvinsko-Russkij Slovar', E R Tenishev, Moskva 1968. Uzbeksko-Russkij Slovar', A K Borovkov, Moskva 1959. Uigursko-Russkij Slovar', E N Nadzhil, Moskva 1968. Yakutsko-Russkij Slovar', P A Spentsov, Moskva 1972. Russko-Altaiskij Slovar', N A Baskakov, Moskva 1964. Russko-Azer'bajdzhanskij Slovar', A A Orudzhev, I II III Baku 1971, 75, 78 Russko-Bashkirskij Slovar', K Z Akhmerov, Moskva 1964. Russko-Chuvashskij Slovar', N A Andreev, I P Petrov Moskva 1971. Russko-Karachaevo-Balkarskij Slovar', Kh I Suyunchev, Moskva 1965. Russko-Karakalpakskij Slovar', N A Baskakov, Moskva 1967. Russko-Kazakhskij Slovar', N T Sauranbajev, Moskva 1954. Russko-Khakasskij Slovar', N A Baskakov, Moskva 1961. Russko-Kirgizskij Slovar', K K Yudakhin, Moskva 1957. Russko-Kumykskij Slovar', Z Z Bammatov, Moskva 1960. Russko-Nogaiskij Slovar', N A Baskakov, Moskva 1956. Russko-Tatarskij Slovar', ---- Kazan 1971. Russko-Turkmenskij Slovar', N A Baskakov, Moskva 1956. Russko-Tuvinskij Slovar', D A Mongush, Moskva 1980. Russko-Uzbekskij Slovar', P Abdurakhmanov, Moskva 1954. Russko-Uigurskij Slovar', T P Rakhimov, Moskva 1956. Russko-Yakutskij Slovar', M B Lazov, Moskva 1968. Lexicons (Finno-Ugric): Komi Dictionary and Grammar, Nikolai Rogov, Marijsko-Russkij Slovar', ---- Moskva 1956. Russko-Marijskij Slovar', I S Galkin, Moskva 1966. Lexicons (Kaukasian): Kabardino-Russkij Slovar', B M Kardanov, Moskva 1957. Lexicons (Languages of the North): Evensko-Russkij Slovar', G M Vasilevich, Moskva 1958. Koryaksko-Russkij Slovar', I S Vdovin, Leningrad 1960. Nanay-Russkij Slovar', V A Avrorin, Moskva 1980. Russko-Evenskij Slovar', V I Tsintsius, Moskva 1952. Russko-Nanayskij Slovar', V A Avrorin, Leningrad 1959. Russko-Nivkhskij Slovar', V N Sabel'ev, Ch M Taksami, Moskva 1965. Lexicons (Other): Buryatsko-Russkij Slovar', K M Cheremisov, Moskva 1973. Russko-Buryatskij Slovar', Ts B Tsydendambaev, Moskva 1954. Kalmyksko-Russkij Slovar', B D Muniev, Moskva 1977. Russko-Kalmykskij Slovar', B B Basangov, Moskva 1963. Moldavskij Yuridicheskij Slovar', Kishinev 1970. Tadzhiksko-Russkij Slovar', M B Rakhimi, L V Uspenskij, Moskva 1954. Lexicons (non-cyrillic written languages): Estonsko-Russkij Slovar', J Tamm, Tallinn 1954. Latyshsko-Russkij Slovar', Riga 1979. Gruzino-Russkij Slovar', A G Shalidze Tbilisi 1984. Armyansko-Russkij Slovar', E G Galstyan Erevan 1984. ANNEX The following tables, results of the research, are presented: A List of all languages written with modern cyrillic characters, with numbers assigned to each for reference. B List of short identifiers, needed for identification or referencing, for all modern cyrillic characters, with their use in languages. C Code table and name list of all modern cyrillic characters, in the order of their hexadecimal code, as specified in ISO/IEC 10646-1: 1993, with their use in languages indicated. D List of all modern cyrillic characters, with their use in languages, presented in the order of Table B (small letters only). E List of the modern cyrillic characters used in each language separate (small letters only). F Tables showing use of national language. G Some possible code tables for non-slavic language use. TABLE A VERSION 2.1 LANGUAGES IN THE WORLD USING MODERN CYRILLIC SCRIPT 1993-02-21 (coded with numbers for reference) J. W. van Wingen 0 Languages outside the USSR 070 Serbian 080 Macedonian 090 Bulgarian 050 Mongolian USSR (former), related languages grouped together. Source: USSR Census, 1970, 1989. (numbers of speakers in 1000s) (% of nationals not speaking their own language, in last columns) 1970 1989 1970 1989 1 Slavic 170300 187773 101 Russian 128811 144886 - - 102 Byelorussian 7291 7120 19.45 29.09 103 Ukrainian 34906 35820 14.35 18.93 2 Romanic 2560 3070 201 Moldavian 2560 3070 5.00 8.41 3 Iranian 2611 4763 301 Kurd 78 123 12.43 19.10 302 Tadzhik 2100 4120 1.49 2.28 303 Osetic 433 520 11.36 13.01 4 Kaukasian (Northwestern group) 511 674 401 Abkhaz 80 98 4.09 6.51 402 Abazin 24 31 4.08 6.59 403 Adygei 96 118 3.53 5.27 404 Kabarda 274 380 1.95 2.80 " Cherkess 37 47 7.96 9.62 5 Kaukasian (Northeastern group) 1831 2740 501 Chechen 605 939 1.31 1.89 502 Ingush 153 230 2.62 3.05 503 Avar 385 584 2.84 2.84 504 Dargva 227 356 1.57 2.51 505 Lezghin 304 427 6.10 8.45 506 Tabasaran 55 94 1.11 4.08 507 Lak 82 110 4.44 6.43 6 Finno-Ugric 2519 2202 610 Mansi 4 3 621 Khanti (Vakh) / 15 14 622 Khanti (Kazym) | " " 623 Khanti (Shuryshkar) | " " 624 Khanti (Surgut) \ " " 631 Komi 266 243 17.26 29.61 632 Komi-Permian 132 106 14.19 29.94 640 Udmurt 580 520 17.39 30.35 651 Mari (plains) / 550 542 8.82 19.00 652 Mari (mountains) \ " " 661 Mordvin-Erzya / 982 774 22.15 32.94 662 Mordvin-Moksha \ " " 7 Samoyed 26 29 701 Nenets 24 27 702 Selkup 2 2 8 Turkic 30341 46386 801 Chuvash 1470 1408 13.11 23.56 802 Turkmen 1510 2689 1.10 1.48 803 Azeri 4300 6614 1.79 2.31 804 Tatar 5290 5532 10.81 16.80 805 Bashkir 620 1048 33.82 27.70 806 Kumyk 186 275 1.58 2.60 807 Kara-kalpak 228 399 3.39 5.89 808 Nogai 46 68 10.22 10.10 809 Kazakh 5190 7890 1.96 3.01 810 Kirghiz 1430 2474 1.22 2.17 811 Altaic 49 60 12.81 15.66 812 Karachai 110 151 1.88 3.17 813 Balkar 58 80 2.83 6.37 814 Uzbek 9070 16417 1.35 1.68 815 Uigur 153 227 11.52 13.43 816 Yakut 296 358 5.30 6.17 817 Tuvinian 138 204 1.28 1.50 818 Khakass 56 61 16.32 23.92 819 Gagauzi 147 173 6.41 12.55 820 Crimean Tatar (not in G&G). 252 - 7.43 821 Dolgan (in Yakut) 5 6 9 Mongolian 416 520 901 Buryat 290 364 7.39 13.70 902 Kalmyk 126 156 8.32 10.30 10 Tunguso-Mandchuric 27 22 1001 Evenki 13 9 1002 Even 7 8 1003 Nanay 7 5 11 Sino-Tibetian 36 66 1101 Dungan 36 66 5.69 5.23 12 Palaeo-Asiatic 20 18 1201 Chukcha 11 11 1202 Koryak 6 5 1203 Nivkh 2 1 1204 Eskimo 1 1 TABLE B SYSTEM OF SHORT IDENTIFIERS FOR THE VERSION 3.0 COMPLETE REPERTOIRE OF GRAPHIC CHARACTERS 1999-03-09 REQUIRED FOR MODERN CYRILLIC SCRIPT J. W. van Wingen The short identifier (SID) precedes the name which identifies the letter according to ISO conventions. The source for the repertoire is ISO/IEC 10646-1:1993, referenced by "hex" value. Those letters not justified by the GOST list were not removed, but marked with *. Small letters are specified first, the corresponding capital letters are at the end of the list. Short identifiers (SID) are formed according to the rules of the system specified in ISO/IEC 6937:1994, Annex B for Latin script. They consist of two letters and two digits. These presented here are derived from the IBM system, that extends it to other alphabetic scripts. For Cyrillic the first letter is K. The conventions for the rest are based on the appearance a letter would have in Latin transliteration, even where no visible diacritic occurs. It is obvious that inconsistencies were unavoidable. SMALL LETTERS 11 ACUTE 21 CARON 31 MACRON 41 CEDILLA 51 LIGATURE 13 GRAVE 23 BREVE 33 43 OGONEK 15 CIRCUMFLEX 25 DOUBLE ACUTE 35 45 DESCENDER 17 DIAERESIS 27 RING ABOVE 37 47 CARON & DESCENDER 19 TILDE 29 DOT ABOVE 39 49 HOOK 61 STROKE 71 & DIAERESIS 81 & ACUTE 91 63 SPECIAL 73 83 & ACUTE 93 CIRCUMFLEX & MACRON 65 SPECIAL 75 85 & ACUTE 67 SPECIAL 77 & DIAERESIS 87 & ACUTE 69 SPECIAL 79 89 & ACUTE hex SID Name ----------------------------------------------------------------------- 0430 KA01 CYRILLIC SMALL LETTER A 0431 KB01 CYRILLIC SMALL LETTER BE 0432 KV01 CYRILLIC SMALL LETTER VE 0433 KG01 CYRILLIC SMALL LETTER GHE 0434 KD01 CYRILLIC SMALL LETTER DE 0435 KE01 CYRILLIC SMALL LETTER IE 0437 KZ01 CYRILLIC SMALL LETTER ZE 0438 KI01 CYRILLIC SMALL LETTER I 043A KK01 CYRILLIC SMALL LETTER KA 043B KL01 CYRILLIC SMALL LETTER EL 043C KM01 CYRILLIC SMALL LETTER EM 043D KN01 CYRILLIC SMALL LETTER EN 043E KO01 CYRILLIC SMALL LETTER O 043F KP01 CYRILLIC SMALL LETTER PE 0440 KR01 CYRILLIC SMALL LETTER ER 0441 KS01 CYRILLIC SMALL LETTER ES 0442 KT01 CYRILLIC SMALL LETTER TE 0443 KU01 CYRILLIC SMALL LETTER U 0444 KF01 CYRILLIC SMALL LETTER EF 0445 KH01 CYRILLIC SMALL LETTER HA 0446 KC01 CYRILLIC SMALL LETTER TSE 044B KY01 CYRILLIC SMALL LETTER YERU 0458 KJ01 CYRILLIC SMALL LETTER JE ---- KQ01 CYRILLIC SMALL LETTER KU ---- KW01 CYRILLIC SMALL LETTER WE ----------------------------------------------------------------------- 0439 KJ11 CYRILLIC SMALL LETTER SHORT I 044C KX11 CYRILLIC SMALL LETTER SOFT SIGN 0453 KG11 CYRILLIC SMALL LETTER GJE 0456 KI11 CYRILLIC SMALL LETTER BYELORUSSIAN-UKRAINIAN I 045A KN11 CYRILLIC SMALL LETTER NJE 045B KC11 CYRILLIC SMALL LETTER TSHE 045C KK11 CYRILLIC SMALL LETTER KJE ----------------------------------------------------------------------- 044D KE13 CYRILLIC SMALL LETTER E ----------------------------------------------------------------------- 044E KU15 CYRILLIC SMALL LETTER YU 044F KA15 CYRILLIC SMALL LETTER YA 0454 KE15 CYRILLIC SMALL LETTER UKRAINIAN IE 0455 KZ15 CYRILLIC SMALL LETTER DZE 0449 KS15 CYRILLIC SMALL LETTER SHCHA ----------------------------------------------------------------------- 0451 KE17 CYRILLIC SMALL LETTER IO 0457 KI17 CYRILLIC SMALL LETTER YI 04D3 KA17 CYRILLIC SMALL LETTER A WITH DIAERESIS 04DF KZ17 CYRILLIC SMALL LETTER ZE WITH DIAERESIS 04E7 KO17 CYRILLIC SMALL LETTER O WITH DIAERESIS 04F1 KU17 CYRILLIC SMALL LETTER U WITH DIAERESIS 04F9 KY17 CYRILLIC SMALL LETTER YERU WITH DIAERESIS ----------------------------------------------------------------------- 0436 KZ21 CYRILLIC SMALL LETTER ZHE 0447 KC21 CYRILLIC SMALL LETTER CHE 0448 KS21 CYRILLIC SMALL LETTER SHA 044A KU21 CYRILLIC SMALL LETTER HARD SIGN 045F KG21 CYRILLIC SMALL LETTER DZHE ----------------------------------------------------------------------- 04D1 KA23 CYRILLIC SMALL LETTER A WITH BREVE 04D7 KE23 CYRILLIC SMALL LETTER IE WITH BREVE 045E KU23 CYRILLIC SMALL LETTER SHORT U 04C2 KZ23 CYRILLIC SMALL LETTER ZHE WITH BREVE ----------------------------------------------------------------------- 04F3 KU25 CYRILLIC SMALL LETTER U WITH DOUBLE ACUTE ----------------------------------------------------------------------- 04E3 KI31 CYRILLIC SMALL LETTER I WITH MACRON 04EF KU31 CYRILLIC SMALL LETTER U WITH MACRON ----------------------------------------------------------------------- 0459 KL41 CYRILLIC SMALL LETTER LJE ----------------------------------------------------------------------- 04CC KC43 *CYRILLIC SMALL LETTER KHAKASSIAN CHE ----------------------------------------------------------------------- 0499 KZ45 CYRILLIC SMALL LETTER ZE WITH DESCENDER 049B KK45 CYRILLIC SMALL LETTER KA WITH DESCENDER 04A3 KN45 CYRILLIC SMALL LETTER EN WITH DESCENDER 04AB KS45 CYRILLIC SMALL LETTER ES WITH DESCENDER 04AD KT45 CYRILLIC SMALL LETTER TE WITH DESCENDER 04B3 KH45 CYRILLIC SMALL LETTER HA WITH DESCENDER ----------------------------------------------------------------------- 0497 KZ47 CYRILLIC SMALL LETTER ZHE WITH DESCENDER 04B7 KC47 CYRILLIC SMALL LETTER CHE WITH DESCENDER ----------------------------------------------------------------------- 04C4 KK49 *CYRILLIC SMALL LETTER KA WITH HOOK 04C8 KN49 *CYRILLIC SMALL LETTER EN WITH HOOK ----------------------------------------------------------------------- 04D5 KA51 CYRILLIC SMALL LIGATURE A IE 04A5 KN51 CYRILLIC SMALL LIGATURE EN GHE 04B5 KT51 CYRILLIC SMALL LIGATURE TE TSE ----------------------------------------------------------------------- 0452 KD61 CYRILLIC SMALL LETTER DJE 0493 KG61 CYRILLIC SMALL LETTER GHE WITH STROKE 049F KK61 CYRILLIC SMALL LETTER KA WITH STROKE ----------------------------------------------------------------------- 04BB KH63 CYRILLIC SMALL LETTER SHHA 04E1 KZ63 CYRILLIC SMALL LETTER ABKHASIAN DZE 04BD KC63 CYRILLIC SMALL LETTER ABKHASIAN CHE 04A9 KW63 CYRILLIC SMALL LETTER ABKHASIAN KHA ----------------------------------------------------------------------- 0495 KG65 CYRILLIC SMALL LETTER GHE WITH MIDDLE HOOK 04A7 KP65 CYRILLIC SMALL LETTER PE WITH MIDDLE HOOK 049D KK65 CYRILLIC SMALL LETTER KA WITH VERTICAL STROKE 04B9 KC65 CYRILLIC SMALL LETTER CHE WITH VERTICAL STROKE ----------------------------------------------------------------------- 0491 KG67 CYRILLIC SMALL LETTER GHE WITH UPTURN 04A1 KK67 CYRILLIC SMALL LETTER BASHKIR KA 04D9 KE67 CYRILLIC SMALL LETTER SCHWA 04E9 KO67 CYRILLIC SMALL LETTER BARRED O 04AF KU67 CYRILLIC SMALL LETTER STRAIGHT U ----------------------------------------------------------------------- 04B1 KU69 CYRILLIC SMALL LETTER STRAIGHT U WITH STROKE 04BF KC69 CYRILLIC SMALL LETTER ABKHASIAN CHE WITH DESCENDER ----------------------------------------------------------------------- 04E5 KI71 CYRILLIC SMALL LETTER I WITH DIAERESIS 04DD KZ71 CYRILLIC SMALL LETTER ZHE WITH DIAERESIS 04F5 KC71 CYRILLIC SMALL LETTER CHE WITH DIAERESIS ----------------------------------------------------------------------- 04DB KE77 CYRILLIC SMALL LETTER SCHWA WITH DIAERESIS 04EB KO77 CYRILLIC SMALL LETTER BARRED O WITH DIAERESIS ----------------------------------------------------------------------- KA81 *CYRILLIC SMALL LETTER A WITH ACUTE KE81 *CYRILLIC SMALL LETTER IE WITH ACUTE KI81 *CYRILLIC SMALL LETTER I WITH ACUTE KO81 *CYRILLIC SMALL LETTER O WITH ACUTE KU81 *CYRILLIC SMALL LETTER U WITH ACUTE KY81 *CYRILLIC SMALL LETTER YERY WITH ACUTE ----------------------------------------------------------------------- KD83 *CYRILLIC SMALL LETTER DJE WITH ACUTE KE83 *CYRILLIC SMALL LETTER SCHWA WITH ACUTE KO83 *CYRILLIC SMALL LETTER BARRED O WITH ACUTE KU83 *CYRILLIC SMALL LETTER STRAIGHT U WITH ACUTE ----------------------------------------------------------------------- KU85 *CYRILLIC SMALL LETTER YU WITH ACUTE KA85 *CYRILLIC SMALL LETTER YA WITH ACUTE KE85 *CYRILLIC SMALL LETTER UKRAINIAN IE WITH ACUTE ----------------------------------------------------------------------- KE87 *CYRILLIC SMALL LETTER IE WITH DIAERESIS AND ACUTE KO87 *CYRILLIC SMALL LETTER O WITH DIAERESIS AND ACUTE ----------------------------------------------------------------------- KU89 *CYRILLIC SMALL LETTER STRAIGHT U WITH STROKE AND ACUTE ----------------------------------------------------------------------- KU93 *CYRILLIC SMALL LETTER YU WITH MACRON KA93 *CYRILLIC SMALL LETTER YA WITH MACRON ----------------------------------------------------------------------- 04C0 SA99 CYRILLIC LETTER PALOCHKA ----------------------------------------------------------------------- CAPITAL LETTERS 12 ACUTE 22 CARON 32 MACRON 42 CEDILLA 52 LIGATURE 14 GRAVE 24 BREVE 34 44 OGONEK 16 CIRCUMFLEX 26 DOUBLE ACUTE 36 46 DESCENDER 18 DIAERESIS 28 RING ABOVE 38 48 CARON & DESCENDER 20 TILDE 30 DOT ABOVE 40 50 HOOK 62 STROKE 72 & DIAERESIS 82 & ACUTE 92 64 SPECIAL 74 84 & ACUTE 94 CIRCUMFLEX & MACRON 66 SPECIAL 76 86 & ACUTE 68 SPECIAL 78 & DIAERESIS 88 & ACUTE 70 SPECIAL 80 90 & ACUTE hex SID Name ----------------------------------------------------------------------- 0410 KA02 CYRILLIC CAPITAL LETTER A 0411 KB02 CYRILLIC CAPITAL LETTER BE 0412 KV02 CYRILLIC CAPITAL LETTER VE 0413 KG02 CYRILLIC CAPITAL LETTER GHE 0414 KD02 CYRILLIC CAPITAL LETTER DE 0415 KE02 CYRILLIC CAPITAL LETTER IE 0417 KZ02 CYRILLIC CAPITAL LETTER ZE 0418 KI02 CYRILLIC CAPITAL LETTER I 041A KK02 CYRILLIC CAPITAL LETTER KA 041B KL02 CYRILLIC CAPITAL LETTER EL 041C KM02 CYRILLIC CAPITAL LETTER EM 041D KN02 CYRILLIC CAPITAL LETTER EN 041E KO02 CYRILLIC CAPITAL LETTER O 041F KP02 CYRILLIC CAPITAL LETTER PE 0420 KR02 CYRILLIC CAPITAL LETTER ER 0421 KS02 CYRILLIC CAPITAL LETTER ES 0422 KT02 CYRILLIC CAPITAL LETTER TE 0423 KU02 CYRILLIC CAPITAL LETTER U 0424 KF02 CYRILLIC CAPITAL LETTER EF 0425 KH02 CYRILLIC CAPITAL LETTER HA 0426 KC02 CYRILLIC CAPITAL LETTER TSE 042B KY02 CYRILLIC CAPITAL LETTER YERU 0408 KJ02 CYRILLIC CAPITAL LETTER JE ---- KQ02 CYRILLIC CAPITAL LETTER KU ---- KW02 CYRILLIC CAPITAL LETTER WE ----------------------------------------------------------------------- 0419 KJ12 CYRILLIC CAPITAL LETTER SHORT I 042C KX12 CYRILLIC CAPITAL LETTER SOFT SIGN 0403 KG12 CYRILLIC CAPITAL LETTER GJE 0406 KI12 CYRILLIC CAPITAL LETTER BYELORUSSIAN-UKRAINIAN I 040A KN12 CYRILLIC CAPITAL LETTER NJE 040B KC12 CYRILLIC CAPITAL LETTER TSHE 040C KK12 CYRILLIC CAPITAL LETTER KJE ----------------------------------------------------------------------- 042D KE14 CYRILLIC CAPITAL LETTER E ----------------------------------------------------------------------- 042E KU16 CYRILLIC CAPITAL LETTER YU 042F KA16 CYRILLIC CAPITAL LETTER YA 0404 KE16 CYRILLIC CAPITAL LETTER UKRAINIAN IE 0405 KZ16 CYRILLIC CAPITAL LETTER DZE 0429 KS16 CYRILLIC CAPITAL LETTER SHCHA ----------------------------------------------------------------------- 0401 KE18 CYRILLIC CAPITAL LETTER IO 0407 KI18 CYRILLIC CAPITAL LETTER YI 04D2 KA18 CYRILLIC CAPITAL LETTER A WITH DIAERESIS 04DE KZ18 CYRILLIC CAPITAL LETTER ZE WITH DIAERESIS 04E6 KO18 CYRILLIC CAPITAL LETTER O WITH DIAERESIS 04F0 KU18 CYRILLIC CAPITAL LETTER U WITH DIAERESIS 04F8 KY18 CYRILLIC CAPITAL LETTER YERU WITH DIAERESIS ----------------------------------------------------------------------- 0416 KZ22 CYRILLIC CAPITAL LETTER ZHE 0427 KC22 CYRILLIC CAPITAL LETTER CHE 0428 KS22 CYRILLIC CAPITAL LETTER SHA 042A KU22 CYRILLIC CAPITAL LETTER HARD SIGN 040F KG22 CYRILLIC CAPITAL LETTER DZHE ----------------------------------------------------------------------- 04D0 KA24 CYRILLIC CAPITAL LETTER A WITH BREVE 04D6 KE24 CYRILLIC CAPITAL LETTER IE WITH BREVE 040E KU24 CYRILLIC CAPITAL LETTER SHORT U 04C2 KZ24 CYRILLIC CAPITAL LETTER ZHE WITH BREVE ----------------------------------------------------------------------- 04F2 KU26 CYRILLIC CAPITAL LETTER U WITH DOUBLE ACUTE ----------------------------------------------------------------------- 04E2 KI32 CYRILLIC CAPITAL LETTER I WITH MACRON 04EE KU32 CYRILLIC CAPITAL LETTER U WITH MACRON ----------------------------------------------------------------------- 0409 KL42 CYRILLIC CAPITAL LETTER LJE ----------------------------------------------------------------------- 04CB KC44 *CYRILLIC CAPITAL LETTER KHAKASSIAN CHE ----------------------------------------------------------------------- 0498 KZ46 CYRILLIC CAPITAL LETTER ZE WITH DESCENDER 049A KK46 CYRILLIC CAPITAL LETTER KA WITH DESCENDER 04A2 KN46 CYRILLIC CAPITAL LETTER EN WITH DESCENDER 04AA KS46 CYRILLIC CAPITAL LETTER ES WITH DESCENDER 04AC KT46 CYRILLIC CAPITAL LETTER TE WITH DESCENDER 04B2 KH46 CYRILLIC CAPITAL LETTER HA WITH DESCENDER ----------------------------------------------------------------------- 0496 KZ48 CYRILLIC CAPITAL LETTER ZHE WITH DESCENDER 04B6 KC48 CYRILLIC CAPITAL LETTER CHE WITH DESCENDER ----------------------------------------------------------------------- 04C3 KK50 *CYRILLIC CAPITAL LETTER KA WITH HOOK 04C7 KN50 *CYRILLIC CAPITAL LETTER EN WITH HOOK ----------------------------------------------------------------------- 04D4 KA52 CYRILLIC CAPITAL LIGATURE A IE 04A4 KN52 CYRILLIC CAPITAL LIGATURE EN GHE 04B4 KT52 CYRILLIC CAPITAL LIGATURE TE TSE ----------------------------------------------------------------------- 0402 KD62 CYRILLIC CAPITAL LETTER DJE 0492 KG62 CYRILLIC CAPITAL LETTER GHE WITH STROKE 049E KK62 CYRILLIC CAPITAL LETTER KA WITH STROKE ----------------------------------------------------------------------- 04BA KH64 CYRILLIC CAPITAL LETTER SHHA 04E0 KZ64 CYRILLIC CAPITAL LETTER ABKHASIAN DZE 04BC KC64 CYRILLIC CAPITAL LETTER ABKHASIAN CHE 04A8 KW64 CYRILLIC CAPITAL LETTER ABKHASIAN KHA ----------------------------------------------------------------------- 0494 KG66 CYRILLIC CAPITAL LETTER GHE WITH MIDDLE HOOK 04A6 KP66 CYRILLIC CAPITAL LETTER PE WITH MIDDLE HOOK 049C KK66 CYRILLIC CAPITAL LETTER KA WITH VERTICAL STROKE 04B8 KC66 CYRILLIC CAPITAL LETTER CHE WITH VERTICAL STROKE ----------------------------------------------------------------------- 0490 KG68 CYRILLIC CAPITAL LETTER GHE WITH UPTURN 04A0 KK68 CYRILLIC CAPITAL LETTER BASHKIR KA 04D8 KE68 CYRILLIC CAPITAL LETTER SCHWA 04E8 KO68 CYRILLIC CAPITAL LETTER BARRED O 04AE KU68 CYRILLIC CAPITAL LETTER STRAIGHT U ----------------------------------------------------------------------- 04B0 KU70 CYRILLIC CAPITAL LETTER STRAIGHT U WITH STROKE 04BE KC70 CYRILLIC CAPITAL LETTER ABKHASIAN CHE WITH DESCENDER ----------------------------------------------------------------------- 04E4 KI72 CYRILLIC CAPITAL LETTER I WITH DIAERESIS 04DC KZ72 CYRILLIC CAPITAL LETTER ZHE WITH DIAERESIS 04F4 KC72 CYRILLIC CAPITAL LETTER CHE WITH DIAERESIS ----------------------------------------------------------------------- 04DA KE78 CYRILLIC CAPITAL LETTER SCHWA WITH DIAERESIS 04EA KO78 CYRILLIC CAPITAL LETTER BARRED O WITH DIAERESIS ----------------------------------------------------------------------- KA82 *CYRILLIC CAPITAL LETTER A WITH ACUTE KE82 *CYRILLIC CAPITAL LETTER IE WITH ACUTE KI82 *CYRILLIC CAPITAL LETTER I WITH ACUTE KO82 *CYRILLIC CAPITAL LETTER O WITH ACUTE KU82 *CYRILLIC CAPITAL LETTER U WITH ACUTE KY82 *CYRILLIC CAPITAL LETTER YERY WITH ACUTE ----------------------------------------------------------------------- KD84 *CYRILLIC CAPITAL LETTER DJE WITH ACUTE KE84 *CYRILLIC CAPITAL LETTER SCHWA WITH ACUTE KO84 *CYRILLIC CAPITAL LETTER BARRED O WITH ACUTE KU84 *CYRILLIC CAPITAL LETTER STRAIGHT U WITH ACUTE ----------------------------------------------------------------------- KU86 *CYRILLIC CAPITAL LETTER YU WITH ACUTE KA86 *CYRILLIC CAPITAL LETTER YA WITH ACUTE KE86 *CYRILLIC CAPITAL LETTER UKRAINIAN IE WITH ACUTE ----------------------------------------------------------------------- KE88 *CYRILLIC CAPITAL LETTER IE WITH DIAERESIS AND ACUTE KO88 *CYRILLIC CAPITAL LETTER O WITH DIAERESIS AND ACUTE ----------------------------------------------------------------------- KU90 *CYRILLIC CAPITAL LETTER STRAIGHT U WITH STROKE AND ACUTE ----------------------------------------------------------------------- KU94 *CYRILLIC CAPITAL LETTER YU WITH MACRON KA94 *CYRILLIC CAPITAL LETTER YA WITH MACRON ----------------------------------------------------------------------- TABLE C VERSION 3.0 COMPLETE REPERTOIRE OF GRAPHIC CHARACTERS 1999-03-09 REQUIRED FOR MODERN CYRILLIC SCRIPT J. W. van Wingen WITH INDICATION OF LANGUAGE USE The list is taken from ISO/IEC 10646-1:1993, and presented in the order given there, indicated by the hexadecimal value in the first column. Added at the end are a number of characters occurring in the first DIS, but since deleted. For easy reference a short identifier (SID) is specified in the second column. It is derived from the IBM system, and described more in detail in Table B. The name which identifies the letter according to ISO conventions is given in the third column, with some shortening from lack of space, (CCL is short for CYRILLIC CAPITAL LETTER, CSL is short for CYRILLIC SMALL LETTER). The languages in which a character is used are indicated by codes in the form of a number in the fourth column. These language codes are explained in a separate list, in Table A. The information is based on the books by Gilyarevskiy and Grivnin, and by K. M. Musaev. NOTE: According to the GOST list, the letters marked here with ? do not exist in any language. Neither do KK49/50, KN49/50, KC43/44, being identical (other than perhaps in local typography) to KK45/46, KN45/46, KC47/48. Nevertheless, ISO/IEC 10646-1:1993 includes these as separate letters, which decision is followed here. Also KD83/84 is nowhere found. In accordance with the 1989 lexicon for the single language Karachai-Balkar, KU81/82 is replaced by KU23/24. All these are marked with a * in all lists. Omitted from the repertoire of ISO/IEC 10646-1:1993 are, quite arbitrarily, the letters KU and WE (KQ01, KQ02, KW01, KW02), given at the end of the list. They were a requirement from the USSR National Body as stated in their comment on the first DIS 10646. These comments were circulated in SC2/WG2 only (N 708), in deviation from ISO usage. With the Khanty language (spoken by less than 14000 people) four variants or dialects are distinguished, each with an alphabet of its own (621, 622, 623, 624). Because sources are sometimes contradictory, only the number 620 is indicated with characters required for any Khanty variant. hex SID name (ISO/IEC 10646-1:1993) languages using it ----------------------------------------------------------------------- 0400 (not used) 0401 KE18 CCL IO 101 0402 KD62 CCL DJE 070 0403 KG12 CCL GJE 080 0404 KE16 CCL UKRAINIAN IE 103 0405 KZ16 CCL DZE 080 0406 KI12 CCL BYELORUSSIAN-UKRAINIAN I 102,103,631,632,809,18 0407 KI18 CCL YI 103 0408 KJ02 CCL JE 070,080,803,811 0409 KL42 CCL LJE 070,080 040A KN12 CCL NJE 070,080 040B KC12 CCL TSHE 070,401 040C KK12 CCL KJE 080 040D (not used) 040E KU24 CCL SHORT U 102,812,813,814,1101 040F KG22 CCL DZHE 070,080 ----------------------------------------------------------------------- 0410 KA02 CCL A 000 0411 KB02 CCL BE 000 0412 KV02 CCL VE 000 0413 KG02 CCL GHE 000 0414 KD02 CCL DE 000 0415 KE02 CCL IE 000 0416 KZ22 CCL ZHE 000 0417 KZ02 CCL ZE 000 0418 KI02 CCL I 000 0419 KJ12 CCL SHORT I 090,101 041A KK02 CCL KA 000 041B KL02 CCL EL 000 041C KM02 CCL EM 000 041D KN02 CCL EN 000 041E KO02 CCL O 000 041F KP02 CCL PE 000 ----------------------------------------------------------------------- 0420 KR02 CCL ER 000 0421 KS02 CCL ES 000 0422 KT02 CCL TE 000 0423 KU02 CCL U 000 0424 KF02 CCL EF 000 0425 KH02 CCL HA 000 0426 KC02 CCL TSE 000 0427 KC22 CCL CHE 000 0428 KS22 CCL SHA 000 0429 KS16 CCL SHCHA 090,101 042A KU22 CCL HARD SIGN 090,101 042B KY02 CCL YERU 101 042C KX12 CCL SOFT SIGN 090,101 042D KE14 CCL E 101 042E KU16 CCL YU 090,101 042F KA16 CCL YA 090,101 hex SID name (ISO/IEC 10646-1:1993) languages using it ----------------------------------------------------------------------- 0430 KA01 CSL A 000 0431 KB01 CSL BE 000 0432 KV01 CSL VE 000 0433 KG01 CSL GHE 000 0434 KD01 CSL DE 000 0435 KE01 CSL IE 000 0436 KZ21 CSL ZHE 000 0437 KZ01 CSL ZE 000 0438 KI01 CSL I 000 0439 KJ11 CSL SHORT I 090,101 043A KK01 CSL KA 000 043B KL01 CSL EL 000 043C KM01 CSL EM 000 043D KN01 CSL EN 000 043E KO01 CSL O 000 043F KP01 CSL PE 000 ----------------------------------------------------------------------- 0440 KR01 CSL ER 000 0441 KS01 CSL ES 000 0442 KT01 CSL TE 000 0443 KU01 CSL U 000 0444 KF01 CSL EF 000 0445 KH01 CSL HA 000 0446 KC01 CSL TSE 000 0447 KC21 CSL CHE 000 0448 KS21 CSL SHA 000 0449 KS15 CSL SHCHA 090,101 044A KU21 CSL HARD SIGN 090,101 044B KY01 CSL YERU 101 044C KX11 CSL SOFT SIGN 090,101 044D KE13 CSL E 101 044E KU15 CSL YU 090,101 044F KA15 CSL YA 090,101 ----------------------------------------------------------------------- 0450 (not used) 0451 KE17 CSL IO 101 0452 KD61 CSL DJE 070 0453 KG11 CSL GJE 080 0454 KE15 CSL UKRAINIAN IE 103 0455 KZ15 CSL DZE 080 0456 KI11 CSL BYELORUSSIAN-UKRAINIAN I 102,103,631,632,809,18 0457 KI17 CSL YI 103 0458 KJ01 CSL JE 070,080,803,811 0459 KL41 CSL LJE 070,080 045A KN11 CSL NJE 070,080 045B KC11 CSL TSHE 070 045C KK11 CSL KJE 080 045D (not used) 045E KU23 CSL SHORT U 102,812,813,814,1101 045F KG21 CSL DZHE 070,080,401 hex SID name (ISO/IEC 10646-1:1993) languages using it ----------------------------------------------------------------------- 0490 KG68 CCL GHE WITH UPTURN 103 0491 KG67 CSL GHE WITH UPTURN 103 0492 KG62 CCL GHE WITH STROKE 302,803,5,7,9,14,15,18 0493 KG61 CSL GHE WITH STROKE 302,803,5,7,9,14,15,18 0494 KG66 CCL GHE WITH MIDDLE HOOK 401,816 0495 KG65 CSL GHE WITH MIDDLE HOOK 401,816 0496 KZ48 CCL ZHE WITH DESCENDER 802,4,15,902,1101 0497 KZ47 CSL ZHE WITH DESCENDER 802,4,15,902,1101 0498 KZ46 CCL ZE WITH DESCENDER 805 0499 KZ45 CSL ZE WITH DESCENDER 805 049A KK46 CCL KA WITH DESCENDER 302,401,807,9,14,15 049B KK45 CSL KA WITH DESCENDER 302,401,807,9,14,15 049C KK66 CCL KA WITH VERTICAL STROKE 803 049D KK65 CSL KA WITH VERTICAL STROKE 803 049E KK62 CCL KA WITH STROKE 401 049F KK61 CSL KA WITH STROKE 401 ----------------------------------------------------------------------- 04A0 KK68 CCL BASHKIR KA 805 04A1 KK67 CSL BASHKIR KA 805 04A2 KN46 CCL EN WITH DESCENDER 610,20,802,4,5,9,10,15,17,18,902,1101 04A3 KN45 CSL EN WITH DESCENDER 610,20,802,4,5,9,10,15,17,18,902,1101 04A4 KN52 CCL LIGATURE EN GHE 651,652,811,16 04A5 KN51 CSL LIGATURE EN GHE 651,652,811,16 04A6 KP66 CCL PE WITH MIDDLE HOOK 401 04A7 KP65 CSL PE WITH MIDDLE HOOK 401 04A8 KW64 CCL ABKHASIAN KHA 401 04A9 KW63 CSL ABKHASIAN KHA 401 04AA KS46 CCL ES WITH DESCENDER 801,5 04AB KS45 CSL ES WITH DESCENDER 801,5 04AC KT46 CCL TE WITH DESCENDER 401 04AD KT45 CSL TE WITH DESCENDER 401 04AE KU68 CCL STRAIGHT U 802,3,4,5,7,9,10,15,16,17,901,1101 04AF KU67 CSL STRAIGHT U 802,3,4,5,7,9,10,15,16,17,901,1101 ----------------------------------------------------------------------- 04B0 KU70 CCL STRAIGHT U WITH STROKE 809 04B1 KU69 CSL STRAIGHT U WITH STROKE 809 04B2 KH46 CCL HA WITH DESCENDER 302,401,807,14 04B3 KH45 CSL HA WITH DESCENDER 302,401,807,14 04B4 KT52 CCL LIGATURE TE TSE 401 04B5 KT51 CSL LIGATURE TE TSE 401 04B6 KC48 CCL CHE WITH DESCENDER 302,401 04B7 KC47 CSL CHE WITH DESCENDER 302,401 04B8 KC66 CCL CHE WITH VERTICAL STROKE 803 04B9 KC65 CSL CHE WITH VERTICAL STROKE 803 04BA KH64 CCL SHHA 301,803,4,5,9,15,16,901,902 04BB KH63 CSL SHHA 301,803,4,5,9,15,16,901,902 04BC KC64 CCL ABKHASIAN CHE 401 04BD KC63 CSL ABKHASIAN CHE 401 04BE KC70 CCL ABKHASIAN CHE WITH DESCENDER 401 04BF KC69 CSL ABKHASIAN CHE WITH DESCENDER 401 hex SID name (ISO/IEC 10646-1:1993) languages using it ----------------------------------------------------------------------- 04C0 SA99 CYRILLIC LETTER PALOCHKA 402,3,4,501,2,3,4,5,6,7 04C1 KZ24 CCL ZHE WITH BREVE 201 04C2 KZ23 CSL ZHE WITH BREVE 201 04C3 KK50 *CCL KA WITH HOOK 621,1201 04C4 KK49 *CSL KA WITH HOOK 621,1201 04C5 (not used) 04C6 (not used) 04C7 KN50 *CCL EN WITH HOOK 621,1201 04C8 KN49 *CSL EN WITH HOOK 621,1201 04C9 (not used) 04CA (not used) 04CB KC44 *CCL KHAKASSIAN CHE 818 04CC KC43 *CSL KHAKASSIAN CHE 818 04CD (not used) 04CE (not used) 04CF (not used) ----------------------------------------------------------------------- 04D0 KA24 CCL A WITH BREVE 801 04D1 KA23 CSL A WITH BREVE 801 04D2 KA18 CCL A WITH DIAERESIS 620,651,52,819,902 04D3 KA17 CSL A WITH DIAERESIS 620,651,52,819,902 04D4 KA52 CCL LIGATURE A IE 303 04D5 KA51 CSL LIGATURE A IE 303 04D6 KE24 CCL IE WITH BREVE 801 04D7 KE23 CSL IE WITH BREVE 801 04D8 KE68 CCL SCHWA 301,401,620,802,3,4,5,9,15,902,1101 04D9 KE67 CSL SCHWA 301,401,620,802,3,4,5,9,15,902,1101 04DA KE78 CCL SCHWA WITH DIAERESIS 620 04DB KE77 CSL SCHWA WITH DIAERESIS 620 04DC KZ72 CCL ZHE WITH DIAERESIS 640 04DD KZ71 CSL ZHE WITH DIAERESIS 640 04DE KZ18 CCL ZE WITH DIAERESIS 640 04DF KZ17 CSL ZE WITH DIAERESIS 640 ----------------------------------------------------------------------- 04E0 KZ64 CCL ABKHASIAN DZE 401 04E1 KZ63 CSL ABKHASIAN DZE 401 04E2 KI32 CCL I WITH MACRON 302 04E3 KI31 CSL I WITH MACRON 302 04E4 KI72 CCL I WITH DIAERESIS 640 04E5 KI71 CSL I WITH DIAERESIS 640 04E6 KO18 CCL O WITH DIAERESIS 301,620,31,2,40,51,52,811,18,19 04E7 KO17 CSL O WITH DIAERESIS 301,620,31,2,40,51,52,811,18,19 04E8 KO68 CCL BARRED O 620,802,3,4,5,7,9,10,15,16,17,901,2 04E9 KO67 CSL BARRED O 620,802,3,4,5,7,9,10,15,16,17,901,2 04EA KO78 CCL BARRED O WITH DIAERESIS 620 04EB KO77 CSL BARRED O WITH DIAERESIS 620 04EE KU32 CCL U WITH MACRON 302 04EF KU31 CSL U WITH MACRON 302 hex SID name (ISO/IEC 10646-1:1993) languages using it ----------------------------------------------------------------------- 04F0 KU18 CCL U WITH DIAERESIS 620,51,52,811,18,19,902 04F1 KU17 CSL U WITH DIAERESIS 620,51,52,811,18,19,902 04F2 KU26 CCL U WITH DOUBLE ACUTE 801 04F3 KU25 CSL U WITH DOUBLE ACUTE 801 04F4 KC72 CCL CHE WITH DIAERESIS 640 04F5 KC71 CSL CHE WITH DIAERESIS 640 04F8 KY18 CCL YERU WITH DIAERESIS 652 04F9 KY17 CSL YERU WITH DIAERESIS 652 ---- KQ01 CSL KU 301 ---- KQ02 CCL KU 301 ---- KW01 CSL WE 301 ---- KW02 CCL WE 301 ----------------------------------------------------------------------- KA81 *CSL A WITH ACUTE ? KA82 *CCL A WITH ACUTE ? KD84 *CCL DJE WITH ACUTE ? KD83 *CSL DJE WITH ACUTE ? KE81 *CSL IE WITH ACUTE ? KE82 *CCL IE WITH ACUTE ? KE87 *CSL IE WITH DIAERESIS AND ACUTE ? KE88 *CCL IE WITH DIAERESIS AND ACUTE ? KE85 *CSL UKRAINIAN IE WITH ACUTE ? KE86 *CCL UKRAINIAN IE WITH ACUTE ? KE83 *CSL SCHWA WITH ACUTE ? KE84 *CCL SCHWA WITH ACUTE ? KI81 *CSL I WITH ACUTE ? KI82 *CCL I WITH ACUTE ? KO81 *CSL O WITH ACUTE ? KO82 *CCL O WITH ACUTE ? KO87 *CSL O WITH DIAERESIS AND ACUTE ? KO88 *CCL O WITH DIAERESIS AND ACUTE ? KO83 *CSL BARRED O WITH ACUTE ? KO84 *CCL BARRED O WITH ACUTE ? KU82 *CCL U WITH ACUTE ? KU81 *CSL U WITH ACUTE ? KU83 *CSL STRAIGHT U WITH ACUTE ? KU84 *CCL STRAIGHT U WITH ACUTE ? KU89 *CSL STRAIGHT U WITH STROKE AND ACUTE ? KU90 *CCL STRAIGHT U WITH STROKE AND ACUTE ? KY81 *CSL YERY WITH ACUTE ? KY82 *CCL YERY WITH ACUTE ? KU85 *CSL YU WITH ACUTE ? KU86 *CCL YU WITH ACUTE ? KU93 *CSL YU WITH MACRON ? KU94 *CCL YU WITH MACRON ? KA85 *CSL YA WITH ACUTE ? KA86 *CCL YA WITH ACUTE ? KA93 *CSL YA WITH MACRON ? KA94 *CCL YA WITH MACRON ? CODE TABLE FROM ISO/IEC 10646-1 CODE TABLE FROM ISO/IEC 10646-1 TABLE D VERSION 3.0 COMPLETE REPERTOIRE OF GRAPHIC CHARACTERS 1999-03-09 REQUIRED FOR MODERN CYRILLIC SCRIPT J. W. van Wingen This list presents the information in Table C in a different order. The names are those according to ISO conventions, (CSL is short for CYRILLIC SMALL LETTER, the corresponding capital letters are omitted from the list.) The repertoire is the same as in in Table B and C, but justified in accordance with the GOST list and language lexicons. hex SID Name (ISO/IEC 10646-1) languages using it ----------------------------------------------------------------------- 0430 KA01 CSL A 000 0431 KB01 CSL BE 000 0432 KV01 CSL VE 000 0433 KG01 CSL GHE 000 0434 KD01 CSL DE 000 0435 KE01 CSL IE 000 0437 KZ01 CSL ZE 000 0438 KI01 CSL I 000 043A KK01 CSL KA 000 043B KL01 CSL EL 000 043C KM01 CSL EM 000 043D KN01 CSL EN 000 043E KO01 CSL O 000 043F KP01 CSL PE 000 0440 KR01 CSL ER 000 0441 KS01 CSL ES 000 0442 KT01 CSL TE 000 0443 KU01 CSL U 000 0444 KF01 CSL EF 000 0445 KH01 CSL HA 000 0446 KC01 CSL TSE 000 044B KY01 CSL YERU 101 0458 KJ01 CSL JE 070,080,620,803,811 ---- KQ01 CSL KU 301 ---- KW01 CSL WE 301 ----------------------------------------------------------------------- 0439 KJ11 CSL SHORT I 090,101 044C KX11 CSL SOFT SIGN 090,101 0453 KG11 CSL GJE 080 0456 KI11 CSL BYELORUSSIAN-UKRAINIAN I 102,103,631,632,809,18 045A KN11 CSL NJE 070,080 045B KC11 CSL TSHE 070 045C KK11 CSL KJE 080 ----------------------------------------------------------------------- 044D KE13 CSL E 101 ----------------------------------------------------------------------- 044E KU15 CSL YU 090,101 044F KA15 CSL YA 090,101 0454 KE15 CSL UKRAINIAN IE 103 0455 KZ15 CSL DZE 080 0449 KS15 CSL SHCHA 090,101 ----------------------------------------------------------------------- 0451 KE17 CSL IO 101 0457 KI17 CSL YI 103 04D3 KA17 CSL A WITH DIAERESIS 620,51,52,819,902 04DF KZ17 CSL ZE WITH DIAERESIS 640 04E7 KO17 CSL O WITH DIAERESIS 301,620,31,2,40,51,52,811,18,19 04F1 KU17 CSL U WITH DIAERESIS 620,51,52,811,18,19,902 04F9 KY17 CSL YERU WITH DIAERESIS 652 ----------------------------------------------------------------------- 0436 KZ21 CSL ZHE 000 0447 KC21 CSL CHE 000 0448 KS21 CSL SHA 000 044A KU21 CSL HARD SIGN 090,101 045F KG21 CSL DZHE 070,080,401 ----------------------------------------------------------------------- 04D1 KA23 CSL A WITH BREVE 801 04D7 KE23 CSL IE WITH BREVE 801 045E KU23 CSL SHORT U 102,812,14,1101 04C2 KZ23 CSL ZHE WITH BREVE 201 ----------------------------------------------------------------------- 04F3 KU25 CSL U WITH DOUBLE ACUTE 801 ----------------------------------------------------------------------- 04E3 KI31 CSL I WITH MACRON 302 04EF KU31 CSL U WITH MACRON 302 ----------------------------------------------------------------------- 0459 KL41 CSL LJE 070,080 ----------------------------------------------------------------------- 0499 KZ45 CSL ZE WITH DESCENDER 805 049B KK45 CSL KA WITH DESCENDER 302,401,620,807,9,14,15 04A3 KN45 CSL EN WITH DESCENDER 610,20,802,4,5,9,10,15,17,18,902,1101 04AB KS45 CSL ES WITH DESCENDER 801,5 04AD KT45 CSL TE WITH DESCENDER 401 04B3 KH45 CSL HA WITH DESCENDER 302,401,807,14 ----------------------------------------------------------------------- 0497 KZ47 CSL ZHE WITH DESCENDER 802,4,15,902,1101 04B7 KC47 CSL CHE WITH DESCENDER 302,401,818 ----------------------------------------------------------------------- 04D5 KA51 CSL LIGATURE A IE 303 04A5 KN51 CSL LIGATURE EN GHE 651,52,811,16 04B5 KT51 CSL LIGATURE TE TSE 401 ----------------------------------------------------------------------- 0452 KD61 CSL DJE 070 0493 KG61 CSL GHE WITH STROKE 302,803,5,7,9,14,15,18 049F KK61 CSL KA WITH STROKE 401 ----------------------------------------------------------------------- 04BB KH63 CSL SHHA 301,803,4,5,9,15,16,901,902 04E1 KZ63 CSL ABKHASIAN DZE 401 04BD KC63 CSL ABKHASIAN CHE 401 04A9 KW63 CSL ABKHASIAN KHA 401 ----------------------------------------------------------------------- 0495 KG65 CSL GHE WITH MIDDLE HOOK 401,816 04A7 KP65 CSL PE WITH MIDDLE HOOK 401 049D KK65 CSL KA WITH VERTICAL STROKE 803 04B9 KC65 CSL CHE WITH VERTICAL STROKE 803 ----------------------------------------------------------------------- 0491 KG67 CSL GHE WITH UPTURN 103 04A1 KK67 CSL BASHKIR KA 805 04D9 KE67 CSL SCHWA 301,401,620,802,3,4,5,9,15,902,1101 04E9 KO67 CSL BARRED O 620,802,3,4,5,7,9,10,15,16,17,901,2 04AF KU67 CSL STRAIGHT U 802,3,4,5,7,9,10,15,16,17,901,1101 ----------------------------------------------------------------------- 04B1 KU69 CSL STRAIGHT U WITH STROKE 809 04BF KC69 CSL ABKHASIAN CHE WITH DESCENDER 401 ----------------------------------------------------------------------- 04E5 KI71 CSL I WITH DIAERESIS 640 04DD KZ71 CSL ZHE WITH DIAERESIS 640 04F5 KC71 CSL CHE WITH DIAERESIS 640 ----------------------------------------------------------------------- 04DB KE77 CSL SCHWA WITH DIAERESIS 620 04EB KO77 CSL BARRED O WITH DIAERESIS 620 ----------------------------------------------------------------------- 04C0 SA99 CYRILLIC LETTER PALOCHKA 402,3,4,501,2,3,4,5,6,7 TABLE E VERSION 3.0 COMPLETE REPERTOIRE OF GRAPHIC CHARACTERS 1999-03-09 REQUIRED FOR EACH LANGUAGE J. W. van Wingen USING MODERN CYRILLIC SCRIPT This list contains all letters required for languages written with Cyrillic script according to modern usage (as of 1989-01-01). It is in accordance with the USSR comments on DIS 10646 (GOST list). Letters are indicated with their full name on the left page, and on both pages with the short identifier (SID) explained in the list in Annex B. The names of the letters are those specified in ISO/IEC 10646-1:1993. On the right pages the number codes of the language using a letter are indicated, as is specified in Table A. The order of presentation is that to language codes, grouping all the letters together that are required. Thus most letters occur more than once. The letters common to all languages written with Cyrillic script (000) are given first, it being assumed that even if a certain letter is not required for the language proper, it is needed for borrowed words and personal names. For every letter it is indicated by which other language it is required, but 000 is not repeated all the time, and 101 is not repeated with the other languages of the USSR. Those that are unique to a certain language can be found easily this way, because the numbers of all languages using it are always given. Only small letters are included, each having a capital equivalent, with the exception of "palochka" (SA99). The hexadecimal and decimal value of each character position in ISO/IEC 10646-1:1993 is indicated for easy reference to Table C. hex dec SID name (ISO/IEC 10646-1) --------------------------------------- Common Cyrillic repertoire -- 0430 048 KA01 CYRILLIC SMALL LETTER A 0431 049 KB01 CYRILLIC SMALL LETTER BE 0432 050 KV01 CYRILLIC SMALL LETTER VE 0433 051 KG01 CYRILLIC SMALL LETTER GHE 0434 052 KD01 CYRILLIC SMALL LETTER DE 0435 053 KE01 CYRILLIC SMALL LETTER IE 0437 055 KZ01 CYRILLIC SMALL LETTER ZE 0438 056 KI01 CYRILLIC SMALL LETTER I 043A 058 KK01 CYRILLIC SMALL LETTER KA 043B 059 KL01 CYRILLIC SMALL LETTER EL 043C 060 KM01 CYRILLIC SMALL LETTER EM 043D 061 KN01 CYRILLIC SMALL LETTER EN 043E 062 KO01 CYRILLIC SMALL LETTER O 043F 063 KP01 CYRILLIC SMALL LETTER PE 0440 064 KR01 CYRILLIC SMALL LETTER ER 0441 065 KS01 CYRILLIC SMALL LETTER ES 0442 066 KT01 CYRILLIC SMALL LETTER TE 0443 067 KU01 CYRILLIC SMALL LETTER U 0444 068 KF01 CYRILLIC SMALL LETTER EF 0445 069 KH01 CYRILLIC SMALL LETTER HA 0446 070 KC01 CYRILLIC SMALL LETTER TSE 0436 054 KZ21 CYRILLIC SMALL LETTER ZHE 0447 071 KC21 CYRILLIC SMALL LETTER CHE 0448 072 KS21 CYRILLIC SMALL LETTER SHA ANNEX E VERSION 3.0 COMPLETE REPERTOIRE OF GRAPHIC CHARACTERS 1993-03-09 REQUIRED FOR EACH LANGUAGE J. W. van Wingen USING FOR MODERN CYRILLIC SCRIPT SID Used in language --------------------------------------- Common Cyrillic repertoire -- KA01 000 KB01 000 KV01 000 KG01 000 KD01 000 KE01 000 KZ01 000 KI01 000 KK01 000 KL01 000 KM01 000 KN01 000 KO01 000 KP01 000 KR01 000 KS01 000 KT01 000 KU01 000 KF01 000 KH01 000 KC01 000 KZ21 000 KC21 000 KS21 000 ------------------------------------------------- Serbian - 070 ----- 0458 088 KJ01 CYRILLIC SMALL LETTER JE 045A 090 KN11 CYRILLIC SMALL LETTER NJE 045B 091 KC11 CYRILLIC SMALL LETTER TSHE (Serbocroatian) 045F 095 KG21 CYRILLIC SMALL LETTER DZHE 0459 089 KL41 CYRILLIC SMALL LETTER LJE 0452 082 KD61 CYRILLIC SMALL LETTER DJE (Serbocroatian) ---------------------------------------------- Macedonian - 080 ----- 0458 088 KJ01 CYRILLIC SMALL LETTER JE 0453 083 KG11 CYRILLIC SMALL LETTER GJE (Macedonian) 045A 090 KN11 CYRILLIC SMALL LETTER NJE 045C 092 KK11 CYRILLIC SMALL LETTER KJE (Macedonian) 0455 085 KZ15 CYRILLIC SMALL LETTER DZE (Macedonian) 045F 095 KG21 CYRILLIC SMALL LETTER DZHE 0459 089 KL41 CYRILLIC SMALL LETTER LJE ----------------------------------------------- Bulgarian - 090 ----- 0439 057 KJ11 CYRILLIC SMALL LETTER SHORT I 044C 076 KX11 CYRILLIC SMALL LETTER SOFT SIGN 0449 073 KS15 CYRILLIC SMALL LETTER SHCHA 044E 078 KU15 CYRILLIC SMALL LETTER YU 044F 079 KA15 CYRILLIC SMALL LETTER YA 044A 074 KU21 CYRILLIC SMALL LETTER HARD SIGN ------------------------------------------------- Russian - 101 ----- 044B 075 KY01 CYRILLIC SMALL LETTER YERU 0439 057 KJ11 CYRILLIC SMALL LETTER SHORT I 044C 076 KX11 CYRILLIC SMALL LETTER SOFT SIGN 044D 077 KE13 CYRILLIC SMALL LETTER E 0449 073 KS15 CYRILLIC SMALL LETTER SHCHA 044E 078 KU15 CYRILLIC SMALL LETTER YU 044F 079 KA15 CYRILLIC SMALL LETTER YA 0451 081 KE17 CYRILLIC SMALL LETTER IO 044A 074 KU21 CYRILLIC SMALL LETTER HARD SIGN -------------------------------------------- Byelorussian - 102 ----- 0456 086 KI11 CYRILLIC SMALL LETTER BYELORUSSIAN-UKRAINIAN I 045E 094 KU23 CYRILLIC SMALL LETTER SHORT U (Byelorussian) ----------------------------------------------- Ukrainian - 103 ----- 0456 086 KI11 CYRILLIC SMALL LETTER BYELORUSSIAN-UKRAINIAN I 0454 084 KE15 CYRILLIC SMALL LETTER UKRAINIAN IE 0457 087 KI17 CYRILLIC SMALL LETTER YI (Ukrainian) 0491 145 KG67 CYRILLIC SMALL LETTER GHE WITH UPTURN ----------------------------------------------- Moldavian - 201 ----- 04C2 194 KZ23 CYRILLIC SMALL LETTER ZHE WITH BREVE ------------------------------------------------- Kurdish - 301 ----- removed KQ01 CYRILLIC SMALL LETTER KURDISH KU removed KW01 CYRILLIC SMALL LETTER KURDISH VE 04E7 231 KO17 CYRILLIC SMALL LETTER O WITH DIAERESIS 04BB 187 KH63 CYRILLIC SMALL LETTER SHHA 04D9 217 KE67 CYRILLIC SMALL LETTER SCHWA ------------------------------------------------- Tadzhik - 302 ----- 04E3 227 KI31 CYRILLIC SMALL LETTER I WITH MACRON 04EF 239 KU31 CYRILLIC SMALL LETTER U WITH MACRON 049B 155 KK45 CYRILLIC SMALL LETTER KA WITH DESCENDER 04B3 179 KH45 CYRILLIC SMALL LETTER HA WITH DESCENDER 04B7 183 KC47 CYRILLIC SMALL LETTER CHE WITH DESCENDER 0493 147 KG61 CYRILLIC SMALL LETTER GHE WITH STROKE -------------------------------------------------- Osetic - 303 ----- 04D5 213 KA51 CYRILLIC SMALL LIGATURE A IE ------------------------------------------------- Serbian - 070 ----- KJ01 070,080,620,803,811 KN11 070,080 KC11 070 KG21 070,080,401 KL41 070,080 KD61 070 ---------------------------------------------- Macedonian - 080 ----- KJ01 070,080,620,803,811 KG11 080 KN11 070,080 KK11 080 KZ15 080 KG21 070,080,401 KL41 070,080 ----------------------------------------------- Bulgarian - 090 ----- KJ11 090,101 KX11 090,101 KS15 090,101 KU15 090,101 KA15 090,101 KU21 090,101 ------------------------------------------------- Russian - 101 ----- KY01 101 KJ11 090,101 KX11 090,101 KE13 101 KS15 090,101 KU15 090,101 KA15 090,101 KE17 101 KU21 090,101 -------------------------------------------- Byelorussian - 102 ----- KI11 102,103,631,632,809,818 KU23 102,812,814,1101 ----------------------------------------------- Ukrainian - 103 ----- KI11 102,103,631,632,809,818 KE15 103 KI17 103 KG67 103 ----------------------------------------------- Moldavian - 201 ----- KZ23 201 ------------------------------------------------- Kurdish - 301 ----- KQ01 301 KW01 301 KO17 301,620,631,632,640,651,652,811,818,819 KH63 301,803,804,805,809,815,816,901,902 KE67 301,401,620,802,803,804,805,809,815,902,1101 ------------------------------------------------- Tadzhik - 302 ----- KI31 302 KU31 302 KK45 302,401,807,809,814,815 KH45 302,401,807,814 KC47 302,401,818 KG61 302,803,805,807,809,814,815,818 -------------------------------------------------- Osetic - 303 ----- KA51 303 -------------------------------------------------- Abkhaz - 401 ----- 045F 095 KG21 CYRILLIC SMALL LETTER DZHE 04B3 179 KH45 CYRILLIC SMALL LETTER HA WITH DESCENDER 049B 155 KK45 CYRILLIC SMALL LETTER KA WITH DESCENDER 04AD 173 KT45 CYRILLIC SMALL LETTER TE WITH DESCENDER 04B7 183 KC47 CYRILLIC SMALL LETTER CHE WITH DESCENDER 04B5 181 KT51 CYRILLIC SMALL LIGATURE TE TSE (Abkhasian) 049F 159 KK61 CYRILLIC SMALL LETTER KA WITH STROKE 04E1 225 KZ63 CYRILLIC SMALL LETTER ABKHASIAN DZE 04BD 189 KC63 CYRILLIC SMALL LETTER ABKHASIAN CHE 04A9 169 KW63 CYRILLIC SMALL LETTER ABKHASIAN KHA 0495 149 KG65 CYRILLIC SMALL LETTER GHE WITH MIDDLE HOOK 04A7 167 KP65 CYRILLIC SMALL LETTER PE WITH MIDDLE HOOK (Abkhasian) 04D9 217 KE67 CYRILLIC SMALL LETTER SCHWA 04BF 191 KC69 CYRILLIC SMALL LETTER ABKHASIAN CHE WITH DESCENDER ----------------------------------------------- Kaukasian - 402/507-- 04C0 192 SA99 CYRILLIC LETTER PALOCHKA --------------------------------------------------- Mansi - 610 ----- 04A3 163 KN45 CYRILLIC SMALL LETTER EN WITH DESCENDER -------------------------------------------------- Khanti - 620 ----- 0458 088 KJ01 CYRILLIC SMALL LETTER JE 04D3 211 KA17 CYRILLIC SMALL LETTER A WITH DIAERESIS 04E7 231 KO17 CYRILLIC SMALL LETTER O WITH DIAERESIS 04F1 241 KU17 CYRILLIC SMALL LETTER U WITH DIAERESIS 049B 155 KK45 CYRILLIC SMALL LETTER KA WITH DESCENDER 04A3 163 KN45 CYRILLIC SMALL LETTER EN WITH DESCENDER 04D9 217 KE67 CYRILLIC SMALL LETTER SCHWA 04E9 233 KO67 CYRILLIC SMALL LETTER BARRED O 04DB 219 KE77 CYRILLIC SMALL LETTER SCHWA WITH DIAERESIS 04EB 235 KO77 CYRILLIC SMALL LETTER BARRED O WITH DIAERESIS ------------------------------------------------ Komi ----- 631 ----- 0456 086 KI11 CYRILLIC SMALL LETTER BYELORUSSIAN-UKRAINIAN I 04E7 231 KO17 CYRILLIC SMALL LETTER O WITH DIAERESIS ------------------------------------------------ Komi (P) - 632 ----- 0456 086 KI11 CYRILLIC SMALL LETTER BYELORUSSIAN-UKRAINIAN I 04E7 231 KO17 CYRILLIC SMALL LETTER O WITH DIAERESIS -------------------------------------------------- Udmurt - 640 ----- 04E7 231 KO17 CYRILLIC SMALL LETTER O WITH DIAERESIS 04DF 223 KZ17 CYRILLIC SMALL LETTER ZE WITH DIAERESIS 04E5 229 KI71 CYRILLIC SMALL LETTER I WITH DIAERESIS 04DD 221 KZ71 CYRILLIC SMALL LETTER ZHE WITH DIAERESIS 04F5 245 KC71 CYRILLIC SMALL LETTER CHE WITH DIAERESIS ------------------------------------------------ Mari (L) - 651 ----- 04D3 211 KA17 CYRILLIC SMALL LETTER A WITH DIAERESIS 04E7 231 KO17 CYRILLIC SMALL LETTER O WITH DIAERESIS 04F1 241 KU17 CYRILLIC SMALL LETTER U WITH DIAERESIS 04A5 165 KN51 CYRILLIC SMALL LIGATURE EN GHE ------------------------------------------------ Mari (H) - 652 ----- 04D3 211 KA17 CYRILLIC SMALL LETTER A WITH DIAERESIS 04E7 231 KO17 CYRILLIC SMALL LETTER O WITH DIAERESIS 04F1 241 KU17 CYRILLIC SMALL LETTER U WITH DIAERESIS 04F9 249 KY17 CYRILLIC SMALL LETTER YERU WITH DIAERESIS 04A5 165 KN51 CYRILLIC SMALL LIGATURE EN GHE -------------------------------------------------- Abkhaz - 401 ----- KG21 070,080,401 KK45 302,401,807,809,814,815 KH45 302,401,807,814 KT45 401 KC47 302,401,818 KT51 401 KK61 401 KC63 401 KZ63 401 KW63 401 KG65 401,816 KP65 401 KE67 301,401,620,802,803,804,805,809,815,902,1101 KC69 401 ----------------------------------------------- Kaukasian - 402/507-- SA99 402,403,404,501,502,503,504,505,506,507 --------------------------------------------------- Mansi - 610 ----- KN45 620,802,804,805,809,810,815,817,818,902,1101 -------------------------------------------------- Khanti - 620 ----- KJ01 070,080,620,803,811 KA17 620,651,652,819,902 KO17 301,620,631,632,640,651,652,811,818,819 KU17 620,651,652,811,818,819,902 KK45 302,401,807,809,814,815 KN45 620,802,804,805,809,810,815,817,818,902,1101 KE67 301,401,620,802,803,804,805,809,815,902,1101 KO67 620,802,803,804,805,807,809,810,815,816,817,901,902 KE77 620 KO77 620 ------------------------------------------------ Komi ----- 631 ----- KI11 102,103,631,632,809,818 KO17 301,620,631,632,640,651,652,811,818,819 ------------------------------------------------ Komi (P) - 632 ----- KI11 102,103,631,632,809,818 KO17 301,620,631,632,640,651,652,811,818,819 -------------------------------------------------- Udmurt - 640 ----- KO17 301,620,631,632,640,651,652,811,818,819 KZ17 640 KI71 640 KZ71 640 KC71 640 ------------------------------------------------ Mari (L) - 651 ----- KA17 620,651,652,819,902 KO17 301,620,631,632,640,651,652,811,818,819 KU17 620,651,652,811,818,819,902 KN51 651,652,811,816 ------------------------------------------------ Mari (H) - 652 ----- KA17 620,651,652,819,902 KO17 301,620,631,632,640,651,652,811,818,819 KU17 620,651,652,811,818,819,902 KY17 652 KN51 651,652,811,816 ------------------------------------------------- Chuvash - 801 ----- 04D1 209 KA23 CYRILLIC SMALL LETTER A WITH BREVE 04D7 215 KE23 CYRILLIC SMALL LETTER IE WITH BREVE 04F3 243 KU25 CYRILLIC SMALL LETTER U WITH DOUBLE ACUTE 04AB 171 KS45 CYRILLIC SMALL LETTER ES WITH DESCENDER ------------------------------------------------- Turkmen - 802 ----- 04A3 163 KN45 CYRILLIC SMALL LETTER EN WITH DESCENDER 0497 151 KZ47 CYRILLIC SMALL LETTER ZHE WITH DESCENDER 04D9 217 KE67 CYRILLIC SMALL LETTER SCHWA 04E9 233 KO67 CYRILLIC SMALL LETTER BARRED O 04AF 175 KU67 CYRILLIC SMALL LETTER STRAIGHT U --------------------------------------------------- Azeri - 803 ----- 0458 088 KJ01 CYRILLIC SMALL LETTER JE 0493 147 KG61 CYRILLIC SMALL LETTER GHE WITH STROKE 04BB 187 KH63 CYRILLIC SMALL LETTER SHHA 049D 157 KK65 CYRILLIC SMALL LETTER KA WITH VERTICAL STROKE 04B9 185 KC65 CYRILLIC SMALL LETTER CHE WITH VERTICAL STROKE 04D9 217 KE67 CYRILLIC SMALL LETTER SCHWA 04E9 233 KO67 CYRILLIC SMALL LETTER BARRED O 04AF 175 KU67 CYRILLIC SMALL LETTER STRAIGHT U --------------------------------------------------- Tatar - 804 ----- 04A3 163 KN45 CYRILLIC SMALL LETTER EN WITH DESCENDER 0497 151 KZ47 CYRILLIC SMALL LETTER ZHE WITH DESCENDER 04BB 187 KH63 CYRILLIC SMALL LETTER SHHA 04D9 217 KE67 CYRILLIC SMALL LETTER SCHWA 04E9 233 KO67 CYRILLIC SMALL LETTER BARRED O 04AF 175 KU67 CYRILLIC SMALL LETTER STRAIGHT U ------------------------------------------------- Bashkir - 805 ----- 0499 153 KZ45 CYRILLIC SMALL LETTER ZE WITH DESCENDER 04A3 163 KN45 CYRILLIC SMALL LETTER EN WITH DESCENDER 04AB 171 KS45 CYRILLIC SMALL LETTER ES WITH DESCENDER 0493 147 KG61 CYRILLIC SMALL LETTER GHE WITH STROKE 04BB 187 KH63 CYRILLIC SMALL LETTER SHHA 04A1 161 KK67 CYRILLIC SMALL LETTER BASHKIR KA 04D9 217 KE67 CYRILLIC SMALL LETTER SCHWA 04E9 233 KO67 CYRILLIC SMALL LETTER BARRED O 04AF 175 KU67 CYRILLIC SMALL LETTER STRAIGHT U --------------------------------------------- Kara-kalpak - 807 ----- 049B 155 KK45 CYRILLIC SMALL LETTER KA WITH DESCENDER 04B3 179 KH45 CYRILLIC SMALL LETTER HA WITH DESCENDER 0493 147 KG61 CYRILLIC SMALL LETTER GHE WITH STROKE 04E9 233 KO67 CYRILLIC SMALL LETTER BARRED O 04AF 175 KU67 CYRILLIC SMALL LETTER STRAIGHT U -------------------------------------------------- Kazakh - 809 ----- 0456 086 KI11 CYRILLIC SMALL LETTER BYELORUSSIAN-UKRAINIAN I 049B 155 KK45 CYRILLIC SMALL LETTER KA WITH DESCENDER 04A3 163 KN45 CYRILLIC SMALL LETTER EN WITH DESCENDER 0493 147 KG61 CYRILLIC SMALL LETTER GHE WITH STROKE 04BB 187 KH63 CYRILLIC SMALL LETTER SHHA 04D9 217 KE67 CYRILLIC SMALL LETTER SCHWA 04E9 233 KO67 CYRILLIC SMALL LETTER BARRED O 04AF 175 KU67 CYRILLIC SMALL LETTER STRAIGHT U 04B1 177 KU69 CYRILLIC SMALL LETTER STRAIGHT U WITH STROKE ------------------------------------------------- Kirghiz - 810 ----- 04A3 163 KN45 CYRILLIC SMALL LETTER EN WITH DESCENDER 04E9 233 KO67 CYRILLIC SMALL LETTER BARRED O 04AF 175 KU67 CYRILLIC SMALL LETTER STRAIGHT U -------------------------------------------------- Altaic - 811 ----- 0458 088 KJ01 CYRILLIC SMALL LETTER JE 04E7 231 KO17 CYRILLIC SMALL LETTER O WITH DIAERESIS 04F1 241 KU17 CYRILLIC SMALL LETTER U WITH DIAERESIS 04A5 165 KN51 CYRILLIC SMALL LIGATURE EN GHE ------------------------------------------------- Chuvash - 801 ----- KA23 801 KE23 801 KU25 801 KS45 801,805 ------------------------------------------------- Turkmen - 802 ----- KN45 620,802,804,805,809,810,815,817,818,902,1101 KZ47 802,804,815,902,1101 KE67 301,401,620,802,803,804,805,809,815,902,1101 KO67 620,802,803,804,805,807,809,810,815,816,817,901,902 KU67 802,803,804,805,807,809,810,815,816,817,901,1101 --------------------------------------------------- Azeri - 803 ----- KJ01 070,080,620,803,811 KG61 302,803,805,807,809,814,815,818 KH63 301,803,804,805,809,815,816,901,902 KK65 803 KC65 803 KE67 301,401,620,802,803,804,805,809,815,902,1101 KO67 620,802,803,804,805,807,809,810,815,816,817,901,902 KU67 802,803,804,805,807,809,810,815,816,817,901,1101 --------------------------------------------------- Tatar - 804 ----- KN45 620,802,804,805,809,810,815,817,818,902,1101 KZ47 802,804,815,902,1101 KH63 301,803,804,805,809,815,816,901,902 KE67 301,401,620,802,803,804,805,809,815,902,1101 KO67 620,802,803,804,805,807,809,810,815,816,817,901,902 KU67 802,803,804,805,807,809,810,815,816,817,901,1101 ------------------------------------------------- Bashkir - 805 ----- KZ45 805 KN45 620,802,804,805,809,810,815,817,818,902,1101 KS45 801,805 KG61 302,803,805,807,809,814,815,818 KH63 301,803,804,805,809,815,816,901,902 KK67 805 KE67 301,401,620,802,803,804,805,809,815,902,1101 KO67 620,802,803,804,805,807,809,810,815,816,817,901,902 KU67 802,803,804,805,807,809,810,815,816,817,901,1101 --------------------------------------------- Kara-kalpak - 807 ----- KK45 302,401,807,809,814,815 KH45 302,401,807,814 KG61 302,803,805,807,809,814,815,818 KO67 620,802,803,804,805,807,809,810,815,816,817,901,902 KU67 802,803,804,805,807,809,810,815,816,817,901,1101 -------------------------------------------------- Kazakh - 809 ----- KI11 102,103,631,632,809,818 KK45 302,401,807,809,814,815 KN45 620,802,804,805,809,810,815,817,818,902,1101 KG61 302,803,805,807,809,814,815,818 KH63 301,803,804,805,809,815,816,901,902 KE67 301,401,620,802,803,804,805,809,815,902,1101 KO67 620,802,803,804,805,807,809,810,815,816,817,901,902 KU67 802,803,804,805,807,809,810,815,816,817,901,1101 KU69 809 ------------------------------------------------- Kirghiz - 810 ----- KN45 620,802,804,805,809,810,815,817,818,902,1101 KO67 620,802,803,804,805,807,809,810,815,816,817,901,902 KU67 802,803,804,805,807,809,810,815,816,817,901,1101 -------------------------------------------------- Altaic - 811 ----- KJ01 070,080,620,803,811 KO17 301,620,631,632,640,651,652,811,818,819 KU17 620,651,652,811,818,819,902 KN51 651,652,811,816 ----------------------------------------- Karachai-Balkar - 812 ----- 045E 094 KU23 CYRILLIC SMALL LETTER SHORT U (Byelorussian) --------------------------------------------------- Uzbek - 814 ----- 045E 094 KU23 CYRILLIC SMALL LETTER SHORT U (Byelorussian) 049B 155 KK45 CYRILLIC SMALL LETTER KA WITH DESCENDER 04B3 179 KH45 CYRILLIC SMALL LETTER HA WITH DESCENDER 0493 147 KG61 CYRILLIC SMALL LETTER GHE WITH STROKE --------------------------------------------------- Uigur - 815 ----- 049B 155 KK45 CYRILLIC SMALL LETTER KA WITH DESCENDER 04A3 163 KN45 CYRILLIC SMALL LETTER EN WITH DESCENDER 0497 151 KZ47 CYRILLIC SMALL LETTER ZHE WITH DESCENDER 0493 147 KG61 CYRILLIC SMALL LETTER GHE WITH STROKE 04BB 187 KH63 CYRILLIC SMALL LETTER SHHA 04D9 217 KE67 CYRILLIC SMALL LETTER SCHWA 04E9 233 KO67 CYRILLIC SMALL LETTER BARRED O 04AF 175 KU67 CYRILLIC SMALL LETTER STRAIGHT U --------------------------------------------------- Yakut - 816 ----- 04A5 165 KN51 CYRILLIC SMALL LIGATURE EN GHE 04BB 187 KH63 CYRILLIC SMALL LETTER SHHA 0495 149 KG65 CYRILLIC SMALL LETTER GHE WITH MIDDLE HOOK 04E9 233 KO67 CYRILLIC SMALL LETTER BARRED O 04AF 175 KU67 CYRILLIC SMALL LETTER STRAIGHT U ------------------------------------------------ Tuvinian - 817 ----- 04A3 163 KN45 CYRILLIC SMALL LETTER EN WITH DESCENDER 04E9 233 KO67 CYRILLIC SMALL LETTER BARRED O 04AF 175 KU67 CYRILLIC SMALL LETTER STRAIGHT U ------------------------------------------------- Khakass - 818 ----- 0456 086 KI11 CYRILLIC SMALL LETTER BYELORUSSIAN-UKRAINIAN I 04E7 231 KO17 CYRILLIC SMALL LETTER O WITH DIAERESIS 04F1 241 KU17 CYRILLIC SMALL LETTER U WITH DIAERESIS 04CC 204 KC43*CYRILLIC SMALL LETTER KHAKASSIAN CHE 04B7 183 KC47 CYRILLIC SMALL LETTER CHE WITH DESCENDER 04A3 163 KN45 CYRILLIC SMALL LETTER EN WITH DESCENDER 0493 147 KG61 CYRILLIC SMALL LETTER GHE WITH STROKE ------------------------------------------------- Gagauzi - 819 ----- 04D3 211 KA17 CYRILLIC SMALL LETTER A WITH DIAERESIS 04E7 231 KO17 CYRILLIC SMALL LETTER O WITH DIAERESIS 04F1 241 KU17 CYRILLIC SMALL LETTER U WITH DIAERESIS -------------------------------------------------- Buryat - 901 ----- 04BB 187 KH63 CYRILLIC SMALL LETTER SHHA 04E9 233 KO67 CYRILLIC SMALL LETTER BARRED O 04AF 175 KU67 CYRILLIC SMALL LETTER STRAIGHT U -------------------------------------------------- Kalmyk - 902 ----- 04D3 211 KA17 CYRILLIC SMALL LETTER A WITH DIAERESIS 04F1 241 KU17 CYRILLIC SMALL LETTER U WITH DIAERESIS 04A3 163 KN45 CYRILLIC SMALL LETTER EN WITH DESCENDER 0497 151 KZ47 CYRILLIC SMALL LETTER ZHE WITH DESCENDER 04BB 187 KH63 CYRILLIC SMALL LETTER SHHA 04D9 217 KE67 CYRILLIC SMALL LETTER SCHWA 04E9 233 KO67 CYRILLIC SMALL LETTER BARRED O ------------------------------------------------- Dungan - 1101 ----- 045E 094 KU23 CYRILLIC SMALL LETTER SHORT U (Byelorussian) 04A3 163 KN45 CYRILLIC SMALL LETTER EN WITH DESCENDER 0497 151 KZ47 CYRILLIC SMALL LETTER ZHE WITH DESCENDER 04D9 217 KE67 CYRILLIC SMALL LETTER SCHWA 04AF 175 KU67 CYRILLIC SMALL LETTER STRAIGHT U ----------------------------------------------- Mongolian - 050 ----- 04E9 233 KO67 CYRILLIC SMALL LETTER BARRED O 04AF 175 KU67 CYRILLIC SMALL LETTER STRAIGHT U --------------------------------------------------------------------- ----------------------------------------- Karachai-Balkar - 812 ----- KU23 102,812,814,1101 --------------------------------------------------- Uzbek - 814 ----- KU23 102,812,814,1101 KK45 302,401,807,809,814,815 KH45 302,401,807,814 KG61 302,803,805,807,809,814,815,818 --------------------------------------------------- Uigur - 815 ----- KK45 302,401,807,809,814,815 KN45 620,802,804,805,809,810,815,817,818,902,1101 KZ47 802,804,815,902,1101 KG61 302,803,805,807,809,814,815,818 KH63 301,803,804,805,809,815,816,901,902 KE67 301,401,620,802,803,804,805,809,815,902,1101 KO67 620,802,803,804,805,807,809,810,815,816,817,901,902 KU67 802,803,804,805,807,809,810,815,816,817,901,1101 --------------------------------------------------- Yakut - 816 ----- KN51 651,652,811,816 KH63 301,803,804,805,809,815,816,901,902 KG65 401,816 KO67 620,802,803,804,805,807,809,810,815,816,817,901,902 KU67 802,803,804,805,807,809,810,815,816,817,901,1101 ------------------------------------------------ Tuvinian - 817 ----- KN45 620,802,804,805,809,810,815,817,818,902,1101 KO67 620,802,803,804,805,807,809,810,815,816,817,901,902 KU67 802,803,804,805,807,809,810,815,816,817,901,1101 ------------------------------------------------- Khakass - 818 ----- KI11 102,103,631,632,809,818 KO17 301,620,631,632,640,651,652,811,818,819 KU17 620,651,652,811,818,819,902 KC43 818 KC47 302,401,818 KN45 620,802,804,805,809,810,815,817,818,902,1101 KG61 302,803,805,807,809,814,815,818 ------------------------------------------------- Gagauzi - 819 ----- KA17 620,651,652,819,902 KO17 301,620,631,632,640,651,652,811,818,819 KU17 620,651,652,811,818,819,902 -------------------------------------------------- Buryat - 901 ----- KH63 301,803,804,805,809,815,816,901,902 KO67 620,802,803,804,805,807,809,810,815,816,817,901,902 KU67 802,803,804,805,807,809,810,815,816,817,901,1101 -------------------------------------------------- Kalmyk - 902 ----- KA17 620,651,652,652,819,902 KU17 620,651,652,811,818,819,902 KN45 620,802,804,805,809,810,815,817,818,902,1101 KZ47 802,804,815,902,1101 KH63 301,803,804,805,809,815,816,901,902 KE67 301,401,620,802,803,804,805,809,815,902,1101 KO67 620,802,803,804,805,807,809,810,815,816,817,901,902 ------------------------------------------------- Dungan - 1101 ----- KU23 102,812,814,1101 KN45 620,802,804,805,809,810,815,817,818,902,1101 KZ47 802,804,815,902,1101 KE67 301,401,620,802,803,804,805,809,815,902,1101 KU67 802,803,804,805,807,809,810,815,816,817,901,1101 ----------------------------------------------- Mongolian - 050 ----- KO67 620,802,803,804,805,807,809,810,815,816,817,901,902 KU67 802,803,804,805,807,809,810,815,816,817,901,1101 --------------------------------------------------------------------- TABLE F VERSION 1.2 TABLES SHOWING USE OF NATIONAL LANGUAGE IN THE USSR 1993-02-21 J. W. van Wingen Union republic number of speakers % (see a) % (see b) 1970 1989 1970 1989 1989 103 Ukrainia 34906 35820 14.35 18.93 25.03 102 Byelorussia 7291 7120 19.45 29.09 16.81 Estonia 962 980 4.49 4.54 66.15 Latvia 1361 1383 4.79 5.23 30.56 Lithuania 2608 2997 2.13 2.30 60.29 201 Moldavia 2560 3070 5.00 8.41 38.72 Georgia 3193 3910 1.60 1.79 65.28 Armenia 3254 4238 8.57 8.39 45.27 803 Azerbaidzhan 4300 6614 1.79 2.31 63.98 802 Turkmenia 1510 2689 1.10 1.48 71.29 809 Kazakh 5190 7890 1.96 3.01 37.32 814 Uzbek 9070 16417 1.35 1.68 76.16 810 Kirghiz 1430 2474 1.22 2.17 64.22 302 Tadzhikistan 2100 4120 1.49 2.28 71.52 804 Tatar ASSR 5290 5532 10.81 16.80 29.22 a: Percentage of nationals not speaking their own national language b: Percentage of nationals speaking their own language, but not speaking Russian fluently as a second language Source: USSR Census, 1970, 1989. (numbers of speakers in 1000s) TABLE G VERSION 1.1 SUPPLEMENTARY SETS FOR NON-SLAVIC LANGUAGES IN THE USSR 1993-02-21 J. W. van Wingen If covering some non-slavic languages would also be under consideration a repertoire and a code table should be provided for an additional supplementary set. In column 07 there are 16 positions available for 16 Cyrillic letters, corresponding with 16 positions for capital letters in column 06. Covering Russian is required (2). A repertoire for Central Asia should include anyway Kazakh (9) and Uzbek (2 additional). This makes up 13, leaving 3 positions unused. These may accomodate either Turkmen (1), Tadzhik (3) or Ukrainian again (3). Kirgiz uses a subset of Kazakh, Azeri needs a further 3. To see what could be done, three versions (not proposals) have been worked out, as an exercise, A with Ukrainian, B with Tadzhik. These use columns 06 and 07 only. More letters could be accomodated by putting letters into columns 02 and 03, replacing special characters not really needed (C). In this way all Turkic and Mongolian languages, (and all Kaukasian but one), in the former USSR are covered, with the exception of: Chuvash, Bashkir, Altaic, Yakut, Khakass, Gagauzi. This means that of the 47 million speakers, (not counting Ukrainian and Byelorussian, also covered), 3 108 000 are not served. Repertoire for version A (small letters only, capitals in 06): 07/00 KU69 CYRILLIC SMALL LETTER STRAIGHT U WITH STROKE 07/01 KG67 CYRILLIC SMALL LETTER GHE WITH UPTURN 07/02 KH45 CYRILLIC SMALL LETTER HA WITH DESCENDER 07/03 KG61 CYRILLIC SMALL LETTER GHE WITH STROKE 07/04 KE15 CYRILLIC SMALL LETTER UKRAINIAN IE 07/05 KO67 CYRILLIC SMALL LETTER BARRED O 07/06 KI11 CYRILLIC SMALL LETTER BYELORUSSIAN-UKRAINIAN I 07/07 KI17 CYRILLIC SMALL LETTER YI (Ukrainian) 07/08 KE67 CYRILLIC SMALL LETTER SCHWA 07/09 KU67 CYRILLIC SMALL LETTER STRAIGHT U 07/10 KN45 CYRILLIC SMALL LETTER EN WITH DESCENDER 07/11 KH63 CYRILLIC SMALL LETTER SHHA 07/12 KK45 CYRILLIC SMALL LETTER KA WITH DESCENDER 07/13 KU21 CYRILLIC SMALL LETTER HARD SIGN 07/14 KU23 CYRILLIC SMALL LETTER SHORT U (Byelorussian) Repertoire for version B (small letters only): Replace in version A positions 07/00, 07/07, 07/15 with: 07/00 KC47 CYRILLIC SMALL LETTER CHE WITH DESCENDER 07/01 KU31 CYRILLIC SMALL LETTER U WITH MACRON 07/07 KI31 CYRILLIC SMALL LETTER I WITH MACRON Repertoire for version C (small letters only): Add to version A the following letters: 03/01 KI31 CYRILLIC SMALL LETTER I WITH MACRON 03/02 KU31 CYRILLIC SMALL LETTER U WITH MACRON 03/03 KC47 CYRILLIC SMALL LETTER CHE WITH DESCENDER 03/04 KZ47 CYRILLIC SMALL LETTER ZHE WITH DESCENDER 03/05 KJ01 CYRILLIC SMALL LETTER JE 03/07 KK65 CYRILLIC SMALL LETTER KA WITH VERTICAL STROKE 03/08 KC65 CYRILLIC SMALL LETTER CHE WITH VERTICAL STROKE 06/15 SA99 CYRILLIC LETTER PALOCHKA