1<?xml version="1.0" encoding="UTF-8" ?>
2<!DOCTYPE supplementalData SYSTEM "../../common/dtd/ldmlSupplemental.dtd">
3<!--
4Copyright © 1991-2015 Unicode, Inc.
5CLDR data files are interpreted according to the LDML specification (http://unicode.org/reports/tr35/)
6For terms of use, see http://www.unicode.org/copyright.html
7-->
8<supplementalData>
9	<version number="$Revision: 12347 $"/>
10	<transforms>
11		<transform source="sat_Olck" target="sat_FONIPA" direction="forward" alias="sat-fonipa-t-sat-olck">
12			<tRule><![CDATA[
13# Santali (Ol Chiki) → Santali (International Phonetic Alphabet)
14
15
16# Output
17# ------
18# m mː n nː ɳ ɳː ɲ ɲː ŋ ŋː
19# p pʰ pʼ b bʰ t tʰ tʼ d dʰ ʈ ʈʰ ɖ ɖʰ c cʰ cʼ k kʰ kʼ ɡ ʔ
20# s sː h
21# d͡ʒ
22# ɽ r
23# l lː
24# w wː w̃ w̃ː
25#
26# i iː ĩ ĩː u uː ũ ũː
27# e eː ẽ ẽː ə əː ə̃ ə̃ː o oː õ õː
28# ɛ ɛː ɛ̃ ɛ̃ː ɔ ɔː ɔ̃ ɔ̃ː
29# a aː ã ãː
30
31
32# References
33# ----------
34# [1] Michael Everson: Final proposal to encode the Ol Chiki script
35#     in the UCS.  ISO/IEC JTC1/SC2/WG2 Working Group Document N2984R,
36#     September 21, 2005.  http://std.dkuug.dk/jtc1/sc2/wg2/docs/n2984.pdf
37#
38# [2] George L. Campbell: Compendium of the World's Languages.
39#     Volume 2: Ladakhi to Zuni. ISBN 0-415-20297-3.  Taylor & Francis, 2000.
40#     Pages 1454 to 1458.
41
42
43# Notes
44# -----
45# According to [1] (page 3), ᱽ can only follow the four ejective
46# consonants ᱵ /pʼ/, ᱡ /cʼ/, ᱫ /tʼ/, and ᱜ /kʼ/; these become
47# ᱵᱽ /b/, ᱫᱽ /d/, ᱡᱽ /d͡ʒ/, and ᱜᱽ /ɡ/.  In online texts, however,
48# we have occasionally encountered ᱽ following non-ejective plosives,
49# for example after ᱯ /p/. These might possibly be typos.  Our rules
50# try to be resilient and handle ᱯᱽ as /b/.
51#
52# According to [1] (page 2), U+1C7C PHAARKAA follows the four “glottal”
53# consonants ᱵ /pʼ/, ᱡ /cʼ/, ᱫ /tʼ/, and ᱜ /kʼ/ (these are actually
54# ejective, not glottal).  In online texts, however, we have frequently
55# encountered ᱼ following non-ejective consonants.
56
57$inword = [[:L:][:M:]];
58
59# Some online texts use a decomposed form of U+1C7A MU-GAAHLAA TTUDDAG.
60ᱹᱸ → ᱺ ;
61ᱸᱹ → ᱺ ;
62::null();
63
64# To simplify the rules below, enforce a uniform ordering of marks.
65ᱻᱹ → ᱹᱻ ;
66ᱻᱸ → ᱸᱻ ;
67ᱻᱺ → ᱺᱻ ;
68ᱼᱹ → ᱹᱼ ;
69ᱼᱸ → ᱸᱼ ;
70ᱼᱺ → ᱺᱼ ;
71::null();
72
73# Some online texts use U+1C7C PHAARKAA instead of U+1C7B RELAA for indicating
74# long phonemes, presumably because the graphemes look similar in some fonts.
75# Since phaarkaa is used for voicing ejectives and plosives (which cannot
76# be lenghtened), we rewrite phaarkaa to relaa.
77[ᱚᱟᱤᱩᱮᱳᱶᱢᱝᱞᱱ] [ᱹᱸᱺ]* {ᱼ} → ᱻ ;
78::null();
79
80ᱚᱹᱻ → ɔː ;
81ᱚᱹ → ɔ ;
82ᱚᱸᱻ → ɔ̃ː ;
83ᱚᱸ → ɔ̃ ;
84ᱚᱺᱻ → ɔ̃ː ;
85ᱚᱺ → ɔ̃ ;
86ᱚᱻ → ɔː ;
87ᱚ → ɔ ;
88
89ᱛᱼ → t ;
90ᱛᱷ → tʰ ;
91ᱛᱽ → d ;
92$inword {ᱛ} → d ;
93ᱛ → t ;
94
95ᱜᱼ → kʼ ;
96ᱜᱷ → kʰ ;
97ᱜᱽ → ɡ ;
98$inword {ᱜ} → ɡ ;
99ᱜ → kʼ ;
100
101ᱝᱻ → ŋː ;
102ᱝ → ŋ ;
103
104ᱞᱻ → lː ;
105ᱞ → l ;
106
107ᱟᱹᱻ → əː ;
108ᱟᱹ → ə ;
109ᱟᱸᱻ → ãː ;
110ᱟᱸ → ã ;
111ᱟᱺᱻ → ə̃ː ;
112ᱟᱺ → ə̃ ;
113ᱟᱻ → aː ;
114ᱟ → a ;
115
116ᱠᱼ → k ;
117ᱠᱷ → kʰ ;
118ᱠᱽ → ɡ ;
119ᱠ → k ;
120
121ᱡᱼ → cʼ ;
122ᱡᱷ → cʰ ;
123ᱡᱽ →  d͡ʒ ;
124$inword {ᱡ} →  d͡ʒ ;
125ᱡ → cʼ ;
126
127ᱢᱻ → mː ;
128ᱢ → m ;
129
130# According to [1], ᱣ is sometimes /v/ and sometimes /w/.
131# TODO: Find out if there is a rule for this.
132ᱣᱸ → w̃ ;
133ᱣ → w ;
134
135ᱤᱹᱻ → iː ;
136ᱤᱹ → i ;
137ᱤᱸᱻ → ĩː ;
138ᱤᱸ → ĩ ;
139ᱤᱺᱻ → ĩː ;
140ᱤᱺ → ĩ ;
141ᱤᱻ → iː ;
142ᱤ → i ;
143
144ᱥᱻ → sː ;
145ᱥ → s ;
146
147# According to [1], ᱦ is sometimes /h/ and sometimes /ʔ/.
148# TODO: Find out if there is a rule for this.
149ᱦ → h ;
150
151ᱧᱻ → ɲː ;
152ᱧ → ɲ ;
153
154ᱨᱻ → r ;
155ᱨ → r ;
156
157ᱩᱹᱻ → uː ;
158ᱩᱹ → u ;
159ᱩᱸᱻ → ũː ;
160ᱩᱸ → ũ ;
161ᱩᱺᱻ → ũː ;
162ᱩᱺ → ũ ;
163ᱩᱻ → uː ;
164ᱩ → u ;
165
166ᱪᱼ → c ;
167ᱪᱷ → cʰ ;
168ᱪᱽ →  d͡ʒ ;
169ᱪ → c ;
170
171ᱫᱼ → tʼ ;
172ᱫᱷ → tʰ ;
173ᱫᱽ → d ;
174$inword {ᱫ} → d ;
175ᱫ → tʼ ;
176
177ᱬᱻ → ɳː ;
178ᱬ → ɳ ;
179
180# TODO: ᱵᱷᱭᱨᱚᱵ → bʰhrɔb seems unlikely; would be good to verify.
181ᱭ → h ;
182
183ᱮᱹᱻ → ɛː ;
184ᱮᱹ → ɛ ;
185ᱮᱺᱻ → ɛ̃ː ;
186ᱮᱺ → ɛ̃ ;
187ᱮᱸᱻ → ẽː ;
188ᱮᱸ → ẽ ;
189ᱮᱻ → eː ;
190ᱮ → e ;
191
192ᱯᱼ → p ;
193ᱯᱷ → pʰ ;
194ᱯᱽ → b ;
195ᱯ → p ;
196
197ᱰᱷ → ɖʰ ;
198ᱰ → ɖ ;
199
200ᱱᱻ → nː ;
201ᱱ → n ;
202
203ᱲᱻ → ɽ ;
204ᱲ → ɽ ;
205
206ᱳᱸᱻ → õː ;
207ᱳᱸ → õ ;
208ᱳᱻ → oː ;
209ᱳ → o ;
210
211ᱴᱼ → ʈ ;
212ᱴᱷ → ʈʰ ;
213ᱴᱽ → ɖ ;
214ᱴ → ʈ ;
215
216ᱵᱼ → pʼ ;
217ᱵᱷ → bʰ ;
218ᱵᱽ → b ;
219$inword {ᱵ} → b ;
220ᱵ → pʼ ;
221
222ᱶᱻ → w̃ː ;
223ᱶ → w̃ ;
224
225			]]></tRule>
226		</transform>
227	</transforms>
228</supplementalData>
229