1<?xml version="1.0" encoding="UTF-8" ?> 2<!DOCTYPE supplementalData SYSTEM "../../common/dtd/ldmlSupplemental.dtd"> 3<!-- 4Copyright © 1991-2015 Unicode, Inc. 5CLDR data files are interpreted according to the LDML specification (http://unicode.org/reports/tr35/) 6For terms of use, see http://www.unicode.org/copyright.html 7--> 8<supplementalData> 9 <version number="$Revision: 12347 $"/> 10 <transforms> 11 <transform source="sat_Olck" target="sat_FONIPA" direction="forward" alias="sat-fonipa-t-sat-olck"> 12 <tRule><![CDATA[ 13# Santali (Ol Chiki) → Santali (International Phonetic Alphabet) 14 15 16# Output 17# ------ 18# m mː n nː ɳ ɳː ɲ ɲː ŋ ŋː 19# p pʰ pʼ b bʰ t tʰ tʼ d dʰ ʈ ʈʰ ɖ ɖʰ c cʰ cʼ k kʰ kʼ ɡ ʔ 20# s sː h 21# d͡ʒ 22# ɽ r 23# l lː 24# w wː w̃ w̃ː 25# 26# i iː ĩ ĩː u uː ũ ũː 27# e eː ẽ ẽː ə əː ə̃ ə̃ː o oː õ õː 28# ɛ ɛː ɛ̃ ɛ̃ː ɔ ɔː ɔ̃ ɔ̃ː 29# a aː ã ãː 30 31 32# References 33# ---------- 34# [1] Michael Everson: Final proposal to encode the Ol Chiki script 35# in the UCS. ISO/IEC JTC1/SC2/WG2 Working Group Document N2984R, 36# September 21, 2005. http://std.dkuug.dk/jtc1/sc2/wg2/docs/n2984.pdf 37# 38# [2] George L. Campbell: Compendium of the World's Languages. 39# Volume 2: Ladakhi to Zuni. ISBN 0-415-20297-3. Taylor & Francis, 2000. 40# Pages 1454 to 1458. 41 42 43# Notes 44# ----- 45# According to [1] (page 3), ᱽ can only follow the four ejective 46# consonants ᱵ /pʼ/, ᱡ /cʼ/, ᱫ /tʼ/, and ᱜ /kʼ/; these become 47# ᱵᱽ /b/, ᱫᱽ /d/, ᱡᱽ /d͡ʒ/, and ᱜᱽ /ɡ/. In online texts, however, 48# we have occasionally encountered ᱽ following non-ejective plosives, 49# for example after ᱯ /p/. These might possibly be typos. Our rules 50# try to be resilient and handle ᱯᱽ as /b/. 51# 52# According to [1] (page 2), U+1C7C PHAARKAA follows the four “glottal” 53# consonants ᱵ /pʼ/, ᱡ /cʼ/, ᱫ /tʼ/, and ᱜ /kʼ/ (these are actually 54# ejective, not glottal). In online texts, however, we have frequently 55# encountered ᱼ following non-ejective consonants. 56 57$inword = [[:L:][:M:]]; 58 59# Some online texts use a decomposed form of U+1C7A MU-GAAHLAA TTUDDAG. 60ᱹᱸ → ᱺ ; 61ᱸᱹ → ᱺ ; 62::null(); 63 64# To simplify the rules below, enforce a uniform ordering of marks. 65ᱻᱹ → ᱹᱻ ; 66ᱻᱸ → ᱸᱻ ; 67ᱻᱺ → ᱺᱻ ; 68ᱼᱹ → ᱹᱼ ; 69ᱼᱸ → ᱸᱼ ; 70ᱼᱺ → ᱺᱼ ; 71::null(); 72 73# Some online texts use U+1C7C PHAARKAA instead of U+1C7B RELAA for indicating 74# long phonemes, presumably because the graphemes look similar in some fonts. 75# Since phaarkaa is used for voicing ejectives and plosives (which cannot 76# be lenghtened), we rewrite phaarkaa to relaa. 77[ᱚᱟᱤᱩᱮᱳᱶᱢᱝᱞᱱ] [ᱹᱸᱺ]* {ᱼ} → ᱻ ; 78::null(); 79 80ᱚᱹᱻ → ɔː ; 81ᱚᱹ → ɔ ; 82ᱚᱸᱻ → ɔ̃ː ; 83ᱚᱸ → ɔ̃ ; 84ᱚᱺᱻ → ɔ̃ː ; 85ᱚᱺ → ɔ̃ ; 86ᱚᱻ → ɔː ; 87ᱚ → ɔ ; 88 89ᱛᱼ → t ; 90ᱛᱷ → tʰ ; 91ᱛᱽ → d ; 92$inword {ᱛ} → d ; 93ᱛ → t ; 94 95ᱜᱼ → kʼ ; 96ᱜᱷ → kʰ ; 97ᱜᱽ → ɡ ; 98$inword {ᱜ} → ɡ ; 99ᱜ → kʼ ; 100 101ᱝᱻ → ŋː ; 102ᱝ → ŋ ; 103 104ᱞᱻ → lː ; 105ᱞ → l ; 106 107ᱟᱹᱻ → əː ; 108ᱟᱹ → ə ; 109ᱟᱸᱻ → ãː ; 110ᱟᱸ → ã ; 111ᱟᱺᱻ → ə̃ː ; 112ᱟᱺ → ə̃ ; 113ᱟᱻ → aː ; 114ᱟ → a ; 115 116ᱠᱼ → k ; 117ᱠᱷ → kʰ ; 118ᱠᱽ → ɡ ; 119ᱠ → k ; 120 121ᱡᱼ → cʼ ; 122ᱡᱷ → cʰ ; 123ᱡᱽ → d͡ʒ ; 124$inword {ᱡ} → d͡ʒ ; 125ᱡ → cʼ ; 126 127ᱢᱻ → mː ; 128ᱢ → m ; 129 130# According to [1], ᱣ is sometimes /v/ and sometimes /w/. 131# TODO: Find out if there is a rule for this. 132ᱣᱸ → w̃ ; 133ᱣ → w ; 134 135ᱤᱹᱻ → iː ; 136ᱤᱹ → i ; 137ᱤᱸᱻ → ĩː ; 138ᱤᱸ → ĩ ; 139ᱤᱺᱻ → ĩː ; 140ᱤᱺ → ĩ ; 141ᱤᱻ → iː ; 142ᱤ → i ; 143 144ᱥᱻ → sː ; 145ᱥ → s ; 146 147# According to [1], ᱦ is sometimes /h/ and sometimes /ʔ/. 148# TODO: Find out if there is a rule for this. 149ᱦ → h ; 150 151ᱧᱻ → ɲː ; 152ᱧ → ɲ ; 153 154ᱨᱻ → r ; 155ᱨ → r ; 156 157ᱩᱹᱻ → uː ; 158ᱩᱹ → u ; 159ᱩᱸᱻ → ũː ; 160ᱩᱸ → ũ ; 161ᱩᱺᱻ → ũː ; 162ᱩᱺ → ũ ; 163ᱩᱻ → uː ; 164ᱩ → u ; 165 166ᱪᱼ → c ; 167ᱪᱷ → cʰ ; 168ᱪᱽ → d͡ʒ ; 169ᱪ → c ; 170 171ᱫᱼ → tʼ ; 172ᱫᱷ → tʰ ; 173ᱫᱽ → d ; 174$inword {ᱫ} → d ; 175ᱫ → tʼ ; 176 177ᱬᱻ → ɳː ; 178ᱬ → ɳ ; 179 180# TODO: ᱵᱷᱭᱨᱚᱵ → bʰhrɔb seems unlikely; would be good to verify. 181ᱭ → h ; 182 183ᱮᱹᱻ → ɛː ; 184ᱮᱹ → ɛ ; 185ᱮᱺᱻ → ɛ̃ː ; 186ᱮᱺ → ɛ̃ ; 187ᱮᱸᱻ → ẽː ; 188ᱮᱸ → ẽ ; 189ᱮᱻ → eː ; 190ᱮ → e ; 191 192ᱯᱼ → p ; 193ᱯᱷ → pʰ ; 194ᱯᱽ → b ; 195ᱯ → p ; 196 197ᱰᱷ → ɖʰ ; 198ᱰ → ɖ ; 199 200ᱱᱻ → nː ; 201ᱱ → n ; 202 203ᱲᱻ → ɽ ; 204ᱲ → ɽ ; 205 206ᱳᱸᱻ → õː ; 207ᱳᱸ → õ ; 208ᱳᱻ → oː ; 209ᱳ → o ; 210 211ᱴᱼ → ʈ ; 212ᱴᱷ → ʈʰ ; 213ᱴᱽ → ɖ ; 214ᱴ → ʈ ; 215 216ᱵᱼ → pʼ ; 217ᱵᱷ → bʰ ; 218ᱵᱽ → b ; 219$inword {ᱵ} → b ; 220ᱵ → pʼ ; 221 222ᱶᱻ → w̃ː ; 223ᱶ → w̃ ; 224 225 ]]></tRule> 226 </transform> 227 </transforms> 228</supplementalData> 229