Orthographic Transparency
Orthographic transparency measures how predictable the relationship is between spelling and pronunciation. Linguists use two standard metrics: feedforward consistency (can you predict the pronunciation from the spelling?) and feedback consistency (can you predict the spelling from the pronunciation?). English scores poorly on both. Ingglish was designed to score perfectly on feedforward and near-perfectly on feedback.
The Standard Metrics
These metrics follow the framework established by Ziegler, Stone & Jacobs (1997) and refined by Siegelman & Kearns (2019).
Feedforward Consistency (Reading: Spelling -> Sound)
Given a grapheme, how many possible pronunciations does it have?
- Consistency ratio = frequency(dominant pronunciation) / frequency(all pronunciations)
- A ratio of 1.0 means the grapheme always makes the same sound
| System | Feedforward Consistency | Notes |
|---|---|---|
| Finnish | ~1.00 | Nearly perfect 1:1 grapheme-phoneme mapping |
| Italian | ~0.98 | Few exceptions (e.g., "c" before e/i) |
| German | ~0.90 | Mostly regular with some context rules |
| French | ~0.85 | Complex but rule-governed (nasal vowels, silent endings) |
| Ingglish | 1.00 | Every grapheme always makes the same sound |
| English | ~0.70 | Highly inconsistent ("ough" has 6+ pronunciations) |
Ingglish achieves perfect feedforward consistency by design: each of its 39 graphemes maps to exactly one phoneme. There are no exceptions, no context rules, and no silent letters.
Feedback Consistency (Spelling: Sound -> Spelling)
Given a phoneme, how many possible spellings does it have?
| System | Feedback Consistency | Notes |
|---|---|---|
| Finnish | ~0.99 | Nearly perfect in both directions |
| Italian | ~0.90 | Some phonemes have multiple spellings |
| German | ~0.75 | Several phonemes can be spelled multiple ways |
| Ingglish | 0.92 | 3 minor ambiguities (see below) |
| French | ~0.55 | Many phonemes have multiple spellings (/o/ = o, au, eau, ...) |
| English | ~0.50 | Extremely inconsistent (/iː/ = ee, ea, e, ie, ei, ey, ...) |
Ingglish Grapheme Inventory
Ingglish uses 39 graphemes (15 vowels + 24 consonants) built from the 26 standard Latin letters. No diacritics, no new symbols. C, Q, and X are retired. See Phoneme Mapping for the full table.
The Three Feedback Ambiguities
Ingglish has exactly three graphemes that can represent more than one phoneme in the reverse direction:
1. "a" -> AE or AH (schwa)
The grapheme "a" represents both the TRAP vowel (/æ/, as in "cat") and the unstressed schwa (/&schwa;/, as in "about"). This is the stress-conditioned split: stressed AH maps to "u" (cut, but), while unstressed AH0 maps to "a" (about, again). Since AE also maps to "a", the reverse direction is ambiguous.
This is the most linguistically defensible compromise. Schwa is the most common English vowel, appearing in virtually every unstressed syllable. Its quality is largely predictable from context -- if "a" appears in an unstressed syllable, it's schwa; in a stressed syllable, it's TRAP.
2. "er" -> ER or EH+R
The grapheme sequence "er" could be the single r-colored vowel ER (/ħ/, as in "bird") or the sequence EH+R (/εr/, as in the rare "welfare" split). In practice this ambiguity is negligible: the ER interpretation is correct in almost all cases.
3. "sh" -> SH or S+HH
The digraph "sh" could be the fricative SH (/∫/, as in "ship") or the rare sequence S+HH (/sh/, as in "mishap" if parsed morphologically). This ambiguity is extremely rare in practice.
How English Compares
For context, English has over 1,100 grapheme-phoneme correspondences for its ~40 phonemes. Some examples of English's feedback inconsistency:
| Phoneme | English spellings | Count |
|---|---|---|
| /iː/ | ee, ea, e, ie, ei, ey, e_e, i, eo, ae, oe, ... | 11+ |
| /∫/ | sh, ti, ci, si, ssi, ch, s, ce, sci, xi | 10+ |
| /k/ | c, k, ck, ch, cc, que, q, x (in "fox") | 8+ |
| /uː/ | oo, u, ue, ew, ou, o, ui, u_e, ough, wo | 10+ |
Ingglish reduces each of these to exactly one spelling.
Entropy Analysis
Shannon entropy quantifies the uncertainty in a mapping. An entropy of 0 means no uncertainty (perfectly predictable); higher values mean more ambiguity.
| Direction | Ingglish Entropy | English Entropy |
|---|---|---|
| Feedforward (reading) | 0.00 bits | ~1.5-2.5 bits per grapheme |
| Feedback (spelling) | ~0.05 bits (3 minor ambiguities) | ~2.0-3.0 bits per phoneme |
The near-zero feedback entropy in Ingglish means that knowing the pronunciation almost completely determines the spelling, with only the "a"/AE-vs-AH ambiguity contributing meaningful uncertainty.
Comparison with Other Spelling Reforms
| System | Feedforward | Feedback | Script | Notes |
|---|---|---|---|---|
| Ingglish | 1.00 | 0.92 | Latin (26 letters) | Digraphs for extra sounds |
| Shavian | 1.00 | 1.00 | New (48 letters) | Perfect but requires learning new alphabet |
| Deseret | 1.00 | 1.00 | New (38 letters) | Perfect but requires learning new alphabet |
| IPA | 1.00 | 1.00 | Extended Latin + new symbols | Perfect but not designed for everyday use |
| SoundSpel | ~0.95 | ~0.85 | Latin | Some remaining ambiguities |
| Cut Spelling | ~0.80 | ~0.75 | Latin | Removes letters but keeps irregularities |
| Traditional English | ~0.70 | ~0.50 | Latin | The baseline |
Ingglish is the only Latin-script reform that achieves perfect feedforward consistency. The three feedback ambiguities are the minimal cost of using 26 standard letters for 39 phonemes.
Methodology
All metrics are computed over the CMU Pronouncing Dictionary (126,000+ unique words) using the SUBTLEX-US corpus for frequency weighting. Feedforward consistency is verified automatically: the translateSync() function is deterministic and produces the same output for any given phoneme sequence. Feedback consistency is measured by counting reverse-direction ambiguities in the INGGLISH_TO_ARPABET_MAP.