The English language phenome set is encoded in ASCII as ARPAbet.
The Advanced Research Projects Agency (ARPA) created ARPABET (sometimes written ARPAbet) as part of their Speech Understanding Research initiative in the 1970s.
It uses unique ASCII letter sequences to represent phonemes and allophones in General American English.
Two methods were developed: one that represented each segment with one character (alternating upper- and lower-case letters) and the other that represented each segment with one or two (case-insensitive).
The latter was significantly more commonly used.
Computalker for the S-100 system, SAM for the Commodore 64, SAY for the Amiga, TextAssist for the PC, and Speakeasy from Intelligent Artefacts, all of which utilized the Votrax SC-01 speech synthesiser IC, have all used ARPABET.
The CMU Pronouncing Dictionary also uses it.
In the TIMIT corpus, an updated version of ARPABET is employed.
The ARPABET: one of many possible short versions
ARPABET
|
|||||
Vowels
|
Consonants
|
Less Used Phones/Allophones
|
|||
Symbol |
Example |
Symbol |
Example |
Symbol |
Example |
iy | beat | b | bad | dx | butter |
ih | bit | p | pad | el | bottle |
eh | bet | d | dad | em | bottom |
ae | bat | t | toy | nx | (flapped) n |
aa | cot | g | gag | en | button |
ax | the | k | kick | eng | Washington |
ah | butt | bcl | (b closure) | ux | |
uw | boot | pcl | (p closure) | el | bottle |
uh | book | dcl | (d closure) | q | (glottal stop) |
aw | about | tcl | (t closure) | ix | roses |
er | bird | gcl | (g closure) | epi | (epinthetic closure) |
axr | diner | kcl | (k closure) | sil | silence |
ey | bait | dh | they | pau | silence |
ay | bite | th | thief | ||
oy | boy | v | very | ||
ow | boat | f | fief | ||
ao | bought | z | zoo | ||
s | see | ||||
ch | church | ||||
m | mom | ||||
n | non | ||||
ng | sing | ||||
w | wet | ||||
y | yet | ||||
hh | hay | ||||
r | red | ||||
l | led | ||||
zh | measure | ||||
sh | shoe |