Skip to main content

American vowels

Vowel definition

Let’s first define what a vowel is. Vowels are often described as sounds in which the air stream moves up from the lungs and through the vocal tract very smoothly. In other words, there’s nothing blocking or constricting the vocal tract. By contrast, consonants are sounds that have obstruction somewhere in the production. This way of defining vowels and consonants, i.e., the absence or existence of obstruction in the vocal tract, is not precise since sounds like /l/, /r/, /w/, and /j/ are also produced without obstruction, but they are classified as consonants. 

Another way to define a vowel is to say that a vowel is a necessary ingredient of a syllable. A syllable is a constant feature in every spoken language since it is viewed as the phonological building block of words. It is like a beat in the rhythm of the word. Every single spoken word is made up of one or more syllables. For instance, ‘look’ is a one syllable word since there is just one beat made by ‘oo.’ And ‘looking’ is a two syllable word since there are two beats made by ‘oo’ and ‘i.’ In other words, a syllable is a unit of organization for a sequence of speech sounds. Each language has its own rules about what kinds of syllables are allowed, and what kinds are not. For instance, Korean and English differ on the allowed number of consonant sounds and of vowel sounds in a syllable. However, the general structure of a syllable is considered to be constant across all languages as its core is a vowel sound. The vowel in a syllable is called the nucleus. The nucleus is the most sonorous part of a syllable. Consonant sounds can come before or after the nucleus. The sound that comes after the nucleus is called the coda, and the sound that comes before the nucleus is called the onset. A syllable can stand without a coda or an onset, but not without a nucleus, the vowel.

The idea that a syllable cannot stand without a vowel suggests that a vowel can form its own syllable. So we could define a vowel as the sound that can form its own syllable. This definition allows us to eliminate /w/ and /j/, since they cannot form a syllable, unlike their counterparts, /u/ and /ɪ/. This explains why /w/ and /j/ are called semi-vowels, and categorized under consonants. This definition, however, is not perfect since some consonants, for example, /l/, /m/, and /n/ can also stand as syllables in certain situations. When they do, they are called syllabic consonants. These words contain syllabic consonants: rhythm, button, and funnel

Defining what exactly a vowel sound is now seems harder than it initially appeared to be due to exceptions to the rule. Some sounds are clearly vowel sounds and some are clearly consonant sounds. But when it comes to the boundary cases, as exemplified in English with semi-vowels and syllabic consonants, the idea of a vowel becomes blurry. If two languages have different conceptions of the well-formed syllable, like Korean vs English, their understanding of the vowel will differ. For instance, in English, /ju/ is a combination of a semivowel and a vowel, but in Korean, which has no semivowel type consonants, a sound similar to /ju/ is viewed as a combination of two vowel sounds. So it seems that ultimately whether a sound is a consonant or a vowel can be determined within its own language system. 

American elementary school classification 

To learn the pronunciation of vowel sounds, we want to know the number of vowel sounds. American elementary school teachers teach their students that each vowel letter has long and short sounds. Since there are 5 vowel letters (A, E, I, O and U), there will be 10 total vowel sounds, like short A and long A, short E and long E, and so on. However, “long” and “short” do not mean that the sounds are identical except for length. Depending on the surrounding sounds, ‘short’ vowels can be as long as ‘long’ vowels. 

In fact, when the teachers say long and short, they mean something entirely different. According to them, when a vowel letter says its name, then that vowel is making a ‘long’ sound. So, in the case of the letter ‘A’, words like ‘ape’ and 'cake’ have the long ‘A’ sound. By contrast, the short ‘A’ sound is /æ/, as in ‘apple’ and ‘cat.’ In the case of ‘E’, the short E sound is /ɛ/ as in ‘egg’ and ‘elephant,’ and the long E sound is /i/ as in ‘eat’ and ‘eagle.’ In the case of ‘I’, the short I sound is /ɪ/ as in ‘igloo’, and the long I sound is /aɪ/ as in ‘ice.’ In the case of ‘O’, the short sound is /ɑ/ as in ‘octopus’, and the long sound is /oʊ/ as in ‘oatmeal.’ And in the case of U, the short sound is /ʌ/ as in ‘umbrella’, and the long sound is /ju/ as in ‘unicorn.’ 

The American elementary school classification of the vowel sounds is inadequate for a few reasons. First, the list does not cover all vowel sounds that can be found in standard American English. There are far more vowel sounds than the 10 long and short vowel sounds. These are some vowel phoneme sounds that are missing in the list: /u, ʊ, ɔ, ɔɪ, aʊ/. Secondly, the list includes a vowel sound /ju/ that is not a pure vowel phoneme. The so-called “long U” sound is a composite of /j/ and /u/. Third, in English, each vowel letter can be pronounced in more than the two different ways mentioned above. For example, the letter ‘A’ can be pronounced as /æ/ as in ‘hat’, /eɪ/ as in ‘hate’, and /ɑ/ as in ‘car’. Since there is no one-to-one correspondence between letters and sounds in English, we cannot use vowel letters to indicate vowel sounds. 

The fact that there is no one-to-one correspondence between letters and sounds in English implies that English is not a phonetic language. In a phonetic language, all the letters have fixed pronunciations, and even if they change, they do so according to some phonologically justifiable rules. Korean, for instance, is a phonetic language, as its writing system was invented and distributed by the government in 1446 to teach its people to speak with the correct sounds. Prior to the invention of the Korean alphabet, Koreans relied on Chinese characters. By contrast, the English writing system developed in a haphazard way. Due to its history of foreign invasions and long-lasting occupations on the land of Britain, English is made of a mix of Latin, German and French, and many more. Because of this, we cannot sound out most English words. Letters in words are silent sometimes for no reason and some sounds appear when there are no corresponding letters in the word. For example, ‘colonel’ is pronounced /kur.nuhl/, with /r/, even though there is no R in the spelling of the word. 

International Phonetic Alphabet (IPA)

To pronounce English words correctly, we should follow the practice of linguists: the use of the International Phonetic Alphabet (IPA). The key to the IPA is to make sure that there is a one-to-one correspondence between the phonetic characters and the sounds. The first version of the IPA was created in 1888, and it has been revised ever since. 

Rhotic R

In light of the phonetic representation of American English, the symbolization of the rhotic R should be interesting. The standard American pronunciation is rhotic. ‘Rhotic’ means the letter R is always pronounced whenever there is an R in the spelling. The word ‘rhotic’ comes from the Greek letter ‘rho,’ as Greek letters are often used for phonetic representation. The standard British English, spoken by Londeners, is non-rhotic. British people pronounce the letter R only when it is the beginning sound in a syllable as in rice, ramen, and bright. If R occurs after a vowel in a syllable, it is not pronounced. Interestingly, British sometimes use the R sound to connect two vowel sounds, like ‘the idea-r-of-it.” This use of /r/ is called the intrusive R. This use of /r/ is viewed as non-standard pronunciation in America. 

The rhotic R phenomenon is commonly called the r-colored or r-controlled vowel. American English has many r-colored vowels: ER, AR, UR, OR, AIR, EAR, IRE, AUR. The vowels can be found in these words:

ER (/ər/) as in butter, color, stir, occur

AR (/ɑr/) as in car, star 

UR (/ʊər/) as in sure, tour, pure

OR (/ɔr/) as in orange, chore, order

AIR (/eər/) as in air, stair, bear, care, chair 

EAR (/ɪər/) as in beer, year, steer

IRE (/aɪər/) as in fire, tire 

AUR (/aʊər/) as in sour, hour, flower

These vowels are called r-colored because the R after a vowel in a syllable changes the quality of the vowel sound. For example, consider the words ‘lodge’ and ‘large’ both of which have the /ɑ/ sound. But their vowel quality is different between ‘lodge’ and ‘large.’ To say ‘lodge’, the tongue is flat. But to say ‘large,’ the tip of the tongue has to be turned backward. That is, to pronounce the ‘r’ sound in ‘large’ the vowel sound has to be modified. American linguists have tried to implement this fact into the IPA. Examples are the hook addition like ɚ, ɑ˞, ɔ˞, or the superscript turned ‘r,’ ‘ɹ’ in əʴ, ɑʴ, ɔʴ. In this course, however, we do not use these special notations and use the same vowel phonemes notation for both r-colored vowels and vowels without the R sound since we are learning American English. That is, the difference in vowel quality is predictable, and thus there is no problem using the same symbol in both cases.

American vowels vs British vowels

To determine the number of vowel phoneme sounds, we should first notice that standard American English has fewer vowel phoneme sounds than British English. 

For example, the British back open rounded vowel /ɒ/ does not exist in American English. In Britain, words like ‘hot, stop, cloth, are all pronounced with /ɒ/ sound. In America, they are all pronounced with the /ɑ/ sound. This phenomenon is called the father-bother merger since in British English,‘father’ and ‘bother’ have different vowel sounds: /fɑːðə/ and /bɒðə/. But in the US, both are pronounced with the /ɑ/ sound: /fɑ:ðər/ and /bɑ:ðər/. 

Pushing the merger further, in standard American English, many words that used to be pronounced with /ɔ/ are pronounced with /ɑ/. That is, in standard American English, the difference between /ɔ/ and /ɑ/ is almost non-existent as the vowel /ɔ/ is merging with /ɑ/. This phenomenon is called the cot-caught merger. 

There is also the vowel shift in American English. The most salient one is the shift from /ɑ:/ to /æ/. In Britain, words like “aunt, bath, laugh, class, chance, ask,” are all spoken with the /ɑ:/ sound. In America, however, they are all pronounced with the /æ/ sound: for example, aunt, bath, laugh, class, chance, and ask.

In American English, vowel length is viewed as a prosodic feature, not a phonetic feature. That is, the length of a vowel sound is not viewed as an intrinsic feature of vowel phoneme sounds. Rather, how long we should hold the vowel sound is determined by such factors as whether the stress lies on the vowel or not and whether it is followed by a voiced sound or not. If the vowel is stressed or if it is followed by a voiced sound, the vowel sound becomes long. 

15 vowel phonemes of standard American English

A phoneme is a unit of sound that distinguishes one word from another in a particular language. So if we use the wrong phonemes while speaking, we’ll be misunderstood. Naturally, to pronounce words correctly, we need to learn the phonemes of English. Thanks to the merger phenomena of standard American English, there are just 10 distinct pure vowel phonemes in American English. They are 

/i/, /ɪ/, /ɛ/, /æ/, /u/, /ʊ/, /ɔ/, /ɑ/, /ə/ and /ʌ/. 

Regarding /ə/ and /ʌ/, they differ in stress rather than in sound. That is, the schwa is used for the unstressed /ʌ/ sound. So one could argue that there are just 9 pure vowel phoneme sounds, but in this book we separate them since schwa is a special vowel used for de-emphasis. Also, while /ɔ/ is disappearing due to the cot-caught merger, we list it as an independent phoneme. While /ɔ/ does not occur as a pure sound, it does occur as an r-colored vowel or as part of a diphthong (two vowel sounds). 


The 10 vowel phonemes (/i/, /ɪ/, /ɛ/, /æ/, /u/, /ʊ/, /ɔ/, /ɑ/, /ə/ and /ʌ/) are called pure vowels or monophthongs (one vowel sound). When we produce monophthongs, the tongue stays in the same position even if we prolong the sound. For example, when we say /æ/ as in ‘bad,’ the tongue position and the quality of the vowel stay constant throughout the production, even if we continue to say the vowel for a long time. 


But English has vowels that have two sounds in a phoneme. For example, when we say /eɪ/ as in day, our tongue moves just a bit, from the position of /e/ to the position of /ɪ/. /e/ is similar to /ɛ/, except that the tongue is a little higher for /e/. Vowels of this type are called diphthongs as they have two perceived auditory qualities. Diphthongs, despite having two vowel sounds within, are single phonemes. That is, they make one syllable because the vowel sounds in diphthongs are unsegmentable. To produce diphthongs, we need to glide continuously from one sound to another sound. We don’t say /de.ɪ/, but /deɪ/. Knowing this fact is especially important when counting the number of syllables of words. 

British English has many diphthongs and even triphthongs. Triphthongs contain three vowel sounds in a phoneme, like /aʊə/. These are examples of British diphthongs: 

/ɪə/ as in deer, beer

/eə/ as in fair, care

/ʊə/ as in poor, tour

/eɪ/ as in they, play

/aɪ/ as in idea, light 

/ɔɪ/ as in boy, join

/oʊ/ as in show, go

/aʊ/ as in sound, how

American English has fewer diphthongs due to its rhotic nature. That is, the /ə/ in the first three diphthongs are not viewed as diphthongs in America since the schwa is included due to the letter R, which sounds /ər/. That is, the first three occur only when the letter R is followed. With this in mind, we can say that there are 5 diphthong sounds in American English: /eɪ/, /aɪ/, /ɔɪ/, /oʊ/ and /aʊ/. So, when we consider monophthongs and diphthongs together, we can say that there are 15 vowel phonemes in standard American English.