• In the news

  • XML spec details voice control
    Computer Business Review, UK -
    ... SSML 1.0 provides a vocabulary for word-level, phoneme-level and waveform-level control, integrates with other XML content, and - according to the W3C - is ...
  • Speaking words of wisdom
    Sydney Morning Herald (subscription), Australia -
    ... The SSML vocabulary allows word-level, phoneme-level and even waveform-level control of the output to satisfy a wide spectrum of application scenarios and ...
  • Culture a contributor to dyslexia: study
    The Australian, Australia -
    ... in the left temporoparietal region, a data-processing part of the brain that assesses a letter, attributes a block of sound (called a phoneme) to it and then a ...
  • Phonetic Systems' Voice Search Engine 5.8 Breaks Deployability ...
    Business Wire (press release), CA -
    ... numbers and other number lookups. The VSE can function as both a phoneme decoder and a small grammar decoder. As a small grammar ...
  • The writing's on the wall
    The Globe and Mail, Canada -
    ... How strange it seemed to gaze upon these beautifully rendered letters and words and phrases and not understand one single phoneme of them! ...
In spoken language, a phoneme is a basic, theoretical unit of sound that can distinguish words (i.e. changing one phoneme in a word can produce another word). A phoneme may well represent categorically several phonetically similar or phonologically related sounds (the relationship may not be so phonetically obvious, which is one of the problems with this conceptual scheme).

Depending on the language and the alphabet used, a phoneme may be written consistently with one letter; however, there are many exceptions to this rule — see Writing systems below.

Two words that are differentiated by one phoneme, such as "cat" and "rat", are known as a minimal pair.

The exact number of phonemes in English depends on the speaker and the method of determining phoneme vs. allophone, but estimates typically range from 40 to 45, which is above average across all languages. Pirahã has only 10, while !Xóõ has 141.

When representing phonemes in linguistic writing, it is common to use 'slash' markers as quotes around the symbol that stands for the sound. For example, the phoneme for the initial consonant sound in the word "phoneme" would be written as /f/. In other words, the graphemes are , but this digraph represents one sound /f/. Allophones, real speech variants of a phoneme, are often denoted in linguistics by the use of diacritical or other marks added to the phoneme symbols and then placed in square brackets [ ] to differentiate them from the phoneme in slant brackets / /. The conventions of orthography are then kept separate from both phonemes and allophones by the use of the markers < > to enclose the spelling.

The symbols of the International Phonetic Alphabet (IPA) and extended sets adapted to a particular language are often used by linguists to write phonemes, with the principle being one symbol equals one categorical sound. Due to problems displaying some symbols in the early days of the Internet, a hack called SAMPA was developed to represent IPA symbols in plain text. Now, in 2004, any modern browser can display IPA, and we use this system in this article.

Examples of phonemes in the English language would include sounds from the set of English consonants, like /p/ and /b/. These two are most often written consistently with one letter for each sound. However, phonemes might not be so apparent in written English, such as when they are typically represented with combined letters, called digraphs, like (pronounced /ʃ/) or (pronounced /tʃ/).

Phonology, or more specifically, phonemics, is the study of the system of phonemes of a language, although some conceptualize phonology as encompassing far more than sound segments. Thus phonology can be used as a more general term subsuming phonemics.

What may be an allophone (a sound variant belonging to the same phoneme category) in one language may be a phoneme itself in another language. In English, for example, [p] has aspirated and non-aspirated allophones, e.g. aspirated in /pIn/, but non-aspirated in /spIn/. However, in some languages (e.g., Ancient Greek), aspirated /ph/ was a phoneme distinct from both unaspirated /p/ and /b/. As another example, there is no distinction between /r/ and /l/ in Japanese, there is only one /r/ phoneme in Japanese, although the Japanese /r/ has allophones that make it sound more like an /l/ or /d/ to English speakers. The sounds /z/ and /s/ are distinct phonemes in English, but allophones in Spanish. /dʒ/ (as in <Jill>) and /ʒ/ (as in sure>, ge>) are phonemes in English, but allophones in Italian.

A sound that is a single phoneme in one language may be a phoneme cluster in another. For instance, /buts/ means leg-covering footwear in English and consists of four phonemes /b u t s/; but in Hebrew it means a kind of cloth and consists of only three phonemes /b u ts/.

The phoneme is a structuralist abstraction that was later adapted to and formally psychologized in generative linguistics (after Chomsky and Halle). Rather than a basic mental unit of language, however, it may well be a perceptual artifact of alphabetic literacy (see the terms Phonemic awareness and Phonological awareness). If not that, it may be an epiphenomenal aspect to listening removed from face-to-face encounters, that is, text-like listening. Cf. Phone and Feature.

1 Phonological extremes

2 Writing systems

3 See also

Table of contents

Phonological extremes

Of all the sounds that a human vocal tract can create, different languages vary considerably in the number of these sounds that are considered to be distinctive phonemes in the speech of that language. Some dialects of Abkhaz have only 2 phonemic vowels, and many Native American languages have 3, while Punjabi has over 25. Rotokas (spoken in Papua New Guinea) has only 6 consonants, while !Xu~ (spoken in southern Africa, in the vicinity of the Kalahari desert) has over 100. The total number of phonemes in languages varies from as few as 11 in Rotokas and 13 in Hawaiian to as many as 141 in !Xu~. These may range from familiar sounds like [t], [s] or [m] to very unusual ones produced in extraordinary ways (see: Click consonant, phonation, airstream mechanism). The English language itself uses a rather large set of 13 vowels, though its 27 consonants are pretty close to average. This differs from the lay definition based on the Latin alphabet, where there are 21 consonants and five vowels (although sometimes y and w are included as vowels).

The most common vowel system consists of five vowels: /i/, /e/, /a/, /o/, /u/.
The most common consonants are /p/, /t/, /k/. Not all languages have these; the Hawai'ian language lacks /t/, and the Mohawk language lacks /p/, but all known languages have at least two of the three. If one of the three is missing, the language will have /?/ (glottal stop).
Possibly the rarest sound is the one represented by the "ř" letter (called "r háček") (found in the name Dvořák) in the Czech language; it appears to be unique to the language.

Only the Dyirbal language of Australia and some dialects of Norwegian use six (primary or contrastive) places of articulation; all other languages use fewer. The possible places of articulation include bilabial, labiodental, dental, alveolar, alveopalatal, palatal, retroflex, velar, uvular, pharyngeal, and glottal.

Writing systems

Languages where a given symbol represents only one phoneme and every phoneme is represented only by one symbol are known by the layman as "phonetic languages", which might be better described as "phonemically-written".

However, the split between phonemically-written and non phonemically-written languages is usually exaggerated. All languages are in fact written with conventional signs that represent meaning and are inspired to some degree by pronunciation. This is true at both ends of the scale: Chinese characters are first and foremost symbols of meaning, but they do also have some minimal phonetic information. At the other extreme, languages such as Serbian have systems that represent the educated spoken language perfectly, but the same system is valid for all accents within the language, and is therefore conventional.

All other languages fall somewhere between these extremes. English is often given as an example of an "unphonetic" language; however, in reality its system is nowhere near as close to being a purely conventional system as Chinese writing is. English spelling conveys vast amounts of phonetic information and follows relatively consistent, if complex rules. Spanish is often given as an example of a "phonetic" language; however, it has numerous imperfections including silent letters. It is, at least, possible to know the correct pronunciation of any written Spanish word.

See also