Speech acquisition

Speech acquisition data include the age typically developing children acquire consonants, consonant clusters, vowels, and tones as well as many other areas of speech.

Summary data are included below.

Summary of consonant acquisition

Free journal articles (open access)

McLeod, S. & Crowe, K. (2018). Children’s consonant acquisition in 27 languages: A cross-linguistic review. American Journal of Speech-Language Pathology, 27, 1546-1571. https://doi.org/10.1044/2018_AJSLP-17-0100

This article received the annual American Journal of Speech-Language Pathology Editor’s Award.

View article

Crowe, K., & McLeod, S. (2020). Children's English consonant acquisition in the United States: A review. American Journal of Speech-Language Pathology. https://doi.org/10.1044/2020_AJSLP-19-00168

View article

Learning English Consonants

Across the world

15 studies of 7,369 children (McLeod & Crowe, 2018)


Download Flowers Graphic


Download Space PDF


Download Treehouse Graphic


Download Steps graphic

United States

15 studies of 18,907 children (Crowe & McLeod, 2020)





Summary of 250 cross-linguistic studies of speech acquisition

Cross-linguistic trends in children's speech acquisition are summarized below. The summary is based on McLeod (2010) and relies on information from McLeod (2007). Additional information is available in Goldstein and McLeod (2012) and Zhu Hua and Dodd (2006).

Definition: The estimated amount of speech that is intelligible to a particular listener

Main findings:

  • 2-year-olds are intelligible at least 50% (more often with their parents)
  • 4- and 5-year-olds' speech is intelligible most of the time, even to strangers

Main languages studying this aspect: English, Finnish, and Portuguese

Definition: Age when most children (90% or 75%) can pronounce a consonant like an adult

Main finding: Wide diversity of reported ages (>2;6 years) even for languages sharing similar consonants

Main languages: Almost every of the 24 languages

Definition: Paradigmatic acquisition = Discreet vowels (e.g., in monosyllabic words) vs. Syntagmatic acquisition = Vowels in context (e.g., stressed and unstressed vowels in polysyllables)

Main findings: Paradigmatic acquisition = Approx 3-years-old; Syntagmatic acquisition = Approx 7- to 9-years-old (in English)

Main languages: Very few languages (mostly English)

Definition: Sometimes reported as percent consonants in error

Main findings:

  • 2-year-olds produce consonants correctly at least 70% of the time
  • 5-year-olds produce consonants correctly at least 90% of the time

Main languages: Most (including English, Finnish, German, Hungarian, Putonghua, and Welsh)

Definition: Sounds children typically produce before they achieve the adult target

Main findings: Although there are some similarities, common mismatches do differ between languages. For example, common mismatches for /s/

  • /s/ - plosive consonant e.g., [t] in many languages (e.g., English, Dutch, Finnish, Hungarian, and Portuguese)
  • /s/ - lateralized fricative e.g., [ɮ] in Greek
  • /s/ - palatal consonant e.g., palatalization in Japanese, and [ʃ] in Israeli Hebrew

Main languages: A few languages including English, Greek, Japanese, Hungarian, Dutch

Definition: Patterns that occur in children's speech

Main findings:Systemic simplifications

  • Backing (e.g., Lebanese Arabic, Greek, Japanese, Norwegian, Putonghua, Thai, Vietnamese)
  • Fronting (e.g., Jordanian Arabic, Lebanese Arabic, Cantonese, English, German, Greek, Israeli Hebrew, Japanese, Korean, Maltese, Norwegian, Portuguese, Putonghua, Thai, Turkish, Welsh)
  • Gliding/Liquid deviation (e.g., Lebanese Arabic, Dutch, English, French, Korean, Maltese, Portuguese, Putonghua, Turkish, Welsh)
  • Stopping (e.g., Lebanese Arabic, Cantonese, Dutch, English, German, Greek, Israeli Hebrew, Japanese, Korean, Maltese, Norwegian, Portuguese, Putonghua, Thai, Turkish, Welsh)
  • Devoicing (e.g., Jordanian Arabic, Lebanese Arabic, Dutch, German, Hungarian, Israeli Hebrew, Maltese, Norwegian)
  • Voicing (e.g., English, German, Norwegian, Turkish, Welsh)

Structural simplifications

  • Assimilation/Consonant harmony (e.g., Cantonese, Dutch, English, French, Greek, Maltese, Norwegian, Portuguese, Putonghua, Turkish, Welsh)
  • Cluster reduction (e.g., Dutch, English, French, Greek, Israeli Hebrew, Maltese, Spanish, Thai, Turkish, Welsh)
  • Initial consonant deletion (e.g., Finnish, Spanish, Maltese, Thai)
  • Final consonant deletion (e.g., Jordanian Arabic, Cantonese, Dutch, English, German, Greek, Israeli Hebrew, Korean, Maltese, Portuguese, Putonghua, Spanish, Thai, Turkish, Welsh)
  • Reduplication (e.g., Dutch, English, Greek, Korean, Turkish, Welsh)
  • (Weak) syllable deletion (e.g., Jordanian Arabic, Dutch, English, Finnish, French, German, Israeli Hebrew, Japanese, Maltese, Norwegian, Portuguese, Spanish, Turkish, Welsh)

Definition: Sounds produced regardless of the adult target

Main findings: Vowels, nasals, and plosives appear to be the earliest sounds to be produced by children. Children produce more sounds and greater articulatory variation as they grow older.

For example, phonetic inventories of American English 1-year-olds = nasals, voiced plosives, and a glide. Phonetic inventories of Jordanian Arabic 1- to 2-year-olds = plosives, fricatives, nasals, a lateral, and approximants. Maltese 2-year-olds = nasals, plosives, a fricative, and approximants

Main languages: Many languages (including Arabic, Cantonese, English, Finnish, Maltese).

Definition: Syllable shapes produced regardless of target

Main findings: CV is a universal syllable shape (Locke, 1983) and is the earliest syllable structure to emerge. Next syllable shapes to emerge are: CVC (e.g., English, Israeli Hebrew, Maltese, Spanish), V (e.g., Korean), VC (e.g. Israeli Hebrew, Spanish)

Main languages: only a few studies

Definition: Strong and weak emphasis on different syllables

Main findings: Acquisition of stress is language-dependent. Very early acquisition (e.g., Israeli Hebrew). Later acquisition (e.g., Dutch and English)

Main languages: Few studies

Definition: Melody of speech

Main findings: Language-specific intonation patterns begins between 1 and 2 years of age (e.g., English and Hungarian). Not fully acquired until 5;0 (English). Perception continues to develop until 10 and 11 years (Wells, Peppé, & Goulandris 2004)

Main languages: Few studies

Definition: Some languages use tones to differentiate lexical meaning

Main findings: Tone acquisition was achieved by 2-year-olds (Cantonese and Putonghua)

Main languages: Cantonese and Putonghua


  • Zhu Hua & Dodd, B. (2006). Phonological development and disorders in children: A multilingual perspective. Cleavdon, UK: Multilingual Matters.
  • Goldstein, B. A., & McLeod, S. (2012). Typical and atypical multilingual speech acquisition. In S. McLeod & B. A. Goldstein (Eds.), Multilingual aspects of speech sound disorders in children (pp. 84-100). Bristol, UK: Multilingual Matters.
  • McLeod, S. (Ed). (2007). The international guide to speech acquisition. Clifton Park, NY: Thomson Delmar Learning.
  • McLeod, S. (2010). Laying the foundations for multilingual acquisition: An international overview of speech acquisition. In M. Cruz-Ferreira (Ed.), Multilingual norms (pp. 53-71). Frankfurt: Peter Lang Publishing.

Suggested citation for this page

McLeod, S. (2012). Summary of 250 cross-linguistic studies of speech acquisition. Bathurst, NSW, Australia: Charles Sturt University. Retrieved month day, year from http://www.csu.edu.au/research/multilingual-speech/speech-acquisition