Synonyms

First language acquisition; Language development; Primary language acquisition

Definition

Process of acquiring the sound system, structures, rules, and symbolic representations of a language and their meanings.

Introduction

How typically developing children universally acquire their primary language(s) with ease before any or without formal schooling is the foundation of research in language acquisition. There have been two general strands of research: nativist and nonnativist. The nativist approach proposes that an innate module gives humans a priori knowledge about the nature of language, specifically syntax, and results in the universal acquisition of language among all typically developing children (Chomsky 1957, 1975). The second strand assumes that there is no innate mechanism specifically dedicated to language, but innate social and cognitive abilities that have evolved for other purposes. There has been no clear winner as research from other disciplines examining infant and child abilities, parent/adult–child interaction, neurobiology have forced researchers to think and rethink, about whether a language faculty dedicated to language evolved, or have general biological structures and cognitive abilities been exapted for languages to be universally acquired.

What Evolved?

Nativism

Chomsky (1957, 1975) perhaps raised the question of language acquisition research. How is it that a child is able to correctly generalize from a finite sample of sentences heard from an unfiltered context the rules of a language, and then apply those rules in such a way that they can correctly produce an infinite number of sentences? In other words, how does one account for the Poverty of Stimulus, which necessitates that children acquire and know more about language structure than could possibly be gained from the “noisy” ambient language they are exposed to and with little correction? Chomsky proposed that humans have evolved a species-specific innate capacity, a language faculty, encoded in our biology. This faculty is Universal Grammar (UG) and accounts for the uniformly successful acquisition of grammar of typically developing children despite the Poverty of Stimulus. So, knowledge about one’s language(s) will be triggered through exposure to the ambient language and subsequently produced. As the search for universals among the world’s 7000 languages continued, the arguments made in support of a universal grammar became less and less probable, and the technical theories of syntax have had to change over time (Dabrowski 2016; Horgan 2016). Furthermore, while there has been some promising support for a language faculty, researchers have yet to find a plausible genetic or neurobiological substrate (Lenneberg 1967; Lai et al. 2001).

Behaviorism

Prior to nativism, in his book Verbal Behavior, Skinner (1957) suggested that we learn language by making and reinforcing associations through the process of conditioning. In this view, language is learned piecemeal using the same innate mechanisms for learning other behaviors and gaining knowledge. New or previously heard instances of language are generalized and inducted into our existing language repertoires. The behaviorist view was considered insufficient as it was believed that young children experience very little one-on-one language teaching interactions and much learning occurs without direct instruction. It seemed implausible that children had the ability to cut through the “noise” of the ambient language to make associations and generalizations. However, overtime, research in frequency effects and priming suggests that a Skinnerian view of language acquisition may be more reasonable than previously considered.

Frequency Effects

Frequent exposure to language elements strengthens associations. There is a growing body of research demonstrating that infants are highly sensitive to the frequency of events in their experiences, including the features of a language, and can scaffold the understanding, learning, and usage of the elements of a language with minimal instruction (Hasher and Chromiak 1977; Attneave 1953; Shapiro 1969; Underwood 1971). In fact, frequency effects are ubiquitous in first language acquisition (Ambridge et al. 2015). Chang (2006) synthesized studies showing how frequency effects can account for the acquisition of constructions from the natural environment affect and adjust a learner’s usage overtime due to exposure to different linguistic forms and generalize a word’s syntactic roles and associate it to other words in the same category (Boyland 2001; Bybee and Hopper 2001; Hallan 2001). Effects have been found to occur in children as young as 8-months old who can (1) segment words from fluent speech, (2) store and access information about the sound patterns of words that occur frequently in fluent speech, and 3) retain words even when there is no contextual support from the surrounding environment (Juszcyk and Hohne 1997). Two-year-old children can learn word meanings across situations and contexts and get better at this skill as they get older (4 years old) (Akhtar 1999). Though not directly addressed, but with frequency, children are able to learn words simply by overhearing (Oshima-Takane 1988; Oshima-Takane et al. 1996; Oshima-Takane 1999; Akhtar et al. 2001; Akhtar 2005; Floor and Akhtar 2006; Skolnick and Fernald 2003). All things being equal, frequency effects in children have been observed at the word level (nouns, adjectives, verbs, function words) (Ambridge et al. 2015 p. 244), with higher frequency words learned before infrequent words and are associated with better comprehension and lower rates of error (Schwartz and Terrell 1983). Frequency effects have also been suggested for word strings such as routine four word sequences (i.e., a cup of tea, a cup of milk) in child-directed speech (Bannard and Matthews 2008). However, effects at the level of sentence construction are still unclear.

Priming

While frequency effects are related to the facilitative effects of exposure for making associations, priming refers to the facilitative effects of an encounter with a stimulus on subsequent processing of the same or a related stimulus. The effects may be a change in speed, bias, or accuracy of the processing of a stimulus. Each time we attend to a stimulus (i.e., a picture, a face, a word, phrase), a lasting representation is created that aids in later identification of the same or similar stimulus. Priming can occur as quickly as one encounter, has long lasting effects, and has been applied to language acquisition as well. Church and Fisher (1998) compared long-term auditory priming between English-speaking 2- and 3-year-olds and college students and found that all of the age groups performed similarly. Each participant was more likely to identify words that were heard previously regardless of the obvious discrepancies in previous lexical knowledge. Similar studies showed the same priming effects in preschoolers when stimuli were novel non-English words (i.e., yeeg, lell) (Fisher et al. 2001).

Priming may have a central role in the acquisition of syntactic structure as well, though experiments with young children have been limited. Some studies have used the passive form as the target construction as it is rare in children’s spontaneous speech (Huttenlocher et al. 2002). One of the earliest studies with 4–5-year-olds (Whitehurst et al. 1974) indicated that children who heard passive constructions were more likely to produce and comprehend the passive form than children who did not. Syntactic priming effects of transitive and dative constructions have also been observed (Huttenlocher et al. 2004; Shimpi 2007). Non-lexical priming studies are significant because input to young children is often heard without being repeated. Such studies demonstrate, in experimental conditions, that young children are sensitive to and may be conditioned to syntactic form just through hearing, and that form may also be reflected in their subsequent speech.

Social Constructivism

While nativism gained momentum, the influence of pragmatic-based philosophers (Wittgenstein 1953; Astin 1960), emphasizing meaning and the role of language acquisition to achieve social ends, did not disappear. The Vygotskian perspective on language as a cultural tool and the fundamental role of social and interpersonal interaction in development advanced among social constructivists (Vygotsky 1978). Among the more notable, Bruner (1957, 1960) rejected the nativist view and proposed that meaningful language acquisition occurs in the context of parent/adult–child interaction with the support of a Language Acquisition Support System (LASS). His research dramatically shifted the field by videotaping the context of naturally occurring language, as opposed to nativists who examined well-formed idealized sentences to uncover the underlying rules, or Universal grammar, that allow children to generate grammatical sentences. Such research indicated that infants are endowed with far more abilities than previously thought that facilitate communicative development (Gopnik 2009). For social constructivists, what is a priori are abilities that can be deployed to achieve social ends, including a disposition to attend to and discriminate the sound system, grammatical constraints, facial expressions, and gestures that communicate and intentions. Bruner highlighted the significance of intersubjectivity, or the ability to share and relate experiences with another. In understanding the intentions and meaning of the other, children also learn to appropriately use language (Bruner 1983).

Social constructivists continued to lay the groundwork for research on the capabilities of neonates and prelinguistic infants, which demonstrates that they are born with abilities to interact and attune to interlocutors. They also continued to promote the study of adult–child dyads and the role of adults or experts in facilitating social interaction, through which valuable cultural knowledge, including language, is learned.

Discriminating Abilities

Infants have been found to make socially related distinctions and preferences. Hours after birth, infants distinguish between rotten and pleasurable odors (Steiner 1979), and 6-day-olds know the smell of their mother’s milk (MacFarlane 1975). The capacity to sense touch begins very early in human ontogeny, as early as a preterm in the third month (Barlow and Mollon 1982; Ellis and Ellingson 1972). With regard to the sense of hearing, studies show that 6-month-olds discriminate features of tempo, rhythm, melody, and key in the structures of both song and instrumental sound (Trehub et al. 1993). Human speech is also salient to a neonate among an array of auditory stimuli in its environment (Cairns and Butterfield 1975; Eisenberg 1975). They have been found to be able to discriminate and show a preference between direct and averted gaze, as well as face-like and nonface-like (Maurer and Barrera 1981; Maurer 1985; Farroni 2002, 2004). They differentiate between human and nonhuman sounds (Eisenberg 1975) and between prosodically different languages (Mehler and Christophe 1995). Moreover, infants are able to decompose the stream of speech in their environment when they are as young as 4-days-old (Bijeljac-Babic et al. 1991; Kuhl 2004; Ramus 2002; Saffran 2001; Saffran et al. 1996; Saffran and Thiessen 2003; Vihman 2014). Thus, humans are born with remarkable abilities to sense and make distinctions of their social world, including linguistic input.

Facial Expressions

Newborns seem to also be born with an inventory of distinctive expressions and vocalizations that can be interpreted as communicative and can lead to interpersonal communication. Facial gestalts resembling facial expressions relating to laughter and crying are observed as early as 32.5 weeks in the womb (Reissland et al. 2011). Facial expressions of interest, joy, disgust, surprise, and distress have also been observed in young infants (Izard 1978). Trevarthen (1979) argued that the posturing of the head during many forms of facial expression seems to be systematically related to particular facial expressions and suggested that all the patterns of body expression are present in infants at birth, and the gesticulations of infants (i.e., hand-waving, index-finger pointing, and fingertip-clasping movements near the face when they are vocalizing) are clearly not imitations of adult partners. Though not linguistic per se, facial expressions can encourage social interaction with interlocutors, creating spaces to be exposed to linguistic input, and to learn to associate expressions with meaning.

Prelinguistic Gestures

Though it is unclear if the phylogentic or ontogenetic origins of language are vocal or gestural (Hirsh-Pasek and Golinkoff 1996; Karmiloff and Karmiloff-Smith 2001; McNeill 2000; Tomasello 2008) research demonstrates a relationship between prelinguistic vocalizations and gestures, and language development. For example, hand banging is a predictor of the onset of canonical babbling (Bates and Dick 2002). Gesture not only provides a way for children to refer to objects (i.e., pointing), but on average, children gesture an object 3 months before they produce the word for that object (Iverson and Goldin-Meadow 2005). Two-word combinations are also preceded by a gesture-plus-word combination (i.e., point at bird while saying nap; point at bird while saying bird). Others have shown that different types of gestures predict different types of word production (Kuvac-Kraljevic et al. 2013). Beat and iconic gestures appear when utterances become longer than two words (Mayberry and Nicoladis 2000). Therefore, some researchers propose that language and gesture are part of a single integrated system (Goldin-Meadow and Wagner Alibali 2013).

Prelinguistic Vocalizations

Babbling was once considered to simply be a reflection of an infant’s motor development and sensitivity to and use of the sounds of their native language. Now prelinguistic vocalizations are considered to be a critical part of social gating and may have a significant role in language acquisition (Jusczyk and Hohne 1997; Kuhl 2007; Oller et al. 2010). The consonants used in babbling (CVCV), for example, are good predictors of the consonants used by infants in their first words (Vihman et al. 1985). Babbling is also a predictor of the appearance of pointing and a child’s first words (McGillion et al. 2017).

Prelinguistic vocalizations or utterances are also deployed effectively by infants to achieve various communicative ends including requesting and obtaining food and getting the name of objects, to regulate the behavior of others, to maintain contact with others, to express their feelings, and to play with language (Halliday 1975). Infants also modify their vocalizations to be more speech-like or more mature when they receive feedback from their interlocutors (Warlaumont et al. 2014; Albert et al. 2017). In response, mothers prolong the interactions and increase the use of simplified speech, which often contains information about objects in the infant’s social context. Mothers also respond more often when infant vocalizations are directed at objects then those that are undirected.

Infant-Directed Speech (IDS)

Infants also have a clear preference for IDS over adult-directed speech (ADS) (Cooper et al. 1997; Cooper and Aslin 1990; Werker and McLeod 1989; Pegg et al. 1992; Soderstrom 2007). Exaggerated, simplified, and adjusted speech directed at infants has long been assumed to facilitate language learning, as it enhances distinctions between the various aspects of their native language (Kemler Nelson et al. 1989). IDS allows infants to easily identify and categorize vowels and consonants (Cristia 2011; McMurray et al. 2013). Infants have been found to remember and learn words better with IDS when compared to ADS (Singh et al. 2009; Ma et al. 2011). IDS exposure is also directly related to vocabulary growth (Weisleder and Fernald 2013).

Though the characteristics or prosodic features may vary from culture to culture, IDS is a species-typical trait (Ferguson 1978; Bryant and Barrett 2007; Falk 2009). Russian, American, Swedish, Mandarin, Hakka, Japanese, Thai, French, German, Hebrew, Quiche, Lebanese speaking women use IDS with infants (Pye 1986; Fernald et al. 1989; Kuhl et al. 1997; Kitamura et al. 2002; Fernald and Morikawa 1993; Liu et al. 2003; Farran et al. 2017). Men also modify and heighten the pitch of their speech (Broesch and Bryant 2017; Rondal 1980; Fernald et al. 1989; Gergely et al. 2017). Deaf speakers make modifications as well when they sign to infants (Erting et al. 1990; Masataka 1992; Nonaka 2004).

Though IDS may be species-typical, and “may be phylogenetically widespread and evolutionarily ancient” (See Heyes 2016 for discussion, p. 284), it is not a species-specific trait. IDS is also directed toward nonhumans, and the preference for its characteristics is also exhibited among nonhuman animals (Gergely et al. 2017; McConnell 1991). Thus, some have argued that IDS, and other adult and infant abilities that facilitate learning (i.e., gaze, motionese, imitation, etc.) reflects a natural pedagogy or predisposition of experts to ostensibly manifest their knowledge to novices to ensure the transmission of cultural knowledge, including language (Csibra and Gergely 2006; Csibra and Gergely 2009). This predisposition was selected for in hominid evolution when tool-making practices emerged and confronted a learner with cognitively opaque behaviors to acquire. Heyes (2016), however, argues that such behaviors were adapted not for teaching or learning, but for the promotion of social bonding or to promote attention to other conspecifics. Regardless, both views hold that such aspects of social interaction are not specifically for language acquisition, though research clearly suggests that they can be facilitative in the process.

Contingency

In addition to IDS, caregivers may use gaze, eye contact, eyebrow raising, and pointing as social cues to index relevant information, which infants recognize as signals that the caregiver has communicative intention (Brugger et al. 2007; Csibra 2010). From birth, we are already participating in and anticipating dialogic practices in multiple ways. When adults use both verbal and deictic communicative signals, they assume that they refer to the same thing. If there is a discrepancy, for example, in a condition when the pointing does not index anything, infants notice and expect that something should be in the direction of the pointed area (Gliga and Csibra 2009). Trevarthen (1979), who observed that infants move their mouth, hands, and eyes in a turn-taking format with an adult, claimed that humans are born with a foundation for interpersonal communication. Bateson (1979) similarly looked at newborns 49–105 days old and also noticed mothers and infants “in a pattern of more or less alternating, non-overlapping vocalization, the mother speaking brief sentences and the infant responding with coos and murmurs, together producing a brief joint performance similar to conversation” (p. 65). Bateson coined this collaboration protoconversation. Jaffe et al. (2001) also showed that 4-month-old infants are highly proficient in vocal turn-taking. If interlocutor’s actions are not contingent to an infant’s behavior or vocalization, infants show sensitivity to the lack of contingency with a loss of positive affect and attention (Murray and Trevarthen 1985).

Thus, decades of social constructivists have shown that human beings are born with a remarkable sensitivity to stimuli in the social world, an ability to discriminate and parse the social input, an inherent expectation and use for communicative signals, and an ability to appreciate the communicative significance of social cues, facial expressions, and gestures, crucial to achieving and maintaining social interactions.

Functionalism

From social pragmatists and social constructivists, a functional or usage-based perspective of language acquisition emerged, one that would also suggest that structure is not innate but emerges from interaction and use. For proponents of this view, what evolved, what is universal is not an innate language faculty, but social cognitive skills, again not dedicated to or specifically for language. Tomasello (1999, 2003), for example, focused on two cognitive skills. The first skill is the ability to discern the goals and intentions of their interlocutors through joint attention (Tomasello and Farrar 1986). It involves the ability to share and follow the attention of other persons, and to actively direct the attention of others by pointing, showing, and using nonlinguistic gestures. In doing so, children learn the use and intentions of speakers and formulate scripts and routines of how to use linguistic conventions to achieve social ends. For example, pointing produces joint attentional frameworks with interlocutors, in which infants have been observed to use pointing for declarative, imperative, and interrogative motives (Bates 1979; Carpenter et al. 1998; Begus et al. 2014). The second skill, pattern finding, includes the ability to perform statistically based distributional analyses on various kinds of perceptual and behavioral sequences and to form categories of similar objects and events (Tomasello 2003 citing Gomez and Gerken 1999; Marcus et al. 1999; Rakison and Oakes 2003; Saffran et al. 1996). As children try to comprehend and acquire the language in their social environment to achieve their social goals, intention reading allows them to theorize the function and meaning of an utterance, word, or sentence, while pattern finding gives them the ability to isolate and extract parts of the language they hear and allows them to see commonalities in use and function across different utterances. These two universal sociocognitive abilities allow children to construct the grammar, meaning, and function of the language(s) being used around them to achieve social ends, along with other social abilities mentioned earlier.

Socioemotional Approach

For nonnativists, the picture that emerges, thus far, is that human beings have evolved social and cognitive abilities that facilitate attention to social agents, social bonding, and interaction, creating spaces and social contexts in which cultural learning, including language, can occur (i.e., conditioning, frequency effects, priming, pattern-finding, joint attention, etc.). A socioemotional approach proposes a perspective on language acquisition based on evolutionary biology, neurobiology, and complexity theory. Language is viewed as a cultural artifact that emerges as a complex adaptive system (Prigogine and Stengers 1984; Waldrop 1992). The ubiquity of primary language acquisition is the product of an innate drive to seek out, attune to, and affiliate with conspecifics – an interactional instinct or a social instinct (Lee et al. 2009; Everett 2012; Joaquin and Schumann 2013; Joaquin 2013). This instinct ensures that the language, formed and shaped through interaction, is passed down to succeeding generations. However, the ontogenesis of language behavior (i.e., its acquisition) does not occur without its evolved biology, which support attentional and motivational processes that entrain children on the behavior of conspecifics, or without its evolved cultural practices for socialization (Rogoff 2003, Rogoff et al. Rogoff et al. 2007; Schieffelin 1990).

Evidence for the instinct is found in neonates’ and infants’ behavior that manifest an underlying motivation to engage in interactional activity. Infants show agency in the initiation of interactional activity. They are not passive participants or recipients of information in interaction. They actively initiate social engagement through the use of inborn abilities (cooing, vocalizations, gestures, smiling, etc.). For example, infants 3–54-hours-old initiate interaction by deploying a previously imitated gesture (Nagy 2006; Nagy and Molnar 2004). Turn-taking interactions are often initiated by infants who also anticipate the end of their interlocutor’s turn resulting in high levels of latched turns (Gratier et al. 2015). Infants can also initiate learning. An infant can choose which object they would like information about by pointing. If pointing is followed by information provided by the adult regarding that object, learning is more likely to occur in contrast to objects that the infant did not point or show interest to (Begus et al. 2014). In interaction, infant utterances are longer in duration and have shorter response latencies if the infant initiated the conversation, in comparison to those initiated by the adult (Ko et al. 2015).

Infants also communicate pleasure and distress when desired interaction occurs or fails. This is observed in still-face studies in which a mother is instructed to display an expressionless, “still-face” in front of their child (Tronick et al. 1979; Brazelton and Cramer 1990). For example, when the child sees his mother, he typically “greets” his mother with a smile. The mother responds with a still-face and he becomes still and then warily looks away. He then looks toward his mother again and finally he withdraws from the interaction. In another study (Murray and Trevarthen 1985), the infant showed signs of distress, such as grimacing, increased handling of clothes, touching the face, sucking the fingers, and frowning within a few seconds of seeing his mother’s “still-face.” Efforts to communicate or reinstate interaction were also intensified at first, mouthing and tonguing postures were maintained and accompanied by active gesturing. Eventually, the infant withdrew and averted his gaze downward from the mother’s face. Thus, infants take into account the visible emotional stances of their interlocutor and use that understanding to deploy their next action.

Moreover, infants demonstrate a bias for interaction with conspecifics. They make a distinction between human and nonhuman acts and they prefer animate entities to inanimate ones. One and two-month-olds respond to physical objects (i.e., a dangling toy) differently than to a person (Trevarthen 1974). Infants look at, listen to, or touch objects. They also seek and interact with physical objects as sources of interest and as potentially graspable, chewable, and kickable, while human beings are perceived as social and are communicated with through expressive movements that are distinct from those with objects. Similarly, other researchers showed that 2-month-olds smile, vocalize, and alternate their gazes with an adult, but if presented with an object that moves and sounds contingently, infants will engage intense arm activity while staring at the object (Legerstee et al. 1987). Neonatal imitation also may be specific to people, such that inanimate objects that attempt to elicit imitative responses from infants fail (Legerstee 1991).

In addition, infants understand the interactional import of communicative and social cues (e.g., gaze, pitch changes) and more generally are attuned to adults’ responses and actions (see Lee et al. 2011; Joaquin 2013 for a review).

Collectively, these behaviors are manifestations of an interactional instinct, and are motivated by biological processes. Endogenous opiates and neuropeptides (e.g., oxytocin, vasopressin, and dopamine) make interaction with caregivers inherently rewarding and serves as a hardwired motivational mechanism (Depue and Morrone-Strupinsky 2005; Thompson et al. 2004; Popik and Van Ree 1992; Blass and Shah 1994). Neuropeptides are also a part of a system that appraises and marks interactions and the contexts in which they occur as pleasurable or not. Thus, as the child grows, the rewarding aspects of the attachment bond become part of the child’s memory, which leads to the approach and development of affiliative bonds with caregivers and ensures cultural learning, socialization, and language acquisition through domain general cognitive abilities (Mates and Joaquin 2013). A similar biology guides the behaviors of caregivers, who through IDS, serve to facilitate and increase affiliation and attachment between the infant and caregiver (Storey et al. 2000; Weismann et al. 2012).

Socialization of language use, or pragmatics, may also be subserved, in part by the prefrontal cortex (PFC), known to be associated with monitoring, learning, and memory of the reward value of reinforcers, and the evaluation of punishers (Schumann 1999; Kringelbach 2005a, 2005b). Therefore, as it is involved in the evaluation of reward, it is important for decision-making (Rolls 1998). Its connections to the amygdala and cingulate cortex also involve the PFC in emotional processing, particularly of face and voice expressions (verbal and nonverbal emotional cues) as it receives inputs from the auditory and visual areas (Baird et al. 2006; Hornak et al. 1996; Mah et al. 2004). Furthermore, it is also linked to an ability to infer the likely thoughts or intentions, beliefs, and desires of others (Heatherton et al. 2006; Kelley et al. 2002). As these brain regions subserve such social and cognitive abilities, they have also evolved to subserve language pragmatics, which a child learns as they are encultured into society. Indeed, there are many anthropological, sociological, and linguistic accounts of how children are socialized to use language. And, indeed, persons with damage to the PFC or frontotemporal dementia, a neurodegenerative disease affecting the PFC, demonstrate social and pragmatic anomalies (Brickner 1936; Mates et al. 2010; Joaquin 2010a, 2010b).

Conclusion

Language acquisition is the process in which humans acquire and subsequently produce their primary language(s) for social means. It involves acquiring the sound system, structures, rules, and symbolic representations of a language, and then applying, producing, and combining those aspects to suit ones objectives in social contexts. The issue in language acquisition is: What has evolved for children to come to know so much about language if the input they are exposed as children to is limited, full of imperfections, and is often not accompanied with correction? There are two evolutionary stories. An economical view proposes that human beings have evolved a language-specific faculty in our biology, which accounts for the ubiquity of language acquisition by all typically developing children. However, since its proposal, the language module has not been biologically substantiated and it is still up for debate. What has been substantiated is a vast array of domain general abilities that behaviorist, social constructivist, functionalist, and socioemotional researchers have suggested are significant for cultural learning and language acquisition. Discriminating abilities, prelinguistic behaviors, frequency effects, priming, pattern finding, and joint attention combined with infant-directed speech and socialization practices, and a biologically driven motivation to attune to, affiliate, and bond with conspecifics make language input more accessible than previously thought. Any explanation of language acquisition must also consider these evolved abilities, processes, and biological structures.

Cross-References