La adquisición de las consonantes oclusivas en español por hablantes de chino mandarín: nn estudio transversal/ the acquisition of stop consonants in Spanish by Mandarin Chinese speakers: a cross-sectional study
Authors
Published Date
Publisher
Abstract
In both Mandarin Chinese and Spanish, stop consonants exhibit contrasting phonological features, but the nature of these contrasts differs fundamentally between the two languages. In Spanish, the distinction is based on voicing, whereas Mandarin lacks voiced stops and instead contrasts stops by aspiration, distinguishing between aspirated and unaspirated voiceless stops. This difference presents challenges for Mandarin speakers learning Spanish. The few empirical studies that have investigated the acquisition of Spanish stop consonants by Mandarin speakers (e.g., Chen, 2007; Gong & Cooke, 2017; Zhang, 2022) have focused on learners with up to four years of experience in Spanish. These studies reveal a tendency for voiceless and voiced stops to neutralize, with learners often perceiving both types of consonants as voiced and producing them as voiceless, even at advanced levels. However, research has yet to examine more advanced stages of acquisition, particularly in an immersion environment, and how both perception and production modalities develop alongside the influence of social and linguistic factors. To fill this gap, this dissertation examines the acquisition of Spanish stop consonants (/p, t, k, b, d, g/) by 80 Mandarin Chinese-speaking learners living in Spain across six proficiency levels, from absolute beginners to highly proficient learners. The primary focus is on how these learners produce and perceive the non-native voicing contrast and the factors that influence this acquisition process. The study is divided into two main parts: production and perception. In the production study, 16,458 tokens of stop consonants were collected from three speech tasks, and their Voice Onset Time (VOT) and difference scores for each stop consonant pair were measured. The data were analyzed using mixed-effects linear regression with a stepwise method, incorporating place of articulation, voicing, following vowel, word type, proficiency level, task type, L2 use, and time spent abroad as independent variables. The results revealed that learners' VOT values were significantly influenced by their proficiency levels. In the beginner and intermediate stages (A1, A2, B1), both voiced and voiceless stops were predominantly produced as voiceless, with no clear differentiation in VOT. It is only at the B2 level that learners begin to systematically produce a clear contrast between voiced and voiceless stops, with advanced learners (C2) approaching native-like production. However, while there is a general trend toward more accurate production as proficiency increases, considerable variability was observed within each level. This variability is particularly prominent at the B2 and C1 levels, where two distinct learner profiles emerge. The first group consistently differentiated voiced and voiceless stops, while the second group continued to neutralize the voicing contrast. Voicing and place of articulation significantly influenced VOT across all proficiency levels, with voiceless stops displaying higher VOT values, especially for velar stops. Task type also played a role, with higher VOT values in passage reading compared to isolated word reading. Furthermore, increased use of Spanish correlated with decreased VOT values, particularly at intermediate and advanced low levels (A2, B1, B2). In the perception study, participants completed a forced-choice identification task involving 165 stimuli with VOT values ranging from -60 ms to 20 ms, yielding a total of 14,850 responses. Sensitivity to the voicing contrast was measured using d-prime scores, and a generalized linear mixed model was applied to estimate the probability of a 'ba, da, ga' response for each level, participant, and place of articulation. The random effects output of the model was used to determine the 50% crossover point (CO) for each participant and place of articulation. This perceptual boundary represents the VOT value at which listeners are equally likely to perceive a sound as voiced or voiceless. The width of the perceptual boundary was calculated to analyze the consistency with which the listener identifies VOT stimuli across the five blocks. As with the production data, a generalized linear mixed model was fitted by proficiency level to examine the impact of independent variables on the estimated response probabilities. Results revealed that sensitivity to the voicing contrast improved non-linearly with increased proficiency. Beginner (A1) and intermediate learners (A2, B1) tended to perceive stimuli as voiced across the entire VOT continuum, showing no perceptual threshold within the examined ranges. As proficiency increased, perceptual thresholds shifted closer to native values, although even advanced learners (C2) did not reach native-level consistency. VOT length and place of articulation were significant predictors of response probability, and the use of Spanish significantly impacted perception at the A1, A2, B2, and C2 levels. The final part of the study examined the relationship between production and perception modalities using Spearman correlations between difference scores and the d' discrimination index. Evidence of alignment between the two modalities was only found at the C1 level, where participants who accurately identified consonants with negative VOT as voiced and those with positive VOT as voiceless also produced these consonants with systematically differentiated VOT values. However, this relationship became less consistent at the C2 level, where the two modalities dissociated again. These findings align with recent research supporting an asymmetry between perception and production. The relationship between the two modalities varies depending on L2 proficiency, becoming more aligned as L2 experience increases. These results suggest a possible asynchronous relationship, where improvements in perception at one point in time may predict improvements in production at a subsequent stage. This dissertation has several implications. Theoretically, the study underscores the significance of perceived interlinguistic similarity between L1 and L2 sounds and highlights the need to examine learners with diverse L1 backgrounds. The acquisition of the Spanish voicing contrast by Mandarin speakers follows a distinct chronology compared to learners from other L1 backgrounds, with delayed acquisition of voicing features and overgeneralization of prevoicing in voiceless consonants at advanced stages. Importantly, even when two contrastive L2 sounds are classified as a single L1 category (single-category assimilation), the acquisition of the L2 contrast remains possible, a finding that contrasts with prior research in domestic learning environments. Furthermore, the acquisition of voiced stops is not uniform across places of articulation, suggesting that existing L2 speech models should incorporate predictions based on articulatory constraints. The study also demonstrates that the effect of input in L2 acquisition is modulated by both L1 and proficiency level, with significant input effects observed only at advanced stages of learning. Methodologically, the integration of multiple measurements proved crucial for capturing the variability in VOT perception among learners.
Description
University of Minnesota Ph.D. dissertation. 2025. Major: Hispanic Linguistics. Advisor: Mandy Menke. 1 computer file (PDF); xx, 425 pages.
Related to
item.page.replaces
License
Collections
Series/Report Number
Funding Information
item.page.isbn
DOI identifier
Previously Published Citation
Other identifiers
Suggested Citation
Bravo Diaz, Celia. (2025). La adquisición de las consonantes oclusivas en español por hablantes de chino mandarín: nn estudio transversal/ the acquisition of stop consonants in Spanish by Mandarin Chinese speakers: a cross-sectional study. Retrieved from the University Digital Conservancy, https://hdl.handle.net/11299/275940.
Content distributed via the University Digital Conservancy may be subject to additional license and use restrictions applied by the depositor. By using these files, users agree to the Terms of Use. Materials in the UDC may contain content that is disturbing and/or harmful. For more information, please see our statement on harmful content in digital repositories.
