Svoboda | Graniru | BBC Russia | Golosameriki | Facebook
Academia.eduAcademia.edu
Elicitation Experiments in Language Acquisition Sonja Eisenbeiss (University of Essex) [email protected] Overview I: Elicitation and other Types of Production Data II: Semi-Structured Elicitation and Stimulus Materials III: Production Experiments See Eisenbeiss (2009) Part I: Elicitation and other Types of Production Data • naturalistic data • semi-structured elicitation • production experiments • transcription and data analysis Naturalistic Data • other term: spontaneous speech data • recording of ongoing communicative events (free play, dinner table conversation,…) Advantages • age-independent • no special task-demands, thus high ecological validity • frequency information available • input-analysis possible • analysable for different phenomena Problems • low comparability • underestimation of productivity due to recurrent situations which require similar linguistic encoding • lack of data for low-frequency phenomena (morphemes, constructions,…) NPs in German Child Language (Eisenbeiss 1994, 2003) child AND ANN CAR HAN LEO MAT SVE total age 2;1 2;4-2;9 3;6 2;0-2;8 1;11-2;11 2;3-3;6 2;9-3;3 1;11-3;6 files (w. elicit.) 1 6 1 8 15 (all) 18 15 (10) 64 (20) utterances 1.4500 1.977 1.795 1.399 4.383 (4.383) 1.978 3.811 (2814) 16.793 (7.197) (Clahsen 1982, Wagner 1985, Clahsen et al. 1990) NPs in Spontaneous Speech Noun phrase number with % of correlation with utterances mean length of context for utterance article 2.646 28 0.489; sign. adjective + 249 3 0.274; n.s. 19 <1 -0.097; n.s. article possessive ‘s Problems for Naturalistic Corpus Studies on NPs • Some NP types are comparatively frequent and become more frequent with increasing utterance length (MLU). • BUT: Some NP types (e.g. those with contexts for adjective+ article or possessive ‘s) : • are rare • occur in some files, but not in others • do not become more frequent over time (though children’s utterances get longer) Semi-Structured Elicitation • Encouraging speech production in a naturalistic (often game-like) setting. • e.g. eliciting complete sentences with the verb to give in an "animal feeding game": participants have to feed toy animals and explain which food items they would like to give to which animals (Eisenbeiss 1994) • often used as supplements to naturalistic data or experiments Advantages • appropriate for very young learners (< 2) • high comparability • no underestimation of productivity due to recurrent situations which require similar linguistic encoding • no underestimation of productivity due to high task demands • data for low-frequency phenomena (morphemes, constructions,…) • analysable for different phenomena NPs with Adjective+ Article (Eisenbeiss 2003) For instance: • picture-matching game: putting picture cards on a board with pictures (red balloon, blue ballon,...) What is this ? This is ... (NOM). What goes onto this picture here? (NOM) What do you want to put here (ACC) NPs with Adjective+ Article % aller analysierbaren Äußerungen % of analysable utterances 25 child KORPUS sve 20 sve 15 mat leo 10 han car 5 ann 0 1.0 and 1.5 2.0 2.5 3.0 MLU black symbols: files with elicitation 3.5 4.0 4.5 NPs with possessive –s (Eisenbeiss 2003) For instance: • possession-matching game: assigning possessions (depicted on cards) to people (depicted on board) Whose bicycle is this? This is ... NPs with possessive - s KORPUS child 6 % aller analysierbaren Äußerungen % of analysable utterances sve 5 sve mat 4 leo 3 leo han 2 car 1 ann and 0 1.0 1.5 2.0 2.5 3.0 3.5 MLU black symbols: files with elicitation 4.0 4.5 Problems • age-dependent (usually at least 1;6 years) • no frequency information available • no input-analysis possible Experiments • Systematic control of variables (properties of participants and stimulus materials) • Standardized procedures • Limited range of response options Advantages • high comparability • data for low-frequency phenomena (morphemes, constructions,…) • no underestimation of productivity due to recurrent situations which require similar linguistic encoding Problems • age-dependent (usually at least 3 years) • underestimation of productivity due to comparatively high task-demands • no frequency information available • no input-analysis possible • analysis restricted to target phenomenon Transcription and Data Analysis • Naturalistic Data and Semi-Structured Elicitation Games require Transcription. • The most common format is the CHATformat, developed for the largest child language data deposit: CHILDES (http:/ / childes.psy.cmu.edu/ ). • Digital CHAT-files can be searched using the CLAN tools provided by CHILDES. • All three data types require “error”-analysis, i.e. classifying deviations from the target. Transcription: CHAT-Format • Transcripts are written in a text editor and stored as unformatted ASCII files ( text only or plain text ). • All lines are ended by a carriage return (ENTER). • Every transcript must begin with the line: @BEGIN and end with the line: @End. • Between @BEGIN and @End: – headers with information about the transcript (obligatory: @Participants) – main tier for transcription – dependent tiers for further annotations CHAT-Format: Basic Structure • • • • • • • • @BEGIN @Participants [ other headers] * JOE: [ spoken material] %mor: [ morpho-syntactic coding] * INT: [ spoken material] %mor: [ morpho-syntactic coding] @End CHAT-Format: Headers • three letters followed by a colon and a tab • obligatory: @ Participants, on the second line of the transcript; e.g.: @Participants: JOE Joe child, INT Interviewer • optional; e.g.: – @Birth of Learner: … – @Age of Learner: … – @Date: … – @Language: … – @Transcriber: … CHAT-Format: Main Tiers • what was actually said, one utterance per tier • introduced by "* ", the three-letter code for the participant and a tab; e.g.: * JOE: the boy put the leash on the cat. • orthographical transcription in lower case Latin letters; except for proper nouns (e.g. John) and "I" • numbers spelled out ( ten, not 10) • normalisation of phonetically deviant forms (phonetic information about forms can be presented on a %pho dependent tier) Main Tiers: Markers • • • • • • • • • • • unfilled pauses: # filled pauses: eh@fp interruption: + /. self-interruption: + //. repetition w/ o correction: [/] repetition with correction: [//] unintelligible speech: xxx material coded on phonol. tier: yyy doubtful material: [ ?] or [ = ? text] omitted parts of words: () to refer to more than one word: < > CHAT: Dependent Tiers further annotations, e.g. • • • • • • %mor %pho %syn %err %com %spa [ morphosyntactic coding] [ phonological coding] [ syntactic coding] [ errors] [ comments] [ speech acts] CLAN: Windows • the commands window where you specify the folders, files, and commands you want to use • the CLAN output window, where you will see the results of your searches. If you have not specified an output file, your results will be displayed in this window. If you have saved your outputs into a file (as you will be asked to do for this exercise), you will not be able to see it in the output window, but the name and location of the output file will be displayed in the output window. CLAN: Command Window CLAN: Steps • specify your working and lib directory, where the files you will be working with are stored • specify your output directory, where any output files will be stored • select a command • select one or more transcription files for analysis • optionally use some so-called switches to modify the commands. CLAN: Core Commands • FREQ: will provide you with type and token frequency information • COMBO: will find utterances matching a given set of criteria • MLU: will calculate the MLU (mean length of utterance) CLAN: Useful Switches +f +s +t +u +o +w saves output to file. For each transcription that you have chosen to analyse, an output file will be generated. By default, this output file will have the name of the transcription file and an extension that will show you which command was used to create the output (e.g. frq, mlu or cmb). searches for a string in a file. restricts the search to a particular tier – e.g. the tier of a particular speaker. treats all files together. orders FREQ lists according to token frequency –w1 and + w1 provide one preceeding/ following line, -w2 and + w2 will provide two preceeding/ following lines, etc. CLAN: Search Strings ^ + ! * “” immediately followed by inclusive OR logical NOT “joker” strings including blanks, etc. should be put in quotes CLAN: Search Strings ^ + ! * “” immediately followed by inclusive OR logical NOT “joker” strings including blanks, etc. should be put in quotes Error Analysis • % suppliance in obligatory contexts (e.g. % –ed in past tense contexts) vs. % correct use of a particular form (e.g. % correct use of all present tense forms) • errors of omission (e.g. two mouse) vs. errors of comission ( two mouses) • overregularisations (e.g. sing-singed) vs. overirregularisations (e.g. bite-bote) Part II: Semi-Structured Elicitation and Stimulus Materials • Interactional Setting • Target Type • The Puzzle Task • Stimulus Type • Transcription and Analysis Interactional Setting • Director/ matcher (or “confederate description”): A “director” describes a scene/ object etc. and a “matcher” who is not able to see this scene/ object, has to recreate it. E.g.: The matcher has to build a toy house identical to the one created by the director who is hidden behind a screen. • Speaker/ Listener: A speaker provides information for someone who does not have access to this information. E.g.: The speaker retells a story (s)he heard/ read while the listener was not in the room. • Co-Players: All participants are involved in a game and provide each other with information to co-ordinate their actions. E.g.: The players are involved in a construction or puzzle game where not everyone has access to all pieces. Target Type • broad-spectrum (generally encouraging participants to speak) • form-focused: the use of a particular form or construction • meaning-focused: the linguistic encoding of a particular function or meaning (which can be encoded in different ways) Broad-Spectrum • Frog story: a picture book w/ owords used to elicit narratives (Berman/ Slobin 1994) • Bag task: a bag with bag for blocks and animals of different sizes and colours. The bag has pockets that match the animals in colour and have coloured buttons, ties, etc.; and children frequently refer to colours, sizes and locations when they ask other players to help them hide or find animals in the pockets (Eisenbeiss 2009b) Form-focused • Picture-matching game: aimed at the elicitation of noun phrases with adjectives in different case contexts (Eisenbeiss 1994), see part I • Possession-matching-games: aimed at the elicitation of adnominal possessive constructions (Eisenbeiss 1994) Meaning-focused • “circle of dirt”: a picture book w/ owords used to elicit descriptions of part-whole relationships and actions affecting (body) parts (Eisenbeiss and McGregor 1999): • “cut-and-break”: video stimulus created for cross-linguistic studies of “separation and material destruction” events (Bohnemeyer, Bowerman and Brown 2001). The Puzzle Task (Eisenbeiss 2009b) • a task with co-players: child describes contrasting pictures on a puzzle board, adult finds the matching pieces, child puts them into the correct cut-out • exchangable pictures and puzzle pieces • can be used to elictit particular forms or to elicit the linguistic encoding of particular meanings Eisenbeiss/ Matsuo (2005) Eisenbeiss et al. (2009) • German: 39 recordings (picture descriptions and asking for pieces) 1286 utterances with/ without V 21 children (3;7-6;6) • Japanese: 67 recordings (asking for pieces) 421 utterances with V 16 children (2;11-6;5) Elicitation Material: give Elicitation Material: bite Elicitation Material: wash Elicitation Material: put on German: The Use of Verbs (%) GAME GIVE BITE WASH PUT GIVE 60 0 < 1 < 1 BITE 0 62 0 0 0 0 70 0 PUT 0 0 0 48 other 25 16 14 27 no V 14 22 15 25 Verb WASH Error Types • More contexts for errors but no qualitative differences in error types observed so far • For instance, PPs instead of IOs: • naturalistic (Carsten 3;6): für'n papa sollste aber den schenken for the daddy shall PART this-one give • elicited (Jannik 6;4): da gibt das baby fuer das schaf ehm den gras. there gives the baby for the sheep ehm the grass Overt Arguments in Japanese (%) 100 90 80 70 60 50 40 30 20 10 0 subject object Jun (3;8) Puzzle (</=3:8) Puzzle (all) Jun: naturalistic data Stimulus Type • pictures • photos • computer animations • videos • toys • real objects Pictures • better for descriptions of static objects/ properties than for event descriptions and verb elicitation • requires knowledge of artistic conventions (e.g. lines for movement, shading etc.) • can be used for “unrealistic” events (e.g. animals in different colours or positions) • comparatively easy to create with clip art and standard software • comparatively easy to modify for minimal variations (e.g. colour) Photos • better for descriptions of static objects/ properties than for event descriptions and verb elicitation • do not require knowledge of artistic conventions and are comparatively “natural” • problematic for “unrealistic” events (e.g. animals in different colours or positions) • comparatively easy to create and manipulate with standard photo equipment and Photoshop or similar software • cannot be as easily modified for minimal variations (e.g. colour) as pictures, but possible in principle Computer Animations • better for descriptions of dynamic events and verb elicitation than for descriptions of static objects/ properties • not very naturalistic, in particular when it comes to natural movements of people and animals • can be used for “unrealistic” events (e.g. funny movements of animals) • difficult and time-consuming to create • good control for minimal variations (e.g. direction or manner of motion) Videos • better for descriptions of dynamic events and verb elicitation than for descriptions of static objects/ properties • comparatively natural • cannot easily be used for “unrealistic” events • comparatively simple to create with standard digital video equipment and editing software (Adobe etc.) • cannot be as easily modified for minimal variations as computer animations (e.g. direction or manner of motion) because “actors” tend to introduce unwanted modifications Toys • e.g. stuffed animals, cars, blocks • appropriate for descriptions of dynamic events and verb elicitation as well as for descriptions of static objects/ properties • not completely naturalistic, in particular when it comes to natural movements of people and animals • often very culture-dependent • can be used for “unrealistic” events (e.g. funny movements of animals) • usually easily obtainable • object properties can be fairly well controlled, but for dynamic events, actions of toy “actors” tend to introduce unwanted modifications Real Objects • e.g. tools, household items like pots, dishes • appropriate for descriptions of dynamic events and verb elicitation as well as for descriptions of static objects/ properties • very naturalistic • basic objects are often less culture-dependent than toys • can be used for realistic and “unrealistic” events (e.g. funny movements of pots) • usually easily obtainable • object properties are a bit harder to control than for toys and modifications might reduce naturalness (e.g. atypical colours for household items) • for dynamic events, manipulations of the objects tend to introduce unwanted modifications Part III: Production Experiments • • • • • • elicited imitation elicited production speeded production syntactic priming input/ feedback manipulations eye-tracking and speech production Elicited Imitation Participants are asked to imitate spoken sentences. Thus, it is clear what learners should say; and when stimuli are sufficiently long and complex, participants cannot simply memorise them as a whole, but have to employ their own grammar to recreate them. Thus, a comparison of the target utterance and the learner’s actual production can shed light on the grammatical knowledge that learners employ to express a given meaning. Elicited Imitation: Pro & Con Advantages • easy to carry out • clear target for comparisons Problems: • good performance might be due to simple memorisation. • errors might be due to a lack of vocabulary knowledge, memory limitations, etc. Elicited Production Participants receive a prompt to produce a form (e.g. This is a door. These are two …?). Alternatively, learners can be instructed to turn a given sentence into a question, a negated sentences, etc. (e.g. I'll say something and then you say the opposite). Elicited Production: Pro & Con Advantages independence of memorised models Problems: • requires reliable prompts • unclear influence of participants’ earlier experience - unless novel words are used ( This is a wug. These are two …?, Berko 1958). • performance errors due to task difficulties (especially with novel words) Berko (1958) Speeded Production The frequency of stimulus items is manipulated to study storage and computation in learners’ developing mental lexicon: If a morphologically complex form such as walk-ed is stored as a whole, then highfrequency forms should have stronger memory traces, due to additional exposure. Hence, they should be retrieved and produced faster than low-frequency forms. In contrast, if morphological complex forms are computed from stems and affixes, production latencies should only be affected by the frequency of these components (e.g. walk and –ed), not by the frequency of the complex form (e.g. walk-ed). Speeded Production: Pro & Con Advantages: can provide information about storage and computation Problems: • requires highly reliable prompts • requires frequency information for the words in the variety the learner is exposed to (e.g. child directed German,…) • requires reaction-time equipment Clahsen et al. (2004) Elicitation of German participles with a computer game involving an alien: overgeneralisations of regularly inflected participles inflection and a frequency effect for irregulars only; which suggests that irregular word forms are stored as wholes, while regularly inflected word forms are computed. Syntactic Priming Speakers tend to repeat syntactic structure across otherwise unrelated utterances (Branigan 2007). For instance, speakers are more likely to use a passive after hearing or producing a passive sentence than after an active sentence. If learners show this effect, this indicates that they have acquired a grammatical representation that can be activated by priming. Syntactic Priming: Variants • presentation of primes: • between groups • in blocks • alternating • required participant response to primes: • listening only • listening and repeating • lexical overlap between prime and target: • no lexical overlap • same verb or head noun Syntactic Priming: Pro & Con Advantages: provides insights into representations Problems: requires pairs of alternative structures that speakers could use (e.g. active/ passive) Input/ Feedback Manipulation The effect of different types of input (e.g. models) or feedback (explicit corrections, reformulations, etc.) on speakers’ (elicited) production is studied. Input/ Feedback Manipulation: Pro & Con Advantages: can provide information about the role of input/ feedback Problems: dependency of corrective feedback on the occurrence of errors Eye-Tracking and Speech Production Eye-movements are monitored while participants describe pictures and videos. This allows us to investigate the amount of visual information required to start speaking and the planning processes involved in speech production (pre-viewing scenes, aligning speech and vision, etc.). Dobel et al. (2009) Eye-Tracking and Speech Production: Pro & Con Advantages: can provide information about the role of visual information and speech planning processes Problems: • requires specialised equipment • requires good knowledge of visual processing Conclusion Converging Evidence References I Berko, J. 1958. The child's learning of English morphology. Word 14, 150-177. Berman, R. A., & Slobin, D. I. (1994). Relating Events in Narrative: A Crosslinguistic Developmental Study. Hillsdale, NJ: Lawrence Erlbaum. Bohnemeyer, J , Brown, P. & Bowerman, M. 2001. Cut and Break Clips. In ‘Manual’ for the field season 2001, Levinson, S.C. & Enfield, N.J. (eds), 90-96. Nijmegen: Max Planck Institute for Psycholinguistics.Branigan, H. (2007). Syntactic Priming. Language and Learning Compass: 1, 1-16. Clahsen, H. 1982. Spracherwerb in der Kindheit. Eine Untersuchung zur Entwicklung der Syntax bei Kleinkindern. Tübingen: Narr. References II Clahsen, H., Hadler, M. & Weyerts, H. 2004. Speeded production of inflected words in children and adults. Journal of Child Language 31: 683-712. Clahsen, H., Vainikka, A. & Young-Scholten, M. 1990. Lernbarkeitstheorie und Lexikalisches Lernen. Eine kurze Darstellung des LEXLERN-Projekts. Linguistische Berichte 130, 466-477. Dobel, C., Glanemann, R., Kreysa, H., Zwitserlood, P., Eisenbeiss, S. (2009) Visual encoding of meaningful and meaningless events. Submitted. References III Eisenbeiss, S. 1994. Elizitation von Nominalphrasen und Kasusmarkierungen. In: Sonja Eisenbeiß, Susanne Bartke, Helga Weyerts & Harald Clahsen (Eds.), Elizitationsverfahren in der Spracherwerbsforschung: Nominalphrasen, Kasus, Plural, Partizipien. (Arbeiten des Sonderforschungsbereichs 282, Nr. 57). Düsseldorf: Heinrich-Heine-Universität, 1-38. Eisenbeiss, S. 2003: Merkmalsgesteuerter Grammatikerwerb: Eine Untersuchung zum Erwerb der Struktur und Flexion von Nominalphrasen. Dissertation; University of Duesseldorf. http:/ / diss.ub.uni-duesseldorf.de/ ebib/ diss/ file?dissid= 1185 Eisenbeiss, S. 2009a: Production Methods. Ms. University of Essex Eisenbeiss, S. 2009b: Semi-Structured Elicitation Tasks. Ms. University of Essex (to appear in Essex Research Reports in Linguistics: http:/ / www.essex.ac.uk/ linguistics/ errl/ ) References IV Eisenbeiss, S., Matsuo, A. 2003. External and Internal Possession - A Comparative Study of German and Japanese Child Language. Paper presented at the 28th Annual Boston University Conference on Language Development, Boston University Wagner, Klaus R. (1985). How much do children say in a day? Journal of Child Language 12, 475-487. Eisenbeiss, S. & Matsuo, A. 2005. Eliciting Language Production Data from Young Children. Presentation at the Xth International Congress for the Study of Child Language, Berlin, Germany Eisenbeiss, S., Matsuo, A. & Sonnenstuhl, I. (2009): Learning to Encode Possession. Submitted. Eisenbeiss, S. & McGregor, W.B. 1999. The circle of dirt. Ms. MaxPlanck-Institute for Psycholinguistics, Nijmegen. Further Readings I Breakwell, G.M., Hammond, S., Fife-Schaw, C. (eds.) 2003. Research Methods in Psychology. London: Sage Publications. Field, A., Hole, G. 2003. How to Design and Report Experiments. Sage Publications. McDaniel, D., McKee, C., Smith Cairns, H. (eds.) 1996. Methods for Assessing Children's Syntax. Cambridge, MA: MIT Press, 3-22. Menn, L., Bernstein Ratner, N. (eds.) 2000. Methods for studying language production. Mahwah, N.J. : Lawrence Erlbaum Associates Further Readings II Sekerina, I.A., Fernandez, E.M., Clahsen, H. (eds) 2008. Developmental Psycholinguistics: On-Line Methods in Children’s Language Processing. Amsterdam: Benjamins. Wei, Li, Moyer, M. (eds) 2008. The Blackwell Guide to Research Methods in Bilingualism and Multilingualism. London: Blackwell. Wray, A., Bloomer, A. 2006. Projects in Linguistics. A Practical Guide to Researching Language. London: Arnold.