Corpora

This webside includes three children's speech corpora to stuy Catalan acquisition, Spanish acquisition, and bilingual Catalan-Spanish acquisition.

 

  Serra-Solé Corpus Database of five monolingual Catalan speaking-children, one monolingual Spanish speaking-child, and four bilingual Catalan-Spanish speaking-children. Recorded sessions from 1;0 to 2;9 are phonetically transcriped using Phon application. The whole corpus includes sessions from 1;0 to 4;0 and is available in CHILDES.

 

  Llinàs-Ojea Corpus Database of two monolingual Spanish speaking-children, and one monolingual Catalan speaking-child. Recorded sessions from 0;11 to 2;04 are phonetically transcribed using Phon application. The whole corpus includes sessions from 0;11 to 3;02 and is available in CHILDES.
 

 

López-Ornat Corpus

 

Database of one monolingual Spanish speaking-child. Recorded sessions from 1;07 to 2;04 are phonetically transcribed using Phon application. The whole corpus includes sessions from 1;07 to 3;10 and is available in CHILDES.

 

Esteve-Prieto Corpus

 

Database of audiovisual recordings of four monolingual Catalan speaking-children, from 6 to 32 months of age. Weekly or fortnightly recordings. Sessions from 7 to 11 months include orthographically and pragmatically transcriptions using Phon.

 

 

How to look up in the corpora. Corpora are available downoading files with two different programs:
  • PhonProject file: it includes recorded sessions which have been segmented, annotated and phonetically transcribed. Read more about Phon.
  • CLAN file: it includes sessions which have been orthographically transcribed. All files are available both in .pdf and in .cha. Read more about CLAN.

 

The team members of the Phon-Cat i Phon-Esp projects would appreciated being notified with a copy or a summary of any work using data availabe in these corpora.