Telisma Records French Children's Speech Database as Part of the Neologos Project
Children are not mini-adults. For speech experts, they could represent a unique and very complex kind of speaker in themselves.
Typically, children's voices show higher frequencies and reveal different spectral analysis. Moreover, the range of values for most acoustic parameters is much larger for children than for adults, a 7-year-old's voice being obviously quite different from a fifteen-year-old's.
But that's not all: children's prosody is different too. To take a few examples, we know that on average, the speaking rate of children is slower than that of adults. Children also tend to overshoot when interacting with machines. And, being more spontaneous, they produce larger amounts of extraneous speech.
Neologos is a speech database project for the French language resulting from collaboration between French universities and industries, and supported by the French Ministry for Research. The first part of this project consists in the creation of a 1000-speaker telephone database of children's voices following the SpeechDat guidelines with some adaptations to the context of the child speakers.
Data collection is already underway by Telisma. Speech is being recorded directly over the fixed telephone network in real-life situations. The database is evenly split between boys and girls and balanced across twelve French linguistic regions. The linguistic items recorded consist of 37 items that are either read, repeated or that correspond to spontaneous answers to specific questions.
The children's database will be used by Telisma to expand existing grammars and possibly create a new language library specifically dedicated to children.