Sublexical phonotactic regularities in language have a major impact on language development, as well as on speech processing and production throughout the entire lifespan. To understand the impact of phonotactic regularities on speech and language functions at the behavioral and neural levels, it is essential to have access to oral language corpora to study these complex phenomena in different languages. Yet, probably because of their complexity, oral language corpora remain less common than written language corpora. This article presents the first corpus and database of spoken Quebec French syllables and phones: SyllabO+. This corpus contains phonetic transcriptions of over 300,000 syllables (over 690,000 phones) extracted from recordings of 184 healthy adult native Quebec French speakers, ranging in age from 20 to 97 years. To ensure the representativeness of the corpus, these recordings were made in both formal and familiar communication contexts. Phonotactic distributional statistics (e.g., syllable and co-occurrence frequencies, percentages, percentile ranks, transition probabilities, and pointwise mutual information) were computed from the corpus. An open-access online application to search the database was developed, and is available at www.speechneurolab.ca/syllabo . In this article, we present a brief overview of the corpus, as well as the syllable and phone databases, and we discuss their practical applications in various fields of research, including cognitive neuroscience, psycholinguistics, neurolinguistics, experimental psychology, phonetics, and phonology. Nonacademic practical applications are also discussed, including uses in speech-language pathology.
Keywords: Corpus; Distributional statistics; Oral language; Phones; Phonotactic regularities; Syllable; Transition probabilities.