The spoken corpus of the English of Hong Kong and Mainland Chinese learners provides 82 sets of high-quality recordings. Phonological annotations have been made for 40 speakers’ passage readings for segmental features and five selected sentences for their suprasegmental features.
The spoken English corpus of Chinese and Non-Chinese learners in Hong Kong is expanded and redeveloped from the previous spoken corpus. Speech data were elicited from Hong Kong speakers, Mainland China speakers with eight different dialect backgrounds, and non-Chinese speakers.
Our spoken corpora contain audio data with phonological annotations that focus on three areas of segmental features (vowels, consonants and syllable structures) and four areas of suprasegmental features (lexical stress, pause, linking and intonation).
The corpora have the following characteristics:
1. They provide high-quality recordings that are ideally suited for phonetic and acoustic analysis by researchers around the world.
2. They produce recordings and phonological annotations that are easily accessible and immediately available to all learners, teachers and researchers, both in and outside EdUHK.
3. They provide platforms for learners to access and rate the corpus data in order to discover the linguistic features on their own and to enhance their active engagement in their own learning.
4. They describe the distinctive linguistic features of English production from Hong Kong, Mainland and non-Chinese university students.