A corpus of Japanese vowel formant patterns

P. Mokhtari and K. Tanaka

Bulletin of The Electrotechnical Laboratory (ETL), Vol.64, Special Issue, 57-66, Nov 2000.


ABSTRACT This paper describes a dataset of formant patterns measured in the steady-states of recorded Japanese vowels. Five adult, male, native speakers of Japanese were selected from the "ETL-WD-I and II" balanced word dataset; and for each of the five vowels /i, e, a, o, u/, 22 different words were selected on the basis of consistently finding the lengthiest and most steady-state vocalic nuclei. A semi-supervised method based on linear-prediction (LP) analysis of the speech waveform was then used to carefully extract the first four formants in five consecutive frames of each vocalic nucleus, thereby yielding a total of 2750 patterns of formant frequencies {F1, F2, F3, F4} and formant bandwidths {B1, B2, B3, B4}. These formant patterns are offered in electronic form, with the aim of contributing to the small but growing body of publicly available formant data.


DOWNLOADS The URL mentioned in the paper no longer exists, but both the paper and the data are available here...

The paper

The formant data

The documentation



Copyright ©Parham Mokhtari 2000-2019 Updated: 27 February 2018