A corpus of Japanese vowel formant patterns

P. Mokhtari and K. Tanaka

Bulletin of The Electrotechnical Laboratory (ETL), Vol.64, Special Issue, 57-66, Nov 2000.

ABSTRACT

This paper describes a dataset of formant patterns measured in the steady-states of recorded Japanese vowels. Five adult, male, native speakers of Japanese were selected from the "ETL-WD-I and II" balanced word dataset; and for each of the five vowels /i, e, a, o, u/, 22 different words were selected on the basis of consistently finding the lengthiest and most steady-state vocalic nuclei. A semi-supervised method based on linear-prediction (LP) analysis of the speech waveform was then used to carefully extract the first four formants in five consecutive frames of each vocalic nucleus, thereby yielding a total of 2750 patterns of formant frequencies {F1, F2, F3, F4} and formant bandwidths {B1, B2, B3, B4}. These formant patterns are offered in electronic form, with the aim of contributing to the small but growing body of publicly available formant data.

DOWNLOADS

The URL mentioned in the paper no longer exists, but both the paper and the data are available here...

The paper

The formant data

The documentation

Updated: 27 February 2018

A corpus of Japanese vowel formant patterns P. Mokhtari and K. Tanaka

A corpus of Japanese vowel formant patterns

P. Mokhtari and K. Tanaka