-
Notifications
You must be signed in to change notification settings - Fork 20
1. Setting up
In order to use AutoVOT you'll need the following installed in addition to the source code provided here:
-
If you're using Python version 2.6 or earlier, you will need to install the argparse module (which is installed by default in Python 2.7), e.g. by running
easy_install argparse
on the command line. -
If you're using Mac OS X you'll need to download GCC, as it isn't installed by default. You can either:
- Install Xcode, then install Command Line Tools using the Components tab of the Downloads preferences panel.
- Download the Command Line Tools for Xcode as a stand-alone package.
You will need a registered Apple ID to download either package.
Files included in this version:
-
AutoVOT scripts:
autovot/
contains all scripts necessary for user to extract features, train, and decode VOT measurements. -
Tutorial example data:
-
experiments/data/tutorialExample/
contains the .wav and .TextGrid files used for training and testing, as well asmakeConfigFiles.sh
, a helper script used to generate file lists.- Note: This data contains short utterances with one VOT window per file. Future versions will contain examples with longer files and more instances of VOT per file.
- The TextGrids contain 3 tiers, one of which will be used by autovot. The tiers are
phones
,words
, andvot
. Thevot
tier contains manually aligned VOT intervals that are labeled "vot"
-
-
Example classifiers:
experiments/models/
contains three pre-trained classifiers that the user may use if they do not wish to provide their own training data. All example classifiers were used in Sonderegger & Keshet (2012) and correspond to the Big Brother and PGWords datasets in that paper:-
Big Brother:
bb_jasa.classifier
's are trained on conversational British speech. Word-initial voiceless stops were included in training. This classifier is best to use if working with conversational speech -
PGWords:
nattalia_jasa.classifier
is trained on single-word productions from lab speech: L1 American English and L2 English/L1 Portuguese bilinguals. Word-initial voiceless stops were included in training. This classifier is best to use if working with lab speech. - Note: For best performance the authors recommend hand-labeling a small subset of VOTs (~100 tokens) from your own data and training new classifiers (see information on training below). Experiments suggesting this works better than using a classifier pre-trained on another dataset are given in Sonderegger & Keshet (2012).
-
Big Brother:
Important: Input TextGrids will be overwritten. If you wish to access your original files, be sure to back them up elsewhere.
- Wav files sampled at 16kHz mono
- You can convert wav files using a utility such as SoX, as follows:
$ sox input.wav -c 1 -r 16000 output.wav
- Saved as text files with .TextGrid extension
- TextGrids for training must contain a tier with hand measured vot intervals. These intervals must have a common text label, such as "vot".
- TextGrids for testing must contain a tier with window intervals indicating the range of times where the algorithm should look for the VOT onset. These intervals must also have a common label, such as "window". For best performance the window intervals should:
- contain no more than one stop consonant
- contain about 50 msec before the beginning of the burst or
- if only force-aligned segments are available (each corresponding to an entire stop), contain about 30 msec before the beginning of the segment.
The experiments
folder contains subdirectories that will be used to store files generated by the scripts, in addition to data to be used during the working tutorial.
(See example data & experiment folders.)
-
experiments/config/
: Currently empty: This is where lists of file names will be stored. -
experiments/models/
: Currently contains example classifiers. This is also where your own classifiers will eventually be stored. -
experiments/tmp_dir/
: Currently empty. This is where feature extraction will be stored in Mode 2. -
experiments/data/tutorialExample/
: This contains TextGrids and wav files for training and testing during the tutorial.
Back to top