KALDI에서 ivector 추출 프로세스

1996 단어 Kaldi 학습
1.wav를 생성합니다.scp,spk2utt,utt2spk
./local/data_prep.sh /home/yixin/kaldi/egs/clarinet/data/clarinet_audio/wav /home/yixin/kaldi/egs/clarinet/data/clarinet_audio/transcript




#   Making spk2utt files
utils/utt2spk_to_spk2utt.pl data/train/utt2spk > data/train/spk2utt
utils/utt2spk_to_spk2utt.pl data/test/utt2spk > data/test/spk2utt

2. 샘플링 확률을 16kHz로 통일
wav가 있는 폴더로 갑니다./sampling.sh
#!/bin/bash
for x in ./*.wav
do
    b=${x##*/}
    sox $b -r 16000 -b 16 tmp_$b
    rm -rf $b
    mv tmp_$b $b
  done
echo "Converted to 16kHz"

3. MFCC 컴퓨팅
# Making feats.scp files
mfccdir=mfcc
# Uncomment and modify arguments in scripts below if you have any problems with data sorting
utils/validate_data_dir.sh data/train     # script for checking prepared data - here: for data/train directory
utils/fix_data_dir.sh data/train          # tool for data proper sorting if needed - here: for 

for x in train test; do
  utils/fix_data_dir.sh data/$x
  steps/make_mfcc.sh --cmd "$train_cmd" --nj 1 data/$x exp/make_mfcc/$x $mfccdir
  sid/compute_vad_decision.sh --nj 1 --cmd "$train_cmd" data/$x exp/make_mfcc/$x $mfccdir
  
done
  • mfcc 특징 파일 txt
  • 획득
    kaldi의 특징 데이터를 읽고 ark 바이너리 파일을 변환합니다.txt 형식의 파일
    ~/kaldi/src/featbin/copy-feats --binary=false ark:raw_mfcc_train.1.ark ark,t:1.txt
    4. ivector 계산
    # train diag ubm
    sid/train_diag_ubm.sh --nj 1 --cmd "$train_cmd" --num-threads 1 \
      data/train 1024 exp/diag_ubm_1024
    
    #train full ubm
    sid/train_full_ubm.sh --nj 1 --cmd "$train_cmd" data/train \
      exp/diag_ubm_1024 exp/full_ubm_1024
    
    #train ivector
    sid/train_ivector_extractor.sh --cmd "$train_cmd --mem 10G" \
      --num-iters 5 exp/full_ubm_1024/final.ubm data/train \
      exp/extractor_1024
    
    #extract ivector
    sid/extract_ivectors.sh --cmd "$train_cmd" --nj 1 \
      exp/extractor_1024 data/train exp/ivector_train_1024

    좋은 웹페이지 즐겨찾기