锐英源软件
第一信赖

精通

英语

开源

擅长

开发

培训

胸怀四海 

第一信赖

当前位置:锐英源 / 开源技术 / 语音识别开源 / pipe相关格式错误
服务方向
人工智能数据处理
人工智能培训
kaldi数据准备
小语种语音识别
语音识别标注
语音识别系统
语音识别转文字
kaldi开发技术服务
软件开发
运动控制卡上位机
机械加工软件
软件开发培训
Java 安卓移动开发
VC++
C#软件
汇编和破解
驱动开发
联系方式
固话:0371-63888850
手机:138-0381-0136
Q Q:396806883
微信:ryysoft

pipe相关格式错误


I am trying to run the sre08 recipe. A weird format error was found when running scripts with pipelines.

For example, with this definition:

我正在尝试运行sre08版本。使用管道运行脚本时发现了奇怪的格式错误。

例如,使用以下定义:

feats="ark,s,cs:add-deltas scp:$sdata/JOB/feats.scp ark:- | apply-cmvn-sliding --norm-vars=false --center=true --cmn-window=300 ark:- ark:- | select-voiced-frames ark:- scp,s,cs:$sdata/JOB/vad.scp ark:- | subsample-feats --n=$subsample ark:- ark:- |"

A command like the following one will always result in format error:

$cmd JOB=1:$nj_full $dir/log/gselect.JOB.log \ gmm-gselect --n=$num_gselect $dir/final.dubm "$feats" ark:- \| \ fgmm-global-gselect-to-post --min-post=$min_post $dir/final.ubm "$feats" \ ark,s,cs:- "ark:|gzip -c >$dir/post.JOB.gz" || exit 1;

Some error messages:

ERROR (gmm-gselect:Read(): kaldi-matrix.cc:1344) Failed to read matrix from stream. : Expected "[", got "archive" File position at start is -1, currently -1
ERROR (select-voiced-frames:Read():kaldi-matrix.cc:1344): Failed to read matrix from stream. : Expected "[", got "archive" File position at start is -1, currently -1

一些错误信息:

错误(gmm-gselect:Read():kaldi-matrix.cc:1344)无法从流中读取矩阵。:预期为“ [”,“存档”开始时文件位置为-1,
当前为-1 错误(select-voiced-frames:Read():kaldi-matrix.cc:1344):无法从流中读取矩阵。:预期“ [”,“存档”开始时文件位置为-1,当前为-1


...

Running the code with some debugging efforts, we found those fed into pipes did not have '\0'B so they are thought of as a text file, but they don't have "[", the file content begins immediately after the index name.

I tried to split these scripts with pipes into separate commands and they can properly generate results without problem.

That is, replace original definition of $feats with separate commands like these will work fine:

通过一些调试工作来运行代码,我们发现送入管道的代码没有'\ 0'B,因此它们虽然是文本文件,但没有“ [”,文件内容在索引之后立即开始名称。

我试图通过管道将这些脚本拆分为单独的命令,它们可以正确生成结果而不会出现问题。

也就是说,用这样的单独命令替换$ feats的原始定义将可以正常工作:

$cmd 1:$nj_full $dir/log/add-deltas.JOB.log \ add-deltas scp:$sdata/JOB/feats.scp ark:$sdata/JOB/feats_deltas.ark

$cmd 1:$nj_full $dir/log/apply-cmvn-sliding.JOB.log \ apply-cmvn-sliding --norm-vars=false --center=true --cmn-window=300 \ ark:$sdata/JOB/feats_deltas.ark ark:$sdata/JOB/feats_deltas_cmvn.ark

$cmd 1:$nj_full $dir/log/select-voiced-frames.JOB.log \ select-voiced-frames ark:$sdata/JOB/feats_deltas_cmvn.ark scp,s,cs:$sdata/JOB/vad.scp \
ark:$sdata/JOB/feats_deltas_cmvn_vad.ark

$cmd 1:$nj_full $dir/log/subsample-feats.JOB.log \ subsample-feats --n=$subsample ark:$sdata/JOB/feats_deltas_cmvn_vad.awk \ ark:$sdata/JOB/feats_deltas_cmvn_vad_$subsample.ark

feats="ark,s,cs:$sdata/JOB/feats_deltas_cmvn_vad_$subsample.ark"

But the frequently I/O may not only slow down the performance but take lots of hard drive spaces. Could someone help with this weird format issue?

Thanks a lot.但是频繁的I / O不仅会降低性能,还会占用 大量硬盘空间。有人可以解决这个奇怪的格式问题吗?

OK - make sure you have not changed the code (do "svn status | grep -v '?'), and run "make test". It's acting like it's reading the string
"archive" from a stream, but that string is never printed by Kaldi.OK-确保您没有更改代码(执行“ svn status | grep -v '?'”),然后运行“ make test”,其作用就像是
从流中读取字符串“ archive”一样,但是该字符串永远不会通过Kaldi打印。

 

Now, with a KALDI copy svn-synced recently, svn status | grep -v '?' generating nothing and all SUCCESS make test, it seems to work properly now.

Is it possible that I somehow corrupted the source file or other possible issues?
The previous copy was NOT ok with the make test.

现在,使用svn同步出来最近KALDI副本,svn status | grep -v'?' 什么也没有产生,make test 都成功进行了测试,现在看来它可以正常工作。

我是否可能以某种方式损坏了源文件或其他可能的问题? 先前的副本无法通过make test。


Yes, possibly either you changed the code, or you had checked out a bad version number with a bug that I was not previously aware of, that was
later fixed. Anyway it doesn't matter. Please don't follow up on this thread as I don't want to generate too much traffic on the list.是的,可能是您更改了代码,或者您签出了一个错误的 版本号以及一个我以前不知道的错误,后来该错误得以解决。无论如何都没关系。请不要跟进此线程,因为我不想在列表上产生太多流量。


友情链接
版权所有 Copyright(c)2004-2021 锐英源软件
公司注册号:410105000449586 豫ICP备08007559号 最佳分辨率 1024*768
地址:郑州大学北校区院(文化路97号院)内