使用sherpa-onnx遇到一个奇怪的问题 #760

AHPUymhd · 2024-04-12T14:06:56Z

我的代码里面是将你好军哥转化成 n ǐ h ǎo j ūn g ē @你好军哥，但是应该在keyword_spotter = sherpa_onnx.KeywordSpotter这行代码的时候出现了错误，导致不能运行，请问这是哪里出来问题，非常期待您的回复！！！我的代码如下：

AHPUymhd · 2024-04-12T14:11:29Z

@pkufool @csukuangfj 可以麻烦您们帮我解答一下疑问吗

AHPUymhd · 2024-04-12T14:13:46Z

我下载的model是这种格式的文件，请问有问题吗，我是windows系统

AHPUymhd · 2024-04-12T14:15:32Z

csukuangfj · 2024-04-12T14:15:43Z

请直接贴文字代码

AHPUymhd · 2024-04-12T14:17:31Z

#!/usr/bin/env python3

"""
This file demonstrates how to use sherpa-onnx Python API to do keyword spotting
from wave file(s).

Please refer to
https://k2-fsa.github.io/sherpa/onnx/kws/pretrained_models/index.html
to download pre-trained models.
"""
import argparse
import time
import wave
from pathlib import Path
from typing import List, Tuple

import numpy as np
import sherpa_onnx

sound_files = "C:/Users/X/Desktop/sherpa-onnx-master/python-api-examples/action_done.wav"
tokens = "C:/Users/X/Documents/Tencent Files/1848795229/FileRecv/sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01/tokens.txt"
encoder = "C:/Users/X/Documents/Tencent Files/1848795229/FileRecv/sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01/encoder-epoch-12-avg-2-chunk-16-left-64.onnx"
decoder = "C:/Users/X/Documents/Tencent Files/1848795229/FileRecv/sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01/decoder-epoch-12-avg-2-chunk-16-left-64.onnx"
joiner = "C:/Users/X/Documents/Tencent Files/1848795229/FileRecv/sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01/joiner-epoch-12-avg-2-chunk-16-left-64.onnx"
key_file = "C:/Users/X/Documents/Tencent Files/1848795229/FileRecv/sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01/keywords_raw.txt"

def get_args():
parser = argparse.ArgumentParser(
formatter_class=argparse.ArgumentDefaultsHelpFormatter
)

parser.add_argument(
    "--tokens",
    type=str,
    default=tokens,
    help="Path to tokens.txt",
)

parser.add_argument(
    "--encoder",
    default=encoder,
    type=str,
    help="Path to the transducer encoder model",
)

parser.add_argument(
    "--decoder",
    default=decoder,
    type=str,
    help="Path to the transducer decoder model",
)

parser.add_argument(
    "--joiner",
    default=joiner,
    type=str,
    help="Path to the transducer joiner model",
)

parser.add_argument(
    "--num-threads",
    type=int,
    default=1,
    help="Number of threads for neural network computation",
)

parser.add_argument(
    "--provider",
    type=str,
    default="cpu",
    help="Valid values: cpu, cuda, coreml",
)

parser.add_argument(
    "--max-active-paths",
    type=int,
    default=4,
    help="""
    It specifies number of active paths to keep during decoding.
    """,
)

parser.add_argument(
    "--num-trailing-blanks",
    type=int,
    default=1,
    help="""The number of trailing blanks a keyword should be followed. Setting
    to a larger value (e.g. 8) when your keywords has overlapping tokens
    between each other.
    """,
)

parser.add_argument(
    "--keywords-file",
    type=str,
    default=key_file,
    help="""
    The file containing keywords, one words/phrases per line, and for each
    phrase the bpe/cjkchar/pinyin are separated by a space. For example:

    ▁HE LL O ▁WORLD
    x iǎo ài t óng x ué 
    """,
)

parser.add_argument(
    "--keywords-score",
    type=float,
    default=1.0,
    help="""
    The boosting score of each token for keywords. The larger the easier to
    survive beam search.
    """,
)

parser.add_argument(
    "--keywords-threshold",
    type=float,
    default=0.25,
    help="""
    The trigger threshold (i.e. probability) of the keyword. The larger the
    harder to trigger.
    """,
)

parser.add_argument(
    "--sound_files",
    type=str,
    nargs="+",
    default=sound_files,
    help="The input sound file(s) to decode. Each file must be of WAVE"
         "format with a single channel, and each sample has 16-bit, "
         "i.e., int16_t. "
         "The sample rate of the file can be arbitrary and does not need to "
         "be 16 kHz",
)

return parser.parse_args()

def assert_file_exists(filename: str):
assert Path(filename).is_file(), (
f"{filename} does not exist!\n"
"Please refer to "
"https://k2-fsa.github.io/sherpa/onnx/kws/pretrained_models/index.html to download it"
)

def read_wave(wave_filename: str) -> Tuple[np.ndarray, int]:

with wave.open(wave_filename) as f:
    assert f.getnchannels() == 1, f.getnchannels()
    assert f.getsampwidth() == 2, f.getsampwidth()  # it is in bytes
    num_samples = f.getnframes()
    samples = f.readframes(num_samples)
    samples_int16 = np.frombuffer(samples, dtype=np.int16)
    samples_float32 = samples_int16.astype(np.float32)

    samples_float32 = samples_float32 / 32768
    return samples_float32, f.getframerate()

def main():
args = get_args()
assert_file_exists(args.tokens)
assert_file_exists(args.encoder)
assert_file_exists(args.decoder)
assert_file_exists(args.joiner)

assert Path(
    args.keywords_file
).is_file(), (
    f"keywords_file : {args.keywords_file} not exist, please provide a valid path."
)

keyword_spotter = sherpa_onnx.KeywordSpotter(
    tokens=args.tokens,
    encoder=args.encoder,
    decoder=args.decoder,
    joiner=args.joiner,
    num_threads=args.num_threads,
    max_active_paths=args.max_active_paths,
    keywords_file=args.keywords_file,
    keywords_score=args.keywords_score,
    keywords_threshold=args.keywords_threshold,
    num_trailing_blanks=args.num_trailing_blanks,
    provider=args.provider,
)

print("Started!")
start_time = time.time()

streams = []
total_duration = 0
for wave_filename in args.sound_files:
    assert_file_exists(wave_filename)
    samples, sample_rate = read_wave(wave_filename)
    duration = len(samples) / sample_rate
    total_duration += duration

    s = keyword_spotter.create_stream()

    s.accept_waveform(sample_rate, samples)

    tail_paddings = np.zeros(int(0.66 * sample_rate), dtype=np.float32)
    s.accept_waveform(sample_rate, tail_paddings)

    s.input_finished()

    streams.append(s)

results = [""] * len(streams)
while True:
    ready_list = []
    for i, s in enumerate(streams):
        if keyword_spotter.is_ready(s):
            ready_list.append(s)
        r = keyword_spotter.get_result(s)
        if r:
            results[i] += f"{r}/"
            print(f"{r} is detected.")
    if len(ready_list) == 0:
        break
    keyword_spotter.decode_streams(ready_list)
end_time = time.time()
print("Done!")

for wave_filename, result in zip(args.sound_files, results):
    print(f"{wave_filename}\n{result}")
    print("-" * 10)

elapsed_seconds = end_time - start_time
rtf = elapsed_seconds / total_duration
print(f"num_threads: {args.num_threads}")
print(f"Wave duration: {total_duration:.3f} s")
print(f"Elapsed time: {elapsed_seconds:.3f} s")
print(
    f"Real time factor (RTF): {elapsed_seconds:.3f}/{total_duration:.3f} = {rtf:.3f}"
)

if name == "main":
main()

csukuangfj · 2024-04-12T14:33:49Z

key_file = "C:/Users/X/Documents/Tencent Files/1848795229/FileRecv/sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01/keywords_raw.txt"

这个文件，包含什么？请把这个文件发过来。

csukuangfj · 2024-04-12T14:36:48Z

你是不是应该要传 keywords.txt 而不是 keywords_raw.txt ?

AHPUymhd · 2024-04-12T14:38:19Z

你是不是应该要传 keywords.txt 而不是 keywords_raw.txt ?

结果都一样的错误

AHPUymhd · 2024-04-12T14:40:11Z

jiu's就是这个

你是不是应该要传 keywords.txt 而不是 keywords_raw.txt ?

csukuangfj · 2024-04-12T14:49:41Z

你是不是应该要传 keywords.txt 而不是 keywords_raw.txt ?

结果都一样的错误

不应该，请贴使用 keywords.txt 的 error log

csukuangfj · 2024-04-12T14:50:17Z

jiu's就是这个

你是不是应该要传 keywords.txt 而不是 keywords_raw.txt ?

你这个错了。不要把这个给这个 python脚本，你要用 keywords.txt, 不要用keywords_raw.txt, 我应该表述清楚了吧？

AHPUymhd · 2024-04-12T14:57:07Z

改成了之后是这个错误

AHPUymhd · 2024-04-12T14:57:46Z

请问是模型没下载对吗

csukuangfj · 2024-04-12T15:08:39Z

你看，现在是文件名不对。不是刚次那个错误了。这种低级错误，你自己解决？

AHPUymhd · 2024-04-13T01:32:44Z

你看，现在是文件名不对。不是刚次那个错误了。这种低级错误，你自己解决？

好滴，非常谢谢您

csukuangfj · 2024-04-13T04:12:42Z

解决了没有呢？

AHPUymhd · 2024-04-16T11:36:22Z

解决了没有呢？

实在不好意思哈，已经解决了，非常感谢您的帮助。

csukuangfj closed this as completed Apr 16, 2024

diyism mentioned this issue May 25, 2024

[Need help] How to realize Syllable-level Voice Recognition with sherpa-onnx Open Vocabulary Keyword Spotting #920

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

使用sherpa-onnx遇到一个奇怪的问题 #760

使用sherpa-onnx遇到一个奇怪的问题 #760

AHPUymhd commented Apr 12, 2024

AHPUymhd commented Apr 12, 2024

AHPUymhd commented Apr 12, 2024

AHPUymhd commented Apr 12, 2024

csukuangfj commented Apr 12, 2024

AHPUymhd commented Apr 12, 2024

csukuangfj commented Apr 12, 2024

csukuangfj commented Apr 12, 2024

AHPUymhd commented Apr 12, 2024

AHPUymhd commented Apr 12, 2024

csukuangfj commented Apr 12, 2024

csukuangfj commented Apr 12, 2024

AHPUymhd commented Apr 12, 2024

AHPUymhd commented Apr 12, 2024

csukuangfj commented Apr 12, 2024

AHPUymhd commented Apr 13, 2024

csukuangfj commented Apr 13, 2024

AHPUymhd commented Apr 16, 2024

使用sherpa-onnx遇到一个奇怪的问题 #760

使用sherpa-onnx遇到一个奇怪的问题 #760

Comments

AHPUymhd commented Apr 12, 2024

AHPUymhd commented Apr 12, 2024

AHPUymhd commented Apr 12, 2024

AHPUymhd commented Apr 12, 2024

csukuangfj commented Apr 12, 2024

AHPUymhd commented Apr 12, 2024

csukuangfj commented Apr 12, 2024

csukuangfj commented Apr 12, 2024

AHPUymhd commented Apr 12, 2024

AHPUymhd commented Apr 12, 2024

csukuangfj commented Apr 12, 2024

csukuangfj commented Apr 12, 2024

AHPUymhd commented Apr 12, 2024

AHPUymhd commented Apr 12, 2024

csukuangfj commented Apr 12, 2024

AHPUymhd commented Apr 13, 2024

csukuangfj commented Apr 13, 2024

AHPUymhd commented Apr 16, 2024