Question about infer.py for EDA #30

Achronferry · 2021-10-12T09:11:02Z

Hello! I read the infer.py file and as I understand, it firstly divide complete audio into chunks and fed these chunks into the model. At the end, it stack all the outputs to make rttm file.
out_chunks.append(ys[0].data)
......
out_chunks = [np.insert(o, o.shape[1], np.zeros((max_n_speakers - o.shape[1], o.shape[0])), axis=1) for o in out_chunks]
outdata = np.vstack(out_chunks)
I'm a little confused about how you can make sure the speaker orders of each chunks are consistent for the EDA model？ Because the attractors in EDA are dynamically generated based on the chunk. One speaker may disappear in another chunk of the same audio?

The text was updated successfully, but these errors were encountered:

shota-horiguchi · 2022-02-06T09:51:42Z

Chunking during inference is not expected. Please make sure that the chunk size is large enough so that the recording is not split during inference.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Question about infer.py for EDA #30

Question about infer.py for EDA #30

Achronferry commented Oct 12, 2021 •

edited

Loading

shota-horiguchi commented Feb 6, 2022

Question about infer.py for EDA #30

Question about infer.py for EDA #30

Comments

Achronferry commented Oct 12, 2021 • edited Loading

shota-horiguchi commented Feb 6, 2022

Achronferry commented Oct 12, 2021 •

edited

Loading