Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support non-identical file names between .wav and .eaf, and recognise media offsets #215

Draft
wants to merge 11 commits into
base: master
Choose a base branch
from

Conversation

mattchrlw
Copy link
Contributor

Resolves #191, #193.

This implementation doesn't give the user any choice as to whether to match the file name of the corresponding .eaf file or to just get it from RELATIVE_MEDIA_URL. It defaults to the former behaviour and falls back to the later.

This implementation also ignores MEDIA_URL as it is difficult to wrestle it (e.g. "file:///Users/bbb/Desktop/abui/abui-audio-1.wav") into a format that the rest of the application will be able to handle easily. In other words, it assumes that RELATIVE_MEDIA_URL is well formed.

This also fixes any line = wer_lines[0] IndexError: list index out of range errors that may have been happening before, although please double check they are actually fixed.

Offsets are directly int()-ed from the .eaf file.

Copy link
Contributor

@nicklambourne nicklambourne left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think a big part of the original tickets was implementing a UI feature that would highlight (in particular) if audio or eaf were uploaded without the corresponding eaf or audio file (respectively). The easiest way I can envisage to accomplish this is aligning the audio files horizontally in the UI with their transcriptions, which would make it obvious that a pair was missing either component (you could also highlight rows with a missing file, or something to that effect). The unfortunate downside to this is that you'll have to replicate the verification on the front end (This might help: https://www.npmjs.com/package/elan-parser ).

@mattchrlw
Copy link
Contributor Author

I think a big part of the original tickets was implementing a UI feature that would highlight (in particular) if audio or eaf were uploaded without the corresponding eaf or audio file (respectively). The easiest way I can envisage to accomplish this is aligning the audio files horizontally in the UI with their transcriptions, which would make it obvious that a pair was missing either component (you could also highlight rows with a missing file, or something to that effect). The unfortunate downside to this is that you'll have to replicate the verification on the front end (This might help: https://www.npmjs.com/package/elan-parser ).

Ah okay, this might be a bit more work to do on the uploading side of things, as it won't just be a file drop anymore. But I can look into it 👌

@mattchrlw mattchrlw marked this pull request as draft April 17, 2021 12:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants