Releases: ReadAlongs/SoundSwallower
0.2.2: Acoustic model and acoustic fixes
This release adds compatibility with all (I hope) of the publicly released CMUSphinx models from https://sourceforge.net/projects/cmusphinx/files/Acoustic%20and%20Language%20Models/.
The decode_file method has been fixed to work properly when the sample rate of the file is different from the default.
There are also a number of JavaScript updates, in particular, a major change to switch to float32 input at 44.1kHz by default, as this is the only format provided directly by the WebAudio API.
0.2: OMG JavaScript!!!1!!1!
There's now a JavaScript API worthy of your attention. The demo code has been moved to https://github.com/dhdaines/soundswallower-demo, and has been updated to be all "modern" for whatever that's worth. No, it's worth something, because asynchronous JavaScript is genuinely cool. Also of note, JavaScript defaults to float32 input because that's What the Web Wants.
A major bug in the Python and C API was fixed which broke long utterances such as used in ReadAlongs. The documentation has also been greatly improved, go see it at https://soundswallower.readthedocs.io/
Also the default acoustic model is now the narrow-band one, so as to be more robust to whatever crappy audio you throw at it, and we include also a French model (qui marche plus ou moins au Québec, désolé).
0.1.5: Lots 'o Docs
This release mainly adds a lot of documentation, which you can see at https://soundswallower.readthedocs.io/en/latest/
Also, some convenient APIs have been added to get configuration parameter descriptions, and to decode an entire file in one shot.
The next release will be 0.2, which will introduce a new JavaScript API, and probably continue to remove unused code on the C side.
Fix Embarrassing Problems
Actually, just one embarrassing problem - the byte-swapping code was totally confused and only worked by pure luck. And actually it didn't work in JavaScript, which is how I found the problem. Which brings me to the main purpose of this release: JavaScript!
Yes, compiling to JavaScript and running on both Node.js and "the browser" (or at least a few of them) is now totally functional. The API is very much subject to change, because I didn't write it and I don't really like it. Nonetheless, I am eternally grateful to the author of PocketSphinx.JS who came up with the idea of doing this in the first place, and who wrote the current demo. A future release, maybe the next one, will have a rewritten and "modernized" API and hopefully a more interesting demo as well.
On the Python side, the command-line interface has all its functionality complete. The major improvement under the hood is the ability to iterate over Config
objects, which also allows us to write the configuration to a JSON file. This is also an API which I don't like, and unfortunately, I did write this one. It is now less annoying than it used to be, though. I suggest no longer prefixing Config
options with dashes, and if possible just passing options to the Decoder
constructor.
A few other minor bugs were also fixed.
Smaller and Simpler
This release switches the Python API to use Cython instead of SWIG, making the code somewhat more readable (but not actually smaller). In addition, the Config class has been improved in various ways, to allow a more Pythonic interface. There is some magic involved. It will likely change again in the future to remove this magic at the C level.
JavaScript builds compile but have not been tested, this will be the next release, probably.
A proper release
The Python module build now works properly from both source and binary, and this is tested with the force alignment test in ReadAlong-Studio.
Initial pre-release
Very preliminary release of SoundSwallower, which may or may not work! The API is subject to change but is basically the same as PocketSphinx/SphinxBase just in the 'soundswallower' namespace (for both C header files and Python modules).
The Windows wheel was built with Visual Studio 2017 and should work with Python 3.7 there... the Linux wheel is built on Ubuntu 19.10, so I am unsure if it will work for, well, anybody. Yay, Linux.