-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reading speed & errors #10
Comments
There is basically a little ring buffer in between the USB thread and the caller of readStream. So -4 should is SOAPY_SDR_OVERFLOW, where basically the ring buffer filled up, it didnt get read fast enough. The optimal chunk size per readStream is going to be sdr.getStreamMTU(readStream) since thats a complete USB frame. Anything smaller will lead to fragmentation. Anything larger is ok, but it will just return the MTU size. Note that readStream() returns the number of samples read, which isnt necessary going to be nb (could be smaller). So its possible to just allocate a large buffer and always pass the available number of samples left to read. In this case the readStream is always as big as possible - MTU for this implementation. Since this is python, perhaps its just too much going into and out of context switching is the problem. It might be worth a try to modify readStream() to fill the entire remaining buffer until timeout by calling aquireReadBuffer() and converting a loop. Then you could do one giant read without no loop required in python: https://github.com/pothosware/SoapyAirspy/blob/master/Streaming.cpp#L208 I'm suggesting that as an experiment, but if that turns out to be the problem, this loop is something we can add to the python wrapper for all devices.
That basically stops or starts the usb thread internally. I would leave the stream active so long as the application is interested in continuous streaming. |
@guruofquality big thanks for this. I tweaked the code some more, switched to the MTU size and now things seem a bit better. However Im still getting way too many overflows (at 2.5 msps) to be useful. The process is literally doing nothing but pulling from the airspy and pushing into a multiprocessing queue (dropping if full). The fact that I also occasionally get the overflows (also at 2.5) on my desktop machine makes me suspect its the python context switching. I also tried my rtlsdr again at 2msps using pyrtlsdr and that seemed fine on both machines. I dont quite fully get what you mean but will have a look at the C code and see if I can get something going. Upstream changes would be cool though :p |
@dgorissen Since the overflow detection is happening between the receive thread and the caller, I'm positive that the issue is related to the SoapyAirspy implementation and/or python, but it shouldn’t have anything to do with airspy itself. I have this streaming_work branch I did a few things, I'm wondering if you can verify some things:
|
@guruofquality this is great! Unfortunately Im traveling and wont be able to test until later this week but I definitely will and feed back here (also happy to test your INT / data format change). As a small note on (2), this is what I tried the very first time. Allocate a big buffer, ~4m samples, and do one call of readStream(). On my i5 3ghz that would just hang. Then I switched to the loop I posted above and experimented with various buffer sizes. After your reply I switched to getStreamMTU() as the buffer size (1024 in my case) but I have not tried again with giving a bigger buffer. Will do. |
@guruofquality ok, tested this. Removed the pkgs installed via apt, cloned/compiled/installed from source. Verified everything worked as before with the master branch, which it did (regular -4 errors). MTU size is returned as 2048. Switched to the streaming branch and reran the same code. I see ~11-13 calls to readStream (as part of filling up my big buffer) and then it just hangs, nothing else seems to happen. No CPU usage or anything, Tried CS16 as a dataformat, seemed to work but same behavior. Verified that switching to and from master all gives consistent behavior. Thoughts? Edit: note this is on an ARM (Odroid XU4) |
@dgorissen Sounds like I broke it on the stream branch. I didnt have an airspy to test on, but these are nearly identical changes that worked on rtl codebase. Locking up though: Is the readStream call never retuning after timeout? There is a condition variable with a timeout here, so I wouldnt expect lockup: https://github.com/pothosware/SoapyAirspy/blob/streaming_work/Streaming.cpp#L300 ... Its that or maybe you loop is just seeing timeout or zero return value and continuing to run the loop. This could mean that I messed up the buffer accounting, or the conversion between bytes vs samples, something like that. The only thing I can suggest is printing in acquireReadBuffer, releaseReadBuffer, and readStream to see whats going on. I'm happy to look at a bunch of prints to see whats going on. Sorry for the trouble! |
Thanks Josh. I added some printf's to readStream. See output here:
This is where it hangs, nothing else happens after that (this line in the code). My python client code (fluff removed):
Samples per scan is: 4194304 |
Thats really strange, its not even fetching more buffers, its still converting chunks from the first acquisition. I can only think of some kind of memory boundary issue.
|
Thanks. Updated the code and switched to CF32 but still the same problem. Just faster :) I will try narrow it down to a short bit of python code and try to replicate with rtlsdr.
|
A bit more progress. I used the following self contained test script:
Installed/compiled the rtlsdr module for soapy from source and ran the above script. Works fine. Tried exactly the same script on the master branch of soapy airspy. Works fine. Switched to the streaming_work branch:
Mm, thats interesting, a segfault. Running in gdb gives:
Same if I use CS16 / numpy.int16. Hope this helps. |
@dgorissen I think I found it, _currentBuff was float* not char*, so the pointer math was bad. Try again with the latest branch. I think it will work now. |
Almost :)
|
I cant make sense of this. The only reason I can think of for the malloc error in the _rx_callback is that the buffer is resized to match the transfer size in ::rx_callback:
|
Thanks again. A really odd one yes. Output below. Any other tools I can use to help you figure this out let me know. I will also have a think. Worth asking the airspy folk?
|
It just dies in the memcpy. The copy is larger than the previous one because of the MTU fix. Is there any reason to suspect that the numpy buffer is smaller than bytesPerSample=4 * 65536? I believe that the source buffer is correctly returnedElems * bytesPerSample in size. |
Found it... Some point along my testing I had switched from SOAPY_SDR_CF32/numpy.complex64 to SOAPY_SDR_CF16/numpy.int16 and was seeing the same segfaults so thought it did not matter. But that turned out to be wrong. After your latest fixes, switching back to CF32/complex64 worked as expected... sorry for that :( Thanks for your patience and at least some other bugs were squashed in the process. So to finally continue our original thread.. My original code now runs with your streaming_work branch. The -4 errors are less than before but I still get at least one batch of 65536 samples failing for every capture period (~1.5 seconds). Any further suggestions? |
Cool, glad its working. Sounds like it can be merged since this branch offers a few improvements.
I forget the syntax, but since there are no complex in types, you simply need to use a 2 dimensional type of numpy.int16, so that each sample is an array of two int16 elements.
I didnt realize that this was an arm device before. I don't ultimately know what rate can be sustained. On the python end, you can save the extra numpy operation by reading directly into the numpy buffer:
This deserves another open issue, but on the C end, libairspy has a usb ring buffer 0) filled by libusb, which it 1) copies/converts/+some dsp into an internal ring buffer which passed to the caller. In SoapyAirspy, we 2) copy this into another ring buffer, and then readSamples once again 3) copies it into the user's provided memory. This is basically insane; FWIW, gr-osmosdr does this exact implementation for several of the USB based devices, and its usually low sample rate and not a big deal for x86 machines. I really wished all of these USB driver libraries directly exposed the libusb ring buffer handles somehow, since SoapySDR has an API exactly for this. In any case, what I wanted to experiment with was skipping 2 and 3 by letting SoapyAirspy::aquireReadBuffer() actually wait for the libusb/airspy rx_callback, and then block it with a condition, acquiring access to the buffer for readSamples. And then SoapyAirspy::releaseReadBuffer() would allow the rx_callback to unblock and return. Its sort of like having a ring buffer of size 1 and relying on the underlying airspy library and libusb to handle any buffering or overflow. I'm very curious if something like this helps the current issue on the odroid or is a better or simpler way with less copies to work with async/callback based libraries like this. Feel free to try this suggestion without me, otherwise, when I get some time I will try it out on soapy rtl and airspy, until then, use the numpy suggestion (good to save an extra copy), and it looks like I will merge this stream work into master. |
Thanks Josh. Made the suggested changes. Note that this shows a less powerful odroid (the C2) working with the airspy at 10msps. Hence I would think my UX4 should have no issue with 2.5msps I also tried my code with a more powerful Intel Atom x5. That is better than the UX4 but also drops occasionally. I also recall the dropping happening with my desktop core i5 on the master branch. However, can't test at the moment though as the HD has died. Will resurrect though. Also, just to give you a sense what Im aiming for. Im working on v2.0 of this. |
Did you manage to have a poke at this? After some thought I thought it might be easier if I just write a very thin python wrapper around the airspy host tools directly, avoiding all overhead. |
It might be the case that the overflows simply arent being reported in this case. Since the buffers are multiple of the FFT size this might not be noticed. You would have to check the terminal for "O" prints in this case. Sort of a related question, but are the occasional drops/overflows actually harmful in this particular use case?
The interesting/worthwhile thing would be to have the airspy async callback call a registered python function. Wrapping the async callback like that is as close as you can get to the library, as opposed to stashing the buffer for a streaming API to pick it up. No idea if thats a significant difference. So I tested this on RTL and made the same changes to Airspy, you can try it out on this branch. It compiles but no idea if it works, its just a near identical change: https://github.com/pothosware/SoapyAirspy/tree/streaming_callback_handshake I'm adding a rate testing option to SoapySDRUtil, I personally wanted this to test out streaming changes, but I think it would also be a good data point in this case as well (when I get it finished). |
On this branch, there is a rate test, looks like this for RTL: https://github.com/pothosware/SoapySDR/tree/rate_test
|
Got back to testing. With current master and slightly updated hardware I can now read without overflow ('O's) it seems, so thats good. The odroid still gets overflows. Though in principle they arent problematic, when other processing processes kick in it does become worse and Id rather avoid them. However after some thought I decided to move away from the odroid for this and other reasons. The streaming_callback_handshake branch consistently fails for me though with a -2 error. How best to debug this? The streaming rate functionality in SoapySDRUtil is useful, thanks. I get consistently around 9.9 Bps for a 2.5e6 sample rate. I do notice that my output does not show the Overflows column you have in yours? |
It was again something that i tested on rtl and tried to copy over. -2 is the stream error code (SOAPY_SDR_STREAM_ERROR), so probably a bad state. Theres only one place that returns that error code: https://github.com/pothosware/SoapyAirspy/blob/streaming_callback_handshake/Streaming.cpp#L275 Which would imply that the number of bytes wasnt reset to 0. activateStream() should handle this, as well as all calls to releaseReadBuffer() which is called by readStream(). So thats something to print, also check if your code is making the call to activateStream(). Just a guess...
I may have changed it since then to only print overflows when the count is > 0. If the util doesn't get overflows that may say something about the overhead of python vs C/C++. |
First of all thanks for the effort on this, its been very useful for me so far. I have run into a performance issue I didn't think would be a problem.
Im using this through the Soapy python bindings. I have a specific use cases where I need to capture ~2.5 seconds of samples at a time in a continuous loop at a fixed centre frequency. Im running an airspy R2 operating at 2.5 msps.
I found out quickly that capturing ~2.5 seconds in one go causes errors or hangs, even on a relatively recent core i5. So I redid the code so I read the ~2.5 seconds in chunks.
That then makes the computer happy but occasionally it still prints out a 'O' to standard output which, from looking at the SoapyAirspy code means the client can not keep up. Im not sure what exactly happens internally at that point but its occasional enough not to bother me too much (can I avoid it though?).
However, I am now running the same code on a less powerful machine. Now very frequently it will fail to read a chunk with return error code -4. It should, however, be able to keep up. I have done similar tests with (py)rtlsdr.
So hence this issue. Am I approaching this in the correct way (see code below)? Is there an optimal buffer/chunk size I should be using? Should I activate/deactivate the stream on every 2.5s instead of only once at the beginning/end?
Code (stripped to its essentials) below. It runs in its own dedicated process.
The text was updated successfully, but these errors were encountered: