-
Notifications
You must be signed in to change notification settings - Fork 56
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix posterior and estimator integer overflow bugs on Windows #259
Conversation
Hello, I made the two code changes in posterior.py, however, I still get the exact same error. I'm not sure how to run the "test" using the files within the test folder... Thanks, cellbender:remove-background: Computing target noise counts per gene for MCKP estimator |
@ps2504 it looks like you have cloned the github repository to your local machine, so all you have to do to run the relevant tests would be (in the folder where the cellbender code is located)
if tests pass, you'd see an output like this
But based on what you said about getting the same error, I'd be expecting the test to fail. I will try to add a Windows configuration to the automatic tests that run in github. |
Hello again, Okay I was able to run the test per your instructions: INTERESTINGLY, the test looks like there's no issues. but I went back and reran the code again but got the same error (cellbender) C:\Users\SuP1\Downloads\CellBender>pytest -k test_save_and_load cellbender/remove_background/tests/test_posterior.py cellbender\remove_background\tests\test_posterior.py .... [100%] ===================== 4 passed, 72 deselected in 0.18s ===================== |
The last call back error was in posterior.py, but the one prior is in estimation.py. I have no idea but could that have something to do with it? Thank you so much for looking into it. |
I pasted the error into chatgpt and this is what it returned... I hope this might be useful? The error message you provided indicates that there is an issue with the input data, specifically with the indices provided in the m_inds array. It seems like the values in the m_inds array are set to -2147483648, which is the minimum value for a 32-bit signed integer (int32). This value is often used as a sentinel value or a placeholder in various programming contexts. Here are a few things you can check and consider to diagnose and address this issue: Index Range: The error message suggests that the values in m_inds are out of range. Check if the indices you're using are within the valid range for the data structure or function you're using. Make sure that the indices are non-negative and within the appropriate range for the specific operation you're performing. Data Type: Ensure that the data type of the m_inds array is correct. If the data type is incorrectly specified or if there's a type mismatch, it can lead to unexpected behavior or errors. For example, using a data type that is too small to represent the indices can result in overflow or wrapping of values. Initialization: If the m_inds array is being initialized with default values, such as -2147483648, it might indicate an issue with the way the array is being created or populated. Make sure that the array is being initialized correctly and that the values are being assigned as intended. Data Integrity: Check the source of the data you're using to populate the m_inds array. It's possible that there's a bug or a data preprocessing issue that's causing incorrect values to be assigned to the array. Library or Framework Specifics: If you're using a specific library or framework for numerical computations, consult its documentation to understand the expected input formats and data ranges. Different libraries might have different requirements for input data. Data Exploration: Before passing the data to the function that's causing the error, print out the contents of the m_inds array to see if the values are indeed what you expect them to be. This can help you identify any inconsistencies or unexpected values. Without more context about the specific code you're working with and the function that's causing the error, it's a bit challenging to provide a more precise solution. However, I hope these general suggestions help you identify and resolve the issue you're facing. |
Fixes Windows bug #250
np.uint64, this may be necessary on Windows
Hi. Just wanted to update that the most recent changes you made (Make dtype of m explicit in apply_function_dense_chunks) seems to have done the trick. Thanks a bunch! |
@ps2504 wonderful! Great news! |
offset_dict = {k + greater_than_max_int32: v for k, v in log_prob_coo['offsets'].items()} | ||
|
||
# this is just a shim | ||
converter = IndexConverter(total_n_cells=2, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a very cool library for "mock"ing classes and methods for unit tests:
https://docs.python.org/3/library/unittest.mock.html
(just for future reference)
* Add WDL input to set number of retries. (#247) * Move hash computation so that it is recomputed on retry, and now-invalid checkpoint is not loaded. (#258) * Bug fix for WDL using MTX input (#246) * Memory-efficient posterior generation (#263) * Fix posterior and estimator integer overflow bugs on Windows (#259) * Move from setup.py to pyproject.toml (#240) * Fix bugs with report generation across platforms (#302) --------- Co-authored-by: kshakir <github@kshakir.org> Co-authored-by: alecw <alecw@users.noreply.github.com>
The data structure used to store the full posterior is a
[m, c]
matrix, wherem
is n_barcodes times n_genes. Som
can be absolutely massive.When
m
> maximum possible int32:Including explicit casts to
uint64
seems to solve the problem on Windows (pending input on #252 ).I have included a test which I believe might fail on Windows without this fix, though I do not have a Windows setup to test on yet.