Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

unumpy array creation from string representation #289

Open
wants to merge 8 commits into
base: master
Choose a base branch
from

Conversation

jnahlers
Copy link

uncertainties has the very useful function ufloat_fromstr to create ufloat objects from various string representations. I thought that it would be a good symmetry to also include a uarray_fromstr function to turn string representations of arrays of ufloat string representations into uarrays. In particular, to re-create uarray objects from the results of str(my_uarray) (independent and uncorrelated, of course).

>>> import numpy as np
>>> from uncertainties import ufloat, unumpy as unp
>>> my_uarray = np.array([ufloat(2,1), ufloat(1.1,0.1)])
>>> my_str_rep = str(my_uarray)
>>> my_str_rep
'[2.0+/-1.0 1.1+/-0.1]'
>>> my_new_uarray = unp.uarray_fromstr(my_str_rep)
>>> unp.nominal_values(my_uarray) == unp.nominal_values(my_new_uarray)
array([ True,  True])
>>> unp.std_devs(my_uarray) == unp.std_devs(my_new_uarray)
array([ True,  True])

Implementing this has a few issues. Firstly, the string representations of numpy arrays use spaces to separate values. Therefore, string representations of ufloats that contain spaces (as supported by ufloat_fromstr) cannot (straightforwardly) be supported by uarray_fromstr.

uarrays, just like numpy arrays, can be multi-dimensional. In this case, the string representation will contain nested square brackets and newline characters. I have implemented uarray_fromstr to support this using a slightly hacky workaround that relies on ast.literal_eval. While this is much safer than eval, there are still some important security considerations.

  • Executed pre-commit run --all-files with no errors
  • The change is fully covered by automated unit tests
  • Documented in docs/ as appropriate
  • Added an entry to the CHANGES file

@newville
Copy link
Member

@jnahlers Thanks. I have to admit that I am not an enormous fan of either ufloat_fromstr or of numpy arrays of ufloat objects (as opposed to having an object that contained an array of floats and an array of uncertainties).

I think the challenges and messy bits of ufloat_fromstr are compounded for arrays.

But also: Is there a problem that this is solving? Like, who has string representations of arrays of ufloats that are stored in a way that needs to be parsed? That is, as opposed to a file with tables with columns for nominal values and standard errors?

@jnahlers
Copy link
Author

I have to admit that I am not an enormous fan of either ufloat_fromstr or of numpy arrays of ufloat objects (as opposed to having an object that contained an array of floats and an array of uncertainties).

The beauty of the unumpy module (and indeed, the uncertainties package as a whole) is that it enables easy uncertainty propagation (including remembering correlations) through lengthy calculations.

I think the challenges and messy bits of ufloat_fromstr are compounded for arrays.

Agreed! Certainly ufloat_fromstr is messy, and I am very grateful for the work that was put into implementing it. I think I found a fairly clean and straightforward extension to arrays, but of course that could quickly change if issues are found and things get more complicated.

But also: Is there a problem that this is solving? Like, who has string representations of arrays of ufloats that are stored in a way that needs to be parsed? That is, as opposed to a file with tables with columns for nominal values and standard errors?

Anyone who has to deal will unumpy arrays that were dumped to text. I encountered it in the context of a large pandas dataframe that included unumpy arrays, which was saved using the dataframe method to_csv(). Of course, this is far from ideal, but I think its inevitable that people will dump objects to text and so a parser (which is essentially what this is) can be quite convenient.

@andrewgsavage
Copy link
Contributor

It may be more appropriate for your problem to add writing/parsing support to uncertainties-pandas
https://github.com/andrewgsavage/uncertainties-pandas/blob/main/notebooks/tutorial.ipynb

@jnahlers
Copy link
Author

It may be more appropriate for your problem to add writing/parsing support to uncertainties-pandas https://github.com/andrewgsavage/uncertainties-pandas/blob/main/notebooks/tutorial.ipynb

Interesting library! I am not working with pandas myself, I am just parsing a text output from pandas.

@andrewgsavage
Copy link
Contributor

andrewgsavage commented Jan 30, 2025 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants