-
Notifications
You must be signed in to change notification settings - Fork 468
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WIP pandas #672
WIP pandas #672
Conversation
>>> from pint.pandas_interface import PintArray >>> import pandas as pd >>> df = pd.DataFrame({"address": PintArray([1, 2, 3])}) >>> df address 0 1 dimensionless 1 2 dimensionless 2 3 dimensionless >>> df.dtypes address Pint dtype: object >>> df['address'] 0 1 dimensionless 1 2 dimensionless 2 3 dimensionless Name: address, dtype: Pint >>> df['address'].values.data <Quantity([1 2 3], 'dimensionless')> but unfortunately >>> df['address'].values <pint.pandas_interface.PintArray object at 0x117517f28>
Initial commits
Most things seem to work well. Here's my tests import pandas as pd
import pint
import numpy as np pd.__version__
pint.__version__
ureg=pint.UnitRegistry()
Q_ = ureg.Quantity b=Q_([1,2,2,3],"m")
c=pint.QuantityArray._from_sequence([item for item in b])
c
[item for item in b]
c.data [\begin{pmatrix}1 & 2 & 2 & 3\end{pmatrix} meter] d=c*2
d.data
[\begin{pmatrix}2 & 4 & 4 & 6\end{pmatrix} meter] df=pd.DataFrame({"a":c,"b":c})
df
<style scoped>
.dataframe tbody tr th:only-of-type {
vertical-align: middle;
}
df.a.dtype
df.a.values
s=df.a*df.b
s
#this rightly shouldnt work
j=df.a**df.b
e=df.a.values + (ureg.cm * [5,5,5,5 ])
e.data
[\begin{pmatrix}1.05 & 2.05 & 2.05 & 3.05\end{pmatrix} meter] type(c.data)
df.a
#why is this different to above?!
df=pd.DataFrame(h)
df
<style scoped>
.dataframe tbody tr th:only-of-type {
vertical-align: middle;
}
type(df[0].values)
#At least cyber pandas has the same issue
from cyberpandas import IPArray
df=pd.DataFrame(IPArray(['192.168.1.1', '192.168.1.10']))
df[0].dtype
df = pd.DataFrame({"address": IPArray(['192.168.1.1', '192.168.1.10'])})
df.address.dtype
df=pd.DataFrame({"a":pint.QuantityArray(Q_([1,2,2,3],"m")),"b":pint.QuantityArray(Q_([5,12,52,53],"m"))})
# df['c']=df.a*df.b
df
<style scoped>
.dataframe tbody tr th:only-of-type {
vertical-align: middle;
}
#swtiching the order
# all(not ju.is_na or ju.block.is_extension for ju in join_units) and
# to
# all(not ju.block.is_extension for ju in join_units or ju.is_na ) and
# fixes this one
pd.concat([df,df], axis=0)
h=c._concat_same_type([c,c])
df=pd.DataFrame({"a":h})
df
<style scoped>
.dataframe tbody tr th:only-of-type {
vertical-align: middle;
}
#changing that to right fixes that
df.a==h
|
Was making unnecessarily doing np.array(<Quantity>)
This should mean that all the tests pass again. Given pint's focus on avoiding dependencies, we shouldn't import the pandas interface by default (as it depends on pandas). If users want it, they'll have to do `from pint.pandas_array import QuantityArray` which is a little longer than `from pint import QuantityArray` but I think that's ok as it's a specialised usage.
Remove pandas_array import from __init__.py
…perations Add check to ensure array sizes of RHS and LHS match Prevented typeerrors when performing operation with a single value quantity
Fixes great DimensionalityError that occurs when you pow with a single value quantity. DimensionalityError: Cannot convert from 'dimensionless' to 'dimensionless'
I have a few comments: |
Removing docs as I haven't worked out how to set them up properly to past tests. DF accessors added to ease going between QAs and numerical arrays.
Tidy up tests
Add example and tidy up source a tiny bit
Get tests passing again
@andrewgsavage probably best close this? |
ya |
Initial commits