-
Notifications
You must be signed in to change notification settings - Fork 610
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding new backend for MapD #1419
Conversation
Added mapd backend initial files.
Improved mapd client and compiler; Added initial documentation.
README updated; Initial changes to use execute method.
Improving ibis.mapd client and compiler
Added Math, trigonometric and geometric operations
Resolves ibis ibis-project#1418 and resolves ibis-project#893
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @xmnlab! This is solid progress. Let's do a few more review cycles before we merge this in and try to clean up a bit of the duplication. Overall, though, this is pretty close.
ibis/expr/api.py
Outdated
@@ -388,6 +389,8 @@ def row_number(): | |||
|
|||
e = ops.E().to_expr() | |||
|
|||
pi = ops.Pi().to_expr().name('pi') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would leave this unnamed for now. No reason to make this choice for users.
ibis/expr/api.py
Outdated
acos = _unary_op('acos', ops.Acos) | ||
asin = _unary_op('asin', ops.Asin) | ||
atan = _unary_op('atan', ops.Atan) | ||
atan2 = _generic_op('atan2', ops.Atan2) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a binary operation right? There should be something like _binary_op
function lying around here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you're right. I think this is the right function: _binop_expr
ibis/expr/operations.py
Outdated
@@ -516,6 +521,53 @@ class Log10(Logarithm): | |||
"""Logarithm base 10""" | |||
|
|||
|
|||
# TRIGONOMETRIC OPERATIONS | |||
|
|||
class TrigonometryUnary(UnaryOp): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you change the name from TrigonometryUnary
to TrigonometricUnary
, and do the same for TrigonometryBinary
.
ibis/expr/operations.py
Outdated
@@ -2183,6 +2235,14 @@ def output_type(self): | |||
return partial(ir.FloatingScalar, dtype=dt.float64) | |||
|
|||
|
|||
class Pi(Constant): | |||
""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess you could saying something here like The constant pi.
ibis/mapd/client.py
Outdated
elif GPUDataFrame is not None and isinstance( | ||
self.cursor, GPUDataFrame | ||
): | ||
result = self.cursor |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since you're executing the same code (assigning self.cursor
to result
) in the case that the cursor is a pandas DataFrame or a GPUDataFrame, can you remove the last two elif
s? Is there a case where self.cursor is not None
and it's not either a pandas DataFrame or a GPUDataFrame?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you're right! thanks!
ibis/mapd/compiler.py
Outdated
# compile the argument | ||
compiled_arg = translator.translate(arg) | ||
|
||
return 'CHAR_LENGTH(%s)' % compiled_arg |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Format strings.
ibis/mapd/tests/test_compiler.py
Outdated
import ibis.expr.datatypes as dt | ||
|
||
|
||
def test_timestamp_accepts_date_literals(alltypes): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a BigQuery specific test. Do you really need it?
ibis/mapd/udf.py
Outdated
@@ -0,0 +1,5 @@ | |||
""" | |||
User Defined Function |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wouldn't add this module unless there's support for this in MapD.
ibis/mapd/tests/test_client.py
Outdated
assert result == expected | ||
|
||
''' | ||
def test_simple_aggregate_execute(alltypes, df): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Many of these tests are BigQuery specific. Can you remove them from here?
The preferred alternative is to add appropriate tests in ibis/tests/all/test_*.py
. To do that, you'll also need to add a MapD
class in ibis/tests/all/backends.py
. There are many examples in that file to get you started.
@@ -0,0 +1,798 @@ | |||
from six import StringIO |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like a lot of the functions and objecs in here are duplicated from either the impala or bigquery backends. Can you see if you can reuse some of their functions so we have only what's needed for MapD?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok. I will take a look :) thanks!
ci/requirements-docs-3.6.yml
Outdated
@@ -22,6 +22,7 @@ dependencies: | |||
- plumbum | |||
- psycopg2 | |||
- pyarrow>=0.6.0 | |||
- pymapd |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does this require a version constraint?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you're right! it is better to pin the version here. thanks!
ibis/__init__.py
Outdated
@@ -71,6 +72,12 @@ | |||
# pip install ibis-framework[bigquery] | |||
import ibis.bigquery.api as bigquery | |||
|
|||
with suppress(ImportError): | |||
# pip install ibis-framework[mapd] | |||
if sys.version_info[0] < 3: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use sys.version_info.major
here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok! thanks
ibis/__init__.py
Outdated
with suppress(ImportError): | ||
# pip install ibis-framework[mapd] | ||
if sys.version_info[0] < 3: | ||
raise ImportError('ibis.mapd is not allowed it for Python 2.') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo here, should read ibis.mapd is not allowed for Python 2
or The mapd backend is not supported under Python 2
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks!
Corrections from the revision
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @cpcloud !
I've push new changes
ibis/expr/operations.py
Outdated
dtype = self.left.type().largest | ||
else: | ||
dtype = dt.float64 | ||
return dtype.scalar_type() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh I see. Sorry, my mistake. I was a little bit confused. I will changed that.
how = Arg(rlz.isin({'sample', 'pop'}), default=None) | ||
where = Arg(rlz.boolean, default=None) | ||
|
||
def output_type(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok! thanks!
ibis/expr/operations.py
Outdated
|
||
class Distance(ValueOp): | ||
""" | ||
Calculates distance in meters between two WGS-84 positions. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, we can do that. We just cannot test it because mapd doesn't have a Euclidean operation.
I am just not sure if common Euclidean distance function works with lat lon parameters .. so maybe would be better to remove this function and create a issue to follow this discussion.
Changed left and right params to column numeric type
@cpcloud thanks a lot for reviewing this PR. this is a compilation of the main fixes here:
if these points are ok for you, I think it is ready for a new review. I just changed the left and right parameters from correlation to column numeric. Again, thank you so much for your attention. |
there is a conflict now .. I will rebase here now. |
Merged from master
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@xmnlab After you address this round of comments I wil approve and merge! Thanks for the effort!!
ibis/mapd/client.py
Outdated
self, name, password=None, is_super=None, insert_access=None | ||
): | ||
""" | ||
Create a new MapD database |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This docstring looks wrong.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you're right. thanks!
statement = ddl.DropDatabase(name) | ||
self._execute(statement) | ||
|
||
def create_user(self, name, password, is_super=False): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like a new API. @xmnlab can you create a follow up issue to add this API to the clients that have support for such functionality?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok I will do it.
) | ||
self._execute(statement) | ||
|
||
def drop_user(self, name): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok I will do it! thanks!
ibis/tests/all/test_temporal.py
Outdated
@@ -41,9 +41,11 @@ def test_timestamp_extract(backend, alltypes, df, attr): | |||
@pytest.mark.parametrize('unit', [ | |||
'Y', 'M', 'D', | |||
param('W', marks=pytest.mark.xfail), | |||
'h', 'm', 's', 'ms', 'us', 'ns' | |||
'h', 'm', 's', 'ms', 'us', | |||
param('ns', marks=pytest.mark.xfail) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this pass now that you added the skipif_backend
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, for some reason when it try to test ns
unit .. raise an error that the pytest breaks ...
I tried to use pytest mark for ns
and it didn't work .. I needed to skip that for mapd.
for another tests here .. I could remove skipif_backend
and just put some xfail for some units and works good.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it necessary to have both xfail
on 'ns'
and the skipif_backend('MapD')
decorator? Shouldn't it be enough to just skip this on MapD altogether?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you're right! sorry, I will remove this now.
ibis/tests/all/conftest.py
Outdated
pytest.param(Impala, marks=pytest.mark.impala) | ||
] | ||
|
||
if sys.version_info.major == 3: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should be > 2
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok thanks!
), | ||
param( | ||
lambda t: t.double_col.cov(t.float_col), | ||
91.67005567565313, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ultimately we will want to change these to use a pandas call or numpy call, so that we don't have to depend on hard coded values.
This is fine for now though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok thanks!
@@ -0,0 +1,422 @@ | |||
from ibis.sql.compiler import DDL, DML |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you have tests for the classes in this file? If not, please add them in a follow-up PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No ... I had some problems adding DDL tests because it was breaking the mapd database container. I will create now a issue for that.
@@ -1752,6 +1814,34 @@ def _string_like(self, patterns): | |||
) | |||
|
|||
|
|||
def _string_ilike(self, patterns): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this API have a test in this PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
changes done. I will commit my changes.
), | ||
param( | ||
lambda t: t.double_col.cov(t.float_col), | ||
91.67005567565313, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok thanks!
ibis/tests/all/conftest.py
Outdated
pytest.param(Impala, marks=pytest.mark.impala) | ||
] | ||
|
||
if sys.version_info.major == 3: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok thanks!
@@ -0,0 +1,422 @@ | |||
from ibis.sql.compiler import DDL, DML |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No ... I had some problems adding DDL tests because it was breaking the mapd database container. I will create now a issue for that.
statement = ddl.DropDatabase(name) | ||
self._execute(statement) | ||
|
||
def create_user(self, name, password, is_super=False): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok I will do it.
ibis/mapd/client.py
Outdated
self, name, password=None, is_super=None, insert_access=None | ||
): | ||
""" | ||
Create a new MapD database |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you're right. thanks!
) | ||
self._execute(statement) | ||
|
||
def drop_user(self, name): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok I will do it! thanks!
ibis/tests/all/test_temporal.py
Outdated
@@ -41,9 +41,11 @@ def test_timestamp_extract(backend, alltypes, df, attr): | |||
@pytest.mark.parametrize('unit', [ | |||
'Y', 'M', 'D', | |||
param('W', marks=pytest.mark.xfail), | |||
'h', 'm', 's', 'ms', 'us', 'ns' | |||
'h', 'm', 's', 'ms', 'us', | |||
param('ns', marks=pytest.mark.xfail) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, for some reason when it try to test ns
unit .. raise an error that the pytest breaks ...
I tried to use pytest mark for ns
and it didn't work .. I needed to skip that for mapd.
for another tests here .. I could remove skipif_backend
and just put some xfail for some units and works good.
Changes from the review
Removed xfail mark for ns unit
Merging on green! |
@cpcloud thank you so much for your attention and support! |
nice! bombs away! |
Also resolves #1418 and resolves #893