Basic parametrised query generation (WIP) #201

grigi · 2019-01-07T21:35:33Z

This PR adds parametrized query building by adding a Parameter(':1') term that will be the parameter placeholder.
Due to placeholders in SQL for parametrized queries being so varied (even between drivers or setup for the same dialect) that I decided to leave implementation as explicit as possible.
Whatever is indicated as the string is included verbatim into the SQL.

An example of this madness:
PostgreSQL → $number OR %s + :name (depending on driver)
MySQL → %s
SQLite → ?
Vertica → :name
Oracle → :number + :name
MSSQL → %(name)s OR :name + :number (depending on driver)

This PR is a work-in-progress:

Added Python3.7 and PyPy3.5 to Travis builds.
Added some tests of when to NOT parametrize.
Added some parameter tests.
Implemented a Parameter() term, and added to root __init__.py for convenience
Removed __str__() where the parent has the same variant implemented.
Update documentation

Fixes #113, potential workaround for some of #3

coveralls · 2019-01-07T21:37:26Z

Coverage increased (+0.07%) to 98.088% when pulling 6ce8072 on grigi:parametrized_query into 38f1d6d on kayak:master.

twheys · 2019-01-08T09:08:41Z

Hi, thanks for the PR. This feature definitely a big ask from a lot of people, so appreciate you contributing it. So far the change looks good, let me know if you need anything in order to check off the remaining boxes.

grigi · 2019-01-09T10:45:33Z

I did some more digging on the 110 tests that fail due to kwargs['params'] not existing is all lower level unit tests.
Which would mean either rewriting all 110 tests or leaving the test-specific hack in:
https://github.com/kayak/pypika/pull/201/files#diff-425db19c78eea4c6ae89f8b70c621969R244

I started changing some tests, (I estimate ~500loc diff just to change each of these tests) and found that negation (from test_negation.py) will now generate -? which I'm not sure is valid SQL.
Another concern is that doing str() on anything that is not a QueryBuilder may be an illegal operation now, and it is used everywhere in the tests.

Please advise what you think is the best course of action re this?
I feel removing __str__() from anything but QueryBuilder is the safe thing to do, but could break backwards compatibility. But leaving a condition that should only be triggerd by tests is not right either. This issue could also be outside the scope of this PR and should instead be considered a deprecation issue?

grigi · 2019-01-09T11:45:04Z

Other than API breakage, removing __str__() from all terms is only 132 changes, but in there it generates a default param of quote_char='"'. Which begs the question, can we not also just add params and then do string interpolation? But then what is the difference between leaving the hack in as is?

twheys · 2019-01-09T13:41:18Z

I think it would be really helpful if you would start with some basic tests covering the syntax you would like to use for parameters so it is clear how the pypika query should be built up and the how the SQL generated by it should look.

Using How to use variables in SQL statement in Python? as a reference, it appears that different db vendors use different markers for parameters, so should aim for something that gives the user control over that.

I'm not sure why you would need a second get_sql function to generate these queries. To support paramterized queries, you basically need a query element that serializes as the marker in the query like ?, %s, or :1. Perhaps, instead of mixing this logic in with ValueWrapper, it would make sense to define it as it's own class that can be used by the end user directly in expressions.

I think a syntax like the following would work real well and be easy to implement without needing to change so much of the internals:

customers = Table('customers')
q = Query.from_(customers)\
    .select(customers.id, 
            customers.fname, 
            customers.lname, 
            customers.phone)\
    .where(customers.age >= Parameter('?'))

would generate SQL like

SELECT id,fname,lname,phone FROM customers WHERE age>=?

Then the user can decide which marker to use for each variable, can freely reuse them, and use them directly with the python db API.

cursor.execute(q, 18)

grigi · 2019-01-09T15:22:34Z

Yes, having a Parameter() function would allow us to differentiate between something static and something parametrized.

The question is, right now (whilst I was writing tests) it seems that evaluation happens in the order one does calls, but will this always be the case?
If there isn't a guarantee, one would need an option to identify the order of the parameters.
In that case we would need a way to determine its order, e.g. by naming the parameters or some other way of identifying order.

grigi · 2019-01-10T11:22:03Z

So far I found:
PostgreSQL → $number OR %s + :name (depending on driver)
MySQL → %s
SQLite → ?
Vertica → :name
Oracle → :number + :name
MSSQL → %(name)s OR :name + :number (depending on driver)

That is a lot of variance as to how the parameters get defined.
I don't think one can realistically expect that PyPika can generate a valid value, especially when the only option is named parameters...

Maybe your suggestion of just expecting a string value with the right value is the most sane one?

twheys · 2019-01-10T16:09:58Z

I would hesitate to try and generate which parameter is which because you ultimately can never know how the user will want to use it. For example, what if they want to reference the same parameter in two places within the query. Giving them control over which reference is used for each parameter will 1) force them to use the right one for their database platform and 2) give them full control over how the parameters are used in the query.

To answer your question, I had it in mind to refactor the way the SQL is generated at some point since that is by a long shot the most complex part of this library, but I don't think that would change the order of how constituents of the query are serialized. You can probably safely assume, for example, that if multiple WHERE-clauses are used, they will be serialized in the SQL in the exact same order.

grigi · 2019-01-11T10:08:24Z

I was thinking the same. I had to do some research to get data, and two things make it a bad idea to generate inside PyPika:

different syntax per driver/driver-config (for the same dialect)
named parameters

So I think a very simple Parameter('&2') is actually the best solution.
I'll redo the PR, and it will be a lot less code. Maybe default it to '?'

grigi · 2019-01-14T07:43:29Z

@twheys Please review.

twheys · 2019-01-15T14:43:26Z

Looks great. Approved, merged, and published. Appreciate you added py37 as well. Thanks 👍

imbolc · 2020-05-14T12:15:11Z

I see the point of having explicit Parameter term. But it's still so annoying, that I ended up mostly avoiding it with something like:

q = Query.from_(table).select("*").where(table.id == "?")
sql = str(q).replace("'?'", "?")

Maybe we add some kind of hook to the end of get_sql, which will trigger a globally defined function? So instead of adding replace to each query, it would be just:

pipika.get_sql_patch = lambda s: s.replace(...)

This way people themselves could decide how to parse placeholders for their particular driver, with regex or something. What do you think?

mvanderlee · 2021-04-08T18:05:48Z

The current implementation does not work for us. We have a complex where clause builder that would need to be completely rewritten to support the current parameterization.
The first commit that modified the ValueWrapper was on the right track in my opinion. It just needs some minor changes to still be able to support different parameter formats.

We solved our issue by patching the ValueWrapper as follows:

from typing import Any, List, Optional
from pypika.terms import ValueWrapper
from pypika.utils import format_alias_sql

class ParameterizedValueWrapper(ValueWrapper):
    
    def get_sql(self, quote_char: Optional[str] = None, secondary_quote_char: str = "'", param_wrapper: ParameterWrapper = None, **kwargs: Any) -> str:
        if param_wrapper is None:
            sql = self.get_value_sql(quote_char=quote_char, secondary_quote_char=secondary_quote_char, **kwargs)
            return format_alias_sql(sql, self.alias, quote_char=quote_char, **kwargs)
        else:
            value_sql = self.get_value_sql(quote_char=quote_char, **kwargs)
            param_sql = param_wrapper.get_sql(**kwargs)
            param_wrapper.update_parameters(param_key=param_sql, param_value=value_sql, **kwargs)

            return format_alias_sql(param_sql, self.alias, quote_char=quote_char, **kwargs)


class ParameterWrapper:
  
    def update_parameters(self, param_key: Any, param_value: Any, **kwargs):
        pass

    def get_sql(self, **kwargs):
        pass


class QParameterWrapper(ParameterWrapper):
    def __init__(self, parameters: List[Any]):
        self.parameters = parameters

    def update_parameters(self, param_key: Any, param_value: Any, **kwargs):
        self.parameters.append(param_value)

    def get_sql(self, **kwargs):
        return '?'


class NamedParameterWrapper(ParameterWrapper):
    def __init__(self, parameters: Dict[str, Any]):
        self.parameters = parameters

    def update_parameters(self, param_key: Any, param_value: Any, **kwargs):
        self.parameters[param_key[1:]] = param_value

    def get_sql(self, **kwargs):
        return f':param{len(self.parameters) + 1}'

Usage:

pypika.terms.ValueWrapper = ParameterizedValueWrapper
q = pypika.Query.from_('test').select('*').where((pypika.Field('name') == 'foobar') & (pypika.Field('foo') != 'bar'))

parameters = []
sql = q.get_sql(param_wrapper=QParameterWrapper(parameters))
print(sql)
print(parameters)

parameters = {}
sql = q.get_sql(param_wrapper=NamedParameterWrapper(parameters))
print(sql)
print(parameters)

Output:

SELECT * FROM "test" WHERE "name"=? AND "foo"<>?
['foobar', 'bar']
SELECT * FROM "test" WHERE "name"=:param1 AND "foo"<>:param2
{'param1': 'foobar', 'param2': 'bar'}

Basic parametrised query generation

47e959f

grigi requested review from twheys and a team as code owners January 7, 2019 21:35

grigi mentioned this pull request Jan 7, 2019

Change to all-parametrised queries tortoise/tortoise-orm#81

Closed

9 tasks

grigi added 3 commits January 8, 2019 09:24

Travis test on py3.7

b1e2b3c

Try travis again for py3.7

6c17fde

Add pypy to Travis and update tox.ini to be closer to travis config

6b763d1

twheys self-assigned this Jan 8, 2019

grigi added 3 commits January 13, 2019 21:30

Use Parameter() instead

da80e39

Minor cleanup

1ba73ee

Added documentation

6ce8072

twheys approved these changes Jan 15, 2019

View reviewed changes

twheys merged commit ac939d9 into kayak:master Jan 15, 2019

twheys added the 0.21.0 label Jan 15, 2019

mvanderlee mentioned this pull request Feb 22, 2024

Improved parameter support #793

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Basic parametrised query generation (WIP) #201

Basic parametrised query generation (WIP) #201

Uh oh!

grigi commented Jan 7, 2019 •

edited

Loading

Uh oh!

coveralls commented Jan 7, 2019 •

edited

Loading

Uh oh!

twheys commented Jan 8, 2019

Uh oh!

grigi commented Jan 9, 2019

Uh oh!

grigi commented Jan 9, 2019

Uh oh!

twheys commented Jan 9, 2019

Uh oh!

grigi commented Jan 9, 2019

Uh oh!

grigi commented Jan 10, 2019

Uh oh!

twheys commented Jan 10, 2019

Uh oh!

grigi commented Jan 11, 2019

Uh oh!

grigi commented Jan 14, 2019

Uh oh!

twheys commented Jan 15, 2019

Uh oh!

imbolc commented May 14, 2020

Uh oh!

mvanderlee commented Apr 8, 2021 •

edited

Loading

Uh oh!

Uh oh!

Basic parametrised query generation (WIP) #201

Basic parametrised query generation (WIP) #201

Uh oh!

Conversation

grigi commented Jan 7, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

coveralls commented Jan 7, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

twheys commented Jan 8, 2019

Uh oh!

grigi commented Jan 9, 2019

Uh oh!

grigi commented Jan 9, 2019

Uh oh!

twheys commented Jan 9, 2019

Uh oh!

grigi commented Jan 9, 2019

Uh oh!

grigi commented Jan 10, 2019

Uh oh!

twheys commented Jan 10, 2019

Uh oh!

grigi commented Jan 11, 2019

Uh oh!

grigi commented Jan 14, 2019

Uh oh!

twheys commented Jan 15, 2019

Uh oh!

imbolc commented May 14, 2020

Uh oh!

mvanderlee commented Apr 8, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

grigi commented Jan 7, 2019 •

edited

Loading

coveralls commented Jan 7, 2019 •

edited

Loading

mvanderlee commented Apr 8, 2021 •

edited

Loading