Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support prepared statements in PEcAn.DB::db.query #2317

Merged
merged 5 commits into from
Mar 25, 2019

Conversation

ashiklom
Copy link
Member

(NOTE: This PR is a superset of #2316. Merge/review that one first).

Description

Support prepared statements in db.query (if using the RPostgres backend).

Motivation and Context

Prepared statements provide a way to efficiently pass data into SQL queries without the risk of SQL injection attacks. Instead of doing this:

db.query(paste0(
  "SELECT * FROM table WHERE mycol = ", somevalue,
  " AND othercol = ", othervalue
), con = con)

...we can now do this:

db.query(
  "SELECT * FROM table WHERE mycol = $1 AND othercol = $2",
  values = list(somevalue, othervalue),
  con = con
)

Besides preventing SQL injections, prepared statements also ensure that the input and target types are compatible.

Prepared statements provide an efficient way to operate on multiple values at once. For example, the following will return all the models whose revision is either "git", "46", or "unk":

db.query(
  "SELECT * FROM models WHERE revision = $1",
  values = list(c("git", "46", "unk")),
  con = con
)

...and here is an example of inserting multiple values of a given trait for a given species:

db.query(
  "INSERT INTO traits (specie_id, variable_id, mean, n) VALUES ($1, $2, $3)",
  values = list(938, 396, c(1.7, 3.9, 4.5), 1),
  con = con
)

Prepared statements have been on our wishlist for a while (#395).

Review Time Estimate

  • Immediately
  • Within one week
  • When possible

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)

Checklist:

  • My change requires a change to the documentation.
  • I have updated the CHANGELOG.md.
  • I have updated the documentation accordingly.
  • I have read the CONTRIBUTING document.
  • I have added tests to cover my changes.
  • All new and existing tests passed.

@Chris-Schnaufer
Copy link
Contributor

Please correct text to “parameterized” statements and not prepared statements. Prepared statements are a different thing: https://www.postgresql.org/docs/9.3/sql-prepare.html

@ashiklom
Copy link
Member Author

@Chris-Schnaufer Wikipedia and the documentation for DBI::dbBind both have them as synonyms. Also, searching "parameterized statement postgres" brings me to the documentation for PREPARE. So I think they are interchangeable.

@ashiklom ashiklom force-pushed the prepared_statement branch from 5f58388 to e5fb356 Compare March 15, 2019 16:31
@ashiklom ashiklom mentioned this pull request Mar 15, 2019
12 tasks
Copy link
Member

@robkooper robkooper left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make sense to deprecate RPostreSQL in favor of RPostgres?

@ashiklom
Copy link
Member Author

Would it make sense to deprecate RPostreSQL in favor of RPostgres?

I wouldn't do it until we give everyone a chance to run some tests. The fact that RPostgres uses integer64 columns can introduce some really insidious bugs when those get implicitly converted to integer or double. I fixed a few of them in #2318, but odds are there are more in functionality outside my basic workflow tests.

@ashiklom
Copy link
Member Author

ashiklom commented Mar 15, 2019

One thing I've thought about that could ease the transition is to add pseudo-support for parameterized statements in RPostgreSQL that relies on unsafe string substitution. Basically, the idea would be to loop over each element in values and incrementally gsub instances of $1, $2, ... in the query string. That would allow us to start switching everything over to prepared statements while still (unsafely) supporting RPostgreSQL. If we don't worry too much about making it bulletproof (which we shouldn't -- it's easy to throw a loud warning about unsafe substitutions), that shouldn't be too hard to implement.

@robkooper
Copy link
Member

@ashiklom almost sounds like a summer of code project :)

@dlebauer dlebauer requested a review from kimberlyh66 March 15, 2019 21:00
@tonygardella tonygardella merged commit 56b2d4c into PecanProject:develop Mar 25, 2019
@ashiklom ashiklom deleted the prepared_statement branch May 8, 2019 15:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants