-
Notifications
You must be signed in to change notification settings - Fork 32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix: Fix and enhance SQL injection prevention #40
Conversation
5e131f8
to
e01221b
Compare
0bbedc7
to
7b8451f
Compare
390df54
to
c70ec05
Compare
fc6fb74
to
5c8ec3a
Compare
Codecov ReportBase: 91.36% // Head: 92.40% // Increases project coverage by
Additional details and impacted files@@ Coverage Diff @@
## develop #40 +/- ##
===========================================
+ Coverage 91.36% 92.40% +1.03%
===========================================
Files 199 221 +22
Lines 2630 3040 +410
Branches 280 354 +74
===========================================
+ Hits 2403 2809 +406
+ Misses 177 169 -8
- Partials 50 62 +12
Flags with carried forward coverage won't be shown. Click here to find out more.
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. ☔ View full report at Codecov. |
5b52324
to
d5f772e
Compare
c70ec05
to
5730b0b
Compare
- passing context to filter runner - add santizier filter builder and runner - add TemplateInput class to control parameterize logics - update req runner to inject parameterizers
…ention - remove duplicated type definitions. - hide data source in executer. - remove useless exports. - add some comments. - move some logic into functions.
c0f8d3f
to
5ff11a6
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Besides some questions ( I do not figure out the part...), others LGTM, Awsome 👍
/** The index (starts from 1) of parameters, it's useful to generate parameter id like $1, $2 ...etc. */ | ||
parameterIndex: number; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we also consider the non-index parameters, like big query seems only support ?
placeholder and naming placeholder e.g: @name
, @url
... and so on.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can simply ignore this input argument and return the identifier whatever you want:
public async prepare() {
return `@${someRandom()}`
}
I can't give you some meaningful parameter name because it might not be an input parameter, for example:
{% for val in someArray %}
{{ val }} --- I don't know the name of the value, only know the order (index)
{% endfor %}
So for Big Query, one alternative is generating naming identifiers by the index too:
public async prepare({parameterIndex}) {
return `@var${parameterIndex}`
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @oscar60310 for replying my question and explain it!
bindParams: BindParameters; | ||
pagination?: Pagination; | ||
} | ||
|
||
export type PrepareParameter = { (param: RequestParameter): Promise<string> }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we rename to PrepareParameterFunc
or PrepareParameterAsyncFunc
to represent explicitly ?
Or it looks like a type object, so that when seeing the prepare: PrepareParameter
in IExecutor
interface, not easy to attend it until discovering not add the await
at dataSource.prepare
under executor prepare method.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, I'll update it to PrepareParameterFunc
, thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks so much !
function (this: any, value: any, ...args) { | ||
// use classic function to receive context | ||
extension.__transform(this, value, ...args); | ||
}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we change this part, not figure out what the influence is, could you introduce it ? Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nunjucks binding "context" to "this" (js this) https://github.com/mozilla/nunjucks/blob/master/nunjucks/src/compiler.js#L499
I'd like to pass context to transform functions, but we bound extension object to __transform function, it makes this
become extension
instead of the context provided by upstream.
extension.__transform.bind(extension),
So I used a classic function here to receive this
first then pass to ___transform function as an argument.
// the value is real param data | ||
[identifier: string]: string; | ||
}; | ||
export type BindParameters = Map<string, string>; | ||
|
||
export type IdentifierParameters = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could also remove it, because seems it will not use to after the PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed, thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @oscar60310's help!
import { RAW_FILTER_NAME, SANITIZER_NAME } from './constants'; | ||
|
||
@VulcanInternalExtension() | ||
export class SanitizerBuilder extends FilterBuilder { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you introduce what is the SanitizerBuilder
responsible for, why we need to add sanitizer, and put the symbol sanitize
? Not figure out the workflow and the domain context for sanitize
and not see the description introduce it, thanks so much !
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This builder automatically adds the filter 'sanilizer` after the "lookup" like nodes, e.g. LookupVal, FunctionCall ...etc. (so do other filter builders). In order to do sql injection prevention.
{{ context.params.id }} -> {{ context.params.id | sanitizer }}
I've add these comments for the class too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @oscar60310 for replying my question
} | ||
|
||
if (this.isNodeNeedToBeSanitize(child)) { | ||
if (!parentHasOutputNode && !(node instanceof nunjucks.nodes.Output)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it mean we only add sanitize when the node is the output or the parent has an output node? But why do we add the sanitize after the output?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it mean we only add sanitize when the node is the output or the parent has an output node?
Yes.
This is an ordinary AST tree of template {{ context.params.id }}
:
When we traversal the AST from root:
- We first meet the output node
- And we'll find its child is a LookupVal
- We wrap the lookupVal node in filter
sanitizer
. (via the replace function){{ context.params.id | sanitizer }}
This is what this builder mainly does.
But why do we add the sanitize after the output?
Output nodes mean to "render" strings, in our case, they also mean to generate the sql to execute, these values should be sanitized (parameterized) before executing, so we add the filter before the output result.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @oscar60310 for introducing detail so that I could understand it 😃 !!
query = query | ||
.split(/\r?\n/) | ||
.filter((line) => line.trim().length > 0) | ||
.join('\n'); | ||
// Get bind real parameters and pass to data query builder for data source used. | ||
const binds = (context.ctx || {})['_paramBinds'] || {}; | ||
const binds = parameterizer.getBinding(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We call parameterizer.getBinding()
, but the binds has contains the idToValueMapping data which SanitizerRunner
ran await input.parameterize(parameterizer)
?
I'm really confused the each filter's execute order and when calling transform
and calling run
~ " ~
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We call parameterizer.getBinding(), but the binds has contains the idToValueMapping data which SanitizerRunner ran await input.parameterize(parameterizer) ?
Yes! The magic happened because of the shared context. Please have a look at the image below.
I'm really confused the each filter's execute order and when calling transform and calling run ~ " ~
We're using DFS to compile and execute the nodes, so we run req tag first, then filter.
// parameterizer from parent, we should set it back after rendered our context. | ||
const parentParameterizer = context.lookup(PARAMETERIZER_VAR_NAME); | ||
context.setVariable(PARAMETERIZER_VAR_NAME, parameterizer); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I don't understand why we set the parameterizer back to our context when parameterizer from parent, could you tell more information? Thanks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because the parent and children shared the same context, if the children don't set it back, their parent will lost its parameterize because it will be overriden.
// Parameterizer should be set by req tag runner | ||
const parameterizer = context.lookup<Parameterizer>(PARAMETERIZER_VAR_NAME); | ||
if (!parameterizer) throw new Error(`No parameterizer found`); | ||
return await input.parameterize(parameterizer); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm curious the parameter only set by req tag runner, the req tag seems to the sample you show on description:
{% req user %}
select id from users where userName = {{ context.params.name }}
{% endreq %}
but I didn't see where to handle the normal {{ ... }}
case, e.g:
select * from user where name = {{ context.params.name }}
Or does the req tag runner also include the case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or does the req tag runner also include the case?
Yes, we'll wrap the template with a builder if there is no main builder.
https://github.com/Canner/vulcan-sql/blob/develop/packages/core/src/lib/template-engine/built-in-extensions/query-builder/reqTagBuilder.ts#L146-L158
Hi @kokokuo , I've drawn a diagram to describe what's happened with our filters and tags. |
Hi @kokokuo all issues have been fixed. |
Thanks for drawing the detailed steps picture, and explaining to me the flow, it's really helpful for me, thanks so much 😃 |
What's fixed
After Feature: SQL Injection Prevention (Flow Simulation) #23, we replace the input parameters with parameterized values like $1, $2 ...etc. But we need the raw value with template logic, for example:
The above template will always render the where query because
context.params.age
is "$1". This PR fixed this issue, it replaces values only when needed.We sent useless binding with sub-queries. For example:
We sent
select id from users where userName = $1
with binding ['someUserName', 'someGroupName'] at the first query, which might cause driver failure (binding length is not equal to query usage). This PR sends binding with only required parameters.Enhancement
Parameterize all the values including sub queries' values, user attributes...
Provide a filter
raw
to force us to render the raw values.