Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Censor fixes #291

Merged
merged 2 commits into from
Dec 6, 2018
Merged

Censor fixes #291

merged 2 commits into from
Dec 6, 2018

Conversation

storojs72
Copy link
Contributor

  1. Move common table matching logic from whitelist/blacklist handlers;
  2. Fix parsing error logic
  3. Some refactoring

sqlStripped = strings.TrimSuffix(sqlStripped, ";")

stmt, err := sqlparser.Parse(sqlStripped)
outputStmt, _ := sqlparser.Parse(sqlStripped)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for what second parsing the same query?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because I need initial query (without redacted values) in parsed form outside

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

then better to parse second time after error check because in worst case we will parse two times with errors without need

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree

atLeastOneTableNameMatch := false
allTableNamesMatch := false

switch parsedQuery.(type) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To avoid casts inside case statements we can cast once using switch statement and reduce lines of code below

switch query := parsedQuery.(type) {
  case *sqlparser.Select:
      // query here casted to *sqlparser.Select
      //query.From
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree

break
case *sqlparser.Insert:
insertQuery := parsedQuery.(*sqlparser.Insert)
if setOfTables[insertQuery.Table.Name.String()] {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it works for case .... FROM table1 as tbl ? Is Table.Name.String() will not return tbl instead table1?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is INSERT statement. Only one table is possible

oneTableMatch = true
if allTablesMatchInternal {
counter++
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

else { 
  break
}

we don't need to continue if we have one false for alltablesMatchInternal

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To use it both whitelist and blacklist we should have information if AT LEAST ONE table is in list (blacklist blocks table if it's true, allows - otherwise) and if ALL tables are in list (whitelist allows table if it's true and blocks otherwise)

Copy link
Collaborator

@Lagovas Lagovas Nov 28, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, but if we have oneTableMatch == true and meet first allTablesMatchInternal == false then we can stop iteration and return true, false because all allTablesMatchInternal will not true anyway

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree

oneTableMatch := false
allTablesMatch := false

switch table.(type) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can simplify this function and don't add specific functions for each type of table. Each case will be < 10 lines of code and here we work with specific types. On my opinion better to check all cases here instead in separate functions (checkAliasedTable, checkJoinedTable, checkParenTable)
switch tbl := table.(type) {
then we can reduce code and use recursion

case *sqlparser.ParenTableExpr:
		oneTableMatch, allTablesMatch = checkParenTable(tbl.Exprs, setOfTables)

handler.logger.WithField(logging.FieldKeyEventCode, logging.EventCodeErrorCensorQueryIsNotAllowed).WithError(ErrAccessToForbiddenTableBlacklist).Errorln("Query has been blocked by blacklist [tables]")
return false, ErrAccessToForbiddenTableBlacklist
}
atLeastOneTableInBlacklist, _ := common.CheckTableNamesMatch(parsedQuery, handler.tables)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice

}
handler.queries[normalizedQuery] = true
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what about store query as is without normalization? to support unparsable queries

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh, I missed handler.queries[query] = true
I think we should store one variant: normalized or as is. We don't need to store both IMHO

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that we should save both versions (normalized and raw). If parsing error occurred, we add/remove only raw version from list in all cases

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for what we need store both versions? how we use it? we can add/remove only normalized or raw query. anyway, we try to parse in both cases and check with logical OR both versions (in queryignore handler). so we can reduce overhead of memory and store one version

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In previous version we tried to parse and store normilized. If parse error occurred, we save as is. I agree that we have to save only normalized or raw. It seems that raw type is better - no parse, faster work

Copy link
Contributor Author

@storojs72 storojs72 Nov 29, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Eventually I decided to use normalized queries CheckQuery and normalize them in Add/Remove Queries functions. So "SELECT * FROM STUDENTS;" will match "select * from students" but if parse error occurred in Add/Remove, then censor will throw it while loading

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if we can parse then we operate with parsed queries. and when we will try to add/remove parsable query then it will correctly added/removed. and when we cant parse then we add as is in raw form and add/remove will fail with parsing too and correctly added/removed using raw form of query

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we shouldn't allow add to black/white list unparsed queries - we check normalized queries in CheckQuery - not raw, and normalized query means that it's parsed.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's true for patterns or logic which require correct parsing, but not simple comparison as is

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently every input query is parsed twice (one statement used for hiding values and second is processed) independently of further handling logic. When we parse while add/remove, it's done only once when loading. So even if we use unparsed queries in checking queries, It will not increase performance, but may cause some annoying typo errors, like add to whitelist "select * from x" but it will not block "select * from x;" because strings are not actually equal.

@@ -197,14 +197,107 @@ func NormalizeAndRedactSQLQuery(sql string) (normalizedQuery string, redactedQue

stmt, err := sqlparser.Parse(sqlStripped)
if err != nil {
return "", "", err
return "", "", nil, ErrQuerySyntaxError
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we log an original error to see it in logs what actually happens and where was a mistake? sometimes it was need for tests and debugging

@Lagovas Lagovas merged commit d067151 into cossacklabs:master Dec 6, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants