-
Notifications
You must be signed in to change notification settings - Fork 16
Conversation
scripts/mssql_discover.py
Outdated
@@ -0,0 +1,70 @@ | |||
import sqlalchemy | |||
|
|||
MASTER_MSSQL_URL = "" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is intentionally blank, so that anyone using this script can add in creds before runtime
scripts/mssql_discover.py
Outdated
db_name = db_name[0] | ||
try: | ||
columns = engine.execute( | ||
f"SELECT TABLE_NAME, COLUMN_NAME, DATA_TYPE, IS_NULLABLE FROM {db_name}.INFORMATION_SCHEMA.COLUMNS;" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can’t think of any way this this would be set outside this function, but if it was it seems vulnerable to SQL injection.
I don’t see a way to get to it from the outside so I’m probably being over paranoid.
scripts/mssql_discover.py
Outdated
flagged_datatypes.add(data_type) | ||
flagged_columns.append(f"{table}.{column}: {data_type}") | ||
|
||
print(f"{len(set(all_columns))} columns found") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you want a count of all columns why use a set? This would be all unique columns rather than all columns.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was the de-dupe the list, since all columns should be unique
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They might not be unique across tables. What if there is table1.unsupported_field and table2.unsupported_field? Then won't the count be 1, but really there are 2 from two different tables?
scripts/mssql_discover.py
Outdated
flagged_columns.append(f"{table}.{column}: {data_type}") | ||
|
||
print(f"{len(set(all_columns))} columns found") | ||
print(f"{len(set(flagged_columns))} columns flagged") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here, why a set? Should this say unique columns?
scripts/mssql_discover.py
Outdated
f"SELECT TABLE_NAME, COLUMN_NAME, DATA_TYPE, IS_NULLABLE FROM {db_name}.INFORMATION_SCHEMA.COLUMNS;" | ||
).all() | ||
except Exception: | ||
# print(f"Access to {db_name}'s tables denied.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Was this commented out line left intentionally?
scripts/mssql_discover.py
Outdated
|
||
|
||
# NB. These are connection secrets, never ever commit these | ||
USER = "" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe add an option to read these from environment variables to make it less likely to commit them.
@seanpreston your changes look good. A bunch of tests are failing, but they all look like frontend tests. I don't think they are relevant to your PR? |
Those failures are left over from the Github Actions downtime |
* adds script to discover mssql datastore compatibility * make prints consistent * updates changelog * add URL template * add warning * store columns correctly * move uncomitted secrets to another file to add to .gitignore * remove empty secrets file, add to gitignore * add comment explaining lack of secrets
Purpose
This PR adds a script to explore the schemas of databases within a MSSQL master instance.
Checklist
CHANGELOG.md
fileCHANGELOG.md
file is being appended toUnreleased
section in an appropriate category. Add a new category from the list at the top of the file if the needed one isn't already there.Run Unsafe PR Checks
label has been applied, and checks have passed, if this PR touches any external servicesTicket
Fixes #557