-
Notifications
You must be signed in to change notification settings - Fork 3
SQL: Stronger read-only mode #11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
2 commits
Select commit
Hold shift + click to select a range
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Empty file.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,126 @@ | ||
| import dataclasses | ||
| import logging | ||
| import typing as t | ||
|
|
||
| import sqlparse | ||
| from sqlparse.tokens import Keyword | ||
|
|
||
| from cratedb_mcp.settings import PERMIT_ALL_STATEMENTS | ||
|
|
||
| logger = logging.getLogger(__name__) | ||
|
|
||
|
|
||
| def sql_is_permitted(expression: str) -> bool: | ||
| """ | ||
| Validate the SQL expression, only permit read queries by default. | ||
|
|
||
| When the `CRATEDB_MCP_PERMIT_ALL_STATEMENTS` environment variable is set, | ||
| allow all types of statements. This is **not** recommended. | ||
|
|
||
| FIXME: Revisit implementation, it might be too naive or weak. | ||
| Issue: https://github.com/crate/cratedb-mcp/issues/10 | ||
| Question: Does SQLAlchemy provide a solid read-only mode, or any other library? | ||
| """ | ||
| is_dql = SqlStatementClassifier(expression=expression, permit_all=PERMIT_ALL_STATEMENTS).is_dql | ||
| if is_dql: | ||
| logger.info(f"Permitted SQL expression: {expression and expression[:50]}...") | ||
| else: | ||
| logger.warning(f"Denied SQL expression: {expression and expression[:50]}...") | ||
| return is_dql | ||
|
|
||
|
|
||
| @dataclasses.dataclass | ||
| class SqlStatementClassifier: | ||
| """ | ||
| Helper to classify an SQL statement. | ||
|
|
||
| Here, most importantly: Provide the `is_dql` property that | ||
| signals truthfulness for read-only SQL SELECT statements only. | ||
| """ | ||
| expression: str | ||
| permit_all: bool = False | ||
|
|
||
| _parsed_sqlparse: t.Any = dataclasses.field(init=False, default=None) | ||
|
|
||
| def __post_init__(self) -> None: | ||
| if self.expression is None: | ||
| self.expression = "" | ||
| if self.expression: | ||
| self.expression = self.expression.strip() | ||
|
|
||
| def parse_sqlparse(self) -> t.List[sqlparse.sql.Statement]: | ||
| """ | ||
| Parse expression using traditional `sqlparse` library. | ||
| """ | ||
| if self._parsed_sqlparse is None: | ||
| self._parsed_sqlparse = sqlparse.parse(self.expression) | ||
| return self._parsed_sqlparse | ||
|
|
||
| @property | ||
| def is_dql(self) -> bool: | ||
| """ | ||
| Whether the statement is a DQL statement, which effectively invokes read-only operations only. | ||
| """ | ||
|
|
||
| if not self.expression: | ||
| return False | ||
|
|
||
| if self.permit_all: | ||
| return True | ||
|
|
||
| # Check if the expression is valid and if it's a DQL/SELECT statement, | ||
| # also trying to consider `SELECT ... INTO ...` and evasive | ||
| # `SELECT * FROM users; \uff1b DROP TABLE users` statements. | ||
| return self.is_select and not self.is_camouflage | ||
|
|
||
| @property | ||
| def is_select(self) -> bool: | ||
| """ | ||
| Whether the expression is an SQL SELECT statement. | ||
| """ | ||
| return self.operation == 'SELECT' | ||
|
|
||
| @property | ||
| def operation(self) -> str: | ||
| """ | ||
| The SQL operation: SELECT, INSERT, UPDATE, DELETE, CREATE, etc. | ||
| """ | ||
| parsed = self.parse_sqlparse() | ||
| return parsed[0].get_type().upper() | ||
|
|
||
| @property | ||
| def is_camouflage(self) -> bool: | ||
| """ | ||
| Innocent-looking `SELECT` statements can evade filters. | ||
| """ | ||
| return self.is_select_into or self.is_evasive | ||
|
|
||
| @property | ||
| def is_select_into(self) -> bool: | ||
| """ | ||
| Use traditional `sqlparse` for catching `SELECT ... INTO ...` statements. | ||
| Examples: | ||
| SELECT * INTO foobar FROM bazqux | ||
| SELECT * FROM bazqux INTO foobar | ||
| """ | ||
| # Flatten all tokens (including nested ones) and match on type+value. | ||
| statement = self.parse_sqlparse()[0] | ||
| return any( | ||
| token.ttype is Keyword and token.value.upper() == "INTO" | ||
| for token in statement.flatten() | ||
| ) | ||
amotl marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
|
||
| @property | ||
| def is_evasive(self) -> bool: | ||
| """ | ||
| Use traditional `sqlparse` for catching evasive SQL statements. | ||
|
|
||
| A practice picked up from CodeRabbit was to reject multiple statements | ||
| to prevent potential SQL injections. Is it a viable suggestion? | ||
|
|
||
| Examples: | ||
|
|
||
| SELECT * FROM users; \uff1b DROP TABLE users | ||
| """ | ||
| parsed = self.parse_sqlparse() | ||
| return len(parsed) > 1 | ||
amotl marked this conversation as resolved.
Show resolved
Hide resolved
|
||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,9 @@ | ||
| # CrateDB MCP backlog | ||
|
|
||
| ## Iteration +1 | ||
| - Docs: HTTP caching | ||
| - Docs: Load documentation index from custom outline file | ||
| - SQL: Extract `SqlFilter` or `SqlGateway` functionality to `cratedb-sqlparse` | ||
|
|
||
| ## Done | ||
| - SQL: Stronger read-only mode |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,107 @@ | ||
| import cratedb_mcp | ||
| from cratedb_mcp.util.sql import sql_is_permitted | ||
|
|
||
|
|
||
| def test_sql_select_permitted(): | ||
| """Regular SQL SELECT statements are permitted""" | ||
| assert sql_is_permitted("SELECT 42") is True | ||
| assert sql_is_permitted(" SELECT 42") is True | ||
| assert sql_is_permitted("select 42") is True | ||
|
|
||
|
|
||
| def test_sql_select_rejected(): | ||
| """Bogus SQL SELECT statements are rejected""" | ||
| assert sql_is_permitted(r"--\; select 42") is False | ||
|
|
||
|
|
||
| def test_sql_insert_allowed(mocker): | ||
| """When explicitly allowed, permit any kind of statement""" | ||
| mocker.patch.object(cratedb_mcp.util.sql, "PERMIT_ALL_STATEMENTS", True) | ||
| assert sql_is_permitted("INSERT INTO foobar") is True | ||
|
|
||
|
|
||
| def test_sql_select_multiple_rejected(): | ||
| """Multiple SQL statements are rejected""" | ||
| assert sql_is_permitted("SELECT 42; SELECT 42;") is False | ||
|
|
||
|
|
||
| def test_sql_create_rejected(): | ||
| """DDL statements are rejected""" | ||
| assert sql_is_permitted("CREATE TABLE foobar AS SELECT 42") is False | ||
|
|
||
|
|
||
| def test_sql_insert_rejected(): | ||
| """DML statements are rejected""" | ||
| assert sql_is_permitted("INSERT INTO foobar") is False | ||
|
|
||
|
|
||
| def test_sql_select_into_rejected(): | ||
| """SELECT+DML statements are rejected""" | ||
| assert sql_is_permitted("SELECT * INTO foobar FROM bazqux") is False | ||
| assert sql_is_permitted("SELECT * FROM bazqux INTO foobar") is False | ||
| assert sql_is_permitted("WITH FOO AS (SELECT * FROM (SELECT * bazqux INTO foobar))") is False | ||
|
|
||
|
|
||
| def test_sql_select_into_permitted(): | ||
| """Forbidden keywords do not contribute to classification when used as labels""" | ||
| assert sql_is_permitted('SELECT * FROM "into"') is True | ||
| assert sql_is_permitted('SELECT * FROM "INTO"') is True | ||
|
|
||
|
|
||
| def test_sql_empty_rejected(): | ||
| """Empty statements are rejected""" | ||
| assert sql_is_permitted("") is False | ||
|
|
||
|
|
||
| def test_sql_almost_empty_rejected(): | ||
| """Quasi-empty statements are rejected""" | ||
| assert sql_is_permitted(" ") is False | ||
|
|
||
|
|
||
| def test_sql_none_rejected(): | ||
| """Void statements are rejected""" | ||
| assert sql_is_permitted(None) is False | ||
|
|
||
|
|
||
| def test_sql_multiple_statements_rejected(): | ||
| assert sql_is_permitted("SELECT 42; INSERT INTO foo VALUES (1)") is False | ||
|
|
||
|
|
||
| def test_sql_with_comments_rejected(): | ||
| assert sql_is_permitted( | ||
| "/* Sneaky comment */ INSERT /* another comment */ INTO foo VALUES (1)") is False | ||
|
|
||
|
|
||
| def test_sql_update_rejected(): | ||
| """UPDATE statements are rejected""" | ||
| assert sql_is_permitted("UPDATE foobar SET column = 'value'") is False | ||
|
|
||
|
|
||
| def test_sql_delete_rejected(): | ||
| """DELETE statements are rejected""" | ||
| assert sql_is_permitted("DELETE FROM foobar") is False | ||
|
|
||
|
|
||
| def test_sql_truncate_rejected(): | ||
| """TRUNCATE statements are rejected""" | ||
| assert sql_is_permitted("TRUNCATE TABLE foobar") is False | ||
|
|
||
|
|
||
| def test_sql_drop_rejected(): | ||
| """DROP statements are rejected""" | ||
| assert sql_is_permitted("DROP TABLE foobar") is False | ||
|
|
||
|
|
||
| def test_sql_alter_rejected(): | ||
| """ALTER statements are rejected""" | ||
| assert sql_is_permitted("ALTER TABLE foobar ADD COLUMN newcol INTEGER") is False | ||
|
|
||
|
|
||
| def test_sql_case_manipulation_rejected(): | ||
| """Statements with case manipulation to hide intent are rejected""" | ||
| assert sql_is_permitted("SeLeCt * FrOm users; DrOp TaBlE users") is False | ||
|
|
||
|
|
||
| def test_sql_unicode_evasion_rejected(): | ||
| """Statements with unicode characters to evade filters are rejected""" | ||
| assert sql_is_permitted("SELECT * FROM users; \uFF1B DROP TABLE users") is False |
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.