-
Notifications
You must be signed in to change notification settings - Fork 14.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix for infinite recursion due to secrets_masker #35048
Fix for infinite recursion due to secrets_masker #35048
Conversation
This feels a bit too fragile to me; it is much too easy for new code to introduce a logging call that hits the errornous code path again. Is it possible to fix this in the log rendering logic instead of relying on the call site to hold this correctly? |
I think the also problem here with inheritance of builtin type from collections import UserString
class MyStringClass(UserString):
pass |
airflow/utils/log/secrets_masker.py
Outdated
"Error was: %s: %s", | ||
item, | ||
repr(item), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general I think it is one of possible way to solve the issue. In the current implementation we pass into arguments which unable to masked and the error happen again, as result recursion happen.
Maybe we even would like to build warning message outside of the logger:
except Exception as exc:
warning_msg = (
r"Unable to redact {item!r}, please report this via <https://github.com/apache/airflow/issues>. "
"Error was: {type(exc).__name__}: {exc}",
)
log.warning(warning_msg)
return item
Another option: use logger without mask_secrets
filter here
@uranusjr WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Taragolis Just curious, should we print the item
in logs at all, It kinda defeats the purpose of mask_secrets
filters right? maybe instead we can just print the class name? or just enough information to debug?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pre-rendering a log message is against best practice since it would not work well for external logging services such as Sentry. There is a linter rule for this: https://docs.astral.sh/ruff/rules/logging-f-string/
airflow/utils/db_cleanup.py
Outdated
logger.debug("rows moved; purging from %s", source_table.name) | ||
logger.debug("rows moved; purging from %s", str(source_table.name)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we sure that we want to include this fix in this PR? Because it seems a bit hacky here, and just hide actual problem
I get your concern. I would still leave in the fix the Regarding actually fixing this particular case, where @@ -259,11 +259,11 @@ class SecretsMasker(logging.Filter):
elif isinstance(item, str):
if self.replacer:
...
- return self.replacer.sub("***", item)
+ return self.replacer.sub("***", str(item))
return item
elif isinstance(item, (tuple, set)):
... |
Here's the class that caused this issue in the wild: https://github.com/sqlalchemy/sqlalchemy/blob/rel_1_4_49/lib/sqlalchemy/sql/elements.py#L5142 |
Rebased on |
18d1593
to
20db9a6
Compare
We can get into trouble for types that cannot be initiated with re2's `type(obj)()` call. The `secrets_masker` thus fails, which triggers a warning log, which also fails because we pass the object to the logger, which is then masked again, and so forth. We can break the recursion by emitting a log without trying to redact the value again (this ensures no new bug will cause a stack overflow). This issue has occured previously: apache#19816 (comment) Additionally, we fix this particular bug by ensuring whatever re2 receives is a simple `str`. I noticed this issue while working with a DAG that calls Airflow's DB cleanup function. Example DAG: ``` from datetime import datetime from airflow import DAG from airflow.models import Variable from airflow.operators.python import PythonOperator class MyStringClass(str): def __init__(self, required_arg): pass def fail(task_instance): # make sure the `SecretsMasker` has a replacer Variable.set(key="secret", value="secret_value") Variable.get("secret") # trigger the infinite recursion task_instance.log.info("%s", MyStringClass("secret_value")) with DAG( dag_id="secrets_masker_recursion", start_date=datetime(2023, 9, 26), ): PythonOperator(task_id="fail", python_callable=fail) ```
20db9a6
to
fdd568a
Compare
It's been a while! 😄 Rebased on the latest |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. @Taragolis @uranusjr ?
* Fix for infinite recursion due to secrets_masker We can get into trouble for types that cannot be initiated with re2's `type(obj)()` call. The `secrets_masker` thus fails, which triggers a warning log, which also fails because we pass the object to the logger, which is then masked again, and so forth. We can break the recursion by emitting a log without trying to redact the value again (this ensures no new bug will cause a stack overflow). This issue has occured previously: #19816 (comment) Additionally, we fix this particular bug by ensuring whatever re2 receives is a simple `str`. I noticed this issue while working with a DAG that calls Airflow's DB cleanup function. Example DAG: ``` from datetime import datetime from airflow import DAG from airflow.models import Variable from airflow.operators.python import PythonOperator class MyStringClass(str): def __init__(self, required_arg): pass def fail(task_instance): # make sure the `SecretsMasker` has a replacer Variable.set(key="secret", value="secret_value") Variable.get("secret") # trigger the infinite recursion task_instance.log.info("%s", MyStringClass("secret_value")) with DAG( dag_id="secrets_masker_recursion", start_date=datetime(2023, 9, 26), ): PythonOperator(task_id="fail", python_callable=fail) ``` * Improve error message --------- Co-authored-by: Tzu-ping Chung <uranusjr@gmail.com>
* Fix for infinite recursion due to secrets_masker We can get into trouble for types that cannot be initiated with re2's `type(obj)()` call. The `secrets_masker` thus fails, which triggers a warning log, which also fails because we pass the object to the logger, which is then masked again, and so forth. We can break the recursion by emitting a log without trying to redact the value again (this ensures no new bug will cause a stack overflow). This issue has occured previously: #19816 (comment) Additionally, we fix this particular bug by ensuring whatever re2 receives is a simple `str`. I noticed this issue while working with a DAG that calls Airflow's DB cleanup function. Example DAG: ``` from datetime import datetime from airflow import DAG from airflow.models import Variable from airflow.operators.python import PythonOperator class MyStringClass(str): def __init__(self, required_arg): pass def fail(task_instance): # make sure the `SecretsMasker` has a replacer Variable.set(key="secret", value="secret_value") Variable.get("secret") # trigger the infinite recursion task_instance.log.info("%s", MyStringClass("secret_value")) with DAG( dag_id="secrets_masker_recursion", start_date=datetime(2023, 9, 26), ): PythonOperator(task_id="fail", python_callable=fail) ``` * Improve error message --------- Co-authored-by: Tzu-ping Chung <uranusjr@gmail.com>
Related: #19816 (comment)
See also: #20039
Broke with: #33520
We can get into trouble for text types that cannot be initiated with re2's
type(obj)()
call. Thesecrets_masker
thus fails, which triggers a warning log, which also fails because we pass the object to the logger, which is then masked again, and so forth.We can break the recursion by printing out the
repr
of the object in question in the warning log. This issue has occured previously: #19816 (comment)I noticed this issue while working with a DAG that calls Airflow's DB cleanup function, thus I'm fixing the logging call in
db_cleanup
to log a stringified name instead of thequoted_name
object.I thought about fixing the
SecretsMasker
call to the reducer, i.e. only passing objects that re2 can handle (i.e. having a parameterless constructor), but decided against it. Checking for this would probably not be cheap and considering this scenario is quite unlikely, I think it's better to simply fail and emit a warning log (without the infinite recursion :)).Testing
I added a basic test that fails next time someone refactors the warning log. I also used the following DAG for a more comprehensive end-to-end test:
^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named
{pr_number}.significant.rst
or{issue_number}.significant.rst
, in newsfragments.