-
-
Notifications
You must be signed in to change notification settings - Fork 760
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Result of fix
changes on subsequent runs (fix is non-deterministic)
#2134
Comments
@juhoautio Which dialect is this query for, I get a parsing error on your query at line 46 if I just use ANSI (default)? You should see it if you run Not sure if relevant to the actual issue but would be good to clear this up incase it it causing issues |
Actually yes, that repeating whitespace issue is because that line doesn't parse. Rules can't properly handle unparseable sections since they won't follow the same logical structure as a properly formed parse tree. @barrywhart I notice that since we only show the fixable lints when running |
This query is primarily for AWS Athena but also works on Presto. Neither of those seem to be supported dialects. But as I wrote about my expectation:
Is this a valid assumption? |
for the "line is too long" lints that get raised twice they are from these two lines: Minimum reproducible example: SELECT
abs(round(a.metricx-b.metricx)) as col_c_rel_diff,
abs((round(a.metricx-b.metricx)/a.metricx)*100) as metric_x_rel_diff
FROM foo_bar_report a
LEFT JOIN xxx_yyy_report b
ON a.event_date = b.event_date; Seems like there's multiple fixes (namely L006 and L031) being applied to these two segments that push the line length over the threshold for L016, which doesn't get picked up to the subsequent run. After the first fix run it looks like this, which should understandably trigger L016. SELECT
abs(round(foo_bar_report.metricx - xxx_yyy_report.metricx)) AS col_c_rel_diff,
abs((round(foo_bar_report.metricx - xxx_yyy_report.metricx) / foo_bar_report.metricx) * 100) AS metric_x_rel_diff
FROM foo_bar_report
LEFT JOIN xxx_yyy_report
ON foo_bar_report.event_date = xxx_yyy_report.event_date; L006 and L031 are working exactly as they should, so the main question is why does L016 not just get applied at the next iteration of the fix loop? @barrywhart thoughts? My initial guess would be the that L006 and L031 are maybe applying their fixes at weird levels of the parse tree which stops L016 from working properly. That would explain why it then works on a subsequent |
@juhoautio well dialects are important since the structure of the sql varies depending on them. ANSI doesn't know how to interpret that specific line, so might as well be a random line of python in there. As explained here we should probably not be trying to apply fixes to a section we can't parse. FYI on complexity of parsing SQL in SQLFluff vs Python in Black. Python has one dialect (different versions but the syntax is mostly the same) and Python progamatically provides its own AST. SQL on the other hand has a lot of different dialects (postgres, tsql, bigquery, etc) each with their own interpretation of how to structure a query and custom grammar. Additionally the various SQL dialects do not provide their AST so part of this project also requires building out the dialects. |
Sure, I understand that by specifying a dialect it will be possible to do more. But my comment was about how to deal with parsing issues. And as you said:
That's exactly what I was expecting :) |
Cool makes sense, I think it would be a sensible approach and hopefully not too tough to implement 👍 As for the line length being highlighted on the second run, I'm pretty sure it's due to an issue (#1304 and #1668 N.B. not the same lint error but related conceptually) with fixes messing up the parse tree. In theory L016 would be picked up in subsequent fix loops and so fixed in a single Let's give @barrywhart some time to read and give a second opinion 😄 |
A general question inspired by black: if black's auto-formatting produces a different result on the 2nd run, black gives up and fails with an error, leaving the problematic file untouched. Is sqlfluff supposed to also work like that? If not, why? In this case it doesn't seem so. I'm talking about an isolated check on top of everything that would be done by SQLFluff itself so that user doesn't need to do it. |
I can understand the desire to have that, and it makes sense in many ways. However I have had issues in the past where two runs the settled and so it would be a shame to not have got the benefit of those fixes because it didn’t settled after just one run. But the downside of that is if it never settles (like in your case) then it’s pointless continuing to change. The problem is finding those cases and following the Black way of doing this may be a bit naive for us for now. I think ultimately we need some changes to rules for cases like this. A few rules can conflict with each other, and some rules so only a partial fix (e.g. a syntax change), expecting the whitespace rules to format it correctly later on a subsequent run. That’s the ultimate root cause of this but for good reason (we don’t want to have to embed the whitespace rules in amongst ever other rule). We’ve talked for a bit now about having a more generic whitespace fixer that lives out side the rules and could be called by each rule but have not done the work yet. IIRC SQLFluff internally calls 10 iterations for each run. So that should be enough. Maybe short term we could increase that to 20 (to catch a few more cases like I suggested in my first paragraph) and then bomb out with no change if not settled by run 20? Not sure how difficult that would be to do though as not looked at that code previously. |
2 things:
I don't believe Black remembers what it did in a previous run of @tunetheweb The two issues in this query are summarised in comments here and here Suggest that this is a topic better suited for maintainers meeting. (to keep this issue focussed I've removed the hive discussion) |
(Thanks a lot for the comments!)
Yes. So it can be left to the user to choose what to do. In the usual case I could imagine doing this:
Eventually if the issue is fixed the manual disabling rules can be removed from the file. In general it's very problematic if sqlfluff doesn't settle already on the first execution of
|
I don't believe either that black remembers what previous execution of executing If there's reason to doubt that, please let me know and I'll try to find a more fresh source to back that. |
@juhoautio doesn’t the fact the CI changes it raise the issue to the user in a more obvious way rather ? So they can manually fix the SQL and/or exempt the unstable lines (we do support this btw). Some might argue that is better than just not applying any fixes. |
@juhoautio as mentioned in the original slack response that is exactly what SQLFluff does. If the mentioned parse tree issue was fixed then this would start working exactly as both you, me, black, and sqlfluff would expect.
|
Ok, in my opinion the issue should be raised already when User should be able to know already in local dev environment that the committed code passes all checks before pushing to remote. First of all that allows avoiding some redundant steps of waiting, debugging, fixing, pushing, communicating.. If an issue is raised only at CI it causes slower turnaround, it may distract other developers without a good reason ("reviewer: this PR looks good otherwise but the build failed!"), and debugging the issue in general is not as nicely isolated because initially you don't know if there is something wrong or different in the CI setup vs. the developer's local environment and you may need to dig into the CI logs to see what exactly failed, as opposed to seeing the error in local dev console (this is rather minor though – you will see it as soon as you run edit: @jpy-git with that, sounds perfect. Sorry, I misunderstood the scope of "parse tree issue". I thought that fixing that would not yet globally guarantee determinism :) |
Regarding what to do if |
Agreed. An in general it does. It should be exceptional times when it does not do that, which implies one of the following:
And what I mean by making it more obvious is that something has broken. Silently rolling back the change could "hide" the error. I'd certainly like this non-deterministic nature to happen less (ideally not at all), but we've lots to do here in this project and all volunteers so need to prioritise. Back to your original question from Slack:
The answer IMHO is: Yes it has that general goal but it is not 100% guaranteed. Maybe it should be as you suggest but that's not the case at the moment.
Absolutely agree. And as I say this should very much be an exception not the norm. But CI is there to catch the unexpected exceptions (otherwise if anything can be, and is, tested locally there is no need to run CI at all!). You seem to have a particular problem with this but are also falling under items 2 and 3 above. IMHO when moving to using SQLFluff you should only do so if i) the dialect is supported (or another dialect ca be proxied with good enough support) and ii) you've run your entire code through SQLFluff to clean it up and then made one commit with that. Expecting that to happen in a pre-commit for the first time is asking for trouble IMHO. Then in future, and especially over time, SQLFluff pre-commit should act as a safe guard for small misses rather than large-scale rewrites of the SQL.
Potentially we should do this, I agree. However as long as this remains exceptional (and I'm the one repeating myself here now😁), I don't think it's the highest priority we have. I would much prefer the changes @jpy-git has suggest that should make this less likely, even if it doesn't prevent this ever happening again (by either raising the error as you suggest, or by rolling back the change as per the black model). If your option IS that important to you then happy to review any PR you can make. |
I just tested your SQL and it keeps some fixes, AND raises an error, AND ALSO exits with a non-zero status code (indicating an error):
So does SQLFluff already behave like you would want it to? |
I haven't read the entire discussion above, but want to mention there may be an easy partial fix. Each lint rule can specify |
Oh, yes, indeed it does! Sorry! I totally missed the non-zero exit code before. Just wanted to quickly respond to that. I'll try to do this next:
P.S. If it's worth mentioning, about "raising an error" I was confused by this being the last line printed, more or less indicating a successful outcome:
|
@barrywhart will add this to L048 👍 |
I tried it and doesn’t help here 😞 I think there’s something wrong with our parse error detection. See #2118 as another example. It used to work so not sure why it’s not working now. @juhoautio that might scupper your next step. |
Adding these ignore markers on the problematic line fixed the fundamental issue for me: event_date <= current_date - INTERVAL '2' day --noqa: PRS,L048 Disabling
I think that was the actual rule that caused Now the output is not changing any more on further execution of So from my perspective the "big picture" looks totally ok – thanks a lot for all the support! In this case I had to run |
Thanks @juhoautio, some good discussion and definitely some things for us to take away and investigate from this 👍 |
Sorry, but I need to still come back to the expectation of raising en error (exit code != 0) if deterministic formatting is not guaranteed. That is not always the case currently. Based on #2134 (comment) I expect that after the parse tree bug fix it will be guaranteed and that sounds great :) But I want to also say that that maybe it's an acceptable tradeoff that in some rare cases formatting may change on a subsequent run even though the previous run didn't raise an error, if adding stronger guarantees would basically mean that Below are the steps to show once more that currently it's possible for formatting changes to happen on the 2nd run even though previous $ cat minimal.sql
SELECT
abs(round(a.metricx-b.metricx)) as col_c_rel_diff,
abs((round(a.metricx-b.metricx)/a.metricx)*100) as metric_x_rel_diff
FROM foo_bar_report a
LEFT JOIN xxx_yyy_report b
ON a.event_date = b.event_date; 1st execution:
$ cat minimal.sql
SELECT
abs(round(foo_bar_report.metricx - xxx_yyy_report.metricx)) AS col_c_rel_diff,
abs((round(foo_bar_report.metricx - xxx_yyy_report.metricx) / foo_bar_report.metricx) * 100) AS metric_x_rel_diff
FROM foo_bar_report
LEFT JOIN xxx_yyy_report
ON foo_bar_report.event_date = xxx_yyy_report.event_date; So, it didn't raise an error. 2nd execution:
$ cat minimal.sql
SELECT
abs(
round(foo_bar_report.metricx - xxx_yyy_report.metricx)
) AS col_c_rel_diff,
abs(
(
round(
foo_bar_report.metricx - xxx_yyy_report.metricx
) / foo_bar_report.metricx
) * 100
) AS metric_x_rel_diff
FROM foo_bar_report
LEFT JOIN xxx_yyy_report
ON foo_bar_report.event_date = xxx_yyy_report.event_date; Changed, again (no error raised this time either, but this time that's expected). |
This is the parse tree issue mentioned several times, if you run the sqlfluff command a second time it has no concept of what has previously been done to the query, just reparses and tries to fix. No amount of extra iterations in the first fix run will change the fact that some rules aren't giving a correct parse tree. Because of the differing parse trees both commands see themselves as being correct. Let's leave this before we start discussing in circles. |
Suffice it to say, there are known issues, and we'll address them as we have the time. It's a volunteer project, still quite new. We want perfection, but it'll take time. 💪 |
It's been set to Thanks for the info about how the flag works -- I didn't know it only applies to specific unparseable areas of the code |
@barrywhart @alanmcruickshank I've raised a PR #2161 to set It still wouldn't fix the situation I mentioned here but to fix that I would actually suggest a stricter approach to handling parsing/templating errors when applying fixes.
Given that SQLFluff is being used in production environments now, I would propose we take the same strict approach. If we detect a parse error or a templating error anywhere in the file we should not apply any fixes as it is impossible for us to guarantee that a rule will not corrupt the code when there is unparseable somewhere in it. (I'm fine with allowing an attempt at linting the code using the current logic since it's not destructive if you prefer, although would personally prefer a similar strict approach to what I've suggested to fix) Doing this would remove a lot of these nonsense issues we get where people mis-report lint/fix issues that are actually just due to not parsing. And also we shouldn't be allowing inline parsing/templating errors as ignoring them doesn't suddenly mean there's not a serious issue with the parse tree for rule evaluation. Most importantly we don't want to corrupting people's code (if they run fix it's not like they'll get any warning that their code is unparsable and could get corrupted). Actually implementing the above should be simple but would be good to get a consensus agreement from us? |
Think those set a pretty strong precedent for adopting the strict approach for both |
I'm not as convinced @jpy-git 😁 Especially for I just worry that unparsable sections are far too common for SQLFluff at the moment and this would exclude a lot of lintable (and potentially fixable) errors in files because of one line of syntax that we haven't added yet. I'd be willing to bet all those examples don't get daily issues on unparsable, like we do. But on We should also really look at #2118 to allow unparsable sections to be ignored. Or how would that work in this stricter world? Would they be unlintable and unfixable too? Cause that would be a real restriction and cut people's SQL out of the project completely until releases are fixed? |
I'm okay allowing fixes to unparsable code, but would prefer that the default behavior would not do that. I'd rather people see and complain about dialect issues than to have their SQL corrupted. In other words, "fail safe" is the default behavior. My rationale is that the more popular the project becomes, the larger the percentage of less technical users becomes, e.g. data scientists, analysts, mathematicians. If a new, less technical user has an initial bad experience, they're more likely to abandon using SQLFluff, and it hurts the project's reputation out in the larger world. |
As a user, +1 to @barrywhart 's sentiment This tool is excellent and it will save me much more time if it just explodes when it can't parse 2 out of 20 files rather than send me backtracking through my commit history to find the place where it made an unsafe change, especially since it could leave me with valid but wrong SQL. I can easily address failures I know about by hand, and shunt the time-saved into fixing the parsing issue. Especially since part of the allure of tools like dbt (whose templating is what motivated me to seek out something like sqlfluff) is that less experienced devs who "only know SQL" can contribute, it has to be safety first for me. btw while so many maintainers are in one place, thank you all! |
@barrywhart So just to be clear, so we can make this actionable:
@tunetheweb I think we have pretty good coverage of dialects for a lot of common statements (don't forget we're likely biased on thinking there's bad coverage since we see all the submitted issues, a lot of the issues that get submitted aren't for code that the majority of analysts use commonly 😄).
inline templating/parsing ignores would be removed in the strict world, we shouldn't be doing anything with this code. Just because someone ignores a parsing/templating error doesn't mean that there aren't serious issues with the parse tree that can and will lead to code corruption. Nonsense begets nonsense. |
Coming back to the point about how black ensures "deterministic changes or nothing" (#2134 (comment)), I happened to come across the source. It's pretty readable in the source code itself: As an example of hitting that in real life, here's one resolved issue: psf/black#1629. So, even though black already has a deterministic way of knowing the exact AST it can sometimes fail to settle regarding the formatting on the first pass and there is explicit protection against that on the top level. I presume sqlfluff could suffer the same in some corner cases even if there is nothing wrong with the parsing. Or then sqlfluff is doing something better than black on the formatting layer :) Based on the previous comments I understood that But as a user I do think it's really nice to have that top level guarantee of being deterministic – but maybe it would cost just a bit too much in case of sqlfluff which is already on the slow side if executed against any substantial amount of sql. But it would be interesting to hear how expensive you think it would be to add that level of protection: would it mean 2x runtime or something less. |
So it looks like black gets everything in one pass. So a if a second pass produces a different result then it's an error. Nice and simple. SQLFluff takes a slightly different approach (and not saying this is the right or wrong one btw), as discussed above, where certain rules deliberately do not do the full fix (e.g. just slot a new word in without worrying about whitespace, as another rule will fix that for it). So some rules depend on other rules, often in a subsequent pass. This keeps the logic of each rule simpler, but at the cost of not being able to do it in one pass - which is both costly in execution times, and also makes this particular ask here more tricky. I think there is also a question is to whether we want this to be deterministic or not. I mean it seems obvious that yes you would, but as I say I’ve had instances where running it two (or even three) times fixed more and more errors until it finally settled. Yes that was not ideal, and a bit confusing and painful, but when first onboarding SQLFluff I much preferred that, than bombing out with no changes because it wasn’t deterministic after one run. The OP also changed their mind and decided they’d rather keep the fixes than throw them away, which appears to be the black model. So it’s not quite as clear cut as it initially seems. Now currently, because of above design, we already run a max of 10 iterations per execution. Should that be 20 (or 30? or even 40?) to capture the above scenarios? That would make it considerably slower, particularly for the non-deterministic queries which would go all the way to the end. So doing one more run (to 11 iterations) and not persisting that one might not add much run time, and might allow bombing out at the end if that shows its non-deterministic but to what avail? If it doesn’t work with 10 then why will it with 11? And some (like me! And also the original OP) would prefer to get the fixes it can and then investigate why others are not settling. Others may prefer getting nothing and just bombing out. Not to mention we have on occasion seen some bugs rather mean the parse tree gets messed up and future iterations in the current execution won’t fix this, but a fresh execution will. Not ideal and shouldn’t happen but another reason why subsequent executions can ultimately lead to a deterministic answer, even if not in that first execution of 10 runs. So, I don’t think we have a clear answer here of what the ask is? But keen to hear what all your interpretations of the ask are and maybe there is more consensus than I realise. I get that being deterministic is the ask, but to understand what that means need to understand what to do in the edge cases where 1) it’s not deterministic at all and 2) where it is but requires several runs to get there: when do we persist the changes we can make, and when do we give up trying further? |
Just want to clarify that I wasn't necessarily saying that. I don't think I'm educated enough to form an actual opinion, at least at the current stage of SQLFluff. Maybe it depends on the level of coverage for dialects and so on. But sure, in an ideal situation eventually I would warmly welcome fully deterministic behavior :)
Right, so with just 10% additional cost to the execution time, SQLFluff could quite easily do the same that black does: if the formatting still changes between rounds 10 & 11, it could raise an error that a deterministic result cannot be reached. User could then act upon that to fix it way or another (or mark the file ignored) to not hit the issue later on CI. Again, I'm not saying, at least strongly, that it should be done, but just pointing out it as an opportunity. Even if |
Yes.
I'm okay without this. I was mainly making the point that, if this capability is valuable, we could make it optional rather than removing it completely. At this time, I'm +1 for removing it completely but open to discussion.
Same, I think. |
About the "10 vs 11" discussion, that value is configurable, and you can make it as high as you want. We've had a few rules which were dumb, like they'd move some code one line at a time, and if it needed to move 20 lines, that would require 20 passes. I found and updated one such rule. There may be others. I would say that as a long-term goal, we try and avoid creating rules like that, and fix them when we find them. In other words, we allow combinations of rules to require multiple passes, but any single fix should be completed in one pass. We don't have a way to enforce that for now, so let's call that a soft, long-term goal. |
@jpy-git: Much earlier in the discussion, you asked about the behavior of L006, L031, and L016 on a particular query. Should we create a separate issue for that? This discussion bounced around a lot, and I couldn't quite follow it all without 🤯. |
@jpy-git: Would it be reasonable to rename this issue as "'sqlfluff fix' should not make changes if the file has parsing or templating errors"? |
The discussion on this issue covered a lot of areas, but it seems we settled on the above, |
Thank you for the fix of not formatting at all in case of parsing errors! That helps a lot. However, the original issue still exists: result of I didn't spot an answer to this question by @jpy-git yet:
That question sounds to me like it is not expected for sqlfluff to not settle in this case. Could there be even some fundamental issue that is causing some cases like this to be missed? Or should this particular issue at least be fixed somehow? I would still like to see at least this issue fixed if possible, even if sqlfluff won't have general guarantee of being deterministic. My hope would still be that the format is final or then there is no change at all and sqlfluff fails instead. But I understood based on this discussion that project is not going to have that as a strict goal. |
Added a suggestion here that might make less far less of an issue: #1304 (comment) |
@juhoautio: Feel free to create new issues for the specific questions you raised. As maintainers, we look at the discussion on issues and make a decision what to do, then close the issue. When the discussion meanders, we can't track all the threads of discussion and keep the issue open until all of them are addressed. It's better to have one issue per issue. 🙂 You didn't do anything wrong, but this is how we work. We are a team of part-time volunteers, figuring out the right way for SQLFluff to work one day, one issue at a time. |
Expected Behaviour
Running
sqlfluff fix my_query.sql -f
modifies query format – but running it again on the same file should not make any further changes.Observed Behaviour
Summary: The result changes considerably also on the 2nd run. On further runs the result keeps changing with a minor whitespace change.
The behavior is demonstrated in this branch: https://github.com/juhoautio/sqlfluff/commits/non_deterministic_fix_bug
Console output for each command is included below.
1st run of
sqlfluff fix
Resulting diff here: juhoautio@bc60c57
Many formatting fixes, as expected. The result looks good!
2nd run of
sqlfluff fix
Resulting diff here: juhoautio@148c91d
The formatting still changes greatly. I'm not sure if this is for better or worse, but that's a matter of taste. The problem is that formatting changes at all.
3rd run of
sqlfluff fix
Resulting diff here: juhoautio@befff64
Now formatting doesn't change much but it still does: redundant whitespace is added.
4th run of
sqlfluff fix
Resulting diff here: juhoautio@d9bcbf6
Adds yet another redundant space character. I suspect it would keep adding it until infinity if sqlfluff fix is run again.
Steps to Reproduce
Get my_query.sql from here:
juhoautio@2b99acd
(This query is an obfuscated version of a query that executes successfully in AWS Athena)
Run
python -m sqlfluff fix my_query.sql -f
and check the file for changes after each execution.Dialect
Not specified. Dialects are not important here: the expectation here is that sqlfluff can do generic autoformatting on any sql file and the result shouldn't change after the initial run.
Version
To make sure that these bugs haven't been fixed recently but are not in a release yet, I ran this with local dev version of sqlfluff.
The version I used was the latest commit in main branch: 7fd452d
Not using dbt.
Configuration
N/A
The text was updated successfully, but these errors were encountered: