Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Records are dropped when merging view parquet files. #1313

Open
bashir2 opened this issue Feb 26, 2025 · 1 comment · May be fixed by #1314
Open

Records are dropped when merging view parquet files. #1313

bashir2 opened this issue Feb 26, 2025 · 1 comment · May be fixed by #1314
Assignees
Labels
bug Something isn't working P0:immediate An issue to be handled ASAP

Comments

@bashir2
Copy link
Collaborator

bashir2 commented Feb 26, 2025

During the incremental step of the pipeline, the merger process has a bug which results in dropping some of the records when merging incremental and old records. This was discovered while I was debugging e2e flakiness after fixing #1295 in #1312. To be specific, this number 108 is incorrect. There should be at least 2*106+1 records (i.e., twice first full-run here).

@bashir2 bashir2 added bug Something isn't working P0:immediate An issue to be handled ASAP labels Feb 26, 2025
@bashir2 bashir2 self-assigned this Feb 26, 2025
@bashir2
Copy link
Collaborator Author

bashir2 commented Feb 26, 2025

For an example of an e2e run that failed because of this bug, see this run; note the discrepancy between 109 vs 108 patient_flat records (both of which are wrong).

Correction: After debugging further and fixing this in #1314, I think the root cause of the particular e2e failure linked above, is not this bug. The numbers 108 and 109 are both wrong and are fixed now but the merging bug seemed to be deterministic and cannot explain that particular discrepancy.

@bashir2 bashir2 linked a pull request Feb 26, 2025 that will close this issue
7 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working P0:immediate An issue to be handled ASAP
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant