Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid Sync message overflow #3389

Merged
merged 1 commit into from
Apr 20, 2023

Conversation

dbutenhof
Copy link
Member

PBENCH-1120

A SQL error was observed in deployment where pbench-index logged an error on the INDEX sync object because a tarball was somehow not present. The message string generated by indexing_tarballs.py exceeded the VARCHAR(255) column specification.

This isn't an attempt to address the root problem, but to address the symptom of overloading the operation table message column in the future so at least errors are properly recorded.

This reworks some of the indexing_tarballs.py messages to avoid redundancy (e.g., naming the dataset or tarball isn't necessary as the records are linked to the Dataset), but also removes the limit on the message column as a precaution.

(NOTE: it also adds some unit test cases, although these are more documentation than "real tests" as sqlite3, unlike PostgreSQL, doesn't implement column limits.)

Resolves #3366

siddardh-ra
siddardh-ra previously approved these changes Apr 19, 2023
Copy link
Member

@siddardh-ra siddardh-ra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

PBENCH-1120

A SQL error was observed in deployment where `pbench-index` logged an error on
the `INDEX` sync object because a tarball was somehow not present. The message
string generated by `indexing_tarballs.py` exceeded the `VARCHAR(255)` column
specification.

This isn't an attempt to address the root problem, but to address the symptom
of overloading the operation table message column in the future so at least
errors are properly recorded.

This reworks some of the `indexing_tarballs.py` messages to avoid redundancy
(e.g., naming the dataset or tarball isn't necessary as the records are linked
to the `Dataset`), but also removes the limit on the message column as a
precaution.
@riya-17
Copy link
Member

riya-17 commented Apr 20, 2023

Hey @dbutenhof just to understand this fix. We have created a separate column for the messages which doesn't have any limit, right? unrelated to this change where were the messages stored earlier?

@dbutenhof
Copy link
Member Author

Hey @dbutenhof just to understand this fix. We have created a separate column for the messages which doesn't have any limit, right? unrelated to this change where were the messages stored earlier?

All this PR does is expand the existing column, and "tweak" some of the messages to avoid unnecessarily long values. The column isn't added, or moved, and none of the Sync logic has changed.

When I added the Sync mechanism to fix the race conditions I added when I got rid of the filesystem state links, I ended up with the operations table which atomically tracks the status of the various operations we can perform on datasets, to ensure that the universe happens once and only once, in order. More or less on a whim, and partially to help with debugging, I added a message column that could be set to record unusual status. I didn't really think at that time that we might end up generating long messages, which are causing problems now in the production environment. This tries to bring that back under control.

Copy link
Member

@webbnh webbnh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

lib/pbench/test/unit/server/test_sync.py Show resolved Hide resolved
@dbutenhof dbutenhof merged commit d1977e2 into distributed-system-analysis:main Apr 20, 2023
@dbutenhof dbutenhof deleted the opmsg branch April 20, 2023 18:01
dbutenhof added a commit to dbutenhof/pbench that referenced this pull request Apr 20, 2023
PBENCH-1120

A SQL error was observed in deployment where `pbench-index` logged an error on
the `INDEX` sync object because a tarball was somehow not present. The message
string generated by `indexing_tarballs.py` exceeded the `VARCHAR(255)` column
specification.

This isn't an attempt to address the root problem, but to address the symptom
of overloading the operation table message column in the future so at least
errors are properly recorded.

This reworks some of the `indexing_tarballs.py` messages to avoid redundancy
(e.g., naming the dataset or tarball isn't necessary as the records are linked
to the `Dataset`), but also removes the limit on the message column as a
precaution.
dbutenhof added a commit that referenced this pull request Apr 24, 2023
* Avoid Sync message overflow (#3389)

PBENCH-1120

A SQL error was observed in deployment where `pbench-index` logged an error on
the `INDEX` sync object because a tarball was somehow not present. The message
string generated by `indexing_tarballs.py` exceeded the `VARCHAR(255)` column
specification.

This isn't an attempt to address the root problem, but to address the symptom
of overloading the operation table message column in the future so at least
errors are properly recorded.

This reworks some of the `indexing_tarballs.py` messages to avoid redundancy
(e.g., naming the dataset or tarball isn't necessary as the records are linked
to the `Dataset`), but also removes the limit on the message column as a
precaution.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants