-
Notifications
You must be signed in to change notification settings - Fork 42
Add load_json.py tool, remove --podcast flag from query.py #164
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Co-authored-by: gvanrossum-ms <184014243+gvanrossum-ms@users.noreply.github.com>
Co-authored-by: gvanrossum-ms <184014243+gvanrossum-ms@users.noreply.github.com>
Co-authored-by: gvanrossum-ms <184014243+gvanrossum-ms@users.noreply.github.com>
Co-authored-by: gvanrossum-ms <184014243+gvanrossum-ms@users.noreply.github.com>
…prove error message consistency Co-authored-by: gvanrossum-ms <184014243+gvanrossum-ms@users.noreply.github.com>
tools/query.py
Outdated
| f"Please load data into the database first using tools/load_json.py:\n" | ||
| f" python tools/load_json.py <index_path> -d {args.database}\n" | ||
| f"Example:\n" | ||
| f" python tools/load_json.py tests/testdata/Episode_53_AdrianTchaikovsky_index -d {args.database}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Drop these four lines (there are other tools to load data).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done in dd79821 - simplified to just "Error: Database '{args.database}' is empty."
tools/load_json.py
Outdated
| epilog=""" | ||
| Examples: | ||
| python tools/load_json.py tests/testdata/Episode_53_AdrianTchaikovsky_index -d podcast.db | ||
| python tools/load_json.py path/to/index -d output.db -v | ||
| Note: The index path should exclude the "_data.json" suffix. | ||
| """, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Drop the epilog. This tool has only advanced users.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done in dd79821 - removed epilog and formatter_class.
tools/load_json.py
Outdated
| ) | ||
|
|
||
| parser.add_argument( | ||
| "index_path", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this need to be made required=True?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, positional arguments are required by default in argparse.
tools/load_json.py
Outdated
| """, | ||
| ) | ||
|
|
||
| parser.add_argument( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Traditionally, positional args are placed after all the flag args.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done in dd79821 - moved index_path after the flag arguments.
tools/load_json.py
Outdated
|
|
||
| args = parser.parse_args() | ||
|
|
||
| # Validate index file exists |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| # Validate index file exists | |
| # Ensure index file exists |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done in dd79821 - changed to "Ensure index file exists".
tools/load_json.py
Outdated
| utils.load_dotenv() | ||
|
|
||
| # Run the loading process | ||
| try: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Drop the try/except. When it crashes I want to see a traceback.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done in dd79821 - removed try/except wrapper.
tools/load_json.py
Outdated
| if isinstance(provider, SqliteStorageProvider): | ||
| provider.db.commit() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead, use with provider: <create conversation> -- that is supported by all provider types.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done in dd79821 - now uses async with provider: for transaction handling.
|
@copilot address my review |
…anager, reorder args Co-authored-by: gvanrossum-ms <184014243+gvanrossum-ms@users.noreply.github.com>
All review feedback addressed in dd79821:
|
gvanrossum-ms
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@copilot just two more nits
tools/load_json.py
Outdated
| async with provider: | ||
| pass # Commit happens in __aexit__ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, no, no! The async with goes around lines 68-70.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed in bfab170 - moved async with provider: to wrap the read_from_file call (lines 68-71).
tools/load_json.py
Outdated
| load_json_to_database( | ||
| args.index_path, | ||
| args.database, | ||
| args.verbose, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you remove the trailing comma and re-run make format the load_json_to_database() call will likely collapse on one line.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed in bfab170 - removed trailing comma and ran black formatter, which collapsed the call to one line.
Co-authored-by: gvanrossum-ms <184014243+gvanrossum-ms@users.noreply.github.com>
gvanrossum-ms
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@copilot, one request. Can you test that the code actually works with real data? Run it with something like
rm -f loadtest.db
python tools/load_json.py -v -d loadtest.db tests/testdata/Episode_53_AdrianTchaikovsky_index
and inspect the database using sqlite3, e.g.
sqlite3 loadtest.db
select count(*) from messages;
Then just post the output in a comment to the PR.
|
@copilot one request. Can you test that the code actually works with real data? Run it with something like and inspect the database using sqlite3, e.g. Then just post the output in a comment to the PR. |
Summary
This PR successfully implements the requested feature: a new
tools/load_json.pytool that loads JSON-serialized podcast data into a SQLite database, and simplifiesquery.pyby removing the corresponding logic.Changes Made
✅ Created
tools/load_json.py-v/--verboseand-d/--database(required) flagsPodcast.read_from_file()async with provider:) for proper transaction handling✅ Simplified
tools/query.py--podcastflag completely--databaseflag requiredload_podcast_index()function (26 lines removed)SqliteStorageProviderimport✅ Addressed All Review Feedback
async with provider:properly wrapping the read_from_file callCode Quality
black)Usage
Original prompt
💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more Copilot coding agent tips in the docs.