Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move code samples to master branch #3271

Merged
merged 14 commits into from
Jun 13, 2024
Merged

Conversation

andy-stark-redis
Copy link
Contributor

@andy-stark-redis andy-stark-redis commented Jun 7, 2024

Pull Request check-list

Please make sure to review and check all of these items:

  • Do tests and lints pass with this change?
  • Do the CI tests pass with this change (enable it first in your forked repo and wait for the github action build to finish)?
  • Is the new or changed code fully tested?
  • Was the change added to CHANGES file?

NOTE: these things are not required to open a PR and can be done
afterwards / while the PR is open.

Description of change

The code samples for the main documentation pages are kept in this repo but they were previously in a separate branch (emb-examples). The plan is to move them to the master branch and run tests on them so they get checked frequently.

@andy-stark-redis andy-stark-redis self-assigned this Jun 7, 2024
@andy-stark-redis andy-stark-redis requested a review from gerzse June 7, 2024 11:05
@andy-stark-redis andy-stark-redis marked this pull request as ready for review June 7, 2024 11:06
Comment on lines +30 to +33
URL = ("https://raw.githubusercontent.com/bsbodden/redis_vss_getting_started"
"/main/data/bikes.json"
)
response = requests.get(URL, timeout=10)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe this data should be included here instead of being pulled from another repo that can change at any moment. That'd drop the requests dependency.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@akx Thanks for the review - very much appreciated! I did wonder myself if it would be better to add the JSON to the code. However, I think the tutorial gets the data from a URL to illustrate the kind of task the user might want to do in real life. Also, the file on the Github repo was deliberately created for this tutorial, so it shouldn't change unless we want it to. I'd say we should leave it like this for now but I'll find out if we've got any feedback about this page and see if the users have any problems with the HTTP request.

Comment on lines 288 to 299
# Optional: convert the table to Markdown using Pandas
queries_table = pd.DataFrame(results_list)
queries_table.sort_values(
by=["query", "score"], ascending=[True, False], inplace=True
)
queries_table["query"] = queries_table.groupby("query")["query"].transform(
lambda x: [x.iloc[0]] + [""] * (len(x) - 1)
)
queries_table["description"] = queries_table["description"].apply(
lambda x: (x[:497] + "...") if len(x) > 500 else x
)
queries_table.to_markdown(index=False)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that this is "optional" and nothing is actually even done with the markdown return value, this should probably be removed so pandas and tabulate aren't required to run the tests.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@akx Again, I think it's trying to show the user a real world example. The code sample doesn't use the Markdown explicitly but the output Markdown table is actually shown in the tutorial page. I think the doc text on the page could probably use some improvement but that's a separate PR on the docs repo.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to_markdown() returns the table as a string, so this would just do a lot of work to dataframe-ify things, group them, transform them... and then throw the result away. I really don't think it's a great idea to spend CI CPU cycles (and contribute to our planet burning in a minor way) for that.

I assume these are somehow extracted from something like a notebook session, where the value of the last expression might actually be shown?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I've added a return value to the function and then the main code prints that out when the function is called. The doc page shows the Markdown table as rendered, but it looks like it only shows the first query currently (this is an issue with the doc page, not the code sample - I'll fix that separately).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm sorry, I'm not following. Why bother with that either? Is there something asserting on the output?

If there is, if the idea of these being here (and e.g. being run as part of CI?) is to test that "typical" Redis-utilizing code works, then it would break if pandas or tabulate decide to change their (human-oriented) output.

If there isn't, I don't see a point in doing the work to human-format data that no one will care about.

(On that note, running sentence-transformers in CI also sounds like a bad idea, since it will have to download the transformers models, then run embedding computation on CPU, ...)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you might be misunderstanding what this code is for :-) The code samples in this folder are included in the docs (eg, https://redis.io/docs/latest/develop/get-started/vector-database/). Their purpose is to explain to the user how to do stuff with Redis and to supply bits of code for copy/paste. The reason for putting them in the client repo as tests is to check that the sample code works against the current version of the client, not to act as real world example tests for the client.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@akx , these tests were living in a separate branch until now, this is a first step to put them as close as possible to the code they are documenting. Right now there is no CI in this repo for them, I guess we can add something in other PRs. The purpose so far is educational, and we want to decrease maintenance effort. I'll merge this as it is.

@gerzse gerzse merged commit 733f800 into redis:master Jun 13, 2024
46 checks passed
@gerzse gerzse added the maintenance Maintenance (CI, Releases, etc) label Jun 14, 2024
@gerzse gerzse changed the title DOC-3801 move code samples to master branch Move code samples to master branch Jun 20, 2024
agnesnatasya pushed a commit to agnesnatasya/redis-py that referenced this pull request Jul 20, 2024
Move the code samples used for documentation into the master branch,
so they sit next to the code they document.
vladvildanov pushed a commit that referenced this pull request Sep 27, 2024
Move the code samples used for documentation into the master branch,
so they sit next to the code they document.
vladvildanov pushed a commit that referenced this pull request Sep 27, 2024
Move the code samples used for documentation into the master branch,
so they sit next to the code they document.
vladvildanov pushed a commit that referenced this pull request Sep 27, 2024
Move the code samples used for documentation into the master branch,
so they sit next to the code they document.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
maintenance Maintenance (CI, Releases, etc)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants