Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Set both "page" and "page_size" Params to PagedSearchIterable #1144

Closed
ashok2ashok opened this issue May 21, 2021 · 2 comments
Closed

Set both "page" and "page_size" Params to PagedSearchIterable #1144

ashok2ashok opened this issue May 21, 2021 · 2 comments

Comments

@ashok2ashok
Copy link

Describe the bug
While trying to limit filename search results to a specific number - say 1, the library ends up iterating over all the pages until every result is collated. I was trying to limit my search result to only one page with a certain page_size. Not looking for the library to parse all possible pages.

To Reproduce
Steps to reproduce the behavior:

  1. Perform a Search with a single filename, language and with a page size of 1. Sample given below:
    PagedSearchIterable<GHContent> searchResultsIter = github.searchContent().filename(fileName).language(language).list().withPageSize(1);

I followed the fix to this: #221 to come up with above code. There is no way to specify page param though.

  1. Put a debug point on the below line (146 in v1.128) in GitHubPageIterator.java in fetch() method
    assert nextResponse.body() != null;
  2. Verify the url param in the method at line 142
    URL url = nextRequest.url();
  3. Resume Program to capture the looped page sizes. On every subsequent loop, the page is incremented automatically.

Expected behavior
To get the behavior of below GET URL through a way to specify page query param as well:
https://api.github.com/search/code?q=filename:ThreadSafeLazyLoadedIvoryTowerTest language:Java&per_page=1&page=1

Desktop (please complete the following information):

  • OS: MacOS Big Sur 11.2.3, Mac mini (M1, 2020)
  • Browser: Safari
  • Version: 1.128

Additional context
I didn't find a way to limit my search results using both page_size and page query params. I tried to cheat by using the q param but Github doesn't not accept the page params in the query-field.
The first search API call doesn't not include page param.
Screenshot 2021-05-21 at 3 04 38 PM

The second auto-search API call added page param and increments it everytime until all results are fetched.
Screenshot 2021-05-21 at 3 05 00 PM

@ashok2ashok
Copy link
Author

ashok2ashok commented May 21, 2021

I now understand how silly my question was. I didn't notice the Iterable.

Fair warning to folks who are in the same boat - the issue will be masked and you'll be waiting forever if you use similar code to mine as shown below to read the results. searchResultsIter.forEach will continue calling the GH-API until the entire results are fetched instead of just the requested page_size. Same thing with searchResultsIter.toList()

           PagedSearchIterable<GHContent> searchResultsIter = github.searchContent().filename(fileName).language(language).list().withPageSize(1);
            List<GHContent> searchResultsList = new ArrayList<>();
            searchResultsIter.forEach(searchResultsList::add)

Since each call will exactly fetch the requested page_size, the iterator can be used to fetch the exact number of results you need - hopefully a multiple of page_size. To request records individually - use GHContent ghContent = ghContentPagedIterator.next() and to request entire page with the requested page_size use List<GHContent> ghContentList = ghContentPagedIterator.nextPage();

           PagedSearchIterable<GHContent> searchResultsIter = github.searchContent().filename(fileName).language(language).list().withPageSize(1);
            PagedIterator<GHContent> ghContentPagedIterator = searchResultsIter.iterator();
            if(ghContentPagedIterator.hasNext()){
                List<GHContent> ghContentList = ghContentPagedIterator.nextPage();
                //... Do you thing....
            }

Please note that PagedSearchIterable.getTotalCount() will make an additional call to the configured search - so beware :)

@kohsuke - Please close this issue as invalid or probably add this to the documentation :)
And can't thank you enough for this.

@kohsuke
Copy link
Collaborator

kohsuke commented May 21, 2021

Iterable.forEach() naturally walks over the entire iterable and my feeling is that nobody should expect otherwise.

Thanks for a very thorough bug report, though.

@kohsuke kohsuke closed this as completed May 21, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants