Add extra if in pseudocode fetch all docs #2691

lbercken · 2024-09-03T07:48:00Z

Description

"You can repeat this process until you’ve fetched as many docs as you want, or until the nextCursorMark returned matches the cursorMark you’ve already specified — indicating that there are no more results."

The pseudo code misses the first part.

Solution

Add an extra if for checking whether the number of documents in the result is less than the specified rows. The extra if also prevents an unnecessary extra request.

It could also be shortened, by replacing the two ifs by:

$done = count($results[response][docs]) < $r || $params[cursorMark] == $results[nextCursorMark]

I don't know whether that's good for readability.

In addition this could be transformed into a do while statement.

Tests

None. This is a documentation change.

Checklist

Please review the following and check all that apply:

I have reviewed the guidelines for How to Contribute and my code conforms to the standards described there to the best of my ability.
I have created a Jira issue and added the issue ID to my pull request title.
I have given Solr maintainers access to contribute to my PR branch. (optional but recommended, not available for branches on forks living under an organisation)
I have developed this patch against the main branch.
I have run ./gradlew check.
I have added tests for my changes.
I have added documentation for the Reference Guide

Not all of the above is checked, because it is a documentation change.

"You can repeat this process until you’ve fetched as many docs as you want, or until the nextCursorMark returned matches the cursorMark you’ve already specified — indicating that there are no more results." The pseudo code misses the first part. The extra if also prevents an unnecessary extra request.

epugh

i think pseudo code by it's nature can be lengthier if that helps clarity!

"You can repeat this process until you’ve fetched as many docs as you want, or until the nextCursorMark returned matches the cursorMark you’ve already specified — indicating that there are no more results." The pseudo code misses the first part. The extra if also prevents an unnecessary extra request.

epugh · 2024-09-04T15:33:16Z

Thank you!

hossman · 2024-09-04T17:38:38Z

Hmmmm.... IMO this change does not seem like an overall improvement to the doc

First off, the premise of the PR seems to conflate two diff scenarios...

"You can repeat this process until you’ve fetched as many docs as you want, or until the nextCursorMark returned matches the cursorMark you’ve already specified — indicating that there are no more results."
The pseudo code misses the first part.

There is a difference between the "until you’ve fetched as many docs as you want" scenario (ie: "I as a client want to fetch docs from solr until i have enough to satisfy my purposes" - there is a separate example for that later in the doc) and "solr ran out of documents to return" scenario being demonstrated here.

Adding a "did solr return the full number of rows requested?" check to the code doesn't do anything to demonstrate the first scenario

Second:
The original psuedo-code here explicitly did NOT check for "did solr return the full number of rows requested?" because doing so is only a viable way to optimize away the lsat request in static indexes. In a "Tailing a Cursor" type scenario you absolutely should not check the number of docs returned -- this is a type of use case that is introduced later in the document, and the psuedocode in both sections was designed to be structuraly the same -- except for the addition of loop-forever/sleep wrapped around it

If we're going to change this pseudo code block to do an explicit rows check, then the example should probably note it's an optimization, and the other psuedo code example in the later section on "tailing a cursor" should have a note that you can't use that optimization in this situation.

Third:
This PR only changed the psuedo-code example, even though the very next line after the example says...

Using SolrJ, this pseudocode would be:

...but now the psuedo-code and SolrJ code no longer match.

Likewise the sequential "curl" example (following the SolrJ example) no longer mtaches either, because it still does a final request looking or an unchanged nextCursorMark (even though it notes the previous request request returned less docs then the rows param)

epugh · 2024-09-04T17:45:32Z

I knew I was going to get in trouble with pseudo code...! I'll roll this back.
One thought, does the pseudo code even help convey thigns? I sort of wondered if it was useful at all.... Thoughts?

epugh · 2024-09-04T17:47:37Z

and we can fix the q=:&rows=5&start=0&sort=name asc, id asc&cursorMark=* where the q param looks badly formatted..

hossman · 2024-09-04T17:50:47Z

I knew I was going to get in trouble with pseudo code...!

:) ... you just gotta remember to review the whole doc holistically.

One thought, does the pseudo code even help convey thigns? I sort of wondered if it was useful at all.... Thoughts?

I don't have a strong opinion ... The idea once upon a time was to avoid the implication to new users that they had to use SolrJ (or python, or whatever) to achieve certain goals.

If you want to rip out the psuedo-code i'm fine with that, but it means re-rwiting all the other examples to use SolrJ (and i don't think this is the only page in the ref-guide with psuedo code?)

This reverts commit 1ce59fa.

epugh · 2024-09-07T12:22:00Z

@lbercken I reverted the commit, but would love to try again if you can take @hossman feedback into account...I thought I would be able to jsut "reopen" this PR, but that doesnt' appear to be how it works!

This reverts commit 1ce59fa.

github-actions bot added the documentation Improvements or additions to documentation label Sep 3, 2024

epugh approved these changes Sep 4, 2024

View reviewed changes

epugh self-assigned this Sep 4, 2024

epugh merged commit 1ce59fa into apache:main Sep 4, 2024
3 checks passed

epugh added a commit that referenced this pull request Sep 7, 2024

Revert "Add extra if in pseudocode fetch all docs (#2691)"

7923292

This reverts commit 1ce59fa.

epugh added a commit that referenced this pull request Sep 7, 2024

Revert "Add extra if in pseudocode fetch all docs (#2691)"

feaca6f

This reverts commit 1ce59fa.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add extra if in pseudocode fetch all docs #2691

Add extra if in pseudocode fetch all docs #2691

lbercken commented Sep 3, 2024

epugh left a comment

epugh commented Sep 4, 2024

hossman commented Sep 4, 2024

epugh commented Sep 4, 2024

epugh commented Sep 4, 2024

hossman commented Sep 4, 2024

epugh commented Sep 7, 2024

Add extra if in pseudocode fetch all docs #2691

Add extra if in pseudocode fetch all docs #2691

Conversation

lbercken commented Sep 3, 2024

Description

Solution

Tests

Checklist

epugh left a comment

Choose a reason for hiding this comment

epugh commented Sep 4, 2024

hossman commented Sep 4, 2024

epugh commented Sep 4, 2024

epugh commented Sep 4, 2024

hossman commented Sep 4, 2024

epugh commented Sep 7, 2024