Fix bounds for embedded lists in QP #5202

benjaminpkane · 2024-12-02T16:10:32Z

What changes are proposed in this pull request?

Fixes bounds computations for embedded lists for QP sidebar. != None is not a valid filtering approach for multikey searches

How is this patch tested? If it is not, please explain why.

Added lightning test coverage

What areas of FiftyOne does this PR affect?

App: FiftyOne application changes
Build: Build and test infrastructure changes
Core: Core fiftyone Python library changes
Documentation: FiftyOne documentation changes
Other

Summary by CodeRabbit

New Features
- Enhanced numeral formatting to improve error handling for "NaN" outputs.
- Updated RangeSlider to provide more informative feedback when no bounds are available.
Bug Fixes
- Improved handling of special float values (inf, nan) in query responses.
- Enhanced error handling and control flow in view loading and filtering processes.
Tests
- Updated test cases to reflect changes in data handling and expected outcomes for various data types.

coderabbitai · 2024-12-02T16:10:39Z

Warning

Rate limit exceeded

@benjaminpkane has exceeded the limit for the number of commits or files that can be reviewed per hour. Please wait 17 minutes and 24 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

📥 Commits

Reviewing files that changed from the base of the PR and between f085b74 and 249f48e.

Walkthrough

The changes in this pull request focus on enhancing error handling and feedback mechanisms within the numeral formatting and filtering components of the application. Modifications include updates to the getFormatter function to handle "NaN" outputs more robustly, improvements to the RangeSlider component for better contextual messaging, and adjustments to the nonfinitesText selectorFamily for more informative data representation. Additionally, the Lightning server's query handling logic has been refined, along with corresponding updates to unit tests to ensure alignment with the new data structures and expected outcomes.

Changes

File	Change Summary
`app/packages/core/src/components/Common/utils.tsx`	Modified `getFormatter` function to enhance error handling for numeral formatting, specifically addressing "NaN" outputs.
`app/packages/core/src/components/Filters/NumericFieldFilter/RangeSlider.tsx`	Introduced `nonfinitesText` state variable for improved conditional rendering in the `RangeSlider` component, enhancing user feedback when no bounds are available.
`app/packages/core/src/components/Filters/NumericFieldFilter/state.ts`	Renamed `hasNonfinites` selectorFamily to `nonfinitesText`, altering its logic to return a comma-separated string of non-finite keys.
`fiftyone/server/lightning.py`	Enhanced query handling in the Lightning server, including updates to the `_do_async_query` and `_parse_result` functions for improved clarity and functionality.
`tests/unittests/lightning_tests.py`	Updated test cases to reflect changes in data handling for floating-point special values and modified GraphQL query structures accordingly.

Possibly related PRs

fix modal not rendering properly in qp mode #5076: The modifications in the NumericFieldFilter component regarding the visibility of the button based on the modal prop may relate to the changes in error handling and display logic in the getFormatter function.
Coinciding filter results for QP sidebar #5121: This PR enhances the filtering mechanism in the sidebar, which relates to the changes in the getFormatter function that improve error handling for numeral formatting.
Fix lightning numeric bounds #5125: The modifications in this PR address numeric bounds calculations, which are closely related to the numeral formatting logic in the main PR's getFormatter function.
QP sidebar filters to active slice for group datasets #5177: This PR introduces enhancements for handling group datasets, which may involve similar logic to the numeral formatting changes in the main PR, particularly in how data is processed and displayed.

Suggested labels

enhancement, app, server, documentation

Suggested reviewers

minhtuev
CamronStaley

🐇 In the meadow where numbers play,
A formatter found a better way.
With "NaN" no longer causing fright,
It brings clarity, shining bright.
The sliders now tell tales so true,
With every bound, a clearer view! 🌼

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Generate unit testing code for this file.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
- @coderabbitai generate unit testing code for this file.
- @coderabbitai modularize this function.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read src/utils.ts and generate unit testing code.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
- @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

@coderabbitai pause to pause the reviews on a PR.
@coderabbitai resume to resume the paused reviews.
@coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
@coderabbitai full review to do a full review from scratch and review all the files again.
@coderabbitai summary to regenerate the summary of the PR.
@coderabbitai resolve resolve all the CodeRabbit review comments.
@coderabbitai configuration to show the current CodeRabbit configuration for the repository.
@coderabbitai help to get help.

Other keywords and placeholders

Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

Documentation and Community

Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

coderabbitai

Actionable comments posted: 1

🧹 Outside diff range and nitpick comments (5)

app/packages/core/src/components/Filters/NumericFieldFilter/state.ts (1)
42-49: Refactor to eliminate code duplication

The mapping and filtering logic is duplicated. Consider extracting it into a helper function for better maintainability.
+ const getNonfiniteKeys = (get, params) =>
+   FLOAT_NONFINITES.map((key) => [
+     key,
+     get(nonfiniteAtom({ key, ...params })),
+   ])
+     .filter(([_, v]) => v)
+     .map(([k]) => k);

  export const nonfinitesText = selectorFamily({
    key: "nonfinitesText",
    get:
      (params: { path: string; modal: boolean }) =>
      ({ get }) => {
-       const result = FLOAT_NONFINITES.map((key) => [
-         key,
-         get(nonfiniteAtom({ key, ...params })),
-       ])
-         .filter(([_, v]) => v)
-         .map(([k]) => k);
+       const result = getNonfiniteKeys(get, params);
        return result.length ? result.join(", ") : null;
      },
  });
app/packages/core/src/components/Common/utils.tsx (1)
71-78: Simplify format selection logic and improve NaN handling

The format selection logic could be simplified, and the NaN handling could be more explicit.
+ const getNumberFormat = (fieldType: string, bounds: [number, number]) => {
+   if ([INT_FIELD, FRAME_NUMBER_FIELD, FRAME_SUPPORT_FIELD].includes(fieldType)) {
+     return "0a";
+   }
+   return bounds[1] - bounds[0] < 0.1 ? "0.0000a" : "0.00a";
+ };

- const str = numeral(v).format(
-   [INT_FIELD, FRAME_NUMBER_FIELD, FRAME_SUPPORT_FIELD].includes(fieldType)
-     ? "0a"
-     : bounds[1] - bounds[0] < 0.1
-     ? "0.0000a"
-     : "0.00a"
- );
+ const format = getNumberFormat(fieldType, bounds);
+ const str = numeral(v).format(format);

- return str === "NaN" ? v.toString() : str;
+ // Explicitly handle NaN case
+ if (str === "NaN" || Number.isNaN(v)) {
+   return v.toString();
+ }
+ return str;
app/packages/core/src/components/Filters/NumericFieldFilter/RangeSlider.tsx (1)
38-38: Consider using a more flexible text formatting approach

The text concatenation could be made more flexible and maintainable.
  const nonfinitesText = useRecoilValue(state.nonfinitesText({ path, modal }));

+ const getBoxText = (nonfinitesText: string | null) => {
+   if (!nonfinitesText) {
+     return "No results";
+   }
+   return `${nonfinitesText} present`;
+ };

  if (!hasBounds) {
    return (
-     <Box text={nonfinitesText ? `${nonfinitesText} present` : "No results"} />
+     <Box text={getBoxText(nonfinitesText)} />
    );
  }
Also applies to: 41-43
fiftyone/server/lightning.py (2)
326-328: Consider optimizing query performance.

The current implementation adds sort before match, which could impact performance as MongoDB would sort the entire collection before filtering. Consider reordering to filter first:
-            query.insert(0, {"$match": {k: v}})
-            query.insert(0, {"$sort": {k: 1}})
+            query.insert(0, {"$sort": {k: 1}})
+            query.insert(0, {"$match": {k: v}})
509-515: LGTM: Enhanced result parsing with finite check.

The changes improve result parsing by properly handling non-finite float values. Consider adding debug logging for when non-finite values are encountered to help with troubleshooting.

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between 7287351 and 750694b.

📒 Files selected for processing (5)

app/packages/core/src/components/Common/utils.tsx (1 hunks)
app/packages/core/src/components/Filters/NumericFieldFilter/RangeSlider.tsx (1 hunks)
app/packages/core/src/components/Filters/NumericFieldFilter/state.ts (1 hunks)
fiftyone/server/lightning.py (4 hunks)
tests/unittests/lightning_tests.py (27 hunks)

🧰 Additional context used

📓 Path-based instructions (3)

app/packages/core/src/components/Common/utils.tsx (1)

Pattern **/*.{ts,tsx}: Review the Typescript and React code for conformity with best practices in React, Recoil, Graphql, and Typescript. Highlight any deviations.

app/packages/core/src/components/Filters/NumericFieldFilter/RangeSlider.tsx (1)

Pattern **/*.{ts,tsx}: Review the Typescript and React code for conformity with best practices in React, Recoil, Graphql, and Typescript. Highlight any deviations.

app/packages/core/src/components/Filters/NumericFieldFilter/state.ts (1)

Pattern **/*.{ts,tsx}: Review the Typescript and React code for conformity with best practices in React, Recoil, Graphql, and Typescript. Highlight any deviations.

🔇 Additional comments (5)

fiftyone/server/lightning.py (2)

12-12: LGTM: Required import for float handling.

The math module import is necessary for checking finite float values.

429-432: LGTM: Improved float handling logic.

The changes correctly handle float values by:

Adding float-specific pipeline steps when needed
Ensuring existence check is consistently applied

tests/unittests/lightning_tests.py (3)

538-538: LGTM: Comprehensive test coverage for float handling.

The test assertions have been properly updated to verify the new behavior of returning None for non-finite float values in various scenarios.

Also applies to: 572-572, 591-591, 609-609, 635-635, 644-644, 654-654, 663-663, 672-672, 681-681, 734-734, 753-753, 771-771, 797-797, 806-806, 816-816, 825-825, 834-834, 843-843, 932-932, 951-951, 969-969

1063-1067: LGTM: Added test cases for group dataset queries.

New test cases properly verify the handling of classifications with confidence values, including negative values.

Also applies to: 1076-1080

1092-1096: LGTM: Comprehensive group dataset test coverage.

The test cases thoroughly verify:

Float field handling in classifications
Proper slicing of group datasets
Correct min/max computations

Also applies to: 1114-1120, 1128-1132, 1146-1152, 1160-1164

app/packages/core/src/components/Filters/NumericFieldFilter/state.ts

coderabbitai

Actionable comments posted: 2

🧹 Outside diff range and nitpick comments (2)

app/packages/core/src/components/Filters/NumericFieldFilter/state.ts (1)
28-33: Add type safety improvements

The selector could benefit from better type safety:

Missing return type annotation

No type checking for the data structure

Consider adding these type improvements:
 export const nonfinitesText = selectorFamily({
   key: "nonfinitesText",
   get:
     (params: { path: string; modal: boolean }) =>
-    ({ get }) => {
+    ({ get }): string | null => {
+      const data: Record<string, boolean> = get(nonfiniteData({ ...params, extended: false }));
fiftyone/server/lightning.py (1)
326-328: Consider performance impact of added sorting operation

While the addition of sorting ensures consistent ordering, consider the performance impact when dealing with large datasets.

Consider adding a conditional sort only when necessary:
 for k, v in filter.items():
     query.insert(0, {"$match": {k: v}})
-    query.insert(0, {"$sort": {k: 1}})
+    if needs_sorting(k):  # Add helper function to determine if sorting is needed
+        query.insert(0, {"$sort": {k: 1}})

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between 750694b and f085b74.

📒 Files selected for processing (3)

app/packages/core/src/components/Filters/NumericFieldFilter/state.ts (2 hunks)
fiftyone/server/lightning.py (5 hunks)
fiftyone/server/view.py (2 hunks)

🧰 Additional context used

📓 Path-based instructions (1)

app/packages/core/src/components/Filters/NumericFieldFilter/state.ts (1)

Pattern **/*.{ts,tsx}: Review the Typescript and React code for conformity with best practices in React, Recoil, Graphql, and Typescript. Highlight any deviations.

🔇 Additional comments (7)

fiftyone/server/lightning.py (4)

429-430: LGTM: Proper handling of non-finite float values

The addition of special float handling in the pipeline is correct and aligns with the PR objective.

432-432: LGTM: Added existence check

The addition of the existence check in the pipeline is a good defensive programming practice.

509-515: LGTM: Proper handling of non-finite float values

The addition of math.isfinite() check ensures proper handling of special float values (inf, -inf, nan).

143-147: LGTM: Improved filter initialization logic

The conditional initialization of filters based on group field and slice presence is a good improvement.

Let's verify the usage of group_field across the codebase:

✅ Verification successful

Based on the search results, I can see that the group_field is a core feature of FiftyOne datasets that allows organizing samples into groups. The filter initialization in the code under review is correctly using dataset.group_field to filter samples based on the group slice when present.

LGTM: Filter initialization correctly handles group field filtering

The conditional filter initialization using dataset.group_field and input.slice is properly implemented to filter samples by their group slice when applicable. This is consistent with FiftyOne's group field functionality as evidenced by extensive usage across the codebase.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Search for group_field usage to ensure consistent handling
rg "group_field" -A 3

Length of output: 66275

fiftyone/server/view.py (3)

617-619: LGTM: Added defensive range check

The addition of the range check before unpacking prevents potential KeyError exceptions.

644-647: LGTM: Proper datetime range handling

The addition of range check and proper timestamp conversion ensures robust datetime handling.

649-651: LGTM: Consistent range handling for numeric values

The addition of range check maintains consistency with other type handlers and prevents potential errors.

coderabbitai · 2024-12-02T21:20:00Z

app/packages/core/src/components/Filters/NumericFieldFilter/state.ts

+      const result = Object.fromEntries(
+        Object.entries(data).filter(([k, v]) => k !== "none" && Boolean(v))
      );
+      return result.length ? result.join(", ") : null;


⚠️ Potential issue

Fix incorrect Object manipulation logic

The current implementation has several issues:

Object.fromEntries() creates an object, but then tries to use array methods (length, join)

The filtering logic could be more efficient

Apply this fix:

- const result = Object.fromEntries( - Object.entries(data).filter(([k, v]) => k !== "none" && Boolean(v)) - ); - return result.length ? result.join(", ") : null; + const filtered = Object.entries(data) + .filter(([k, v]) => k !== "none" && Boolean(v)) + .map(([k]) => k); + return filtered.length ? filtered.join(", ") : null;

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

const result = Object.fromEntries(

Object.entries(data).filter(([k, v]) => k !== "none" && Boolean(v))

);

return result.length ? result.join(", ") : null;

const filtered = Object.entries(data)

.filter(([k, v]) => k !== "none" && Boolean(v))

.map(([k]) => k);

return filtered.length ? filtered.join(", ") : null;

app/packages/core/src/components/Filters/NumericFieldFilter/state.ts

kaixi-wang · 2024-12-02T22:02:06Z

fiftyone/server/lightning.py

@@ -317,7 +317,7 @@ async def _do_async_query(
    filter: t.Optional[t.Mapping[str, str]],
 ):
    if isinstance(query, DistinctQuery):
-        if query.has_list and not query.filters:
+        if query.has_list:
            return await _do_distinct_query(collection, query, filter)


not sure about mongo, but distinct scans in most dbs are really expensive and slow... have you confirmed that relying more on distinct improves performance?

A distinct scan with high cardinality is not good, yes. I am not aware of an aggregation pipeline that can return the first N sorted results using a multikey index, though. But we should investigate more

minhtuev · 2024-12-02T22:10:26Z

app/packages/core/src/components/Common/utils.tsx

        [INT_FIELD, FRAME_NUMBER_FIELD, FRAME_SUPPORT_FIELD].includes(fieldType)
          ? "0a"
          : bounds[1] - bounds[0] < 0.1
          ? "0.0000a"
          : "0.00a"
      );
+      return str === "NaN" ? v.toString() : str;


if str is not a number (null?), we just show it as null?

The formatter (numeral().format) returns an NaN string when it can't meaningfully format given the precision provided ("0.00a"). Punting on a better solution for now, this at least shows the value as opposed to NaN

minhtuev

Small comment, otherwise LGTM

benjaminpkane added 3 commits December 2, 2024 10:20

remove ne matching and extra sorting

f7bc02e

use exists

aeb9433

add to embedded list bounds tests

edc4ddc

benjaminpkane changed the title ~~Do not~~ Fix bounds for embedded lists in QP Dec 2, 2024

benjaminpkane self-assigned this Dec 2, 2024

benjaminpkane added 4 commits December 2, 2024 13:02

skip nonfinites sort for lists

46ab13f

handle NaN formatting

c36ed04

debugging

0f9b3ca

nonfinites tweaks

750694b

benjaminpkane requested review from minhtuev and CamronStaley December 2, 2024 20:03

benjaminpkane marked this pull request as ready for review December 2, 2024 20:03

coderabbitai bot reviewed Dec 2, 2024

View reviewed changes

app/packages/core/src/components/Filters/NumericFieldFilter/state.ts Outdated Show resolved Hide resolved

selector fixes

f478784

CamronStaley previously approved these changes Dec 2, 2024

View reviewed changes

always use distinct for embedded lists

f085b74

benjaminpkane dismissed CamronStaley’s stale review via f085b74 December 2, 2024 21:17

coderabbitai bot reviewed Dec 2, 2024

View reviewed changes

fix selector

249f48e

kaixi-wang reviewed Dec 2, 2024

View reviewed changes

minhtuev reviewed Dec 2, 2024

View reviewed changes

minhtuev approved these changes Dec 2, 2024

View reviewed changes

benjaminpkane merged commit bf1d29d into release/v1.1.0 Dec 2, 2024
13 checks passed

benjaminpkane deleted the bugfix/ne-none branch December 2, 2024 22:19

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix bounds for embedded lists in QP #5202

Fix bounds for embedded lists in QP #5202

benjaminpkane commented Dec 2, 2024 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Dec 2, 2024 •

edited

Loading

Rate limit exceeded

Chat

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

Documentation and Community

coderabbitai bot left a comment

coderabbitai bot left a comment

coderabbitai bot Dec 2, 2024

kaixi-wang Dec 2, 2024

benjaminpkane Dec 2, 2024

minhtuev Dec 2, 2024

benjaminpkane Dec 2, 2024

minhtuev left a comment

Fix bounds for embedded lists in QP #5202

Fix bounds for embedded lists in QP #5202

Conversation

benjaminpkane commented Dec 2, 2024 • edited by coderabbitai bot Loading

What changes are proposed in this pull request?

How is this patch tested? If it is not, please explain why.

What areas of FiftyOne does this PR affect?

Summary by CodeRabbit

coderabbitai bot commented Dec 2, 2024 • edited Loading

Rate limit exceeded

Walkthrough

Changes

Possibly related PRs

Suggested labels

Suggested reviewers

Chat

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

Documentation and Community

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot Dec 2, 2024

Choose a reason for hiding this comment

kaixi-wang Dec 2, 2024

Choose a reason for hiding this comment

benjaminpkane Dec 2, 2024

Choose a reason for hiding this comment

minhtuev Dec 2, 2024

Choose a reason for hiding this comment

benjaminpkane Dec 2, 2024

Choose a reason for hiding this comment

minhtuev left a comment

Choose a reason for hiding this comment

benjaminpkane commented Dec 2, 2024 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Dec 2, 2024 •

edited

Loading