Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MRG] fix prevewing public dataset errors #190

Merged
merged 1 commit into from
Sep 7, 2024
Merged

Conversation

HuaizhengZhang
Copy link
Contributor

@HuaizhengZhang HuaizhengZhang commented Sep 7, 2024

User description

Closes #188


PR Type

enhancement, bug fix


Description

  • Clarified the conditions under which the preview_csv_data function should be used for public datasets in the advisor documentation.
  • Changed the default mode in the CLI from 'general' to 'baseline' to better align with the intended functionality.
  • Updated the error message in the CLI to reflect the new mode options, ensuring users are informed of valid modes.

Changes walkthrough 📝

Relevant files
Documentation
advisor.py
Clarify dataset preview conditions in advisor documentation

mle/agents/advisor.py

  • Clarified the usage of preview_csv_data for public datasets.
  • Updated documentation to specify conditions for previewing datasets.
  • +1/-1     
    Enhancement
    cli.py
    Update CLI default mode and error message                               

    mle/cli.py

  • Changed default mode from 'general' to 'baseline'.
  • Updated error message to reflect new mode options.
  • +3/-3     

    💡 PR-Agent usage:
    Comment /help on the PR to get a list of all available PR-Agent tools and their descriptions

    @dosubot dosubot bot added size:XS This PR changes 0-9 lines, ignoring generated files. bug Something isn't working labels Sep 7, 2024
    Copy link

    github-actions bot commented Sep 7, 2024

    PR Reviewer Guide 🔍

    ⏱️ Estimated effort to review: 2 🔵🔵⚪⚪⚪
    🧪 No relevant tests
    🔒 No security concerns identified
    ⚡ Key issues to review

    Ambiguity in Documentation
    The added documentation in line 58 is ambiguous. It's unclear whether the function preview_csv_data should or should not be used for public datasets. Consider rephrasing for clarity.

    Error Handling
    The error message in line 59 only specifies 'baseline' and 'report' as valid modes. Ensure this aligns with all supported modes in the CLI.

    Copy link

    github-actions bot commented Sep 7, 2024

    PR Code Suggestions ✨

    CategorySuggestion                                                                                                                                    Score
    Possible bug
    Add conditional logic to ensure preview_csv_data is only called for local datasets

    The conditional logic for determining whether to preview the CSV file based on the
    dataset type (public or local) is unclear and potentially missing. It's important to
    implement a clear conditional check to ensure that preview_csv_data is called only
    for local datasets and not for public datasets.

    mle/agents/advisor.py [58]

    -the CSV file or not if the dataset is a public dataset.
    +if not is_public_dataset:
    +    preview_csv_data(dataset_path)
     
    Suggestion importance[1-10]: 8

    Why: The suggestion addresses a potential bug by clarifying when to call preview_csv_data, which is crucial for correct functionality when handling different dataset types.

    8
    Maintainability
    Refactor mode handling to use a dictionary for scalability and maintainability

    Consider handling additional modes in the start function or providing a more generic
    error handling mechanism. This would make the function more flexible and
    maintainable, especially if more modes are to be added in the future.

    mle/cli.py [52-59]

    -if mode == 'baseline':
    -    # Baseline mode
    -    return workflow.baseline(os.getcwd(), model)
    -elif mode == 'report':
    -    # Report mode
    -    return ctx.invoke(report, model=model, visualize=False)
    -else:
    -    raise ValueError("Invalid mode. Supported modes: 'baseline', 'report'.")
    +mode_functions = {
    +    'baseline': lambda: workflow.baseline(os.getcwd(), model),
    +    'report': lambda: ctx.invoke(report, model=model, visualize=False)
    +}
    +try:
    +    return mode_functions[mode]()
    +except KeyError:
    +    raise ValueError(f"Invalid mode. Supported modes: {', '.join(mode_functions.keys())}.")
     
    Suggestion importance[1-10]: 7

    Why: The suggestion improves maintainability by using a dictionary for mode handling, making the code more scalable and easier to extend with additional modes.

    7
    Possible issue
    Verify and potentially revert the default mode to 'general' if the change was unintended

    The default value for the mode argument in the start command has been changed from
    'general' to 'baseline'. Ensure that this change is intentional and correctly
    documented, as it alters the default behavior of the command.

    mle/cli.py [43]

    -@click.argument('mode', default='baseline')
    +@click.argument('mode', default='general')  # if reverting to previous default
     
    Suggestion importance[1-10]: 5

    Why: The suggestion is valid as it prompts verification of an intentional change in default behavior, which is important for maintaining expected functionality.

    5

    @HuaizhengZhang HuaizhengZhang requested review from zlfben and huangyz0918 and removed request for huangyz0918 and zlfben September 7, 2024 18:34
    @dosubot dosubot bot added the lgtm This PR has been approved by a maintainer label Sep 7, 2024
    @HuaizhengZhang HuaizhengZhang merged commit c0c82f3 into main Sep 7, 2024
    4 checks passed
    @HuaizhengZhang HuaizhengZhang deleted the hz/fix-public-data branch September 7, 2024 18:49
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    Bug fix bug Something isn't working enhancement New feature or request lgtm This PR has been approved by a maintainer Review effort [1-5]: 2 size:XS This PR changes 0-9 lines, ignoring generated files.
    Projects
    None yet
    Development

    Successfully merging this pull request may close these issues.

    after dataset guess, we can not continue the process
    2 participants