Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bump tqdm from 4.65.0 to 4.66.3 in /inference #2

Open
wants to merge 11 commits into
base: main
Choose a base branch
from

Conversation

dependabot[bot]
Copy link

@dependabot dependabot bot commented on behalf of github Sep 30, 2024

Bumps tqdm from 4.65.0 to 4.66.3.

Release notes

Sourced from tqdm's releases.

tqdm v4.66.3 stable

tqdm v4.66.2 stable

  • pandas: add DataFrame.progress_map (#1549)
  • notebook: fix HTML padding (#1506)
  • keras: fix resuming training when verbose>=2 (#1508)
  • fix format_num negative fractions missing leading zero (#1548)
  • fix Python 3.12 DeprecationWarning on import (#1519)
  • linting: use f-strings (#1549)
  • update tests (#1549)
  • CI: bump actions (#1549)

tqdm v4.66.1 stable

  • fix utils.envwrap types (#1493 <- #1491, #1320 <- #966, #1319)
    • e.g. cloudwatch & kubernetes workaround: export TQDM_POSITION=-1
  • drop mentions of unsupported Python versions

tqdm v4.66.0 stable

  • environment variables to override defaults (TQDM_*) (#1491 <- #1061, #950 <- #614, #1318, #619, #612, #370)
    • e.g. in CI jobs, export TQDM_MININTERVAL=5 to avoid log spam
    • add tests & docs for tqdm.utils.envwrap
  • fix & update CLI completion
  • fix & update API docs
  • minor code tidy: replace os.path => pathlib.Path
  • fix docs image hosting
  • release with CI bot account again (cli/cli#6680)

tqdm v4.65.2 stable

  • exclude examples from distributed wheel (#1492)

tqdm v4.65.1 stable

  • migrate setup.{cfg,py} => pyproject.toml (#1490)
    • fix asv benchmarks
    • update docs
  • fix snap build (#1490)
  • fix & update tests (#1490)
    • fix flaky notebook tests
    • bump pre-commit
    • bump workflow actions
Commits

Dependabot compatibility score

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


Dependabot commands and options

You can trigger Dependabot actions by commenting on this PR:

  • @dependabot rebase will rebase this PR
  • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
  • @dependabot merge will merge this PR after your CI passes on it
  • @dependabot squash and merge will squash and merge this PR after your CI passes on it
  • @dependabot cancel merge will cancel a previously requested merge and block automerging
  • @dependabot reopen will reopen this PR if it is closed
  • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
  • @dependabot show <dependency name> ignore conditions will show all of the ignore conditions of the specified dependency
  • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
  • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
  • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    You can disable automated security fix PRs for this repo from the Security Alerts page.

dyang415 and others added 11 commits September 10, 2024 23:15
BFCL V3 release. Introducing new multi-turn dataset and state-based
evaluation metric for category: `multi_turn_base`,
`multi_turn_miss_func`, `multi_turn_miss_param`,
`multi_turn_long_context`, `multi_turn_composite`; a significant leap
towards multi-turn, and multi-step function calling (tool usage)
benchmarking.

BFCL V3 is a critical advancement in evaluating how Large Language
Models (LLMs) interact with diverse scenarios through invoking right
functions. Multi-turn function calling allows models to engage in a
back-and-forth interaction with users, making it possible for LLMs to
navigate through the complex tasks by asking clarifying questions. In
contrast to multi-turn `(user t0, assistant t1, user t2, assistant t3,
..)`, multi-step is where the LLM can break the response down into
multiple steps `(user t0, assistant t1, assistant t2,..)`. This new
paradigm mimics real-world agentic behaviors where AI assistants might
have to plan execution paths, request and extract critical information,
and handle sequential function invokes to complete a task.

To read more about the composition and construction of this live
dataset, please refer to our
[blog](https://gorilla.cs.berkeley.edu/blogs/13_bfcl_v3_multi_turn.html).

---------

**Also in this PR**:

1. Switch to use vllm serve for OSS model inference
2. Switch to Vertex AI Python SDK for Gemini models inference
3. Split out ast_checker and executable_checker for readability
4. Several outdated or deprecated models will be excluded from the
leaderboard and replaced with their updated successors to improve the
leaderboard's overall maintainability.

---------

Co-authored-by: Fanjia Yan <fanjiayan@berkeley.edu>
Co-authored-by: Charlie Cheng-Jie Ji <charliechengjieji@berkeley.edu>
Co-authored-by: Jason Huang <jasonhuang1103@berkeley.edu>
Co-authored-by: Vishnu Suresh <vishnusuresh@berkeley.edu>
Co-authored-by: Yixin Huang <yixinhuang1@berkeley.edu>
Co-authored-by: Xiaowen Yu <yxw2002@berkeley.edu>
Last time when I contributed the `raft_local.py` in directory named
`raft` there was some unnecessary were there, which I removed in this
pull request. It will not confuse the developers when they read the
file.
This PR separate out the change log from the READMD.md to make it more
readable. Some setup instructions have also been updated.

---------

Co-authored-by: Devansh Amin <devanshamin97@gmail.com>
…ishirPatil#656)

There are some dataset format issues for the single turn entries. The
code wraps the question field in an additional unnecessary list.

Fix ShishirPatil#651
…hishirPatil#660)

In the parse_nested_value function, added a check to determine whether
we are dealing with another function call or if its a regular
dictionary. Previous version of the code incorrectly assumed that this
was always a function call and did not consider the case where the
function argument is a dictionary.

Fix ShishirPatil#652

---------

Co-authored-by: Huanzhi (Hans) Mao <huanzhimao@gmail.com>
Added handler for:
phi-3-mini-4k-instruct
phi-3-mini-128k-instruct
phi-3-small-8k-instruct
phi-3-small-128k-instruct
phi-3-medium-4,-instruct
phi-3-medium-128k-instruct
phi-3.5-mini-instruct

|Rank|Model |Model Link |Organization|License |AST Summary|Simple
AST|Multiple AST|Parallel AST|Parallel Multiple AST|Irrelevance
Detection|Relevance Detection|

|----|---------------------------------|---------------------------------------------------------|------------|------------|-----------|----------|------------|------------|---------------------|---------------------|-------------------|
|1 |Phi-3-small-8k-instruct (Prompt)
|https://huggingface.co/microsoft/Phi-3-small-8k-instruct |Microsoft
|MIT |66.39% |59.70% |64.20% |76.75% |64.92% |47.06% |87.80% |
|2 |Phi-3-medium-4k-instruct
(Prompt)|https://huggingface.co/microsoft/Phi-3-medium-4k-instruct|Microsoft
|MIT |62.10% |66.67% |67.40% |62.00% |52.33% |46.79% |78.05% |
|3 |Phi-3-mini-4k-instruct (Prompt)
|https://huggingface.co/microsoft/Phi-3-mini-4k-instruct |Microsoft |MIT
|66.63% |70.76% |75.67% |69.75% |50.33% |20.25% |75.61% |
|4 |Phi-3.5-mini-instruct (Prompt)
|https://huggingface.co/microsoft/Phi-3.5-mini-instruct |Microsoft |MIT
|55.13% |64.22% |66.12% |52.00% |38.17% |64.93% |70.73% |
|5 |Phi-3-mini-128k-instruct
(Prompt)|https://huggingface.co/microsoft/Phi-3-mini-128k-instruct|Microsoft
|MIT |51.49% |67.60% |72.50% |41.12% |24.75% |44.07% |85.37% |

---------

Co-authored-by: Huanzhi (Hans) Mao <huanzhimao@gmail.com>
Bumps [tqdm](https://github.com/tqdm/tqdm) from 4.65.0 to 4.66.3.
- [Release notes](https://github.com/tqdm/tqdm/releases)
- [Commits](tqdm/tqdm@v4.65.0...v4.66.3)

---
updated-dependencies:
- dependency-name: tqdm
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
@dependabot dependabot bot added dependencies Pull requests that update a dependency file python Pull requests that update Python code labels Sep 30, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dependencies Pull requests that update a dependency file python Pull requests that update Python code
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants