Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: filter out empty chunks in SentenceSplitter #1517

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

parhammmm
Copy link
Contributor

No description provided.

Copy link

changeset-bot bot commented Nov 21, 2024

🦋 Changeset detected

Latest commit: 764f0fb

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 29 packages
Name Type
@llamaindex/core Patch
@llamaindex/unit-test Patch
@llamaindex/doc Patch
@llamaindex/cloud Patch
@llamaindex/community Patch
llamaindex Patch
@llamaindex/node-parser Patch
@llamaindex/readers Patch
@llamaindex/anthropic Patch
@llamaindex/clip Patch
@llamaindex/deepinfra Patch
@llamaindex/huggingface Patch
@llamaindex/ollama Patch
@llamaindex/openai Patch
@llamaindex/portkey-ai Patch
@llamaindex/replicate Patch
@llamaindex/llama-parse-browser-test Patch
docs Patch
@llamaindex/cloudflare-worker-agent-test Patch
@llamaindex/next-agent-test Patch
@llamaindex/nextjs-edge-runtime-test Patch
@llamaindex/next-node-runtime-test Patch
@llamaindex/waku-query-engine-test Patch
@llamaindex/autotool Patch
@llamaindex/experimental Patch
@llamaindex/autotool-01-node-example Patch
@llamaindex/autotool-02-next-example Patch
@llamaindex/groq Patch
@llamaindex/vllm Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

Copy link

vercel bot commented Nov 21, 2024

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
legacy-llama-index-ts-docs ✅ Ready (Inspect) Visit Preview 💬 Add feedback Nov 21, 2024 0:49am

Copy link

vercel bot commented Nov 21, 2024

@parhammmm is attempting to deploy a commit to the LlamaIndex Team on Vercel.

A member of the Team first needs to authorize it.

Copy link

pkg-pr-new bot commented Nov 21, 2024

Open in Stackblitz

@llamaindex/autotool

pnpm add https://pkg.pr.new/run-llama/LlamaIndexTS/@llamaindex/autotool@1517

@llamaindex/cloud

pnpm add https://pkg.pr.new/run-llama/LlamaIndexTS/@llamaindex/cloud@1517

@llamaindex/community

pnpm add https://pkg.pr.new/run-llama/LlamaIndexTS/@llamaindex/community@1517

@llamaindex/core

pnpm add https://pkg.pr.new/run-llama/LlamaIndexTS/@llamaindex/core@1517

@llamaindex/env

pnpm add https://pkg.pr.new/run-llama/LlamaIndexTS/@llamaindex/env@1517

@llamaindex/experimental

pnpm add https://pkg.pr.new/run-llama/LlamaIndexTS/@llamaindex/experimental@1517

llamaindex

pnpm add https://pkg.pr.new/run-llama/LlamaIndexTS/llamaindex@1517

@llamaindex/node-parser

pnpm add https://pkg.pr.new/run-llama/LlamaIndexTS/@llamaindex/node-parser@1517

@llamaindex/readers

pnpm add https://pkg.pr.new/run-llama/LlamaIndexTS/@llamaindex/readers@1517

@llamaindex/wasm-tools

pnpm add https://pkg.pr.new/run-llama/LlamaIndexTS/@llamaindex/wasm-tools@1517

@llamaindex/workflow

pnpm add https://pkg.pr.new/run-llama/LlamaIndexTS/@llamaindex/workflow@1517

@llamaindex/anthropic

pnpm add https://pkg.pr.new/run-llama/LlamaIndexTS/@llamaindex/anthropic@1517

@llamaindex/clip

pnpm add https://pkg.pr.new/run-llama/LlamaIndexTS/@llamaindex/clip@1517

@llamaindex/deepinfra

pnpm add https://pkg.pr.new/run-llama/LlamaIndexTS/@llamaindex/deepinfra@1517

@llamaindex/groq

pnpm add https://pkg.pr.new/run-llama/LlamaIndexTS/@llamaindex/groq@1517

@llamaindex/ollama

pnpm add https://pkg.pr.new/run-llama/LlamaIndexTS/@llamaindex/ollama@1517

@llamaindex/huggingface

pnpm add https://pkg.pr.new/run-llama/LlamaIndexTS/@llamaindex/huggingface@1517

@llamaindex/openai

pnpm add https://pkg.pr.new/run-llama/LlamaIndexTS/@llamaindex/openai@1517

@llamaindex/portkey-ai

pnpm add https://pkg.pr.new/run-llama/LlamaIndexTS/@llamaindex/portkey-ai@1517

@llamaindex/replicate

pnpm add https://pkg.pr.new/run-llama/LlamaIndexTS/@llamaindex/replicate@1517

@llamaindex/vllm

pnpm add https://pkg.pr.new/run-llama/LlamaIndexTS/@llamaindex/vllm@1517

commit: 764f0fb

Copy link
Member

@himself65 himself65 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could you please add test for this?

Copy link
Member

@himself65 himself65 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't find the case then splits get empty, could you please add some tests

@parhammmm
Copy link
Contributor Author

@himself65 this was the case where it happened https://gist.github.com/parhammmm/c69e37420c92cb85db788454f2a9fe27

It's a bit of an edge case and I found a way to work around it. Will need to find some time to figure out Llamaindex's test suite and then will add it

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants