Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] [Knowledge] The embeding data will not be deleted in the runtime directory "pilot/data" when the space name was delete #2182

Closed
4 of 15 tasks
toralee opened this issue Dec 6, 2024 · 6 comments
Labels
bug Something isn't working Waiting for reply

Comments

@toralee
Copy link
Contributor

toralee commented Dec 6, 2024

Search before asking

  • I had searched in the issues and found no similar issues.

Operating system information

Linux

Python version information

3.10

DB-GPT version

latest release

Related scenes

  • Chat Data
  • Chat Excel
  • Chat DB
  • Chat Knowledge
  • Model Management
  • Dashboard
  • Plugins

Installation Information

Device information

GPU 0, 24GB memory

Models information

text2vec-large-chinese
QWen2.5-14B-Instruct

What happened

Nither of deletion the knowledge space name from Web or dbgpt command cli will not delete the pilot data, that caused failure when re-create the same space name.

What you expected to happen

It should be that when the space name was deleted, the corresponding pilot data was deleted from filesystem. If not, another result maybe that availabe volume of filesystem decreased.

How to reproduce

Just create knowledge space using name such as 'testknowlegeSpace', then delete the space with name 'testknowlegeSpace', and re-create the space 'testknowlegeSpace', when entered the detail of recreated space, the content failed.

Additional context

No response

Are you willing to submit PR?

  • Yes I am willing to submit a PR!
@toralee toralee added bug Something isn't working Waiting for reply labels Dec 6, 2024
@Aries-ckt
Copy link
Collaborator

what's your dbgpt version and what vector type you use?

@toralee
Copy link
Contributor Author

toralee commented Dec 6, 2024

what's your dbgpt version and what vector type you use?

latest version: 0.6.2, I tried the Cli command dbgpt knowledge load and used the default vector type 'Chroma'

@Aries-ckt
Copy link
Collaborator

Is it like this every time? Is it the same when creating a new knowledge space and deleting it?

@toralee
Copy link
Contributor Author

toralee commented Dec 7, 2024

Is it like this every time? Is it the same when creating a new knowledge space and deleting it?

Yes, it happened each time. When a new knowledge was created, and then to delete the exist knowledge, the result was that deleting space ([spacename].vectordb) sucessed, while the uploading directory named by spacename left.

I tried to resolve that in the route.post('/knowledge/space/delete') with rmtree the directory, which should be ok.

More, in the same space, following the steps like: firstly creating(load) space and docfile, then deleting the docfile, re-loading the docfile to the space caused Error 'ERROR document embedding, ******* Collection 6309672b-8b70-4e35-a087-e90ae1f4b59b does not exist.' Maybe that is reasonable, it is forbidded to re-load the same docfile after just deleting from space.

@Aries-ckt
Copy link
Collaborator

That's wired, I see you pull your solution for this issue, we will check it. Thanks for contribution.

@fangyinc
Copy link
Collaborator

Closed in #2185

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working Waiting for reply
Projects
None yet
Development

No branches or pull requests

3 participants