Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use compression when saving knowledge graphs #1916

Open
lryan599 opened this issue Feb 11, 2025 · 1 comment
Open

Use compression when saving knowledge graphs #1916

lryan599 opened this issue Feb 11, 2025 · 1 comment
Labels
enhancement New feature or request module-testsetgen Module testset generation

Comments

@lryan599
Copy link
Contributor

Describe the Feature
I found that ragas directly dumps node and relationships directly when saving the knowledge graph, and the relationships holds all the information about the relevant node, with a very large amount of redundancy, especially in the summary_embedding field. So when saving relationships, wouldn't it be better to just save the ID of the related node instead?

Also, there are some compression methods that can be used when saving embeddings. For instances, some RAG pipelines use base64 encoding to save these embeddings.

Why is the feature important for you?
Generating test sets takes a lot of time, and it is necessary to save the knowledge graphs generated in between to be able to load the knowledge graphs directly in case of an anomaly instead of starting all over again.

@lryan599 lryan599 added the enhancement New feature or request label Feb 11, 2025
@sahusiddharth sahusiddharth added the module-testsetgen Module testset generation label Feb 11, 2025
@lryan599
Copy link
Contributor Author

class Relationship(BaseModel):

It would be fine to add this code to class Relationship

    @field_serializer("source", "target")
    def serialize_node(self, node: Node):
        return node.id

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request module-testsetgen Module testset generation
Projects
None yet
Development

No branches or pull requests

2 participants