Skip to content

Conversation

@sjyangkevin
Copy link
Contributor

@sjyangkevin sjyangkevin commented May 18, 2025

Motivation

Users can create links between objects through cross-references using this method.

Close #42568

Test

Add the following 2 test cases

  • Ensure add_reference is called the right number of times.
  • Ensure the retry mechanism is working
  • A DAG is created to connect to Weaviate and links existing objects in two different collections.

Testing DAG code

@task()
def batch_create_links():
    from airflow.providers.weaviate.hooks.weaviate import WeaviateHook
    from weaviate.classes.query import QueryReference
    import pandas as pd

    weaviate_hook = WeaviateHook()

    weaviate_hook.batch_create_links(
        collection_name="JeopardyQuestion",
        data=pd.DataFrame.from_dict(
            {
                "from_uuid": ["5b118d91-7006-444f-a028-08372f1e7356", "00d38b07-ebd7-484a-a75c-2fa37ecfb29d"],
                "to_uuid": ["b801e25e-6c86-4ad6-a5c9-d14994375a71", "b0c80fc4-cc87-42a6-9324-f2e663d67266"],
                "from_property": ["hasCategory", "hasCategory"]
            }
        ),
        from_property_col="from_property",
        from_uuid_col="from_uuid",
        to_uuid_col="to_uuid"
    )

    collection = weaviate_hook.get_collection("JeopardyQuestion")

    response = collection.query.fetch_objects(
        return_references=QueryReference(
            link_on="hasCategory",
            return_properties=["category"]
        )
    )
    print(len(response.objects))
    print(response)

Test Output

Before adding the links, the references property in the QueryReturn is an empty dictionary.
Screenshot from 2025-05-18 09-01-48


^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named {pr_number}.significant.rst or {issue_number}.significant.rst, in airflow-core/newsfragments.

@sjyangkevin sjyangkevin changed the title Issues/42568/weaviate batch create links Add batch_create_links method in weaviate hook May 18, 2025
@sjyangkevin sjyangkevin force-pushed the issues/42568/weaviate-batch-create-links branch from 22d2de9 to 787029a Compare May 18, 2025 15:03
@sjyangkevin sjyangkevin force-pushed the issues/42568/weaviate-batch-create-links branch from 787029a to 777677d Compare May 18, 2025 19:03
@sjyangkevin sjyangkevin requested a review from shahar1 May 18, 2025 19:03
@potiuk potiuk merged commit 8f48a3b into apache:main May 19, 2025
64 checks passed
dadonnelly316 pushed a commit to dadonnelly316/airflow that referenced this pull request May 26, 2025
* add batch_create_links method

* redefine batch_create_links in weaviate hook

* add logging for failed reference

* refactor batch create links

* add tests and run precommit

* use self.log.error instead of print
sanederchik pushed a commit to sanederchik/airflow that referenced this pull request Jun 7, 2025
* add batch_create_links method

* redefine batch_create_links in weaviate hook

* add logging for failed reference

* refactor batch create links

* add tests and run precommit

* use self.log.error instead of print
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add batch_create_links method in Weaviate hook

3 participants