Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enhance PostgresType to support array types (serialization, not deserialization) #2017

Merged

Conversation

ExtReMLapin
Copy link
Contributor

@ExtReMLapin ExtReMLapin commented Feb 27, 2025

Fixes an issue that leaded the array to be returned to the client as a string because it was not implemented in the postgres types

Serialization is implemented, deserialization is NOT implemented.
First code wrote in java so i'm all ears open to fix things in this PR.

Fixed my NPM install so precommit is fixed aswell

@ExtReMLapin
Copy link
Contributor Author

Not going to lie, only tested it on a long 1024 array of floats, didn't have the data to test with ints, bools etc

}
} else {
// Default to text array for empty lists
valueType = PostgresType.ARRAY_TEXT;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe do something else, not sure

@lvca lvca requested a review from robfrank February 27, 2025 20:38
@lvca lvca added the enhancement New feature or request label Feb 27, 2025
@lvca lvca added this to the 25.3.1 milestone Feb 27, 2025
@robfrank
Copy link
Collaborator

Can you give me an example on the data to store in the db?
some SQL commands to create the data, so I can add some relevant tests

@ExtReMLapin
Copy link
Contributor Author

It's from the data from this thread : #2005

https://limewire.com/d/10245dd2-fd50-4d33-865e-3a3969026047#_n2Dys-Z8KKH032BHBltTagwu_DpoRev8_PKBWQLKiA

Python code :

(Just install the package with pip install psycopg not the binary package)

import psycopg
import time

with psycopg.connect(user="root", password="rootroot",
                    host='localhost',
                    port='5432',
                    dbname='ORANO_DOC',
                    sslmode='disable'
                    ) as connection:
    connection.autocommit = True
    _time = time.time()
    with connection.cursor() as cursor:
        cursor.execute("""MATCH {type: EMBEDDING, as: embb}-->{ as: target}
RETURN embb.vector, target.asRID()""")
        results = cursor.fetchall()
        print(results[0])

        
    print('time', time.time()-_time)

@ExtReMLapin
Copy link
Contributor Author

ExtReMLapin commented Feb 28, 2025

As for the data type I ran tests with ARRAY_OF_FLOATS

@ExtReMLapin
Copy link
Contributor Author

Not going to lie, only tested it on a long 1024 array of floats, didn't have the data to test with ints, bools etc

As for the data type I ran tests with ARRAY_OF_FLOATS

Now covered by tests : #2039

Copy link
Collaborator

@robfrank robfrank left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@robfrank robfrank merged commit 444bbc7 into ArcadeData:main Mar 4, 2025
2 of 3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants