Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Vector Search #2006

Merged
merged 70 commits into from
Apr 2, 2024
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
70 commits
Select commit Hold shift + click to select a range
36a50cb
Implement vector field type
wu-hui Sep 15, 2023
415c94e
Merge branch 'main' into wuandy/VectorType
wu-hui Sep 15, 2023
3551acb
test
wu-hui Sep 19, 2023
248af30
Merge branch 'main' into wuandy/VectorType
wu-hui Nov 27, 2023
6042c4d
Added extra logs and vector type test
wu-hui Nov 29, 2023
764800b
Add vector watch test
wu-hui Dec 5, 2023
93ae1d7
Add vectorvalue to dts.
wu-hui Jan 17, 2024
a3d57e3
get d.ts right.
wu-hui Jan 10, 2024
68cafe1
VectorQuery and QueryUtil setup
wu-hui Jan 10, 2024
f8f1ecb
Stage proto
wu-hui Jan 11, 2024
9f973a6
Actual serialization
wu-hui Jan 12, 2024
57a3f25
Update proto
wu-hui Jan 15, 2024
e15721d
More integration tests
wu-hui Jan 15, 2024
3898181
fix some tests. Also what to do with mismatch dimentions
wu-hui Jan 15, 2024
069a3e4
Add unit tests and minor fixes
wu-hui Jan 16, 2024
4b8b624
Add comments and delete stream()
wu-hui Jan 17, 2024
cff8b5b
Fix rebase.
wu-hui Jan 17, 2024
2f1b964
Address Mark's comments.
wu-hui Jan 23, 2024
9ebd42a
Merge branch 'main' into wuandy/VectorType
wu-hui Jan 23, 2024
1132ff3
get d.ts right.
wu-hui Jan 10, 2024
ddec250
VectorQuery and QueryUtil setup
wu-hui Jan 10, 2024
86ce16e
Stage proto
wu-hui Jan 11, 2024
df0b01a
Actual serialization
wu-hui Jan 12, 2024
1755a4e
Update proto
wu-hui Jan 15, 2024
8da86bc
More integration tests
wu-hui Jan 15, 2024
b8a043c
fix some tests. Also what to do with mismatch dimentions
wu-hui Jan 15, 2024
629ed38
Add unit tests and minor fixes
wu-hui Jan 16, 2024
5ca985f
Add comments and delete stream()
wu-hui Jan 17, 2024
66cab1f
Fix rebase.
wu-hui Jan 17, 2024
30f9977
Address comments
wu-hui Jan 23, 2024
2ae6cfb
Merge remote-tracking branch 'origin/wuandy/FindNearestImpl' into wua…
wu-hui Jan 23, 2024
a6af2da
Address comments
wu-hui Jan 23, 2024
1943803
No retry with cursor for vector for now.
wu-hui Jan 29, 2024
e535fd1
Fix error message test
wu-hui Jan 31, 2024
0af118f
Fix broken array equality
wu-hui Jan 31, 2024
41ab580
Implementing IndexTestHelper and updating findNearest tests to use th…
MarkDuckworth Feb 29, 2024
bcf7f45
Add dot product support.
MarkDuckworth Mar 4, 2024
625e8d7
Merge branch 'main' of github.com:googleapis/nodejs-firestore into ma…
MarkDuckworth Mar 4, 2024
b512348
Adding missing file
MarkDuckworth Mar 4, 2024
0a9ab60
Lint
MarkDuckworth Mar 4, 2024
6b52042
Updating API reference docs.
MarkDuckworth Mar 5, 2024
afc4f99
Fixing API reference docs for TSDoc output on CGC.
MarkDuckworth Mar 5, 2024
3d0267f
Fix API reference docs.
MarkDuckworth Mar 6, 2024
c3a337e
Supporting Vector field order onSnapshot.
MarkDuckworth Mar 11, 2024
f0c6228
Add missing map-type file.
MarkDuckworth Mar 11, 2024
8436c1e
Cleanup and linting.
MarkDuckworth Mar 11, 2024
2fcf70a
Supporting Proto3 JSON with vector.
MarkDuckworth Mar 12, 2024
adb01c1
Additional test for vector in a map.
MarkDuckworth Mar 13, 2024
d916197
Merge branch 'main' of github.com:googleapis/nodejs-firestore into ma…
MarkDuckworth Mar 14, 2024
d3254cd
Update deps.
MarkDuckworth Mar 14, 2024
f0ed9bc
Add DOT_PRODUCT to d.ts
MarkDuckworth Mar 14, 2024
4791b1b
Merge branch 'main' of github.com:googleapis/nodejs-firestore into ma…
MarkDuckworth Mar 15, 2024
f2986f8
Merge branch 'main' of github.com:googleapis/nodejs-firestore into ma…
MarkDuckworth Mar 15, 2024
3a9e2ab
Merge branch 'main' of github.com:googleapis/nodejs-firestore into ma…
MarkDuckworth Mar 15, 2024
0d1b967
Staging protos
MarkDuckworth Mar 15, 2024
9d598a1
Types update for admin protos.
MarkDuckworth Mar 15, 2024
4513c24
License header fixes.
MarkDuckworth Mar 19, 2024
35303af
Fix docs CI for jsdoc build.
MarkDuckworth Mar 19, 2024
a40366a
Updated api-report
MarkDuckworth Mar 21, 2024
6e3796d
More integration test cases.
MarkDuckworth Mar 21, 2024
e5a3d9c
Clarifying supported range on options.limit
MarkDuckworth Mar 21, 2024
3840563
Merge branch 'main' into markduckworth/findnearest
MarkDuckworth Mar 25, 2024
970a3f1
Revert changes from staging protos that were not applied in stable.
MarkDuckworth Mar 25, 2024
b98c3e8
Improving test comments/name for PR feedback.
MarkDuckworth Mar 25, 2024
563ef55
Merge Query Profile to Find Nearest
MarkDuckworth Mar 26, 2024
ee3f1d6
Merge branch 'main' of github.com:googleapis/nodejs-firestore into ma…
MarkDuckworth Mar 27, 2024
ccabafd
Revert rename of internal toProto method name. It is marked internal,…
MarkDuckworth Mar 27, 2024
43a5cd1
Update api-report.md
MarkDuckworth Mar 27, 2024
a0f103e
Re-adding two internal methods to the Query class to minimize the ris…
MarkDuckworth Mar 27, 2024
bb56524
Merge branch 'main' of github.com:googleapis/nodejs-firestore into ma…
MarkDuckworth Mar 28, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
102 changes: 62 additions & 40 deletions dev/src/convert.ts
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,7 @@ import {ApiMapValue, ProtobufJsValue} from './types';
import {validateObject} from './validate';

import api = google.firestore.v1;
import {RESERVED_MAP_KEY, RESERVED_MAP_KEY_VECTOR_VALUE} from "./map-type";

/*!
* @module firestore/convert
Expand Down Expand Up @@ -112,53 +113,74 @@ function bytesFromJson(bytesValue: string | Uint8Array): Uint8Array {
* @return The string value for 'valueType'.
*/
export function detectValueType(proto: ProtobufJsValue): string {
let valueType: string | undefined;

if (proto.valueType) {
return proto.valueType;
valueType = proto.valueType;
}
else {

const detectedValues: string[] = [];
const detectedValues: string[] = [];

if (proto.stringValue !== undefined) {
detectedValues.push('stringValue');
}
if (proto.booleanValue !== undefined) {
detectedValues.push('booleanValue');
}
if (proto.integerValue !== undefined) {
detectedValues.push('integerValue');
}
if (proto.doubleValue !== undefined) {
detectedValues.push('doubleValue');
}
if (proto.timestampValue !== undefined) {
detectedValues.push('timestampValue');
}
if (proto.referenceValue !== undefined) {
detectedValues.push('referenceValue');
}
if (proto.arrayValue !== undefined) {
detectedValues.push('arrayValue');
}
if (proto.nullValue !== undefined) {
detectedValues.push('nullValue');
}
if (proto.mapValue !== undefined) {
detectedValues.push('mapValue');
}
if (proto.geoPointValue !== undefined) {
detectedValues.push('geoPointValue');
}
if (proto.bytesValue !== undefined) {
detectedValues.push('bytesValue');
if (proto.stringValue !== undefined) {
detectedValues.push('stringValue');
}
if (proto.booleanValue !== undefined) {
detectedValues.push('booleanValue');
}
if (proto.integerValue !== undefined) {
detectedValues.push('integerValue');
}
if (proto.doubleValue !== undefined) {
detectedValues.push('doubleValue');
}
if (proto.timestampValue !== undefined) {
detectedValues.push('timestampValue');
}
if (proto.referenceValue !== undefined) {
detectedValues.push('referenceValue');
}
if (proto.arrayValue !== undefined) {
detectedValues.push('arrayValue');
}
if (proto.nullValue !== undefined) {
detectedValues.push('nullValue');
}
if (proto.mapValue !== undefined) {
detectedValues.push('mapValue');
}
if (proto.geoPointValue !== undefined) {
detectedValues.push('geoPointValue');
}
if (proto.bytesValue !== undefined) {
detectedValues.push('bytesValue');
}

if (detectedValues.length !== 1) {
throw new Error(
`Unable to infer type value from '${JSON.stringify(proto)}'.`
);
}

valueType = detectedValues[0];
}

if (detectedValues.length !== 1) {
throw new Error(
`Unable to infer type value from '${JSON.stringify(proto)}'.`
);
// Special handling of mapValues used to represent other data types
if (valueType === "mapValue") {
const fields = proto.mapValue?.fields;
if (fields) {
const props = Object.keys(fields);
if (
props.indexOf(RESERVED_MAP_KEY) !== -1 &&
detectValueType(fields[RESERVED_MAP_KEY]) === "stringValue" &&
fields[RESERVED_MAP_KEY].stringValue === RESERVED_MAP_KEY_VECTOR_VALUE
) {
valueType = "vectorValue";
}
}
}

return detectedValues[0];
return valueType;
}

/**
Expand Down Expand Up @@ -199,7 +221,7 @@ export function valueFromJson(fieldValue: api.IValue): api.IValue {
},
};
}
case 'mapValue': {
case 'mapValue':{
const mapValue: ApiMapValue = {};
const fields = fieldValue.mapValue!.fields;
if (fields) {
Expand Down
31 changes: 28 additions & 3 deletions dev/src/order.ts
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,8 @@ enum TypeOrder {
REF = 6,
GEO_POINT = 7,
ARRAY = 8,
OBJECT = 9,
VECTOR = 9,
OBJECT = 10,
}

/*!
Expand Down Expand Up @@ -67,6 +68,8 @@ function typeOrder(val: api.IValue): TypeOrder {
return TypeOrder.REF;
case 'mapValue':
return TypeOrder.OBJECT;
case 'vectorValue':
return TypeOrder.VECTOR;
default:
throw new Error('Unexpected value type: ' + valueType);
}
Expand Down Expand Up @@ -225,6 +228,23 @@ function compareObjects(left: ApiMapValue, right: ApiMapValue): number {
return primitiveComparator(leftKeys.length, rightKeys.length);
}

/*!
* @private
* @internal
*/
function compareVectors(left: ApiMapValue, right: ApiMapValue): number {
// The vector is a map, but only vector value is compared.
const leftArray = left?.["value"]?.arrayValue?.values ?? [];
const rightArray = right?.["value"]?.arrayValue?.values ?? [];

const lengthCompare = primitiveComparator(leftArray.length, rightArray.length);
if (lengthCompare !== 0) {
return lengthCompare;
}

return compareArrays(leftArray, rightArray);
}

/*!
* @private
* @internal
Expand Down Expand Up @@ -264,8 +284,13 @@ export function compare(left: api.IValue, right: api.IValue): number {
);
case TypeOrder.OBJECT:
return compareObjects(
left.mapValue!.fields || {},
right.mapValue!.fields || {}
left.mapValue!.fields || {},
right.mapValue!.fields || {}
);
case TypeOrder.VECTOR:
return compareVectors(
left.mapValue!.fields || {},
right.mapValue!.fields || {}
);
default:
throw new Error(`Encountered unknown type order: ${leftType}`);
Expand Down
30 changes: 11 additions & 19 deletions dev/src/serializer.ts
Original file line number Diff line number Diff line change
Expand Up @@ -18,17 +18,18 @@ import {DocumentData} from '@google-cloud/firestore';

import * as proto from '../protos/firestore_v1_proto_api';

import {detectValueType} from './convert';
import {DeleteTransform, FieldTransform, VectorValue} from './field-value';
import {GeoPoint} from './geo-point';
import {DocumentReference, Firestore} from './index';
import {FieldPath, QualifiedResourcePath} from './path';
import {Timestamp} from './timestamp';
import {ApiMapValue, ValidationOptions} from './types';
import {ApiMapValue, ProtobufJsValue, ValidationOptions} from './types';
MarkDuckworth marked this conversation as resolved.
Show resolved Hide resolved
import {isEmpty, isObject, isPlainObject} from './util';
import {customObjectMessage, invalidArgumentMessage} from './validate';

import api = proto.google.firestore.v1;
import {detectValueType} from "./convert";
import {RESERVED_MAP_KEY, RESERVED_MAP_KEY_VECTOR_VALUE, VECTOR_MAP_VECTORS_KEY} from "./map-type";

/**
* The maximum depth of a Firestore object.
Expand All @@ -38,10 +39,6 @@ import api = proto.google.firestore.v1;
*/
const MAX_DEPTH = 20;

const RESERVED_MAP_KEY = '__type__';
const RESERVED_MAP_KEY_VECTOR_VALUE = '__vector__';
const VECTOR_MAP_VECTORS_KEY = 'value';

/**
* An interface for Firestore types that can be serialized to Protobuf.
*
Expand Down Expand Up @@ -298,24 +295,19 @@ export class Serializer {
case 'mapValue': {
const fields = proto.mapValue!.fields;
if (fields) {
const props = Object.keys(fields);
if (
props.indexOf(RESERVED_MAP_KEY) !== -1 &&
this.decodeValue(fields[RESERVED_MAP_KEY]) ===
RESERVED_MAP_KEY_VECTOR_VALUE
) {
return VectorValue._fromProto(fields[VECTOR_MAP_VECTORS_KEY]);
} else {
const obj: DocumentData = {};
for (const prop of Object.keys(fields)) {
obj[prop] = this.decodeValue(fields[prop]);
}
return obj;
const obj: DocumentData = {};
for (const prop of Object.keys(fields)) {
obj[prop] = this.decodeValue(fields[prop]);
}
return obj;
} else {
return {};
}
}
case 'vectorValue': {
const fields = proto.mapValue!.fields!;
return VectorValue._fromProto(fields[VECTOR_MAP_VECTORS_KEY]);
}
case 'geoPointValue': {
return GeoPoint.fromProto(proto.geoPointValue!);
}
Expand Down
38 changes: 38 additions & 0 deletions dev/system-test/firestore.ts
Original file line number Diff line number Diff line change
Expand Up @@ -3005,6 +3005,44 @@ describe('Query class', () => {

unsubscribe();
});

it.only('orders vector field correctly', async () => {
MarkDuckworth marked this conversation as resolved.
Show resolved Hide resolved
await randomCol.add({"embedding": {"HELLO": "WORLD"}});
await randomCol.add({"embedding": {"hello": "world"}});
await randomCol.add({"embedding": FieldValue.vector([1, 2, 3])});
await randomCol.add({"embedding": FieldValue.vector([1, 2])});
await randomCol.add({"embedding": FieldValue.vector([2, 2])});
await randomCol.add({"embedding": FieldValue.vector([100, 2, 3, 4, 5])});
await randomCol.add({"embedding": FieldValue.vector([1, 2, 3, 4, 5])});
await randomCol.add({"embedding": FieldValue.vector([1, 2, 100, 4, 5])});
await randomCol.add({"embedding": FieldValue.vector([1, 2, 3, 4])});

const orderedQuery = randomCol.orderBy("embedding");

const unsubscribe = orderedQuery.onSnapshot(
snapshot => {
currentDeferred.resolve(snapshot);
},
err => {
currentDeferred.reject!(err);
}
);

const watchSnapshot = await waitForSnapshot();
unsubscribe();
const getSnapshot = await orderedQuery.get();

console.log("---- watch snapshot ----")
watchSnapshot.docs.forEach(ds => console.log(ds.get("embedding")));

console.log("---- get snapshot ----")
getSnapshot.docs.forEach(ds => console.log(ds.get("embedding")));

snapshotsEqual(watchSnapshot, {
docs: getSnapshot.docs,
docChanges: getSnapshot.docChanges()
});
});
});

(process.env.FIRESTORE_EMULATOR_HOST === undefined
Expand Down