-
Notifications
You must be signed in to change notification settings - Fork 150
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[feat] ppl #2094
base: main
Are you sure you want to change the base?
[feat] ppl #2094
Conversation
it('arrayContainsAll works', async () => { | ||
const results = await randomCol | ||
.pipeline() | ||
.where(Field.of('tags').arrayContainsAll('adventure', 'magic')) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wish this read .where(field('tags').arrayContains...
or even .where(Field.named('tags').arrayContains...
. As I read through the code, I continue to read it as "where field of something" but the meaning is "where field something".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think for Java Andy added an function overloads that covers the "common" usecase -- which is the case that the first parameter is a string specifying the name of the field. Is that possible to do for TS?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, it adds confusion when reading one line
.where(arrayContains('tags', 'comedy'))
. The affected field is the argument.
And then reading another line
Field.of('tags').arrayContainsAll('adventure', 'magic')
. The affected field is this
.
I'm assuming you're only switching around for testing completeness, but also I'm just making sure we need to support both patterns?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Field.of is a Java convention, i am not opposed to make it Field.named() or something else. I think we need more inputs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just starting review. Will resume next week, but providing some thoughts now.
dev/system-test/pipeline.ts
Outdated
const results = await randomCol | ||
.pipeline() | ||
.select( | ||
Field.of('tags').arrayConcat('newTag1', 'newTag2').as('modifiedTags') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wondering if arrayConcat(['newTag1', 'newTag2'])
would be better since it is the concatenation of multiple arrays.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will do.
dev/system-test/pipeline.ts
Outdated
.select( | ||
arrayFilter(Field.of('tags'), arrayElement().eq('comedy')).as( | ||
'filteredTags' | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wondering if we can have the array filter syntax mirror the query filter syntax.
Field.of('tags').arrayFilter().where(eq(arrayElement(), 'comedy')))
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can be written as:
Field.of('tags')
.arrayFilter(arrayElement().eq('comedy'))
.as('filteredTags')
which is not bad?
dev/system-test/pipeline.ts
Outdated
arrayTransform( | ||
Field.of('tags'), | ||
arrayElement().strConcat('transformed') | ||
).as('transformedTags') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same, except mirror query select syntax.
Field.of('tags').arrayTransform().select(arrayElement().strConcat('transformed'))
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same, it looks decent after i rewrite it with fluent builder.
it('arrayContainsAll works', async () => { | ||
const results = await randomCol | ||
.pipeline() | ||
.where(Field.of('tags').arrayContainsAll('adventure', 'magic')) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, it adds confusion when reading one line
.where(arrayContains('tags', 'comedy'))
. The affected field is the argument.
And then reading another line
Field.of('tags').arrayContainsAll('adventure', 'magic')
. The affected field is this
.
I'm assuming you're only switching around for testing completeness, but also I'm just making sure we need to support both patterns?
it('testRegexMatches', async () => { | ||
const results = await randomCol | ||
.pipeline() | ||
.where(regexMatch('title', '.*(?i)(the|of).*')) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I understand the difference between regexContains
and regexMatch
. Perhaps we should drop one of these. Most authors using regex should be able to write an expressions that match beginning and end of line.
regexMatch('title', '(?i)(the|of)')
vs regexMatch('title', '^.*(?i)(the|of).*$')
.
Or perhaps drop regex as a prefix.
where(contains('title', 'the'))
where(contains('title', Regex.of('the|of'))
const results = await randomCol | ||
.pipeline() | ||
.select( | ||
Field.of('awards').mapGet('hugo').as('hugoAward'), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd also consider Field.of('awards.hugo').as('hugoAward')
, or is there some special behavior to mapGet
?
Does mapGet()
support dot notation?
Field.of('awards').mapGet('hugo.year').as('hugoAwardYear')
vs
Field.of('awards').mapGet('hugo').mapGet('year').as('hugoAwardYear')
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good question. It looks like mapGet does not support nested fields (always returns null).
One benefit is maybe mapGet supports field with special chars better (., /, etc).
dev/system-test/pipeline.ts
Outdated
const results = await randomCol | ||
.pipeline() | ||
.select( | ||
cosineDistance(Constant.ofVector(sourceVector), targetVector).as( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider just Constant.vector(sourceVector)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
), | ||
euclideanDistance(Constant.ofVector(sourceVector), targetVector).as( | ||
'euclideanDistance' | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This pattern will support KNN. If eventually there is ANN support we may need a different pattern. Do we need to consider that now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure I understand.
it('testNestedFields', async () => { | ||
const results = await randomCol | ||
.pipeline() | ||
.where(eq('awards.hugo', true)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool. Does this make mapGet
redundant?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See above.
} else if (selectable instanceof ExprWithAlias) { | ||
const expr = selectable as ExprWithAlias<Expr>; | ||
result.set(expr.alias, expr.expr); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If Selectable
is an interface, should we make the interface contain the common methods you need here:
interface Selectable {
// @internal
toExprMap(): Map<string, Expr>;
}
dev/src/pipeline.ts
Outdated
return new Pipeline(this.db, copy); | ||
} | ||
|
||
distinct(...groups: (string | Selectable)[]): Pipeline { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider making distinct
and select
(and possibly other functions) consistent in their API surface.
Either make the API
select(...fields: (Selectable | string)[]): Pipeline
distinct(...groups: (Selectable | string)[]): Pipeline
Or:
select(...fields: string[]): Pipeline
select(...fields: Selectable[]): Pipeline
distinct(...groups: string[]): Pipeline
distinct(...groups: Selectable[]): Pipeline
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
aggregate(...accumulators: AccumulatorTarget[]): Pipeline; | ||
aggregate(options: { | ||
accumulators: AccumulatorTarget[]; | ||
groups?: (string | Selectable)[]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment. Maybe this is pushing us to change select
to select(...fields: (Selectable | string)[]): Pipeline
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah.
dev/src/pipeline.ts
Outdated
field: string | Field, | ||
vector: FirebaseFirestore.VectorValue | number[], | ||
options: FindNearestOptions | ||
): Pipeline; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note to update this to the single options param design as used in the Query API.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this is on my todo.
dev/src/pipeline.ts
Outdated
sort(options: { | ||
orderings: Ordering[]; | ||
density?: 'unspecified' | 'required'; | ||
truncation?: 'unspecified' | 'disabled'; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the difference between 'unspecified'
and undefined
(or not passing a value)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They are the same. It does not matter anymore though, the backend decided to remove these options.
dev/src/stage.ts
Outdated
} | ||
} | ||
|
||
export class GenerateStage implements Stage { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be GenericStage
to match the API method Pipeline.genericStage()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
dev/src/expression.ts
Outdated
|
||
arrayConcat(...values: Expr[]): ArrayConcat; | ||
arrayConcat(...values: any[]): ArrayConcat; | ||
arrayConcat(...values: any[]): ArrayConcat { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider making values a standard array parameter, not rest parameters.
dev/src/expression.ts
Outdated
dotProductDistance(other: Expr): DotProductDistance; | ||
dotProductDistance(other: VectorValue): DotProductDistance; | ||
dotProductDistance(other: number[]): DotProductDistance; | ||
dotProductDistance(other: Expr | VectorValue | number[]): DotProductDistance { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might be more accurate to just call this this and the class dotProduct
. Drop "Distance"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will do.
static of(value: Array<any>): Constant; | ||
static of(value: Map<string, any>): Constant; | ||
static of(value: VectorValue): Constant; | ||
static of(value: api.IValue): Constant; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you want to make this more extensible, you could also add an interface
interface FirestoreProtoSerializable {
toProto(): api.IValue;
}
and have a static of(value: FirestoreProtoSerializable): Constant
dev/src/expression.ts
Outdated
} | ||
} | ||
|
||
class DotProductDistance extends Function { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same. Just "DotProduct" is probably more accurate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done with reviews for now
return new Fields([Field.of(name), ...others.map(Field.of)]); | ||
} | ||
|
||
static ofAll(): Fields { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this used?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will remove.
filterable = true as const; | ||
} | ||
|
||
class ArrayContainsAny extends Function implements FilterCondition { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is FilterCondition always a function that returns a boolean? If so, would it be useful to make a more generic BooleanFunction interface in case there is a future use case for boolean functions outside of filtering?
export function add(left: string, right: Expr): Add; | ||
export function add(left: string, right: any): Add; | ||
export function add(left: Expr | string, right: Expr | any): Add { | ||
const normalizedLeft = typeof left === 'string' ? Field.of(left) : left; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great candidate for a helper Field.fromArgument(left)
. Similar to FieldPath.fromArgument(stringOrFieldPath)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will do.
export function add(left: string, right: any): Add; | ||
export function add(left: Expr | string, right: Expr | any): Add { | ||
const normalizedLeft = typeof left === 'string' ? Field.of(left) : left; | ||
const normalizedRight = right instanceof Expr ? right : Constant.of(right); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Expr.fromExpressionOrConstant(right)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will do.
dev/src/expression.ts
Outdated
export function inAny(element: Expr, others: Expr[]): In; | ||
export function inAny(element: Expr, others: any[]): In; | ||
export function inAny(element: string, others: Expr[]): In; // Added overload | ||
export function inAny(element: string, others: any[]): In; // Added overload |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this behavior is different from the classic in()
supporting array, should we also change Expr.in(...)
to name Expr.inAny(...)
. Or if this is only supporting in
arrays, would it make more sense to name it inArray(...)
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will rename Expr.in to inAny.
dev/src/expression.ts
Outdated
export function notInAny(element: Expr, others: Expr[]): Not; | ||
export function notInAny(element: Expr, others: any[]): Not; | ||
export function notInAny(element: string, others: Expr[]): Not; // Added overload | ||
export function notInAny(element: string, others: any[]): Not; // Added overload |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the feedback! I addressed some of those, and maybe we should discuss for the rest.
dev/src/pipeline.ts
Outdated
return new Pipeline(this.db, copy); | ||
} | ||
|
||
distinct(...groups: (string | Selectable)[]): Pipeline { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
aggregate(...accumulators: AccumulatorTarget[]): Pipeline; | ||
aggregate(options: { | ||
accumulators: AccumulatorTarget[]; | ||
groups?: (string | Selectable)[]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah.
dev/src/pipeline.ts
Outdated
field: string | Field, | ||
vector: FirebaseFirestore.VectorValue | number[], | ||
options: FindNearestOptions | ||
): Pipeline; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this is on my todo.
dev/src/pipeline.ts
Outdated
sort(options: { | ||
orderings: Ordering[]; | ||
density?: 'unspecified' | 'required'; | ||
truncation?: 'unspecified' | 'disabled'; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They are the same. It does not matter anymore though, the backend decided to remove these options.
orderings: Ordering[]; | ||
density?: 'unspecified' | 'required'; | ||
truncation?: 'unspecified' | 'disabled'; | ||
}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With density and truncation removed, it's probably an overkill now.
dev/system-test/pipeline.ts
Outdated
arrayTransform( | ||
Field.of('tags'), | ||
arrayElement().strConcat('transformed') | ||
).as('transformedTags') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same, it looks decent after i rewrite it with fluent builder.
dev/system-test/pipeline.ts
Outdated
const results = await randomCol | ||
.pipeline() | ||
.select( | ||
cosineDistance(Constant.ofVector(sourceVector), targetVector).as( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
), | ||
euclideanDistance(Constant.ofVector(sourceVector), targetVector).as( | ||
'euclideanDistance' | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure I understand.
const results = await randomCol | ||
.pipeline() | ||
.select( | ||
Field.of('awards').mapGet('hugo').as('hugoAward'), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good question. It looks like mapGet does not support nested fields (always returns null).
One benefit is maybe mapGet supports field with special chars better (., /, etc).
it('testNestedFields', async () => { | ||
const results = await randomCol | ||
.pipeline() | ||
.where(eq('awards.hugo', true)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See above.
0f4066c
to
3aa8edd
Compare
… in expression.ts
*/ | ||
static of(name: string): Field; | ||
static of(path: firestore.FieldPath): Field; | ||
static of(nameOrPath: string | firestore.FieldPath): Field; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This overload is redundant. static of(nameOrPath: string | firestore.FieldPath): Field;
Either remove this or keep it and remove the two above.
* @param value The Firestore proto value. | ||
* @return A new `Constant` instance. | ||
*/ | ||
static of(value: api.IValue): Constant; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This appears to be redeclared below on line 2083
return this.value; | ||
} | ||
|
||
return serializer.encodeValue(this.value)!; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
serializer.endcodeValue(undefined)
will return null
, which is a permitted but unhandled case. Do we want to treat undefined as null?
No description provided.