Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CUMULUS-3694: Update granules List endpoints to query postgres - filter by field value #3656

Merged
merged 41 commits into from
May 13, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
41 commits
Select commit Hold shift + click to select a range
0772a4e
CUMULUS-3692:Granule list endpoint for basic postgres query
jennyhliu Apr 24, 2024
dfdf68f
refactor
jennyhliu Apr 26, 2024
4efe990
refactor
jennyhliu Apr 26, 2024
38f588d
Merge branch 'master' into jl/CUMULUS-3692
jennyhliu Apr 26, 2024
5cc833c
typing
jennyhliu Apr 29, 2024
883e65f
add changelog entry
jennyhliu Apr 29, 2024
cd20bd4
skip search_after
jennyhliu Apr 29, 2024
08d635a
skip searchafter unit tests
jennyhliu Apr 29, 2024
020ccb6
Merge branch 'master' into jl/CUMULUS-3692
jennyhliu Apr 30, 2024
e3926c7
add granule list test
jennyhliu Apr 30, 2024
935641d
rename
jennyhliu May 1, 2024
8d971e9
Merge branch 'master' into jl/CUMULUS-3692
jennyhliu May 2, 2024
c1e724a
refactor
jennyhliu May 3, 2024
c31a9d9
build query parameters
jennyhliu May 4, 2024
421a2dd
update comment
jennyhliu May 6, 2024
87b3cdf
add field-mapping
jennyhliu May 6, 2024
9942fac
update jsdoc
jennyhliu May 6, 2024
39922be
use type over interface,add log
jennyhliu May 7, 2024
fb9c51f
update test description
jennyhliu May 7, 2024
0f36350
Merge branch 'jl/CUMULUS-3692' into jl/CUMULUS-3694
jennyhliu May 8, 2024
a655b7d
Merge branch 'jl/CUMULUS-3692' into jl/CUMULUS-3694
jennyhliu May 8, 2024
facd22f
build term/terms
jennyhliu May 8, 2024
8eeb5e2
buildDbQueryParameters
jennyhliu May 9, 2024
ae0e085
add unit test no terms search
jennyhliu May 10, 2024
a68a67a
add doc
jennyhliu May 10, 2024
73504c4
rename
jennyhliu May 10, 2024
e571757
add unit test
jennyhliu May 10, 2024
2610483
add fields test
jennyhliu May 10, 2024
31c1ec3
add more unit tests
jennyhliu May 10, 2024
c36a7b4
Merge branch 'feature/es-phase-1' into jl/CUMULUS-3694
jennyhliu May 10, 2024
a1a58f7
support error.Error search
jennyhliu May 10, 2024
7ce65a7
Merge branch 'jl/CUMULUS-3694' of https://github.com/nasa/cumulus int…
jennyhliu May 10, 2024
cef467b
fix lint
jennyhliu May 10, 2024
e6339db
rename functions
jennyhliu May 13, 2024
e899aca
ignore files
jennyhliu May 13, 2024
b66a580
add convert query unit tests
jennyhliu May 13, 2024
ee593d1
add all types
jennyhliu May 13, 2024
c8ae699
add unit test for fieldmapping types fix timestamp
jennyhliu May 13, 2024
b4e56b0
update timestamp test
jennyhliu May 13, 2024
d522b24
add multiple term field test
jennyhliu May 13, 2024
470f354
ignore execution in granule list record
jennyhliu May 13, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 7 additions & 2 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,8 +9,13 @@ The format is based on [Keep a Changelog](http://keepachangelog.com/en/1.0.0/).
### Replace ElasticSearch Phase 1

- **CUMULUS-3692**
- Update granules List endpoints to query postgres for basic queries

- Added `@cumulus/db/src/search` `BaseSearch` and `GranuleSearch` classes to
support basic queries for granules
- Updated granules List endpoint to query postgres for basic queries
- **CUMULUS-3694**
- Added functionality to `@cumulus/db/src/search` to support term queries
- Updated `BaseSearch` and `GranuleSearch` classes to support term queries for granules
- Updated granules List endpoint to search postgres

### Migration Notes

Expand Down
1 change: 1 addition & 0 deletions example/spec/helpers/granuleUtils.js
Original file line number Diff line number Diff line change
Expand Up @@ -234,6 +234,7 @@ const waitForGranuleRecordUpdatedInList = async (stackName, granule, additionalQ
'beginningDateTime',
'endingDateTime',
'error',
'execution', // TODO remove after CUMULUS-3698
'files', // TODO -2714 this should be removed
'lastUpdateDateTime',
'productionDateTime',
Expand Down
3 changes: 2 additions & 1 deletion example/spec/parallel/testAPI/granuleSpec.js
Original file line number Diff line number Diff line change
Expand Up @@ -183,7 +183,8 @@ describe('The Granules API', () => {
});

const searchedGranule = JSON.parse(searchResults.body).results[0];
expect(searchedGranule).toEqual(jasmine.objectContaining(randomGranuleRecord));
// TODO CUMULUS-3698 includes files
expect(searchedGranule).toEqual(jasmine.objectContaining(omit(randomGranuleRecord, 'files')));
});

it('can modify the granule via API.', async () => {
Expand Down
23 changes: 3 additions & 20 deletions packages/api/endpoints/granules.js
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,6 @@ const {
recordNotFoundString,
multipleRecordFoundString,
} = require('@cumulus/es-client/search');
const ESSearchAfter = require('@cumulus/es-client/esSearchAfter');

const { deleteGranuleAndFiles } = require('../src/lib/granule-delete');
const { zodParser } = require('../src/zod-utils');
Expand Down Expand Up @@ -105,25 +104,9 @@ async function list(req, res) {
log.trace(`list query ${JSON.stringify(req.query)}`);
const { getRecoveryStatus, ...queryStringParameters } = req.query;

let es;
if (queryStringParameters.searchContext) {
es = new ESSearchAfter(
{ queryStringParameters },
'granule',
process.env.ES_INDEX
);
} else {
es = new Search({ queryStringParameters }, 'granule', process.env.ES_INDEX);
}
let result;
// TODO the condition should be removed after we support all the query parameters
if (Object.keys(queryStringParameters).filter((item) => !['limit', 'page', 'sort_key'].includes(item)).length === 0) {
log.debug('list perform db search');
const dbSearch = new GranuleSearch({ queryStringParameters });
result = await dbSearch.query();
} else {
result = await es.query();
}
const dbSearch = new GranuleSearch({ queryStringParameters });
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

const result = await dbSearch.query();

if (getRecoveryStatus === 'true') {
return res.send(await addOrcaRecoveryStatus(result));
}
Expand Down
50 changes: 45 additions & 5 deletions packages/api/tests/endpoints/test-granules.js
Original file line number Diff line number Diff line change
Expand Up @@ -288,6 +288,7 @@ test.beforeEach(async (t) => {
const granuleId1 = t.context.createGranuleId();
const granuleId2 = t.context.createGranuleId();
const granuleId3 = t.context.createGranuleId();
const timestamp = new Date();

// create fake Postgres granule records
t.context.fakePGGranules = [
Expand All @@ -299,29 +300,33 @@ test.beforeEach(async (t) => {
cmr_link:
'https://cmr.uat.earthdata.nasa.gov/search/granules.json?concept_id=A123456789-TEST_A',
duration: 47.125,
timestamp: new Date(Date.now()),
timestamp,
updated_at: timestamp,
}),
fakeGranuleRecordFactory({
granule_id: granuleId2,
status: 'failed',
collection_cumulus_id: t.context.collectionCumulusId,
duration: 52.235,
timestamp: new Date(Date.now()),
timestamp,
updated_at: timestamp,
}),
fakeGranuleRecordFactory({
granule_id: granuleId3,
status: 'failed',
collection_cumulus_id: t.context.collectionCumulusId,
duration: 52.235,
timestamp: new Date(Date.now()),
timestamp,
updated_at: timestamp,
}),
// granule with same granule_id as above but different collection_cumulus_id
fakeGranuleRecordFactory({
granule_id: granuleId3,
status: 'failed',
collection_cumulus_id: t.context.collectionCumulusId2,
duration: 52.235,
timestamp: new Date(Date.now()),
timestamp,
updated_at: timestamp,
}),
];

Expand Down Expand Up @@ -456,7 +461,7 @@ test.serial('default lists and paginates correctly from querying database', asyn
const { meta, results } = response.body;
t.is(results.length, 4);
t.is(meta.stack, process.env.stackName);
t.is(meta.table, 'granule');
t.is(meta.table, 'granules');
t.is(meta.count, 4);
results.forEach((r) => {
t.true(granuleIds.includes(r.granuleId));
Expand Down Expand Up @@ -487,6 +492,41 @@ test.serial('default lists and paginates correctly from querying database', asyn
t.not(results[0].granuleId, newResults[0].granuleId);
});

test.serial('LIST endpoint returns search result correctly', async (t) => {
const granuleIds = t.context.fakePGGranules.map((i) => i.granule_id);
const searchParams = new URLSearchParams({
granuleId: granuleIds[3],
});
const response = await request(app)
.get(`/granules?limit=1&page=2&${searchParams}`)
.set('Accept', 'application/json')
.set('Authorization', `Bearer ${jwtAuthToken}`)
.expect(200);

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't really understand this test, when you add granuleId: granuleIds[3], to the SearchParams, why is the meta count 2? shouldn't 1 granuleid = 1 distinct granule, maybe I'm missing something, I get the t.is(results.length, 1); because 1 granule is returned

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are 2 granules with same granule Id but different collections in the setup

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ohhhhhh ok, I didn't know it could work like that, thanks 👍

const { meta, results } = response.body;
t.is(meta.count, 2);
t.is(results.length, 1);
t.true([granuleIds[2], granuleIds[3]].includes(results[0].granuleId));

const newSearchParams = new URLSearchParams({
collectionId: t.context.collectionId,
status: 'failed',
duration: 52.235,
timestamp: t.context.fakePGGranules[0].timestamp.getTime(),
});
const newResponse = await request(app)
.get(`/granules?${newSearchParams}`)
.set('Accept', 'application/json')
.set('Authorization', `Bearer ${jwtAuthToken}`)
.expect(200);

const { meta: newMeta, results: newResults } = newResponse.body;
t.is(newMeta.count, 2);
t.is(newResults.length, 2);
const newResultIds = newResults.map((g) => g.granuleId);
t.deepEqual([granuleIds[1], granuleIds[2]].sort(), newResultIds.sort());
});

test.serial('CUMULUS-911 GET without pathParameters and without an Authorization header returns an Authorization Missing response', async (t) => {
const response = await request(app)
.get('/granules')
Expand Down
89 changes: 70 additions & 19 deletions packages/db/src/search/BaseSearch.ts
Original file line number Diff line number Diff line change
@@ -1,8 +1,11 @@
import { Knex } from 'knex';
import Logger from '@cumulus/logger';
import { getKnexClient } from '../connection';

import { BaseRecord } from '../types/base';
import { getKnexClient } from '../connection';
import { TableNames } from '../tables';
import { DbQueryParameters, QueryEvent, QueryStringParameters } from '../types/search';
import { convertQueryStringToDbQueryParameters } from './queries';

const log = new Logger({ sender: '@cumulus/db/BaseSearch' });

Expand All @@ -15,32 +18,35 @@ export type Meta = {
count?: number,
};

const typeToTable: { [key: string]: string } = {
asyncOperation: TableNames.asyncOperations,
collection: TableNames.collections,
execution: TableNames.executions,
granule: TableNames.granules,
pdr: TableNames.pdrs,
provider: TableNames.providers,
rule: TableNames.rules,
};

/**
* Class to build and execute db search query
*/
class BaseSearch {
readonly type?: string;
readonly type: string;
readonly queryStringParameters: QueryStringParameters;
// parsed from queryStringParameters for query build
dbQueryParameters: DbQueryParameters = {};

constructor(event: QueryEvent, type?: string) {
constructor(event: QueryEvent, type: string) {
this.type = type;
this.queryStringParameters = event?.queryStringParameters ?? {};
this.dbQueryParameters.page = Number.parseInt(
(this.queryStringParameters.page) ?? '1',
10
);
this.dbQueryParameters.limit = Number.parseInt(
(this.queryStringParameters.limit) ?? '10',
10
this.dbQueryParameters = convertQueryStringToDbQueryParameters(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

this.type, this.queryStringParameters
);
this.dbQueryParameters.offset = (this.dbQueryParameters.page - 1)
* this.dbQueryParameters.limit;
}

/**
* build the search query
* Build the search query
*
* @param knex - DB client
* @returns queries for getting count and search result
Expand All @@ -51,27 +57,32 @@ class BaseSearch {
searchQuery: Knex.QueryBuilder,
} {
const { countQuery, searchQuery } = this.buildBasicQuery(knex);
if (this.dbQueryParameters.limit) searchQuery.limit(this.dbQueryParameters.limit);
if (this.dbQueryParameters.offset) searchQuery.offset(this.dbQueryParameters.offset);
this.buildTermQuery({ countQuery, searchQuery });
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so for here, you need to specify 2 different types of queries in case the query might be a term one or an infix one? I don't quite understand how that works, or do both queries (count and search) need to be executed and parsed for the api response

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unlike es, the search query doesn't return count, and only returns limited number of records, so we need a separate query to get the total count which satisfy the search parameters.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A search can have infix, term and other supported query types (sort, range etc.)

Copy link
Contributor

@Nnaga1 Nnaga1 May 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yea I understood this after looking at your GranuleSearch class, it was kinda hard to understand how you did it based on my initial understanding of the stats endpoint (since stats mainly does count, aggregation, avg, not actually returning records), I understand now the meta thing needs a count and then the actual body is the granules records actually returned 👍

this.buildInfixPrefixQuery({ countQuery, searchQuery });

const { limit, offset } = this.dbQueryParameters;
if (limit) searchQuery.limit(limit);
if (offset) searchQuery.offset(offset);

log.debug(`_buildSearch returns countQuery: ${countQuery.toSQL().sql}, searchQuery: ${searchQuery.toSQL().sql}`);
return { countQuery, searchQuery };
}

/**
* metadata template for query result
* Get metadata template for query result
*
* @returns metadata template
*/
private _metaTemplate(): Meta {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 nice, this is something I can use

return {
name: 'cumulus-api',
stack: process.env.stackName,
table: this.type,
table: this.type && typeToTable[this.type],
};
}

/**
* build basic query
* Build basic query
*
* @param knex - DB client
* @throws - function is not implemented
Expand All @@ -84,6 +95,46 @@ class BaseSearch {
throw new Error('buildBasicQuery is not implemented');
}

/**
* Build queries for infix and prefix
*
* @param params
* @param params.countQuery - query builder for getting count
* @param params.searchQuery - query builder for search
* @param [params.dbQueryParameters] - db query parameters
*/
protected buildInfixPrefixQuery(params: {
countQuery: Knex.QueryBuilder,
searchQuery: Knex.QueryBuilder,
dbQueryParameters?: DbQueryParameters,
}) {
log.debug(`buildInfixPrefixQuery is not implemented ${Object.keys(params)}`);
throw new Error('buildInfixPrefixQuery is not implemented');
}

/**
* Build queries for term fields
*
* @param params
* @param params.countQuery - query builder for getting count
* @param params.searchQuery - query builder for search
* @param [params.dbQueryParameters] - db query parameters
*/
protected buildTermQuery(params: {
countQuery: Knex.QueryBuilder,
searchQuery: Knex.QueryBuilder,
dbQueryParameters?: DbQueryParameters,
}) {
const table = typeToTable[this.type];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so this is for what we were talking about before, where the query can search by ANY field in the table and query by it based on an exact param (i.e. published = true)

const { countQuery, searchQuery, dbQueryParameters } = params;
const { term = {} } = dbQueryParameters || this.dbQueryParameters;

Object.entries(term).forEach(([name, value]) => {
countQuery.where(`${table}.${name}`, value);
searchQuery.where(`${table}.${name}`, value);
});
}

/**
* Translate postgres records to api records
*
Expand All @@ -96,7 +147,7 @@ class BaseSearch {
}

/**
* build and execute search query
* Build and execute search query
*
* @param testKnex - knex for testing
* @returns search result
Expand Down
Loading