Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

10502 dxox #5679

Draft
wants to merge 134 commits into
base: staging
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
134 commits
Select commit Hold shift + click to select a range
e8420a8
10391 - starting work to refactor work items
codyseibert Sep 24, 2024
16f12f6
10391 - refactoring the qc inbox to use postgres instead of elasticse…
codyseibert Sep 24, 2024
5c74ea9
10391 - refactoring a few more methods
codyseibert Sep 24, 2024
f59ff36
10391: sort prop fields in seed data
jtdevos Sep 24, 2024
c2eb549
Merge branch '10391-work-item-refactor' of https://github.com/flexion…
jtdevos Sep 24, 2024
e000047
10491 - refactoring more work items
codyseibert Sep 26, 2024
51b188d
Merge branch '10391-dynamo-migrations-kysely-aurora-rds-tokens' of gi…
pixiwyn Sep 27, 2024
50887b3
10491: Refactor out duplicate case details
pixiwyn Sep 27, 2024
15f86d2
10491: join with case
pixiwyn Sep 27, 2024
3879a0d
10491: covert remaining dynamodb functions
pixiwyn Sep 27, 2024
1159aa5
10491: fix lint errors
pixiwyn Sep 30, 2024
34676fc
Merge branch '10391-dynamo-migrations-kysely-aurora-rds-tokens' of gi…
pixiwyn Sep 30, 2024
0f98fef
Merge branch '10391-dynamo-migrations-kysely-aurora-rds-tokens' of gi…
pixiwyn Oct 2, 2024
b4f8c99
10491 - trying to fix work items
codyseibert Oct 2, 2024
13b715a
10491: add mocks + use in api test
pixiwyn Oct 2, 2024
49fea39
10491: fix workitems mock import name
pixiwyn Oct 2, 2024
23ae202
10491: fix mocks path
pixiwyn Oct 2, 2024
4b3c9cf
10491 - fetching work items when doing queryFull
codyseibert Oct 3, 2024
d630e0f
Merge branch '10391-dynamo-migrations-kysely-aurora-rds-tokens' of gi…
pixiwyn Oct 4, 2024
604212f
10491: refactor to use prefix
pixiwyn Oct 4, 2024
1a060ac
10491: fix shared tests
pixiwyn Oct 4, 2024
b88f8c9
10491: fix linting errors
pixiwyn Oct 4, 2024
4e1e5a7
10491: update broken mock
pixiwyn Oct 4, 2024
e6368ca
10491: fix mocks
pixiwyn Oct 4, 2024
69a625e
10491: remove unnecessary tests + add mock values
pixiwyn Oct 8, 2024
0df7f78
10491: fix shared tests
pixiwyn Oct 8, 2024
a1a39c9
10491: update association if workItems and add mock
pixiwyn Oct 8, 2024
abc2f20
10491: add additional mocks
pixiwyn Oct 8, 2024
508e7ad
10491: add default returnValue
pixiwyn Oct 8, 2024
ec2e952
10491: fix shared tests
pixiwyn Oct 8, 2024
0ba8e15
10491: fix api tests
pixiwyn Oct 8, 2024
058cd40
Merge branch 'staging' of github.com:ustaxcourt/ef-cms into 10491-wor…
pixiwyn Oct 8, 2024
ac16041
10491: fix more tests
pixiwyn Oct 8, 2024
5ed20d2
fixing some integration tests
codyseibert Oct 8, 2024
31eab83
Merge branch '10491-work-item-refactor' of github.com:flexion/ef-cms …
codyseibert Oct 8, 2024
aa5651f
10491: fix sentWorkItemsExpireAfterNDays.test.ts
pixiwyn Oct 8, 2024
c16b3b4
Merge branch '10491-work-item-refactor' of github.com:flexion/ef-cms …
pixiwyn Oct 8, 2024
2685218
10491 - fixing integration test
codyseibert Oct 9, 2024
b5919db
10491: fix api tests
pixiwyn Oct 9, 2024
3575038
10491 - fixing more integration tests
codyseibert Oct 9, 2024
1ec0f19
Merge branch '10491-work-item-refactor' of github.com:flexion/ef-cms …
codyseibert Oct 9, 2024
4495d6d
10491: fix api tests
pixiwyn Oct 9, 2024
6f92426
Merge branch '10491-work-item-refactor' of github.com:flexion/ef-cms …
pixiwyn Oct 9, 2024
663d6ae
10491: fix api tests
pixiwyn Oct 9, 2024
0208f12
10491 - fixing more integration tests
codyseibert Oct 9, 2024
0e8fb91
10491 - fixing more integration tests
codyseibert Oct 9, 2024
9bf112d
10491: fix remaining tests (hopefully?)
pixiwyn Oct 10, 2024
bc09c44
10491: fix last test
pixiwyn Oct 11, 2024
2747950
10492: resolve conflicts
pixiwyn Nov 6, 2024
f0252e3
10491: update spell check
pixiwyn Nov 6, 2024
41c118c
10491-work-item-refactor: fix failing tests and type errors
Mwindo Nov 6, 2024
49a723a
10491-dxox: add delete script for work items; todo, delete script for…
Mwindo Nov 20, 2024
8c7b54d
10491-dxox: add delete scripts for outboxes
Mwindo Nov 20, 2024
9d8996c
10502-dxox: initial work getting migrations, database-types, and some…
Mwindo Nov 20, 2024
4a4ad09
10502-dxox: begin seeding data
Mwindo Nov 20, 2024
934586d
10502-dxox: small type fix, update TODO
Mwindo Nov 20, 2024
4adfb79
10502-dxox: create a separate table for case status history, small re…
Mwindo Nov 21, 2024
7022063
10502-dxox: beginning of case factory, removing upsertCase (we should…
Mwindo Nov 21, 2024
b754469
10502-dxox: WIP moving more seed data over, trying to organize to avo…
Mwindo Nov 21, 2024
f6e227a
10502-dxox: move all cases out of efcms-local and into seed files; to…
Mwindo Nov 22, 2024
a69febd
WIP: cleaning up case seed files
Mwindo Nov 22, 2024
6c19834
10502-dxox: new seed files are clean, but need to check petitioners f…
Mwindo Nov 22, 2024
82544ff
10502-dxox: all seed data loaded into postgres and app 'runs'
Mwindo Nov 22, 2024
9ac86cf
10502-dxox: use constants in case seed data
Mwindo Nov 25, 2024
ab8a7da
Merge remote-tracking branch 'ustc/staging' into 10502-dxox
Mwindo Nov 25, 2024
105a658
10502-dxox: WIP implement CaseFactory; todo: now clean up getCaseInte…
Mwindo Nov 25, 2024
10ac605
10502: WIP, continuing with CaseFactory, and nixing ill-conceived Use…
Mwindo Nov 25, 2024
e5afe64
10502-dxox: add public case interactor stuff back in
Mwindo Nov 26, 2024
26dfc08
10502-dxox: clean up getCaseByDocketNumber
Mwindo Nov 26, 2024
db5e4a7
10502-dxox: update todo
Mwindo Nov 26, 2024
14872c1
10502-dxox: fix bug with dates (not serializing correctly in getCaseB…
Mwindo Dec 2, 2024
50a20e1
10502-dxox: make CaseFactory a class and related changes
Mwindo Dec 2, 2024
c848ba1
10502-dxox: fix some types in files I was looking through
Mwindo Dec 3, 2024
b387924
10502-dxox: converting some dynamo persistence functions to postgres
Mwindo Dec 3, 2024
1a260a0
10502-dxox: convert a few more persistence methods and fix createCase…
Mwindo Dec 3, 2024
75b90de
10502-dxox: slightly more efficient in-code join in getCaseByDocketNu…
Mwindo Dec 4, 2024
25d1d18
10502-dxox: update updateCase, and sync mocks.jest.ts with current pe…
Mwindo Dec 4, 2024
3c56b84
10502-dxox: remove duplicative getCountOfConsolidatedCases
Mwindo Dec 4, 2024
370dd6c
10502-dxox: move more persistence functions to postgres
Mwindo Dec 4, 2024
d3aab80
10502-dxox: convert remaining case persistence functions (that do not…
Mwindo Dec 4, 2024
6bc03c2
10502-dxox: fix updateCase and seal/unseal case
Mwindo Dec 4, 2024
75da8fa
10502-dxox: fix consolidated cases in getCaseByDocketNumber, add a ma…
Mwindo Dec 4, 2024
82de900
10504-dxox: experiment to prevent creating lots of db connections
Mwindo Dec 4, 2024
c9d5262
10502-dxox: WIP: getting statistics and case status updates to work; …
Mwindo Dec 5, 2024
00d949b
10502-dxox: WIP: getting case statuses and petitioners to persist pro…
Mwindo Dec 5, 2024
a0014b3
10502-dxox: forgot to delete old txt file
Mwindo Dec 5, 2024
b1cee6b
10502-dxox: WIP: fix a few bugs and start work on Statistic penalties
Mwindo Dec 5, 2024
8aa6bfc
10502-dxox: fix case penalties in getCaseByDocketNumber
Mwindo Dec 5, 2024
60ddde8
10502-dxox: WIP: fixing penalty validation error
Mwindo Dec 9, 2024
d1311fc
10502-dxox: fix case statistic issues--missing certain fields, and ba…
Mwindo Dec 9, 2024
e62a9b7
10502-dxox: forgot to update seed data with missing determinationDefi…
Mwindo Dec 9, 2024
8f4253b
10502-dxox: fix period-based statistics
Mwindo Dec 9, 2024
43480cf
10502-dxox: use string instead of number in database-types and let po…
Mwindo Dec 10, 2024
89b11ca
10502-dxox: WIP, moving custom case report from open search to postgres
Mwindo Dec 18, 2024
f20dd31
10502-dxox: WIP: custom case report into postgres
Mwindo Dec 18, 2024
02145d9
10502-dxox: custom case report seems to be working, pending test fixes
Mwindo Dec 19, 2024
550e459
10502-dxox: move getCasedColdCountByJudge from elasticsearch to postg…
Mwindo Dec 19, 2024
2e287a4
Merge remote-tracking branch 'ustc/staging' into 10502-dxox
Mwindo Dec 19, 2024
184f5e2
10502-dxox: WIP caseAdvancedSearch
Mwindo Dec 19, 2024
d9f229c
10502-dxox: fix missing fields in caseAdvancedSearch
Mwindo Dec 19, 2024
bdf27ce
10502-dxox: WIP: start moving getSuggestedCalendarCases
Mwindo Dec 19, 2024
e814ef6
10502-dxox: WIP: moving more stuff from open search to postgres; also…
Mwindo Dec 20, 2024
0b9f6b9
10502: delete old tests
pixiwyn Dec 30, 2024
e77903a
10502-dxox: trying to improve case advanced search
Mwindo Jan 1, 2025
55c6bc4
10502-dxox: caseAdvancedSearch proof of concept
Mwindo Jan 1, 2025
b62ee4a
10502-dxox: update/refactor casePublicSearch and clean up code pertai…
Mwindo Jan 1, 2025
9ccfdf7
10502-dxox: WIP, try translating cold case report to postgres
Mwindo Jan 2, 2025
aa2ee7b
10502-dxox: confirmed same cold cases locally, now need to sort them
Mwindo Jan 2, 2025
da91532
10502-dxox: finish converting cold cases report to postgres, but sort…
Mwindo Jan 2, 2025
cb3d85f
10502-dxox: WIP with getDocketNumbersByStatusAndByJudge
Mwindo Jan 2, 2025
eeb69d1
10502: wip testing get cases by email total + getDocketNumbersByStatu…
Mwindo Jan 2, 2025
9a95a05
10502-dxox: a little housekeeping before testing postgres reports dat…
Mwindo Jan 6, 2025
fa1022c
10502-dxox: fix error in case inventory report
Mwindo Jan 6, 2025
7a6a49b
10502-dxox: move fetchPendingItems from OpenSearch to Postgres
Mwindo Jan 6, 2025
c5d3c34
10502-dxox: fix fetchPendingItems.ts
Mwindo Jan 6, 2025
ab1fa6f
10502-dxox: update TODO, and remove unused const in searchClauses.ts
Mwindo Jan 6, 2025
0ae8c75
Merge branch 'staging' of github.com:ustaxcourt/ef-cms into 10502-dxox
pixiwyn Jan 7, 2025
87ec528
10502: update getAllPendingMotionDocketEntriesForJudge to use postgre…
pixiwyn Jan 7, 2025
9fd3551
10502-dxox: translate getReadyForTrialCases into postgres; and remove…
pixiwyn Jan 7, 2025
79886ff
10502-dxox: fix getCaseByDocketNumber being called incorrectly in two…
Mwindo Jan 8, 2025
117958e
10502-dxox: tentatively fix bug in which petitions clerk cannot creat…
Mwindo Jan 9, 2025
df40195
10502-dxox: fix broken rename
Mwindo Jan 9, 2025
7eb3f23
10502-dxox: fix getReadyForTrialCases, fix international petitioners …
Mwindo Jan 9, 2025
9f256ba
10502-dxox: add case statistics on creating a case, and remove statis…
Mwindo Jan 9, 2025
b43a416
10502-dxox: fix import error and bad data in seed data
Mwindo Jan 10, 2025
70566ca
10502: wip advancedDocumentSearch
pixiwyn Jan 10, 2025
5742a7f
Merge branch '10502-dxox' of github.com:flexion/ef-cms into 10502-dxox
pixiwyn Jan 10, 2025
7ee2fed
10502-dxox: fix getCustomCaseReport for csv export
Mwindo Jan 11, 2025
94c5b08
10502-dxox: wip: advancedDocumentSearch without null word_similarity
Mwindo Jan 11, 2025
7c332b0
Merge remote-tracking branch 'ustc/staging' into 10502-dxox
Mwindo Jan 11, 2025
adcddd4
10502-dxox: fix message sorting error (both message and case have cre…
Mwindo Jan 11, 2025
944631a
10502-dxox: wip: fix advancedDocumentSearch for external users
Mwindo Jan 11, 2025
f1853c1
10502-dxox: fix bad seed data for case 150-12, and remove a few conso…
Mwindo Jan 11, 2025
4c9fb7a
10502-dxox: fixing some type errors
Mwindo Jan 11, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
54 changes: 54 additions & 0 deletions __TODO-10502.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,54 @@
# TODO

- get rid of case-mapping (pair):
- getReadyForTrialCases (confirm finished in a deployed environment, and how to handle locking)
- advancedDocumentSearch
- Case title, petitioner advanced syntax
- Can we remove formatWorkItemResult and formatMessageResult? What about formatDocketEntryResult?

- if we use the trigram search, run on the db CREATE EXTENSION IF NOT EXISTS pg_trgm;
- Some TODOs are marked with "10502 TODO", so search for this (without the quotes) as needed. (solo, meetup as needed)
- Make sure that report data for case-mapping reports (above) is correct, and make sure indices are added for efficiency as needed for all case-based reports (pair)
- Make sure getCasesMetadataByUserId (previously called getCasesForUser) is working. It relies on processPractitionerMappingEntries streaming things into PractitionerOnCaseTable (pair)
- Fix all of the broken tests (solo)
- Do we need to implement Case entity locking? E.g., mimic withLocking. If we have removed locks, confirm with court that is ok. (pair)
- Figure out if there are fields we can get rid of on database types to make things simpler (solo)
- Remove console.logs

---

# Things to test:

- Be consistent with Case and RawCase (maybe? not a big deal) and with returning undefined vs. throwing a not-found error in persistence methods
- Places where we createCase/upsertCases sometimes also need to create/upsert rows in related tables (petitionerOnCase, etc.) that used to be children objects on Case dynamo records.
- We removed duplicative getCountOfConsolidatedCases. Make sure that everything works as expected--Chris H mentioned that consolidated cases have had several refactors, so there might be a reason this existed after all.
- Are there any unhappy paths in which latency-induced data discrepancies can occur between Postgres and Open Search?

---

# General notes

Current summary:
• All case (and directly related, i.e., stored on the case) seed data has been moved.
• After talking with Jim L., it became apparent that petitioners on a case are almost entirely distinct from users. We may want an optional foreign key to a (forthcoming) user table, but the user table should really only be login/role info for petitioners. Their contact info on any particular case can be different from their "user" contact info or the contact info on a separate case.
• The app runs with the fake data, although case is still missing some stuff.

Continue to use CaseFactory, typing it correctly. I think the pattern should be:
• We force developers to construct via CaseFactory, e.g., CaseFactory.getCase() (generic), CaseFactory.getFullCase (Case) CaseFactory.getPublicCase (PublicCase), etc.
• PublicCase, Case, etc. entities should be constructed (at least for business logic--we currently use Case as both a representation of our database as well as a business object) from _create static helpers.

So: CaseFactory.getFullCase(data, user) [which will check that the user actually has permissions for a Case and throw an error otherwise] -> cleans up data -> calls Case._create(data)

This design is meant to solve a couple of issues:
1) Make sure we are always getting a "safe" subset of data for a user. We have been relying on getCaseInteractor, random formatters, etc. to do this, which is dangerous. We should enforce one way of cleaning up a case regardless of which interactor etc.--hence the factory.
2) There is one place that determines the permissions of what is and is not seen--the factory.
3) Ideally, we could do the data obfuscation/clean up in each constructor. However, that requires passing user into each constructor, which makes the class harder to reason about, and it means we have multiple places/duplicate logic for cleaning up certain type-independent data obfuscation. That is why I have chosen to put this shared logic in the factory itself.
4) Ultimately, this allows a developer to know that they are constructing the right type, getting a narrow type when they know what they want (e.g., a PublicCase from .getPublicCase) or a broad type when they don't (.getCase).

- Update as of 20 December 2024:
- The most difficult part of this work so far has been figuring out how to get case data into open search.
- The scalable, reliable, "good" approaches -- streaming data from serverless Aurora to Open Search via, e.g., Kinesis -- are higher latency.
- The faster approaches -- "directly" writing to Open Search or Kinesis in one way or another -- are less scalable and reliable.
- We have more options if we switch to provisioned Aurora, but that opens a different can of worms.
- Moreover, since our eventual goal is to move as many queries from Open Search into Postgres anyway, much of the above work will be temporary.
- For those reasons, the current goal is to move all case-index Open Search queries to Postgres. Then we don't have to worry about the above issues, and the work we are doing will not need to be undone.
62 changes: 62 additions & 0 deletions __aurora-streams.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,62 @@
# Goal: Stream AWS Aurora Postgres updates to Open Search

## Constraints:
1) A low-latency, real-time solution. The user should be able to create/update/etc. X and see the updates as soon as possible afterwards.

## Option 1: Replicate (as far as possible) our existing setup, which is the following:

DynamoDB --> DynamoDB Stream --> Lambda --> Open Search

We would need to replace DynamoDB Stream with some other Change Data Capture (CDC) and streaming tool. Two options I have seen:

1) Persistence function --> Aurora --> Data Migration Service --> Kinesis --> Lambda --> Open Search
2) Persistence function --> Aurora --> Data Activity Stream --> Kinesis --> Lambda --> Open Search

Both DMS and Data Activity Streams can be used to capture changes. DMS is more agnostic about source and destination databases, but it incurs an additional cost. Data Activity Streams are free, but they are narrower in scope (used for AWS RDS). Kinesis is a new cost regardless of our choice between 1 and 2.

Pros:
1) This continues our current pattern and keeps DB updates synced with--while, implementation-wise, decoupled from--Open Search
2) More scalable and event-driven: Kinesis offers buffering to smooth out traffic spikes.

Cons:
1) Added cost of Kinesis and possibly DMS
2) More moving parts: now we have to configure streams, kinesis, and format the data for Open Search
3) Kinesis sometimes has duplicate or out-of-order messages.

Basic argument for this approach: "This keeps us consistent with our current pattern, is more robust, and is more easily scalable. Plus, we might be able to use this event-driven pipeline for other things in the future."

Issues I have run into:
1) Data Activity Stream is only available for provisioned (not serverless) Aurora. This means
2) I see conflicting information, but Data Migration Service latency seems like it might not be acceptable for some of our use cases. (Potential delays of minutes.)

## Option 2a: "Directly" index in our persistence functions to stream to Open Search on successful Postgres updates

Rather than streaming database changes at all, we modify our code so that any changes that need to be indexed 1) wait for a successful write to postgres and then 2) kick off an async task (storing a message in a queue, for instance) to index the update. Something like:

Persistence function --> (Contingent upon successful update to Aurora) [--> Queue -->] Lambda --> Open Search

Pros:
1) Lower cost, probably
2) Faster, maybe?
-- With a queue, probably not. SQS, for instance, is poll based, not push-based, which introduces latency.
-- Without a queue--viz., directly invoking a lambda--probably faster.
3) Relatively simple infrastructure

Cons:
1) This breaks our existing pattern and puts the onus on us to update Open Search. Implementation between DB and Open Search is now more coupled.
2) Without a queue, we would need to handle retry logic. Although maybe this retry logic is as simple as "on failure, send to queue," and then we have a separate lambda polling for failed indexing.
3) Less scalable. If we had 1000s of writes per second, we could reach lambda concurrency limits.
4) Less of a paper trail.

Basic argument for this approach: "Look, we don't need to index all that much in open search anyway. Why set up all this infrastructure for streaming when we can just index what we need directly? Plus this incentivizes us to move as little into Open Search as necessary."

## Option 2b: "Directly" invoke a lambda function or Kinesis stream in a postgres trigger

This is like 2a, except rather than triggering in code, with trigger in postgres itself via pg_notify. An overview: we enable logical replication,

## Option 3: Use a plugin or third-party tool like Debezium as a CDC

Cons:
1) It seems like most of these tools work better with a provisioned Aurora instance

## Option 4? Wrap getDbWriter in something that then invokes the change
Loading
Loading