Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Data loss issue #1214

Closed
mahalakshme opened this issue Dec 4, 2023 · 1 comment
Closed

Data loss issue #1214

mahalakshme opened this issue Dec 4, 2023 · 1 comment
Assignees

Comments

@mahalakshme
Copy link
Contributor

mahalakshme commented Dec 4, 2023

https://avni.freshdesk.com/a/tickets/3265

@mahalakshme mahalakshme converted this from a draft issue Dec 4, 2023
@mahalakshme mahalakshme moved this from In Analysis to In Progress in Avni Product Dec 4, 2023
@mahalakshme mahalakshme changed the title Callers are not able to see some of the records which they were previously able to view Data loss issue Dec 4, 2023
@himeshr
Copy link
Contributor

himeshr commented Dec 4, 2023

On analysis of the db entry for individual with uuid "69f77ba9-177b-4fc1-a949-c7b5d9e8e5ac", we found that the Details are different across the PROD DB snapshots of 29Nov2023 and 04Dec2023

PROD 04Dec2023

[2023-12-04 17:36:23] Connected
openchs.public> select * from individual where id = 1439370
[2023-12-04 17:36:23] 1 row retrieved starting from 1 in 421 ms (execution: 111 ms, fetching: 310 ms)
[
{
"id": 1439370,
"uuid": "69f77ba9-177b-4fc1-a949-c7b5d9e8e5ac",
"address_id": 373580,
"observations": "{"10cc236f-d5ec-4b41-821a-589e5843e199": 110007, "1d61eb6b-94e4-42c6-bb1b-b3f363b9ba3a": "B-479 SANGAM PARK PRATAP BAGH MALKA GANJ MALKA GANJ Delhi 110007 India", "54c89951-af35-4057-8d29-a688be4926b5": "3aa7b24f-b641-4349-8b26-15e2ace24120", "68cff1ab-85fe-46d3-9ce2-951cd4b92129": "3bbbfd19-7c9f-4a91-9849-b0f59764d6e0", "d2678174-e061-4993-878e-68ad653fd306": "08929063913"}",
"version": 0,
"date_of_birth": "1989-01-01",
"date_of_birth_verified": false,
"gender_id": 716,
"registration_date": "2020-08-16",
"organisation_id": 291,
"first_name": "SARTAAJ",
"last_name": "",
"is_voided": false,
"audit_id": 12713189,
"facility_id": null,
"registration_location": null,
"subject_type_id": 816,
"legacy_id": "527",
"created_by_id": 8735,
"last_modified_by_id": 8735,
"created_date_time": "2023-10-31 19:14:30.468 +00:00",
"last_modified_date_time": "2023-11-29 10:16:17.823 +00:00",
"sync_concept_1_value": null,
"sync_concept_2_value": null,
"profile_picture": null,
"middle_name": null,
"manual_update_history": null
}
]

PROD 29Nov2023

[
{
"id": 1439370,
"uuid": "69f77ba9-177b-4fc1-a949-c7b5d9e8e5ac",
"address_id": 373583,
"observations": "{"10cc236f-d5ec-4b41-821a-589e5843e199": 110046, "1d61eb6b-94e4-42c6-bb1b-b3f363b9ba3a": "RZ-63 GALI NO. 11 MADAN PURI WEST SAGARPUR Delhi 110046 India", "54c89951-af35-4057-8d29-a688be4926b5": "3aa7b24f-b641-4349-8b26-15e2ace24120", "68cff1ab-85fe-46d3-9ce2-951cd4b92129": "3bbbfd19-7c9f-4a91-9849-b0f59764d6e0", "d2678174-e061-4993-878e-68ad653fd306": "08512890285"}",
"version": 0,
"date_of_birth": "1980-12-02",
"date_of_birth_verified": false,
"gender_id": 715,
"registration_date": "2020-08-13",
"organisation_id": 291,
"first_name": "BHARAT",
"last_name": "BHUSHAN",
"is_voided": false,
"audit_id": 12713189,
"facility_id": null,
"registration_location": null,
"subject_type_id": 816,
"legacy_id": "527",
"created_by_id": 8735,
"last_modified_by_id": 8735,
"created_date_time": "2023-10-31 19:14:30.468 +00:00",
"last_modified_date_time": "2023-11-20 10:54:37.280 +00:00",
"sync_concept_1_value": null,
"sync_concept_2_value": null,
"profile_picture": null,
"middle_name": null,
"manual_update_history": null
}
]

We queried the DB to figure out amount of individuals created by the user

openchs.public> select count(*) from individual where created_by_id = 8735
[2023-12-04 17:44:18] 1 row retrieved starting from 1 in 8 s 203 ms (execution: 8 s 180 ms, fetching: 23 ms)
2000

We needed to determine which APK version was used by the client to sync the data, so we queries the sync_telemetry table for the user

openchs.public> select * from sync_telemetry where user_id = 8735 order by last_modified_date_time desc
[2023-12-04 17:38:01] 13 rows retrieved starting from 1 in 82 ms (execution: 60 ms, fetching: 22 ms)

There are only 13 rows for dates 03Oct2023 and 01Nov2023 for the user, indicating that the bulk of data uploaded by the user was not through APK but other means.

We then looked for BulkImport Subjects files in S3 for Power Org and found a large number of files from the date 31Oct2023 till 30Nov2023

Therefore, the data must have been introduced through the BulkImport Subjects route.
On picking up the first files for the dates 31Oct2023 till 29Nov2023, observed that the id column for different Individuals had the same values starting from 1 till 50.
File names: 42353d84-8905-457b-86f6-7c8ce69b54b0-sheet1.csv (31Oct2023) and d92fa628-7481-47aa-8073-b1454960fc92-TransformedData_-_Sheet_1.csv (29Nov2023)

This is a major point of concern, as Avni BulkSubject Import overwrites the Subjects based on the id column value if specified.

This is also mentioned clearly in Avni Readme documentation here.

Conclusion

SInce Power admin users have uploaded BulkSubject create files with repetition in ID values, we have overwritten the entries based on the same, as per design.

This is a User error, which has resulted in overwrite of data and needs no further intervention from Avni Product or support team.

@himeshr himeshr moved this from In Progress to Done in Avni Product Dec 4, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Archived in project
Development

No branches or pull requests

3 participants