Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Old data not visible for some users #3738

Closed
amruth-movano opened this issue Jun 4, 2024 · 19 comments
Closed

Old data not visible for some users #3738

amruth-movano opened this issue Jun 4, 2024 · 19 comments
Labels
bug Something isn't working datastore Issues related to the DataStore category

Comments

@amruth-movano
Copy link

amruth-movano commented Jun 4, 2024

Describe the bug

Hi Team,
some of our users are not able to see their old data which is older than a week.

For some users if i fetch weekly data i.e. if i fetch data for whole week at one time, it's giving me result but when I fetch data for single day in past, it's not giving me any result.

Below is my schema for one table. I have also attached my logic files that how are we configuring amplify. Can you please check and let us know why amplify is behaving like this way?
amplify-issue.zip

extension SleepSession2 {
  // MARK: - CodingKeys 
   enum CodingKeys: String, ModelKey {
    case id
    case user_id
    case sleep_start_time
    case sleep_end_time
    case sleep_data
    case avg_heart_rate
    case avg_temperature
    case avg_breathing_rate
    case avg_sp_o2
    case avg_hrv
    case health_score
    case data_point_score
    case sleep_jobid
    case ring_serial_no
    case sleep_window_start
    case sleep_window_end
    case owner
    case createdAt
    case updatedAt
  }
  
  static let keys = CodingKeys.self
  //  MARK: - ModelSchema 
  
  static let schema = defineSchema { model in
    let sleepSession2 = SleepSession2.keys
    
    model.authRules = [
      rule(allow: .owner, ownerField: "owner", identityClaim: "cognito:username", provider: .userPools, operations: [.create, .update, .delete, .read])
    ]
    
    model.listPluralName = "SleepSession2s"
    model.syncPluralName = "SleepSession2s"
    
    model.attributes(
      .index(fields: ["user_id", "sleep_window_end"], name: nil),
      .primaryKey(fields: [sleepSession2.user_id, sleepSession2.sleep_window_end])
    )
    
    model.fields(
      .field(sleepSession2.id, is: .required, ofType: .string),
      .field(sleepSession2.user_id, is: .required, ofType: .string),
      .field(sleepSession2.sleep_start_time, is: .required, ofType: .dateTime),
      .field(sleepSession2.sleep_end_time, is: .required, ofType: .dateTime),
      .field(sleepSession2.sleep_data, is: .optional, ofType: .embeddedCollection(of: Int.self)),
      .field(sleepSession2.avg_heart_rate, is: .optional, ofType: .double),
      .field(sleepSession2.avg_temperature, is: .optional, ofType: .double),
      .field(sleepSession2.avg_breathing_rate, is: .optional, ofType: .double),
      .field(sleepSession2.avg_sp_o2, is: .optional, ofType: .double),
      .field(sleepSession2.avg_hrv, is: .optional, ofType: .double),
      .field(sleepSession2.health_score, is: .optional, ofType: .double),
      .field(sleepSession2.data_point_score, is: .optional, ofType: .string),
      .field(sleepSession2.sleep_jobid, is: .optional, ofType: .string),
      .field(sleepSession2.ring_serial_no, is: .optional, ofType: .string),
      .field(sleepSession2.sleep_window_start, is: .required, ofType: .dateTime),
      .field(sleepSession2.sleep_window_end, is: .required, ofType: .dateTime),
      .field(sleepSession2.owner, is: .optional, ofType: .string),
      .field(sleepSession2.createdAt, is: .optional, isReadOnly: true, ofType: .dateTime),
      .field(sleepSession2.updatedAt, is: .optional, isReadOnly: true, ofType: .dateTime)
    )
    }
}

Steps To Reproduce

It is reproducible for some users all the time.
Login with user.
Check the data for olders days
Datastore not giving any data though the data is present in dynamodb

Expected behavior

Amplify datastore should give us the data, if it present in dynamodb

Amplify Framework Version

2.33.6

Amplify Categories

DataStore

Dependency manager

Swift PM

Swift version

5.9.2

CLI version

12.11.1

Xcode version

15.0

Relevant log output

<details>
<summary>Log Messages</summary>


INSERT LOG MESSAGES HERE
```

Is this a regression?

Yes

Regression additional context

No response

Platforms

iOS

OS Version

iOS 17.0

Device

iPhone 13

Specific to simulators

No response

Additional context

No response

@amruth-movano
Copy link
Author

Please see attached verbose log file -

https://drive.google.com/file/d/12c5dvjpmoHZIYtNEkdGePfmvpd5SDwV4/view?usp=sharing

@5d 5d added the datastore Issues related to the DataStore category label Jun 4, 2024
@lawmicha
Copy link
Contributor

lawmicha commented Jun 5, 2024

From the attached amplify-issue.zip, I can see the sync expression being used

let syncExpressions = [
            DataStoreSyncExpression.syncExpression(UserProfile2.schema, where: {
                UserProfile2.keys.user_id.eq(userId)
            }),
            DataStoreSyncExpression.syncExpression(ExerciseSession2.schema, where: {
                ExerciseSession2.keys.user_id.eq(userId)
            }),
            DataStoreSyncExpression.syncExpression(TimedData.schema, where: {
                if let dateLimit = dateLimit {
                TimedData.keys.user_id.eq(userId).and(TimedData.keys.time_stamp.gt(dateLimit.temporalDateTime))
                } else {
                    TimedData.keys.user_id.eq(userId)
                }
            }),
            DataStoreSyncExpression.syncExpression(MenstrualLog.schema, where: {
                if let dateLimit = dateLimit {
                MenstrualLog.keys.user_id.eq(userId).and(MenstrualLog.keys.time_stamp.gt(dateLimit.temporalDateTime))
                } else {
                    MenstrualLog.keys.user_id.eq(userId)
                }
            }),
            DataStoreSyncExpression.syncExpression(MobileToken2.schema, where: {
                MobileToken2.keys.user_id.eq(userId)
            }),
            DataStoreSyncExpression.syncExpression(SleepSession2.schema, where: {
                if let dateLimit = dateLimit {
                SleepSession2.keys.user_id.eq(userId).and(SleepSession2.keys.sleep_window_end.gt(dateLimit.temporalDateTime))
                } else {
                    SleepSession2.keys.user_id.eq(userId)
                }
            }),
            DataStoreSyncExpression.syncExpression(SleepStage.schema, where: {
                if let dateLimit = dateLimit {
                SleepStage.keys.user_id.eq(userId).and(SleepStage.keys.data_timestamp.gt(dateLimit.temporalDateTime))
                } else {
                    SleepStage.keys.user_id.eq(userId)
                }
            }),
            DataStoreSyncExpression.syncExpression(SpotCheck.schema, where: {
                if let dateLimit = dateLimit {
                SpotCheck.keys.user_id.eq(userId).and(SpotCheck.keys.data_timestamp.gt(dateLimit.temporalDateTime))
                } else {
                    SpotCheck.keys.user_id.eq(userId)
                }
            }),
            DataStoreSyncExpression.syncExpression(UserDailyMedians2.schema, where: {
                if let dateLimit = dateLimit {
                UserDailyMedians2.keys.user_id.eq(userId).and(UserDailyMedians2.keys.time_stamp.gt(dateLimit.temporalDateTime))
                } else {
                    UserDailyMedians2.keys.user_id.eq(userId)
                }
            }),
            DataStoreSyncExpression.syncExpression(UserGoals.schema, where: {
                UserGoals.keys.user_id.eq(userId)
            }),
            DataStoreSyncExpression.syncExpression(UserPreference.schema, where: {
                UserPreference.keys.user_id.eq(userId)
            }),

            DataStoreSyncExpression.syncExpression(DeviceTelemetry.schema, where: {
                if dateLimit != nil {
                DeviceTelemetry.keys.user_id.eq(userId).and(DeviceTelemetry.keys.time_stamp.ge((Date() - .hours(2)).temporalDateTime))
                } else {
                    DeviceTelemetry.keys.user_id.eq(userId)
                }
            })
        ]
        
        try Amplify.add(plugin: AWSDataStorePlugin(
            modelRegistration: DataStoreModels(),
            configuration: .custom(errorHandler: onError, syncMaxRecords: maxLimit, syncExpressions: syncExpressions)
        ))

The data to be synchronized is filtered down to the userId or when dateLimit is non nil then both the userId and the dateLimit which is used to control syncing weekly or last day data of the particular user.

DataStore performs a sync[Model] query to retrieve the data and apply the syncExpression on the data client-side, reconcile the data to the local database and then reconciles the subscription events from the subscription connection as an on-going basis. The subscription is established with the syncExpression, applying the filter on the server side.

The synchronization is going to be drastically improved if we applied the syncExpression with the sync query call on the server side. This is pending proiritization of this PR #3550 . Some challenges we faced with getting the server side filter released is due to supporting the edge case of supporting customers that do not have AppSync backends that support server side filter (as that feature was launched after the client side filter implementation was put in place).

The problems you see may be solved if your AppSync backend is provisioned with filter parameter on the sync query operations, which means it supports server side filtering, and taking on the latest version of Amplify after we ship the library changes to pass the syncExpression as the filter parameter, since it would no longer evaluate the expression against the response data from the sync request locally.

There could be a bug in the client side evaluation such that given a dateLimit and the response model, it is incorrectly discarding it.

The verbose file is over 100M, could you narrow down the logs to just the output from the result of calling DataStore.start() on a fresh install.

Also could you share code snippet of how and what the dateLimit is set to? We could reproduce this with that information.

@lawmicha lawmicha added bug Something isn't working pending-community-response Issue is pending response from the issue requestor labels Jun 5, 2024
@lawmicha
Copy link
Contributor

lawmicha commented Jun 5, 2024

can you give us the full code of the extensions used by this line inside startAndPopulateDataStore?

.now.advanced(by: -7, byAdding: .day).startOfDay

@github-actions github-actions bot removed the pending-community-response Issue is pending response from the issue requestor label Jun 5, 2024
@lawmicha lawmicha added the pending-community-response Issue is pending response from the issue requestor label Jun 5, 2024
@amruth-movano
Copy link
Author

@lawmicha this is the code used by extension -

public func advanced(
by value: Int,
byAdding unit: Calendar.Component,
calendar: Calendar = .current
) -> Date {
calendar.date(byAdding: unit, value: value, to: self)!
}

@github-actions github-actions bot removed the pending-community-response Issue is pending response from the issue requestor label Jun 5, 2024
@amruth-movano
Copy link
Author

@lawmicha for datelimit, we had set it while login, you can check the sessionmanager class attached in zip file, inside signin function you will get a call to startAndPopulateDataStore with the dateLimit

@5d
Copy link
Member

5d commented Jun 5, 2024

@amruth-movano ,

The verbose log you provided is too extensive for us to investigate. Could you please narrow it down to the specific issue where fetching 1 week data is successful, but fetching single day data is not?

Additionally, in the verbose log, there should be generated GraphQL queries. Could you verify whether the single-day query, which is supposed to retrieve the data, is actually fetching the data in the AWS AppSync console?

@amruth-movano
Copy link
Author

Hi @5d
I am attaching 4 files. So I have data on 22nd, 23rd 24th may in backend and in week table it is showing but not in daily page.
Also after doing logout and relogin again, facing same issues in data store.
Can you please check & let us know the issue?

verbose.zip

@lawmicha
Copy link
Contributor

lawmicha commented Jun 5, 2024

Hi @amruth-movano, I checked the code and I misspoke earlier, it is passing the syncExpression as the filter input for sync query. It is applying the syncExpression client side only on the subscription events. As @5d mentioned, can you add Amplify.Logging.logLevel = .verbose and launch the app. Have dataLimit set to weekly and initiate a DataStore.start()? The logs will then have the GraphQL queries. Repeat this with daily, and you should be able to find two sync query requests with different filter inputs.

Can you provide us with these logs that show the GraphQL queries, and you should also be able to replay them in the AppSync console to see the response which will be reconciled to the local database

@amruth-movano
Copy link
Author

HI @lawmicha
I have ran app by setting dateLimit to week and here are the verbose logs. And I am new to amplify, can you please explain in detail if there is any issue in our log?

Archive.zip

@lawmicha
Copy link
Contributor

lawmicha commented Jun 6, 2024

The queries we're looking for can be seen in login_with_7day.txt.

{
  "variables" : {
    "lastSync" : 1717642654175,
    "filter" : {
      "and" : [
        {
          "user_id" : {
            "eq" : "f6dd38ae-3d14-4455-8f23-ab12ad4fa179"
          }
        },
        {
          "time_stamp" : {
            "gt" : "2024-05-29T18:30:00.000Z"
          }
        }
      ]
    },
    "limit" : 1000
  },
  "query" : "query SyncTimedData($filter: ModelTimedDataFilterInput, $lastSync: AWSTimestamp, $limit: Int) {\n  syncTimedData(filter: $filter, lastSync: $lastSync, limit: $limit) {\n    items {\n      user_id\n      time_stamp\n      breath_rate\n      calories\n      createdAt\n      data_timezone\n      distance_covered\n      heart_rate\n      hrv\n      owner\n      ring_serial_no\n      skin_temperature\n      spo2\n      step_count\n      updatedAt\n      __typename\n      _version\n      _deleted\n      _lastChangedAt\n    }\n    nextToken\n    startedAt\n  }\n}"
}

The query contains the filter with the correct condition.

For the other file, 23may_day.txt and 24may_day.txt, you'll have to debug the error you are getting:

Error fetching sleep data. Error: noSleepSessionsFound(2024-05-23 18:29:59 +0000)
Error happened while loading sleep details. Error: The operation couldn’t be completed. (DataStore.SleepSessionError error 0.)

How are you making this call? and what else can you get from the error object returned, such as the underlying error?

@amruth-movano
Copy link
Author

amruth-movano commented Jun 6, 2024

@lawmicha Got the issue, for users data have not been uploaded, it was in their local cache of data store but not got uploaded. And when user logged out, we first try to synchronise with data store and then data gets cleared. So for this user data has not been uploaded for one table & its uploaded for remaining tables.

So can you help me how can we upload data safely & fastly to dynamodb. I am having lots of data on daily basis to upload to db.

Currently we are doing it below way. But sometimes its taking 2-3 days to get data saved in db -

extension DataStoreBaseBehavior {
    @discardableResult
    func save<M: Model>(_ models: [M], where predicate: QueryPredicate?) async throws -> [M] {
        let results = try await withThrowingTaskGroup(of: M.self) { group in
            for model in models {
                group.addTask {
                    try await save(model, where: predicate)
                }
            }

            return try await group.reduce(into: []) { array, model in
                array.append(model)
            }
        }

        return results
    }
}

@lawmicha
Copy link
Contributor

lawmicha commented Jun 7, 2024

Clearing the local database with DataStore.clear() may result in data lost if it has not been synchronized to AppSync. What are some reasons why you are clearing instead of calling DataStore.stop()? If you need DataStore to use an updated syncExpression, you could restart DataStore by calling stop then start instead of clear then start.

If you clearing because there's a different user signing into the app, then you could clear not on sign out, but when sign in is signed in with a user different from the previous.

@amruth-movano
Copy link
Author

Thanks for the review on our code @lawmicha

Could also help us in a better approach on how can we upload multiple rows to datastore. Currently we are adding it singly.

But in our app it can happen that for 1 table in a day for single user there can be maximum 11520 rows, depend on the user use case. Can you guide me how can we implement best way to upload data to dyanamodb which will ensure that data will get uploaded as soon as possible.

@lawmicha
Copy link
Contributor

lawmicha commented Jun 7, 2024

Hi @amruth-movano, AppSync conflict resolution enabled backends requires DataStore to perform the synchronization logic. Conflicts are handled per model which is why we cannot perform batch uploads. However, DataStore does optimizes how many model instances have to be synced, so for example if you save then update the model (two calls to DataStore.save with a model that has the same identifier), and it hasn't synced yet, it will merge the two into one create mutation with the final updated fields.

If you're producing 11520 rows of data, make sure to update syncMaxRecords to the expected data per model you want to pull down into the local database locally. It should be the max of the expected values of each model.

What does your data model look like and can you store the information in a different way to reduce the number of rows created?

@amruth-movano
Copy link
Author

amruth-movano commented Jun 10, 2024

@lawmicha
Actually we have user health data, so we need to store it for all the time to show it in representable way like in graphs, charts. So we can not reduce the number. We have the syncMaxRecords key already added to the code.

So overall conclusion is -

We won't be able to do the batch operations. We have do it one by one only, right?

Also below is config from our code -

{ "version": 1, "serviceConfiguration": { "apiName": "*****", "serviceName": "AppSync", "defaultAuthType": { "mode": "AMAZON_COGNITO_USER_POOLS", "cognitoUserPoolId": "authappUserPool" }, "conflictResolution": { "defaultResolutionStrategy": { "type": "OPTIMISTIC_CONCURRENCY" }, "perModelResolutionStrategy": [ { "resolutionStrategy": { "type": "OPTIMISTIC_CONCURRENCY" }, "entityName": "*************" }, { "resolutionStrategy": { "type": "OPTIMISTIC_CONCURRENCY" }, "entityName": "*************" }, { "resolutionStrategy": { "type": "OPTIMISTIC_CONCURRENCY" }, "entityName": "*************" }, { "resolutionStrategy": { "type": "OPTIMISTIC_CONCURRENCY" }, "entityName": "*************" }, { "resolutionStrategy": { "type": "OPTIMISTIC_CONCURRENCY" }, "entityName": "*************" }, { "resolutionStrategy": { "type": "OPTIMISTIC_CONCURRENCY" }, "entityName": "*************" }, { "resolutionStrategy": { "type": "OPTIMISTIC_CONCURRENCY" }, "entityName": "*************" }, { "resolutionStrategy": { "type": "OPTIMISTIC_CONCURRENCY" }, "entityName": "*************" }, { "resolutionStrategy": { "type": "OPTIMISTIC_CONCURRENCY" }, "entityName": "*************" }, { "resolutionStrategy": { "type": "OPTIMISTIC_CONCURRENCY" }, "entityName": "*************" }, { "resolutionStrategy": { "type": "OPTIMISTIC_CONCURRENCY" }, "entityName": "UserPreference" }, { "resolutionStrategy": { "type": "OPTIMISTIC_CONCURRENCY" }, "entityName": "*************" }, { "resolutionStrategy": { "type": "OPTIMISTIC_CONCURRENCY" }, "entityName": "*************" }, { "resolutionStrategy": { "type": "OPTIMISTIC_CONCURRENCY" }, "entityName": "*************" }, { "resolutionStrategy": { "type": "OPTIMISTIC_CONCURRENCY" }, "entityName": "*************" }, { "resolutionStrategy": { "type": "OPTIMISTIC_CONCURRENCY" }, "entityName": "DeviceTelemetry" }, { "resolutionStrategy": { "type": "OPTIMISTIC_CONCURRENCY" }, "entityName": "*************" }, { "resolutionStrategy": { "type": "OPTIMISTIC_CONCURRENCY" }, "entityName": "*************" }, { "resolutionStrategy": { "type": "OPTIMISTIC_CONCURRENCY" }, "entityName": "*************" } ] }, "additionalAuthTypes": [ { "mode": "API_KEY", "expirationTime": 7, "apiKeyExpirationDate": "2023-03-13T06:58:17.489Z", "keyDescription": "Added *********** Support" } ] } }

@lawmicha
Copy link
Contributor

We won't be able to do the batch operations. We have do it one by one only, right?

If you are provisioning your AppSync service with @model directive on your data modeling schema then yes the GraphQL operations created are create/update/delete mutations which do not support batching. If you want to bypass DataStore's local persistance, you can send the mutations directly to AppSync. Just keep in mind, create mutations may more straight forward, but for updates and deletes, you will need to add version to the GraphQL request, which usually comes from querying the model first. versioning metadata is added to each model when you have AppSync's conflict resolution strategy enabled.

@amruth-movano
Copy link
Author

@lawmicha
Ok, Thanks for this info. I think we can close this bug for now.

@lawmicha
Copy link
Contributor

Feel free to open another issue if you have more questions

Copy link
Contributor

This issue is now closed. Comments on closed issues are hard for our team to see.
If you need more assistance, please open a new issue that references this one.
If you wish to keep having a conversation with other community members under this issue feel free to do so.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working datastore Issues related to the DataStore category
Projects
None yet
Development

No branches or pull requests

3 participants