Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CR94][Sync] DCHECK in SyncSchedulerImpl::ScheduleLocalRefreshRequest #17938

Closed
AlexeyBarabash opened this issue Sep 8, 2021 · 6 comments · Fixed by brave/brave-core#12184
Closed
Assignees
Labels

Comments

@AlexeyBarabash
Copy link
Contributor

Description

From the brave/brave-core#9558 (comment):

3237513:3237542:0908/153018.831534:FATAL:sync_scheduler_impl.cc(323)] Check failed: !types.Empty(). 
#0 0x7fe48df836c9 base::debug::CollectStackTrace()
#1 0x7fe48de7e723 base::debug::StackTrace::StackTrace()
#2 0x7fe48de9ff54 logging::LogMessage::~LogMessage()
#3 0x7fe48dea099e logging::LogMessage::~LogMessage()
#4 0x5606819953ea syncer::SyncSchedulerImpl::ScheduleLocalRefreshRequest()
#5 0x5606819a961d syncer::SyncManagerImpl::RefreshTypes()
#6 0x560681e6aed2 syncer::SyncEngineBackend::DoStartSyncing()
#7 0x56068128f79e base::internal::Invoker<>::RunOnce()
#8 0x7fe48df15e2d base::TaskAnnotator::RunTask()
#9 0x7fe48df4ee67 base::internal::TaskTracker::RunBlockShutdown()
#10 0x7fe48df4e675 base::internal::TaskTracker::RunTask()
#11 0x7fe48df9d767 base::internal::TaskTrackerPosix::RunTask()
#12 0x7fe48df4dfe4 base::internal::TaskTracker::RunAndPopNextTask()
#13 0x7fe48df59c79 base::internal::WorkerThread::RunWorker()
#14 0x7fe48df5986d base::internal::WorkerThread::RunPooledWorker()
#15 0x7fe48df9e28e base::(anonymous namespace)::ThreadFunc()
#16 0x7fe47d19f609 start_thread
#17 0x7fe47c933293 clone
Task trace:
#0 0x560681e65e72 syncer::SyncEngineImpl::StartSyncingWithServer()
#1 0x560681e6b6d2 syncer::SyncEngineBackend::DoFinishConfigureDataTypes()
#2 0x560681e6675f syncer::SyncEngineImpl::ConfigureDataTypes()
#3 0x560681e6b6d2 syncer::SyncEngineBackend::DoFinishConfigureDataTypes()
#4 0x560681e6675f syncer::SyncEngineImpl::ConfigureDataTypes()
Task trace buffer limit hit, update PendingTask::kTaskBacktraceLength to increase.

Steps to Reproduce

  1. Take debug build
  2. Create chain on deviceA
  3. Connect deviceB to the chain with the passphrase

Actual result:

  1. Crash happens on failed DCHECK

Expected result:

  1. No crash

Reproduces how often:

Easily reproduced

Desktop Brave version:

Brave 1.31.24 Chromium: 94.0.4606.31 (Developer Build) (64-bit)
Revision 18bcff0168997e62fbbcd38ac5176f659b327761-refs/branch-heads/4606@{#584}
OS Linux
@AlexeyBarabash AlexeyBarabash added OS/Android Fixes related to Android browser functionality OS/Desktop labels Sep 8, 2021
@AlexeyBarabash AlexeyBarabash self-assigned this Sep 8, 2021
@AlexeyBarabash
Copy link
Contributor Author

With commented out DCHECK the issue does not happen.
On Chromium 93 the DCHECK is at the same place, but it is not triggered.

@AlexeyBarabash
Copy link
Contributor Author

cc @jsecretan , @darkdh

@AlexeyBarabash
Copy link
Contributor Author

Something weird happened. According to the stack, SyncEngineBackend::DoStartSyncing has invoked SyncManagerImpl::RefreshTypes, but this is impossible:

void SyncEngineBackend::DoStartSyncing(base::Time last_poll_time) {
  DCHECK_CALLED_ON_VALID_SEQUENCE(sequence_checker_);
  sync_manager_->StartSyncingNormally(last_poll_time);
}

I will try to re-build without ccache

@AlexeyBarabash
Copy link
Contributor Author

Yes, with clean and ccache-less build there are no any issues. This was an accidental false alarm.

@AlexeyBarabash
Copy link
Contributor Author

I can see this error again on master with merged cr94, even with proper build:

[3035677:3035709:0915/200914.904498:FATAL:sync_scheduler_impl.cc(323)] Check failed: !types.Empty(). 
#0 0x7f9653bc96c9 base::debug::CollectStackTrace()
#1 0x7f9653ac4723 base::debug::StackTrace::StackTrace()
#2 0x7f9653ae5f54 logging::LogMessage::~LogMessage()
#3 0x7f9653ae699e logging::LogMessage::~LogMessage()
#4 0x56505835ecca syncer::SyncSchedulerImpl::ScheduleLocalRefreshRequest()
#5 0x565058372efd syncer::SyncManagerImpl::RefreshTypes()
#6 0x5650588347c2 syncer::SyncEngineBackend::DoStartSyncing()
#7 0x565057c57dee base::internal::Invoker<>::RunOnce()
#8 0x7f9653b5be2d base::TaskAnnotator::RunTask()
#9 0x7f9653b94e67 base::internal::TaskTracker::RunBlockShutdown()
#10 0x7f9653b94675 base::internal::TaskTracker::RunTask()
#11 0x7f9653be3767 base::internal::TaskTrackerPosix::RunTask()
#12 0x7f9653b93fe4 base::internal::TaskTracker::RunAndPopNextTask()
#13 0x7f9653b9fc79 base::internal::WorkerThread::RunWorker()
#14 0x7f9653b9f86d base::internal::WorkerThread::RunPooledWorker()
#15 0x7f9653be428e base::(anonymous namespace)::ThreadFunc()
#16 0x7f9642de1609 start_thread
#17 0x7f9642575293 clone
Task trace:
#0 0x56505882f762 syncer::SyncEngineImpl::StartSyncingWithServer()
#1 0x565058834fc2 syncer::SyncEngineBackend::DoFinishConfigureDataTypes()
#2 0x56505883004f syncer::SyncEngineImpl::ConfigureDataTypes()
#3 0x565058834fc2 syncer::SyncEngineBackend::DoFinishConfigureDataTypes()
#4 0x56505883004f syncer::SyncEngineImpl::ConfigureDataTypes()
Task trace buffer limit hit, update PendingTask::kTaskBacktraceLength to increase.

Received signal 6
#0 0x7f9653bc96c9 base::debug::CollectStackTrace()
#1 0x7f9653ac4723 base::debug::StackTrace::StackTrace()
#2 0x7f9653bc9151 base::debug::(anonymous namespace)::StackDumpSignalHandler()
#3 0x7f9642ded3c0 (/usr/lib/x86_64-linux-gnu/libpthread-2.31.so+0x153bf)
#4 0x7f964249918b gsignal
#5 0x7f9642478859 abort
#6 0x7f9653bc82d5 base::debug::BreakDebugger()
#7 0x7f9653ae63a1 logging::LogMessage::~LogMessage()
#8 0x7f9653ae699e logging::LogMessage::~LogMessage()
#9 0x56505835ecca syncer::SyncSchedulerImpl::ScheduleLocalRefreshRequest()
#10 0x565058372efd syncer::SyncManagerImpl::RefreshTypes()
#11 0x5650588347c2 syncer::SyncEngineBackend::DoStartSyncing()
#12 0x565057c57dee base::internal::Invoker<>::RunOnce()
#13 0x7f9653b5be2d base::TaskAnnotator::RunTask()
#14 0x7f9653b94e67 base::internal::TaskTracker::RunBlockShutdown()
#15 0x7f9653b94675 base::internal::TaskTracker::RunTask()
#16 0x7f9653be3767 base::internal::TaskTrackerPosix::RunTask()
#17 0x7f9653b93fe4 base::internal::TaskTracker::RunAndPopNextTask()
#18 0x7f9653b9fc79 base::internal::WorkerThread::RunWorker()
#19 0x7f9653b9f86d base::internal::WorkerThread::RunPooledWorker()
#20 0x7f9653be428e base::(anonymous namespace)::ThreadFunc()
#21 0x7f9642de1609 start_thread
#22 0x7f9642575293 clone
  r8: 0000000000000000  r9: 00007f962fc5e5e0 r10: 0000000000000008 r11: 0000000000000246
 r12: 00002f016597fc80 r13: 0000000000000065 r14: aaaaaaaaaaaaaaaa r15: 00007f962fc5f060
  di: 0000000000000002  si: 00007f962fc5e5e0  bp: 00007f962fc5e830  bx: 00007f962fc61700
  dx: 0000000000000000  ax: 0000000000000000  cx: 00007f964249918b  sp: 00007f962fc5e5e0
  ip: 00007f964249918b efl: 0000000000000246 cgf: 002b000000000033 erf: 0000000000000000
 trp: 0000000000000000 msk: 0000000000000000 cr2: 0000000000000000

Looks like caused by https://chromium-review.googlesource.com/c/chromium/src/+/3055357/3/components/sync/engine/sync_manager_impl.cc

void SyncManagerImpl::RefreshTypes(ModelTypeSet types) {
  DCHECK_CALLED_ON_VALID_SEQUENCE(sequence_checker_);

  const ModelTypeSet types_to_refresh =
      Intersection(types, model_type_registry_->GetConnectedTypes());

  if (!types.Empty()) {
    scheduler_->ScheduleLocalRefreshRequest(types_to_refresh);
  }
}
...
void SyncSchedulerImpl::ScheduleLocalRefreshRequest(ModelTypeSet types) {
  DCHECK_CALLED_ON_VALID_SEQUENCE(sequence_checker_);
  DCHECK(!types.Empty());
...

Which doesn't have much sense and looks like a typo.
Posting issue to Chromium bugtracker is in progress.

@szilardszaloki
Copy link
Contributor

szilardszaloki commented Feb 8, 2022

The above issue is easily reproducible with just disabling the default enabled type (Bookmarks).
So STR would be:

  1. new profile
  2. turn on Sync
  3. disable Bookmarks

The problem is that in BraveSyncManagerImpl::StartSyncingNormally(), we pass a hardcoded ModelTypeSet (ModelTypeSet(BOOKMARKS)) to SyncManagerImpl::RefreshTypes(), therefore types is always Bookmarks. model_type_registry_->GetConnectedTypes() depends on what we have enabled, but by default it's Bookmarks, Device Info, User Consents, Security Events, Sharing Message, Encryption Keys. Since the intersection is Bookmarks, we schedule a nudge with a non-empty ModelTypeSet, so the DCHECK in SyncSchedulerImpl::ScheduleLocalRefreshRequest() passes. If we disable Bookmarks, the intersection becomes empty, hence we end up with the DCHECK failing.

There's two things that we have to fix here:

  1. the Empty() check in SyncManagerImpl::RefreshTypes() needs to be done on the intersection (actually I went ahead and fixed this in Chromium), which, although resolves the DCHECK issue, would introduce another one:
    1.1. new profile
    1.2. turn on Sync
    1.3. disable Bookmarks
    1.4. enable Passwords
    1.5. restart browser
    1.6. search for polling delay in logs ([17096:14408:0201/171159.419:VERBOSE1:sync_scheduler_impl.cc(568)] Default: Updating polling delay to 479 minutes.)
    The above leaves the polling delay at ~480 minutes (unless you have a pending nudge job going on already, but that's unlikely, since you've just restarted the browser).
  2. the root cause of the above is in BraveSyncManagerImpl::StartSyncingNormally() and can be fixed like so:
void BraveSyncManagerImpl::StartSyncingNormally(base::Time last_poll_time) {
  SyncManagerImpl::StartSyncingNormally(last_poll_time);
  RefreshTypes(ModelTypeSet::All());
}

Currently (that is, with !types.Empty() in SyncManagerImpl::RefreshTypes()), it resolves the DCHECK issue, since the intersection can never become empty. On the other hand, it will also prevent the polling delay issue once the fix (!types.Empty() ==> !types_to_refresh.empty()) reaches the version of Chromium we are on.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
2 participants