Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

opt(restore): Sort the buffer before spinning the writeToDisk goroutine #7984

Merged
merged 1 commit into from
Aug 12, 2021

Conversation

ahsanbarkati
Copy link
Contributor

@ahsanbarkati ahsanbarkati commented Aug 12, 2021

Sort the buffer before-hand instead of sorting it in the goroutine used for
writing the buffer to disk. The writeToDisk goroutines are throttled and making it
expensive causes other goroutines to block.

This change significantly improves restore map phase.


This change is Reviewable

Copy link
Contributor

@manishrjain manishrjain left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm: Awesome change!

Reviewed 1 of 1 files at r1, all commit messages.
Reviewable status: :shipit: complete! all files reviewed, all discussions resolved (waiting on @ahsanbarkati)

@ahsanbarkati ahsanbarkati merged commit 1966245 into master Aug 12, 2021
@ahsanbarkati ahsanbarkati deleted the ahsan/fix-restore-bottleneck branch August 12, 2021 20:27
danielmai pushed a commit that referenced this pull request Aug 24, 2021
…ne (#7984)

Sort the buffer beforehand instead of sorting it in the goroutine used for
writing the buffer to disk. The writeToDisk goroutines are throttled and
making it expensive causes other goroutines to block.

This change significantly improves the restore map phase.

(cherry picked from commit 1966245)
danielmai added a commit that referenced this pull request Aug 24, 2021
…ne (#7984) (#7996)

Sort the buffer beforehand instead of sorting it in the goroutine used for
writing the buffer to disk. The writeToDisk goroutines are throttled and
making it expensive causes other goroutines to block.

This change significantly improves the restore map phase.

(cherry picked from commit 1966245)

Co-authored-by: Ahsan Barkati <ahsan@dgraph.io>
mangalaman93 pushed a commit that referenced this pull request Jan 4, 2023
This commit is a major rewrite of backup and online restore
code. It used to use KVLoader in badger. Now it instead uses
StreamWriter that is much faster for writes.

cherry-pick PR #7753

following commits are cherry-picked (in reverse order):
 * opt(restore): Sort the buffer before spinning the writeToDisk goroutine (#7984) (#7996)
 * fix(backup): Fix full backup request (#7932) (#7933)
 * fix: fixing graphql schema update when the data is restored +
 * fix(restore): return nil if there is error (#7899)
        skipping /probe/graphql from audit (#7925)
 * Don't ban namespace in export_backup
 * reset the kv.StreamId before sending to stream writer (#7833) (#7837)
 * fix(restore): Bump uid and namespace after restore (#7790) (#7800)
 * fix(ee): GetKeys should return an error (#7713) (#7797)
 * fix(backup): Free the UidPack after use (#7786)
 * fix(export-backup): Fix double free in export backup (#7780) (#7783)
 * fix(lsbackup): Fix profiler in lsBackup (#7729)
 * Bring back "perf(Backup): Improve backup performance (#7601)"
 * Opt(Backup): Make backups faster (#7680)
 * Fix s3 backup copy (#7669)
 * [BREAKING] Opt(Restore): Optimize Restore's new map-reduce based design (#7666)
 * Perf(restore): Implement map-reduce based restore (#7664)
 * feat(backup): Merge backup refactoring
 * Revert "perf(Backup): Improve backup performance (#7601)"
mangalaman93 pushed a commit that referenced this pull request Jan 6, 2023
This commit is a major rewrite of backup and online restore
code. It used to use KVLoader in badger. Now it instead uses
StreamWriter that is much faster for writes.

cherry-pick PR #7753

following commits are cherry-picked (in reverse order):
 * opt(restore): Sort the buffer before spinning the writeToDisk goroutine (#7984) (#7996)
 * fix(backup): Fix full backup request (#7932) (#7933)
 * fix: fixing graphql schema update when the data is restored +
        skipping /probe/graphql from audit (#7925)
 * fix(restore): return nil if there is error (#7899)
 * Don't ban namespace in export_backup
 * reset the kv.StreamId before sending to stream writer (#7833) (#7837)
 * fix(restore): Bump uid and namespace after restore (#7790) (#7800)
 * fix(ee): GetKeys should return an error (#7713) (#7797)
 * fix(backup): Free the UidPack after use (#7786)
 * fix(export-backup): Fix double free in export backup (#7780) (#7783)
 * fix(lsbackup): Fix profiler in lsBackup (#7729)
 * Bring back "perf(Backup): Improve backup performance (#7601)"
 * Opt(Backup): Make backups faster (#7680)
 * Fix s3 backup copy (#7669)
 * [BREAKING] Opt(Restore): Optimize Restore's new map-reduce based design (#7666)
 * Perf(restore): Implement map-reduce based restore (#7664)
 * feat(backup): Merge backup refactoring
 * Revert "perf(Backup): Improve backup performance (#7601)"
mangalaman93 pushed a commit that referenced this pull request Jan 17, 2023
This commit is a major rewrite of backup and online restore
code. It used to use KVLoader in badger. Now it instead uses
StreamWriter that is much faster for writes.

cherry-pick PR #7753

following commits are cherry-picked (in reverse order):
 * opt(restore): Sort the buffer before spinning the writeToDisk goroutine (#7984) (#7996)
 * fix(backup): Fix full backup request (#7932) (#7933)
 * fix: fixing graphql schema update when the data is restored +
        skipping /probe/graphql from audit (#7925)
 * fix(restore): return nil if there is error (#7899)
 * Don't ban namespace in export_backup
 * reset the kv.StreamId before sending to stream writer (#7833) (#7837)
 * fix(restore): Bump uid and namespace after restore (#7790) (#7800)
 * fix(ee): GetKeys should return an error (#7713) (#7797)
 * fix(backup): Free the UidPack after use (#7786)
 * fix(export-backup): Fix double free in export backup (#7780) (#7783)
 * fix(lsbackup): Fix profiler in lsBackup (#7729)
 * Bring back "perf(Backup): Improve backup performance (#7601)"
 * Opt(Backup): Make backups faster (#7680)
 * Fix s3 backup copy (#7669)
 * [BREAKING] Opt(Restore): Optimize Restore's new map-reduce based design (#7666)
 * Perf(restore): Implement map-reduce based restore (#7664)
 * feat(backup): Merge backup refactoring
 * Revert "perf(Backup): Improve backup performance (#7601)"
mangalaman93 pushed a commit that referenced this pull request Jan 17, 2023
This commit is a major rewrite of backup and online restore
code. It used to use KVLoader in badger. Now it instead uses
StreamWriter that is much faster for writes.

cherry-pick PR #7753

following commits are cherry-picked (in reverse order):
 * opt(restore): Sort the buffer before spinning the writeToDisk goroutine (#7984) (#7996)
 * fix(backup): Fix full backup request (#7932) (#7933)
 * fix: fixing graphql schema update when the data is restored +
        skipping /probe/graphql from audit (#7925)
 * fix(restore): return nil if there is error (#7899)
 * Don't ban namespace in export_backup
 * reset the kv.StreamId before sending to stream writer (#7833) (#7837)
 * fix(restore): Bump uid and namespace after restore (#7790) (#7800)
 * fix(ee): GetKeys should return an error (#7713) (#7797)
 * fix(backup): Free the UidPack after use (#7786)
 * fix(export-backup): Fix double free in export backup (#7780) (#7783)
 * fix(lsbackup): Fix profiler in lsBackup (#7729)
 * Bring back "perf(Backup): Improve backup performance (#7601)"
 * Opt(Backup): Make backups faster (#7680)
 * Fix s3 backup copy (#7669)
 * [BREAKING] Opt(Restore): Optimize Restore's new map-reduce based design (#7666)
 * Perf(restore): Implement map-reduce based restore (#7664)
 * feat(backup): Merge backup refactoring
 * Revert "perf(Backup): Improve backup performance (#7601)"
mangalaman93 pushed a commit that referenced this pull request Jan 17, 2023
This commit is a major rewrite of backup and online restore
code. It used to use KVLoader in badger. Now it instead uses
StreamWriter that is much faster for writes.

cherry-pick PR #7753

following commits are cherry-picked (in reverse order):
 * opt(restore): Sort the buffer before spinning the writeToDisk goroutine (#7984) (#7996)
 * fix(backup): Fix full backup request (#7932) (#7933)
 * fix: fixing graphql schema update when the data is restored +
        skipping /probe/graphql from audit (#7925)
 * fix(restore): return nil if there is error (#7899)
 * Don't ban namespace in export_backup
 * reset the kv.StreamId before sending to stream writer (#7833) (#7837)
 * fix(restore): Bump uid and namespace after restore (#7790) (#7800)
 * fix(ee): GetKeys should return an error (#7713) (#7797)
 * fix(backup): Free the UidPack after use (#7786)
 * fix(export-backup): Fix double free in export backup (#7780) (#7783)
 * fix(lsbackup): Fix profiler in lsBackup (#7729)
 * Bring back "perf(Backup): Improve backup performance (#7601)"
 * Opt(Backup): Make backups faster (#7680)
 * Fix s3 backup copy (#7669)
 * [BREAKING] Opt(Restore): Optimize Restore's new map-reduce based design (#7666)
 * Perf(restore): Implement map-reduce based restore (#7664)
 * feat(backup): Merge backup refactoring
 * Revert "perf(Backup): Improve backup performance (#7601)"
mangalaman93 pushed a commit that referenced this pull request Jan 18, 2023
This commit is a major rewrite of backup and online restore
code. It used to use KVLoader in badger. Now it instead uses
StreamWriter that is much faster for writes.

cherry-pick PR #7753

following commits are cherry-picked (in reverse order):
 * opt(restore): Sort the buffer before spinning the writeToDisk goroutine (#7984) (#7996)
 * fix(backup): Fix full backup request (#7932) (#7933)
 * fix: fixing graphql schema update when the data is restored +
        skipping /probe/graphql from audit (#7925)
 * fix(restore): return nil if there is error (#7899)
 * Don't ban namespace in export_backup
 * reset the kv.StreamId before sending to stream writer (#7833) (#7837)
 * fix(restore): Bump uid and namespace after restore (#7790) (#7800)
 * fix(ee): GetKeys should return an error (#7713) (#7797)
 * fix(backup): Free the UidPack after use (#7786)
 * fix(export-backup): Fix double free in export backup (#7780) (#7783)
 * fix(lsbackup): Fix profiler in lsBackup (#7729)
 * Bring back "perf(Backup): Improve backup performance (#7601)"
 * Opt(Backup): Make backups faster (#7680)
 * Fix s3 backup copy (#7669)
 * [BREAKING] Opt(Restore): Optimize Restore's new map-reduce based design (#7666)
 * Perf(restore): Implement map-reduce based restore (#7664)
 * feat(backup): Merge backup refactoring
 * Revert "perf(Backup): Improve backup performance (#7601)"
all-seeing-code pushed a commit that referenced this pull request Jan 23, 2023
This commit is a major rewrite of backup and online restore
code. It used to use KVLoader in badger. Now it instead uses
StreamWriter that is much faster for writes.

cherry-pick PR #7753

following commits are cherry-picked (in reverse order):
 * opt(restore): Sort the buffer before spinning the writeToDisk goroutine (#7984) (#7996)
 * fix(backup): Fix full backup request (#7932) (#7933)
 * fix: fixing graphql schema update when the data is restored +
 * fix(restore): return nil if there is error (#7899)
        skipping /probe/graphql from audit (#7925)
 * Don't ban namespace in export_backup
 * reset the kv.StreamId before sending to stream writer (#7833) (#7837)
 * fix(restore): Bump uid and namespace after restore (#7790) (#7800)
 * fix(ee): GetKeys should return an error (#7713) (#7797)
 * fix(backup): Free the UidPack after use (#7786)
 * fix(export-backup): Fix double free in export backup (#7780) (#7783)
 * fix(lsbackup): Fix profiler in lsBackup (#7729)
 * Bring back "perf(Backup): Improve backup performance (#7601)"
 * Opt(Backup): Make backups faster (#7680)
 * Fix s3 backup copy (#7669)
 * [BREAKING] Opt(Restore): Optimize Restore's new map-reduce based design (#7666)
 * Perf(restore): Implement map-reduce based restore (#7664)
 * feat(backup): Merge backup refactoring
 * Revert "perf(Backup): Improve backup performance (#7601)"
all-seeing-code pushed a commit that referenced this pull request Jan 23, 2023
This commit is a major rewrite of backup and online restore
code. It used to use KVLoader in badger. Now it instead uses
StreamWriter that is much faster for writes.

cherry-pick PR #7753

following commits are cherry-picked (in reverse order):
 * opt(restore): Sort the buffer before spinning the writeToDisk goroutine (#7984) (#7996)
 * fix(backup): Fix full backup request (#7932) (#7933)
 * fix: fixing graphql schema update when the data is restored +
        skipping /probe/graphql from audit (#7925)
 * fix(restore): return nil if there is error (#7899)
 * Don't ban namespace in export_backup
 * reset the kv.StreamId before sending to stream writer (#7833) (#7837)
 * fix(restore): Bump uid and namespace after restore (#7790) (#7800)
 * fix(ee): GetKeys should return an error (#7713) (#7797)
 * fix(backup): Free the UidPack after use (#7786)
 * fix(export-backup): Fix double free in export backup (#7780) (#7783)
 * fix(lsbackup): Fix profiler in lsBackup (#7729)
 * Bring back "perf(Backup): Improve backup performance (#7601)"
 * Opt(Backup): Make backups faster (#7680)
 * Fix s3 backup copy (#7669)
 * [BREAKING] Opt(Restore): Optimize Restore's new map-reduce based design (#7666)
 * Perf(restore): Implement map-reduce based restore (#7664)
 * feat(backup): Merge backup refactoring
 * Revert "perf(Backup): Improve backup performance (#7601)"
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

2 participants