Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider retrieving object summaries individually #10

Open
martinklepsch opened this issue Dec 7, 2015 · 4 comments
Open

Consider retrieving object summaries individually #10

martinklepsch opened this issue Dec 7, 2015 · 4 comments

Comments

@martinklepsch
Copy link
Member

martinklepsch commented Dec 7, 2015

Currently we download object summaries instead of getting each objects data with an individual request.

  1. This is faster when we need sync information for a lot of objects
  2. If the number of objects in the bucket grows (1000s) this gets slower
  3. If the number of objects to sync is small retrieving their data individually might be faster
  4. Retrieving objects individually will not be enough when pruning the bucket.

Getting objects data individually as mentioned in 3 would also allow diffing and syncing of metadata.

There is no clear right way in this case. I see the following options:

  • Add some logic that decides which approach to use
  • Add an option that allow users to decide how to get bucket information

I think adding logic is intransparent and might confuse so I'm thinking the latter option is best.

@podviaznikov any opinion to offer?

/via #9

@podviaznikov
Copy link
Contributor

I wonder would be the time difference for say 100 objects?
You can send individual requests in parallel, right?

@martinklepsch
Copy link
Member Author

Will need to check that.
On Wed, 9 Dec 2015 at 03:09, Anton Podviaznikov notifications@github.com
wrote:

I wonder would be the time difference for say 100 objects?
You can send individual requests in parallel, right?


Reply to this email directly or view it on GitHub
#10 (comment)
.

@martinklepsch
Copy link
Member Author

400 objects w/o any parallel processing:

  • 65s for getting data individually per object
  • 1s for getting object summaries

@martinklepsch
Copy link
Member Author

I did some very basic work towards this in the individual-diff branch. I'll release a 0.1.0 without it now and then we can cut a release with this later when it's done.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants