-
Notifications
You must be signed in to change notification settings - Fork 169
Sudden FAILURE for every download #202
Comments
Hi. I'm surprised that there is no additional logging to this. The log file in the root folder always is log-level debug. Is there any additional output in there? |
if not, I'll push a version with some additional debugging to help you diagnose. |
Thanks for the very quick response. There is no additional output in either stdout, gphotos.log or gphotos.trace. The stdout and gphotos.log match and gphotos.trace only has the result of the single search it runs at the beginning (mediaItems:search responds with 200) - no further output during these FAILURE downloads. Here is the full command and output:
|
OK, I'll drop some extra debugging in this evening. Will let you know when it is pushed. |
I added
It looks like they do rate limiting on IP addresses as well:
Because any other IP address works fine:
Perhaps I shouldn't have ran this at 128 threads at 870 Mbit/s for nearly 3 hours. :) Judging by the code base, there is no code that specifically handles 429 responses. It would be a good idea to throttle and slow down to avoid Google from placing more severe limits on the IP address:
|
your first raise cause find_bad_items() to be invoked and that needs to be removed since it was there to handle an earlier issue in the Google API that is fixed. My error handling in this bit of the code is rather poor and your second fix helps. I have never seen quota failure like this before. I've downloaded my entire 100,000 item library multiple times to the same PC. I'm at work right now but will do a little more digging this evening. Oh wait, I just saw your last bit about threads and MBits - yeah I have not tested at anywhere near that rate! I will put some better error handling in the code, at least so that it reports the issue correctly. I could add the suggested throttling too but would not be able to real-world test it! I originally added the --max-threads option to help people throttle their bandwidth and had not anticipated your use case. Good to know that gphotos-sync (almost) held it together under such load. :-) |
I have a library of 2.9 TB, I just wanted to speed things up :D For a quick fix, just sleep all HTTP requests for 30 seconds whenever you hit a 429 and double the sleep duration until you get a 200 again. At that point, you can reset it to 30 seconds again. To real-world test it, just change your credentials quota to rclone has implemented this pretty well. Here are the HTTP codes they apply this kind of throttling to: https://github.com/rclone/rclone/blob/29b4f211ab95b42b7544103984f025eab3281b2a/backend/googlephotos/googlephotos.go#L206 |
That is a serious library.
Do you have any interest in contributing the fix? You are the only user I
know of that could actually test it!
…On Wed, 26 Feb 2020 at 12:52, ozupey ***@***.***> wrote:
I have a library of 2.9 TB, I just wanted to speed things up :D
For a quick fix, just sleep all HTTP requests for 30 seconds whenever you
hit a 429 and double the sleep duration until you get a 200 again. At that
point, you can reset it to 30 seconds again.
rclone has implemented this pretty well. Here are the HTTP codes they
apply this kind of throttling to:
https://github.com/rclone/rclone/blob/29b4f211ab95b42b7544103984f025eab3281b2a/backend/googlephotos/googlephotos.go#L206
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#202?email_source=notifications&email_token=AAHLRWZE3QQRYPM7YSYYYPDREZQX3A5CNFSM4K4DRJAKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOENADKPA#issuecomment-591410492>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAHLRW7VU7BFJ7FY54KBBULREZQX3ANCNFSM4K4DRJAA>
.
|
OK, thanks for excellent diagnosis and useful info. I'll take a look at this sometime this week and get you to try it out. |
Just found out one more quirk, the No matter which IP I use to access the I'm gonna wait 12-24 hours and try again. :) |
Did it come back to life after a day? |
It did. It kept giving failures for roughly 12 hours afterwards, but I think it reset at midnight Pacific time (which is also when the API quotas reset), after which there were no more failures. :) |
Cool. |
I believe the reason for the long blacklisting was because the initial 429 errors were ignored. That's why Google recommends the "Exponential backoff", where the sleeps get longer until the 429 is gone. |
OK, that makes sense. I'll add the backoff soon. |
The latest push of the code in the branch Not much credit due to me since it turns out I just needed to switch on the feature which is in the python urllib3 library. I'm not sure if you are interested in gphotos-sync anymore now you are aware of the video transcoding issue. But if you are then please can you try a little stress test? Thanks, giles. |
As soon as my bandwidth cap has reset for this month, I'll give it another go and let you know. Feel free to send me a reminder if I didn't get back to you by the end of next week. Enjoy your weekend! :) |
@ozupey please can you try out that stress test if you have time? |
@gilesknap Did I do anything wrong? It just instantly dies now:
|
So you have tripped the 429 on the API call rather than on the Base URL. I have not implemented the 429 handling for the API calls (but probably should and it is easy to do). I implemented it in the download code which uses the Base URL quota (this is the one you bumped into, I believe). Now the other problem is that the Quotas you can adjust on the API console are on a daily basis. There is also an underlying rate quota which does not appear to be configurable. I don't really think the request backoff can deal with hitting the daily quota. It is intended to deal with the rate quotas. Apparently these are 10/s/IPaddress and 100/s/user - at least for the APIs, maybe the BaseURL Rate Quota is higher. If you hit the daily quota and try to back off you would need to do so for several hours until the day rolls over. Having written this all down, I have convinced myself that you probably never hit the rate quota, but did hit the daily quota. I assert that in this case aborting with a sensible error is the best we can do. Perhaps you could verify that you are happy with that by switching 'All Requests' to a high number and leaving 'BaseUrl requests' at 1. Thanks |
Closing this since I think we have done what we can. Reopen if you have any more to add. Thanks for an interesting issue! |
Just so you know, I just hit the 429s again on the latest version installed via pip(env). This time I stuck to all the default arguments, including
|
Am I correct in saying that once you hit this you have to wait until the next day? If that is correct then the only thing we can do is cap the maximum download rate to match Google. |
I'm not sure how long it will take until it will be reset again. As mentioned before, Google uses "exponential backoff". This means each time you hit a 429, you should wait longer and longer until you do the next HTTP request:
If I look at the logs, gphotos-sync just skipped the photo instantly and moved on to the next without any delay whatsoever. This will make the block longer and longer and is the opposite of what Google recommends:
So, ideally, as soon as you hit a 429, wait for 30 seconds and try again. If that fails, wait 60 seconds and try again. If that fails, wait 120 seconds and try again. As soon as the 429 vanishes, you can reset the delay. |
I don't think you'll see any retries in the logs since I'm letting the underlying http library handle the 429 errors. It is my understanding that the Google API call quotas have a calls / second quota as well as a calls / day cap. Thus 429 handling should work with API calls. However, I believe that the download URLs only have a daily quota and therefore once the quota is exceeded then you are out of luck. I will confess that I have not traced the code or the network to discover if the library is correctly handling 429s. I don't have the bandwidth to do a genuine test. However I might be able to do some unit testing using https://httpbin.org/status/429 or generating my one 429 responses. |
First of all, thanks for this great project. :)
After downloading 75,450 photos and videos successfully, every attempt is now returning an instant FAILURE. The quotas on the Google Console only show about 30% usage, but to rule this out, I created a new client ID with no quota usage at all, and it keeps happening.
Unfortunately, even with a trace level, nothing gets appended to gphotos.trace for these downloading
attempts. If I search for these files on photos.google.com I can see and download them perfectly fine. I've also made sure the partition still has free space (3.7 TB) and I can write files perfectly fine.
Running version: 2.14.0, database schema version 5.7.
Any tips on how to debug this?
The text was updated successfully, but these errors were encountered: