Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove retriving bookmarks function #1160

Merged
merged 1 commit into from
Sep 23, 2022
Merged

Conversation

YukihoAA
Copy link
Contributor

@YukihoAA YukihoAA commented Sep 10, 2022

for #1159

removing retriving bookmarks infomation to prevent ban.

it works fine for tags mode but idk why code get bookmarks seperatly before download...

@Nandaka
Copy link
Owner

Nandaka commented Sep 10, 2022

It is to skip the post if the bookmark limit is defined when you download by tags. If you remove this code, then it will download all images when you set a limit.

If you want to skip this bookmark data retrieval, then just set the limit to 0/None from the console input.

https://github.com/Nandaka/PixivUtil2/blob/master/PixivUtil2.py#L366
https://github.com/Nandaka/PixivUtil2/blob/master/PixivUtil2.py#L371

@Nandaka Nandaka closed this Sep 10, 2022
@YukihoAA
Copy link
Contributor Author

YukihoAA commented Sep 11, 2022

@Nandaka

bookmark skipping and grouping works fine for image mode because there is skipping function in PixivImageHandler.py L174

image

image

image

The reason why i had to remove this function, retriving bookmark data cause IP ban from pixiv, and we actually get bookmarkdata
in image processing so don't need to get image data before it.

it makes actually faster download because it skips getting bookmark data for posts which already in database

@Nandaka
Copy link
Owner

Nandaka commented Sep 11, 2022

But if the image doesn't exists in the db, it still need to get the bookmark data. Only if you are registered premium user that have the data from the json response, I think?

# only premium support server-side filtering for bookmark count

fyi, bookmark data is empty from server
image

and it is only available when you load the page, which I think it is more expensive to run? and you still need to get the page anyway if you want to filter by bookmark count.
image

Maybe you can add a logic to skip this you are premium user? e.g. if use_bookmark_data and not self._isPremium:

if use_bookmark_data:

@YukihoAA
Copy link
Contributor Author

YukihoAA commented Sep 11, 2022

I'm not using premium now and I'm using bookmark count skip feature.

pixivutil gets bookmark data who does not using premium.

It was an ideal codes to use bookmark count skipping without being IP ban. issue #1159
if i use 20220825 version, like if image skipped 120 in one time, pixiv will give me IP ban.

I tried to add 2 second wait to every loop for bookmark retriving function using wait(result, self.__config)
but does not worked properly and took too many times (because, it will double wait for images which will be downloaded) :(

so my Idea was just using functions in PixivImageHandler to make some delay to prevent ban.

so.. this code will use more CPU time but less Physical time... for wait()

this is not perfect solution for it so just for reference

@Nandaka
Copy link
Owner

Nandaka commented Sep 11, 2022

can you confirm after running it for some time? because after the change, then it will always fetch the image page and then skipping it if the bookmark count is lower ==> which in turn, currently it will cause 429 error due to it request too many image page too often.

@YukihoAA
Copy link
Contributor Author

YukihoAA commented Sep 11, 2022

image

In case v20220825: will got baned after 120~180 image skipped by retriving bookmarks with 5 sec delay
image
image

After edited: it worked fine (~2hr running) with 5 sec delay.
image
image

fine with 4 sec delay (tested 20min)
image
image
image

I run same tag and option that i used before so almost all images was skipped.

2 sec delay is not working both case.

@YukihoAA
Copy link
Contributor Author

3sec is seems fine too (1hr test) - @Nandaka
image
image
image
image
image

@Nandaka Nandaka reopened this Sep 23, 2022
@Nandaka Nandaka merged commit 3927a42 into Nandaka:master Sep 23, 2022
@Nandaka
Copy link
Owner

Nandaka commented Sep 23, 2022

sorry for the late reply, I was just back. Anyway, it is merged now

Nandaka added a commit that referenced this pull request Sep 24, 2022
Nandaka added a commit that referenced this pull request Sep 24, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants