Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FANBOX download enhancement #680

Merged
merged 23 commits into from
Apr 28, 2020
Merged

FANBOX download enhancement #680

merged 23 commits into from
Apr 28, 2020

Conversation

bluerthanever
Copy link
Contributor

@bluerthanever bluerthanever commented Apr 27, 2020

  1. Some minor format changes made by pyCharm
  2. Moved codes to download FANBOX cover images from processFanboxArtist to processFanboxImages, because I don't think cover images should be downloaded if the post is not accessible
  3. Post infomation (mainly just artist_id, post_id, title, feeRequired, workDate, updatedDate, type) would be written into database first thing after 'processFanboxImages' is called inside it, before retrieving post data row from database for comparison between the newly retrieved post updated date and that written in the database. Post would not be processed if it is restricted or download history exists.
  4. Then after everything is down, images downloaded and image info written to file, update the updatedDate in database
  5. Added entry for download fanbox post by post ids, and some related logic in option parser thiingy

1. Imported `FanboxPost`
2. Moved codes to obtain post jsons to a single method `fanboxGetPost`
3. If `member_id` is passed when calling `fanboxGetPost`, json object is returned
4. If no `member_id` is passed when calling `fanboxGetPost`, member_id and name is retrieved from post json and a new `FanboxArtist` instance would be created via`object().__new__(FanboxArtist)`), and a new `FanboxPost` instance would be created and returned.
1. New table `fanbox_master_post`
2. Logic to create/drop table
3. Method to export fanbox post infomation to csv files, `exportFanboxPostList`
4. Methods to insert, update, select, delete post infos, CRUD?
5. Entries in `menu` and `main`
1. Added `updatedDate`, `updatedDateDatetime`, `feeRequired` properties and logic to set them in `parsePost` and `parseBody`
2. Deleted the old `updatedDatetime` that was not used anywhere
3. Extracted some codes to print fanbox post infomation to a new method `printPost`
1. Extracted some codes into `FanboxPost.printPost`
2. Moved codes to download FANBOX cover images from `processFanboxArtist` to `processFanboxImages`, because I don't think cover images should be downloaded if the post is not accessible
3. Post infomation (mainly just artist_id, post_id, title, feeRequired, workDate, updatedDate, type) would be written into database first thing after 'processFanboxImages' is called inside it, before retrieving post data row from database for comparison between the newly retrieved post updated date and that written in the database. Post would not be processed if it is restricted or download history exists. 
4. Then after everything is down, images downloaded and image info written to file, update the `updatedDate` in database
5. Added entry for download fanbox post by post ids, and some related logic in option parser thiingy
@bluerthanever
Copy link
Contributor Author

Wish I didn't miss anything....

@bluerthanever
Copy link
Contributor Author

...I am confused now about which I should be using, following or followed...

I wanted to change the apis, I mean I think it's the best, even though the old ones are still good to use.... But some classes might need to be changed a lot... and I am like tired now and don't wanna think about it anymore.... Changed some simple ones though.

@bluerthanever
Copy link
Contributor Author

Hmm... I found a problem that is to use the new url (https://api.fanbox.cc/...), there need to be a new cookie..... urr.

@bluerthanever
Copy link
Contributor Author

I will just add a new item in config, and users would need to copy and paste the new FANBOXSEESID into config in order to use it.... how is that....

@bluerthanever
Copy link
Contributor Author

Okay.... I guess I am almost done.... I will submit the changes now..

1. Deleted class `Fanbox` which is not too much of a use I guess....
2. Added property `creatorId` for class `FanboxArtist`, and `SUPPORTED` and `FOLLOWED` as constants(?) to be used when deciding the api for getting artist list (supported or followed)
3. Changed the `init` method of `FanboxArtist`, to just set instance property values instead of parsing jsons
4. Made the `parsePosts` method to return a list of posts instead of saving the list to its property, thus no need to create new `FanboxArtist` instance every time for every page.
1. Changed method name from `menu_fanbox_download_supported_artist` to `menu_fanbox_download_from_artist` with an extra parameter for deciding whether to get artists from supporting list or following list, which should be passed with `FanboxArtist.SUPPORTED` or `FanboxArtist.FOLLOWED`
2. Inside the new `menu_fanbox_download_from_artist` method, it obtains a list of `FanboxArtist` instances, and then update it's `ArtistToken` with some new methods added in `PixivBrowser`
3. The `FanboxArtist` instances would by passed one by one to `processFanboxArtist`, which uses `FanboxArtist` as a parameter now instead of id which is an integer before
4. Inside the new `processFanboxArtist`, posts are obtained from modified methods in `PixivBrowser`
5. Changed the name of `menu_fanbox_download_by_artist_id` to `menu_fanbox_download_by_artist_or_creator_id`, and allowed it to use the new `creatorId` to get posts
1. Removed `Fanbox` in importation
2. For method `_loadCookie`, added parameter `domain` and logic to decide what cookie it will add
3. In method `loginUsingCookie`, when calling `_loadCookie`, added `domain` parameter
4. Added new method `fanboxLoginUsingCookie`, mainly to check login status when getting artist list of followed or supported
5. Changed the name of method `fanboxGetSupportedUsers` to `fanboxGetUsers` with a new parameter via, to decide what kind of artist list to get. And returns a list of `FanboxArtist` instances, instead of integers
6. Merged the previously added `fanboxGetFollowedUsers` into `fanboxGetUsers`
7. Added method `fanboxUpdateArtistToken` to update `ArtistToken` and `ArtistName` of `FanboxArtist` instances
8. Added method `fanboxGetArtistById` to get`FanboxArtist` instance with creator/user id, to support the website transfer
9. Used the new APIs in some other FANBOX methods
10. Some minor format changes.
Added new option 'cookieFanbox`
Also to load the `cookieFanbox` .
But no errors would be raised currently
@bluerthanever
Copy link
Contributor Author

Changes required... that I can think of...

  1. Users need to add the new cookieFanbox which is FANBOXSESSID when visiting fanbox.cc.
  2. This cookie is not obtainable by codes... or not implemented currently

@bluerthanever
Copy link
Contributor Author

The FANBOXSESSID seems to be changing constantly....hmmm

@Nandaka
Copy link
Owner

Nandaka commented Apr 28, 2020

any changes on the unit test? ensure the existing unit test doesn't break, else I need you to update it (as I don't have fanbox access).

@bluerthanever
Copy link
Contributor Author

hmmm... I hate unit test....
I ran it and downloaded stuff...haha. I will take a rest, and take a look at unit tests later

@Nandaka
Copy link
Owner

Nandaka commented Apr 28, 2020

yeah, you change the structure and it fail to import the class
image

@bluerthanever
Copy link
Contributor Author

Yeah. Haha. Later tonight.

@Nandaka
Copy link
Owner

Nandaka commented Apr 28, 2020

tried to fix the unittest, but looks like the source json also changed?
I need to set the unittest to pass, else I cannot commit back as I have enable Travis CI
image

@bluerthanever
Copy link
Contributor Author

Yeah... I noticed some minor changes, like some new tokens added. But mostly the same...
So I will just fix the syntax to make it pass now...

What is that interface though...?

@Nandaka
Copy link
Owner

Nandaka commented Apr 28, 2020

That one is from TravisCI, so if the build pass then I can merge the pull request, like below:
image

@Nandaka Nandaka merged commit 3b026ae into Nandaka:master Apr 28, 2020
@Nandaka
Copy link
Owner

Nandaka commented Apr 28, 2020

merged 😄

@bluerthanever
Copy link
Contributor Author

Hmm.. Didn't notice that before... cuz it was always green before...
Actually I think I can try to do login with account ans password...? do you think that's necessary? I just caught some packages today.... was thinking about how to obtain fanbox cookie then....

@Nandaka
Copy link
Owner

Nandaka commented Apr 28, 2020

account and password

currently not possible due to captcha blocking the authentication form, right?

@bluerthanever
Copy link
Contributor Author

Actually I am not sure that's like impossible.... cuz I don't see any captcha authentication form on the page, and if it's hidden I think it could be bypassed or simulated with with execjs if it's in js or similar logic if not...?

@bluerthanever
Copy link
Contributor Author

bluerthanever commented Apr 28, 2020

well I managed to do login with twitter though...
anyway I don't think this should be prioritized.... and I might need further studies on that....

@Nandaka
Copy link
Owner

Nandaka commented Apr 28, 2020

hmm, I'm not sure, I think it is depend on your login behaviour and your IP address? I also don't use twitter or facebook anymore.

@bluerthanever
Copy link
Contributor Author

I'm not sure either...
the important thing is that it's usable, then comes convenience.
I will look into this if I am like really bored or something. haha

@AgentThirteen
Copy link

Thanks for all your work so far bluerthanever.
I was about to write an issue for this but it looks like it was an intended enhancement.

Moved codes to download FANBOX cover images from processFanboxArtist to processFanboxImages, because I don't think cover images should be downloaded if the post is not accessible

FANBOX cover images can be used if you pause then resume support to see if anything has changed as well as keep track of deleted posts. Any intent to bring them back as an option if the coding isn't too much of a hassle? It doesn't really slow down the process all that much (at least not until #686 - I will try to keep that one issue informed as I was short on time but should be able to test with a VPN or proxy at some point).

Users who have new creator post notifications set can have the cover sent to their email address but that is a rather buggy thumbnail and not a default setting if I am not mistaken.

@bluerthanever
Copy link
Contributor Author

bluerthanever commented May 2, 2020

@AgentThirteen
Well, it was not on the speed side, but more on the file management side when I was moving the codes, because sometimes I download from an creator with free posts and paid ones, that I am not supporting, and there would be a whole lot of single cover images.
And in this enhancement, I recorded the updatedDate of each post which is retrieved from FANBOX API in database, so if anything has changed, I think the date is changed too and when the post is hit again it will be downloaded again.

Unless you would want those cover images intentionally?

And I am curious on how you keep track of deleted posts with cover images, do you mind sharing the idea? Not saying no yet, but curious about how different users use it.

@AgentThirteen
Copy link

Thank you for the quick reply.

Oh I see. That certainly makes sense if you have a lot of followed creators considering how many have a FANBOX by now. I kind of missed the updatedDate enhancement, sorry, this is good to know for the pause-and-resume usage.

As for keeping track of deleted posts with cover images, simply checking based on notifications and locally updated date as you stated as I don't use the API myself and I am not really a coding guru let alone an amateur, haha.

@bluerthanever
Copy link
Contributor Author

As for keeping track of deleted posts with cover images, simply checking based on notifications and locally updated date as you stated as I don't use the API myself and I am not really a coding guru let alone an amateur, haha.

Sorry I don't quite get this. Do you mean you keep all notification emails of an artist's new post and compare to find out which posts are deleted? Because I don't remember that FANBOX would notify users if an artist deletes their posts.

But whichever way, if you still need to download the cover image, please don't hesitate to let me know.

@AgentThirteen
Copy link

Yes, that's exactly what I meant. I actually keep all mail sent as updated/new post notifications as I have pretty much unlimited email storage space and it's easy to parse regardless of the amount received. A thumbnail suddenly going missing indicates that the post is gone (the same thing will happen in bell notifications but that kind of history isn't fully checkable as far as I could tell) but there are surely easier ways to tell I am not aware of.

But whichever way, if you still need to download the cover image, please don't hesitate to let me know.

That'd be much appreciated if that's possible when you have time, thanks a lot.

Not sure how many users need this though, so probably as an optional feature if you feel like that's a lot of clogging?

@bluerthanever
Copy link
Contributor Author

bluerthanever commented May 3, 2020

@Nandaka Hey, Nandaka-san. I would like to add the following enhancements, what do you think?

  1. File formats for FANBOX posts
  • 3 separate formats for cover images, images inside posts, and post info
  1. Options to write HTML for posts that are not article type
  • an option named whenTextsLongerThan or something to write posts into HTML, when they contains at least 1 image/file, and texts longer than that number
  • a new HTML template for such posts (that are not articles) or modify the current one so it could be used for both
  1. An option to decide whether to download cover images when posts are restricted
  2. An option to decide whether to check database download history to decide whether to download a post or not
  3. Move FANBOX options together under FANBOX section

@Nandaka
Copy link
Owner

Nandaka commented May 3, 2020

  1. You means the filename formats? Currently it is already using filenameFormat for the cover, filenameMangaFormat for the images inside the post, and filenameInfoFormat for the post info. Any reason to create separate filename formats?
  2. Should be ok.
  3. Should be ok.
  4. Is it something like this? Only download the post if updatedDate is different?
if config.checkFanboxPostHistory:
   result = db.getFanboxPostHistory(post_id)
   if result is not None:
      post_data = br.getFanboxPost(post_id)
      if result.updatedDate == post_data.updatedDate:
         return False # skip the post
   return True # download the post
  1. Sure, ensure the config upgrate can be handled. I think these are the existing config for Fanbox?
    • writeHtml
    • useAbsolutePathsInHtml

@bluerthanever
Copy link
Contributor Author

bluerthanever commented May 3, 2020

  1. Well, it seems like some users are confused, to make it easier to understand, and also, for users who would want the FANBOX posts to be saved by different formats and are not willing to use a different config. Haha, for the lazy ones?
  2. Yup, something like that.
  3. Hmm, Now that you've mentioned it, and I just realized that I am moving the sections, which the current structure does not support yet. I will find a way to deal with that.

@Nandaka
Copy link
Owner

Nandaka commented May 3, 2020

well, ok. For moving section upgrade, no need to think on the way to do it automatically, just ensure all the previous configs are read then only throw the error message.

On the previous structure, I'll just move the reading code to the last line after I update the section name (treat it like new config).

@bluerthanever
Copy link
Contributor Author

Well currently, if a config is moved to another section, it will be reset... so not technically ensuring all previous configs are read... So I need to think of a solution I guess.

35122 pushed a commit to 35122/PixivUtil2 that referenced this pull request Oct 30, 2020
* Modification and implementation for downloading FANBOX post

1. Imported `FanboxPost`
2. Moved codes to obtain post jsons to a single method `fanboxGetPost`
3. If `member_id` is passed when calling `fanboxGetPost`, json object is returned
4. If no `member_id` is passed when calling `fanboxGetPost`, member_id and name is retrieved from post json and a new `FanboxArtist` instance would be created via`object().__new__(FanboxArtist)`), and a new `FanboxPost` instance would be created and returned.

* Fanbox post records enhancement 

1. New table `fanbox_master_post`
2. Logic to create/drop table
3. Method to export fanbox post infomation to csv files, `exportFanboxPostList`
4. Methods to insert, update, select, delete post infos, CRUD?
5. Entries in `menu` and `main`

* Some properties and methods changes

1. Added `updatedDate`, `updatedDateDatetime`, `feeRequired` properties and logic to set them in `parsePost` and `parseBody`
2. Deleted the old `updatedDatetime` that was not used anywhere
3. Extracted some codes to print fanbox post infomation to a new method `printPost`

* Fanbox post download enhancement

1. Extracted some codes into `FanboxPost.printPost`
2. Moved codes to download FANBOX cover images from `processFanboxArtist` to `processFanboxImages`, because I don't think cover images should be downloaded if the post is not accessible
3. Post infomation (mainly just artist_id, post_id, title, feeRequired, workDate, updatedDate, type) would be written into database first thing after 'processFanboxImages' is called inside it, before retrieving post data row from database for comparison between the newly retrieved post updated date and that written in the database. Post would not be processed if it is restricted or download history exists. 
4. Then after everything is down, images downloaded and image info written to file, update the `updatedDate` in database
5. Added entry for download fanbox post by post ids, and some related logic in option parser thiingy

* Fixed some indent errors

* Some logical correction to `insertPost`

* Moved codes position

Made codes to parse `feeRequired` to be ran into earlier

* Added sep for `exportFanboxPosts`

Allow users to choose between "," and "\t"

* Added method to get followed artists

* Added method for downloading from following artists

* Forgot about menu

* Typo. crap....

I should be sleeping now.

* Big changes..... whew

1. Deleted class `Fanbox` which is not too much of a use I guess....
2. Added property `creatorId` for class `FanboxArtist`, and `SUPPORTED` and `FOLLOWED` as constants(?) to be used when deciding the api for getting artist list (supported or followed)
3. Changed the `init` method of `FanboxArtist`, to just set instance property values instead of parsing jsons
4. Made the `parsePosts` method to return a list of posts instead of saving the list to its property, thus no need to create new `FanboxArtist` instance every time for every page.

* Some big changes

1. Changed method name from `menu_fanbox_download_supported_artist` to `menu_fanbox_download_from_artist` with an extra parameter for deciding whether to get artists from supporting list or following list, which should be passed with `FanboxArtist.SUPPORTED` or `FanboxArtist.FOLLOWED`
2. Inside the new `menu_fanbox_download_from_artist` method, it obtains a list of `FanboxArtist` instances, and then update it's `ArtistToken` with some new methods added in `PixivBrowser`
3. The `FanboxArtist` instances would by passed one by one to `processFanboxArtist`, which uses `FanboxArtist` as a parameter now instead of id which is an integer before
4. Inside the new `processFanboxArtist`, posts are obtained from modified methods in `PixivBrowser`
5. Changed the name of `menu_fanbox_download_by_artist_id` to `menu_fanbox_download_by_artist_or_creator_id`, and allowed it to use the new `creatorId` to get posts

* Big changes

1. Removed `Fanbox` in importation
2. For method `_loadCookie`, added parameter `domain` and logic to decide what cookie it will add
3. In method `loginUsingCookie`, when calling `_loadCookie`, added `domain` parameter
4. Added new method `fanboxLoginUsingCookie`, mainly to check login status when getting artist list of followed or supported
5. Changed the name of method `fanboxGetSupportedUsers` to `fanboxGetUsers` with a new parameter via, to decide what kind of artist list to get. And returns a list of `FanboxArtist` instances, instead of integers
6. Merged the previously added `fanboxGetFollowedUsers` into `fanboxGetUsers`
7. Added method `fanboxUpdateArtistToken` to update `ArtistToken` and `ArtistName` of `FanboxArtist` instances
8. Added method `fanboxGetArtistById` to get`FanboxArtist` instance with creator/user id, to support the website transfer
9. Used the new APIs in some other FANBOX methods
10. Some minor format changes.

* New option added

Added new option 'cookieFanbox`

* Added codes to check FANBOX login status in main()

Also to load the `cookieFanbox` .
But no errors would be raised currently

* Minor change to enable creatorId passed as args

* Put request inside try except inside `fanboxLoginUsingCookie`

In case error happens here and causes the whole util to fail

* Some print and indent correction?

* Changes to fit changes to classes in order to pass unit test

* Added `creatorId` for the unit test
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants