Skip to content

Commit

Permalink
merge #5626: [facebook] add support (#470, #2612)
Browse files Browse the repository at this point in the history
* [facebook] add initial support

* renamed extractors & subcategories

* better stability, modularity & naming

* added single photo extractor, warnings & retries

* more metadata + extract author followups

* renamed "album" mentions to "set" for consistency

* cookies are now only used when necessary

also added author followups for singular images

* removed f-strings

* added way to continue extraction from where it left off

also fixed some bugs

* fixed bug wrong subcategory

* added individual video extraction

* extract audio + added ytdl option

* updated setextract regex

* added option to disable start warning

the extractor should be ready :)

* fixed description metadata bug

* removed cookie "safeguard" + fixed for private profiles

I have removed the cookie "safeguard" (not using cookies until they are necessary) as I've come to the conclusion that it does more harm than good. There is no way to detect whether the extractor has skipped private images, that could have been possibly extracted otherwise. Also, doing this provides little to no advantages.

* fixed a few bugs regarding profile parsing

* a few bugfixes

Fixed some metadata attributes from not decoding correctly from non-latin languages, or not showing at all.
Also improved few patterns.

* retrigger checks

* Final cleanups

-Added tests
-Fixed video extractor giving incorrect URLs
-Removed start warning
-Listed supported site correctly

* fixed regex

* trigger checks

* fixed livestream playback extraction + bugfixes

I've chosen to remove the "reactions", "comments" and "views" attributes as I've felt that they require additional maintenance even though nobody would ever actually use them to order files.
I've also removed the "title" and "caption" video attributes for their inconsistency across different videos.
Feel free to share your thoughts.

* fixed regex

* fixed filename fallback

* fixed retrying when a photo url is not found

* fixed end line

* post url fix + better naming

* fix posts

* fixed tests

* added profile.php url

* made most of the requested changes

* flake

* archive: false

* removed unnecessary url extract

* [facebook] update

- more 'Sec-Fetch-…' headers
- simplify 'text.nameext_from_url()' calls
- replace 'sorted(…)[-1]' with 'max(…)'
- fix '_interval_429' usage
- use replacement fields in logging messages

* [facebook] update URL patterns

get rid of '.*' and '.*?'

* added few remaining tests

---------

Co-authored-by: Mike Fährmann <mike_faehrmann@web.de>
  • Loading branch information
zWolfrost and mikf authored Nov 26, 2024
1 parent d1ad97a commit e9370b7
Show file tree
Hide file tree
Showing 6 changed files with 612 additions and 0 deletions.
25 changes: 25 additions & 0 deletions docs/configuration.rst
Original file line number Diff line number Diff line change
Expand Up @@ -2376,6 +2376,31 @@ Description
for example ``tags_artist`` or ``tags_character``.


extractor.facebook.author-followups
-----------------------------------
Type
``bool``
Default
``false``
description
Extract comments that include photo attachments made by the author of the post.


extractor.facebook.videos
-------------------------
Type
* ``bool``
* ``string``
Default
``true``
Description
Control video download behavior.

* ``true``: Extract and download video & audio separately.
* ``"ytdl"``: Let |ytdl| handle video extraction and download, and merge video & audio streams.
* ``false``: Ignore videos.


extractor.fanbox.comments
-------------------------
Type
Expand Down
6 changes: 6 additions & 0 deletions docs/supportedsites.md
Original file line number Diff line number Diff line change
Expand Up @@ -247,6 +247,12 @@ Consider all listed sites to potentially be NSFW.
<td>Favorites, Galleries, Search Results</td>
<td>Supported</td>
</tr>
<tr>
<td>Facebook</td>
<td>https://www.facebook.com/</td>
<td>Photos, Profiles, Sets, Videos</td>
<td><a href="https://github.com/mikf/gallery-dl#cookies">Cookies</a></td>
</tr>
<tr>
<td>Fanleaks</td>
<td>https://fanleaks.club/</td>
Expand Down
1 change: 1 addition & 0 deletions gallery_dl/extractor/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -50,6 +50,7 @@
"erome",
"everia",
"exhentai",
"facebook",
"fanbox",
"fanleaks",
"fantia",
Expand Down
Loading

0 comments on commit e9370b7

Please sign in to comment.