-
-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[lightroom] add Lightroom gallery extractor #2263
Conversation
directory_fmt = ("{category}", "{user}", "{title}") | ||
filename_fmt = "{num:>04}_{id}.{extension}" | ||
archive_fmt = "{id}" | ||
pattern = r"(?:https?://)?lightroom\.adobe\.com/shares/([0-9a-f]+)" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't hurt having (?:www\.)?
just in case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The www
subdomain doesn't exist, so such an URL wouldn't work anyway.
Is there any way to browse this site, well, normally? |
@Hrxn Not that I know of. It's for people sharing their Lightroom galleries, not a general-purpose image host like imgur. So, you need to have the URL of a gallery to be able to see it, and there's no search or "explore" feature. |
gallery_dl/extractor/lightroom.py
Outdated
@@ -0,0 +1,105 @@ | |||
# -*- coding: utf-8 -*- | |||
|
|||
# Copyright 2018-2022 Mike Fährmann |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not the author or copyright holder.
Put your own name and current year there, or just delete this line.
gallery_dl/extractor/lightroom.py
Outdated
response = self.request(url) | ||
# skip 1st line as it's a JS loop | ||
data_idx = response.text.index("\n") + 1 | ||
data = json.loads(response.text[data_idx:]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should never access response.text
more than once.
It internally does some heavy computations and doesn't cache its result.
Store it in its own variable and use that.
response = self.request(url) | |
# skip 1st line as it's a JS loop | |
data_idx = response.text.index("\n") + 1 | |
data = json.loads(response.text[data_idx:]) | |
page = self.request(url).text | |
# skip 1st line as it's a JS loop | |
data = json.loads(page[page.index("\n") + 1:]) |
gallery_dl/extractor/lightroom.py
Outdated
data_idx = response.text.index("\n") + 1 | ||
data = json.loads(response.text[data_idx:]) | ||
|
||
next_url = data.get("links", {}).get("next", {}).get("href", None) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd rather use a try-except block than creating dozens of dicts every time.
Move this after the for loop and you can immediately return. Or you set next_url
to None
.
next_url = data.get("links", {}).get("next", {}).get("href", None) | |
try: | |
next_url = data["links"]["next"]["href"] | |
except KeyError: | |
next_url = None |
|
||
next_url = data.get("links", {}).get("next", {}).get("href", None) | ||
|
||
base_url = data["base"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This overrides the base_url
value set in line 73 before that got used even once.
Not sure if that's a problem, just something I noticed.
No description provided.