Skip to content
This repository has been archived by the owner on Mar 9, 2021. It is now read-only.

Tumbler Hidden crawls missing multiple images #274

Closed
jas0320 opened this issue Sep 25, 2018 · 1 comment
Closed

Tumbler Hidden crawls missing multiple images #274

jas0320 opened this issue Sep 25, 2018 · 1 comment
Assignees

Comments

@jas0320
Copy link

jas0320 commented Sep 25, 2018

Not all images are downloading when crawling a hidden blog. The issue seems to only occur if the image appears clickable (cursor changes and you click and a sub window pops up) but there is no actual pop up window. The blog example below is a mix of NSFW and just cool pics so apologize in advance. This blog has multiple examples - https://justalittlegin.tumblr.com/
This is one of the recently posted pics that does not download
https://78.media.tumblr.com/a962067c00c4f1c1b73a953551990444/tumblr_pfiwi0h8Wz1tcssjy_540.jpg
This seems like it may have started about the same time the raw image issue surfaced.
On the same blog, if the image is actually clickable it does download.

johanneszab added a commit that referenced this issue Sep 27, 2018
- Uses the content of the trail of each post for the hidden tumblr blog post inline photo and video detection instead of changing fields depending on the posts type.
- Code formatting.
@johanneszab johanneszab self-assigned this Sep 27, 2018
@johanneszab
Copy link
Owner

johanneszab commented Sep 27, 2018

Thanks. Should be better now. It will now grab the text from the content of the "trail" of each post, instead of some fields of the post itself. The problem here is that the fields vary depending on the post type. Say you have regular (text) post, then the text seems to be inside a body field. If you have a picture post, then the text is in the caption field, and it seems to be different and even inconsistent for each/some posts. And then you end up having to concatenate and null check several different fields.

I'll close this. You can reopen it at anytime if you still have any other posts that don't work. Then we can still concatenate all related fields and try it that way.

Thanks for the detailed description and for providing examples!

johanneszab added a commit that referenced this issue Sep 27, 2018
- Uses the content of the trail of each post for the hidden tumblr blog post inline photo and video detection instead of changing fields depending on the posts type.
- Code formatting.
johanneszab added a commit that referenced this issue Sep 27, 2018
- Uses the content of the trail of each post for the hidden tumblr blog post inline photo and video detection instead of changing fields depending on the posts type.
- Code formatting.
johanneszab added a commit that referenced this issue Sep 27, 2018
- Uses the content of the trail of each post for the hidden tumblr blog post inline photo and video detection instead of changing fields depending on the posts type.
- Code formatting.
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

2 participants