Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fiction.live - certain story URLs are not recognized #541

Closed
muchtea opened this issue Sep 7, 2020 · 5 comments · Fixed by #542
Closed

fiction.live - certain story URLs are not recognized #541

muchtea opened this issue Sep 7, 2020 · 5 comments · Fixed by #542

Comments

@muchtea
Copy link
Contributor

muchtea commented Sep 7, 2020

Hello and thank you @HazelSh for adding support for the site.
I just came across this issue #kemayo/leech#31 for a different fiction.live-downloader.
There are some stories with 'Sci-fi' instead of 'stories' in the URL and some quest ids start with a '-' and are 20 characters long, which isn't covered by the current regex.

Example stories:
https://fiction.live/Sci-fi/Endless-Journey/-JTOp94O1JnJQ10y557N
https://fiction.live/stories/The-Traveler/-JEdsoZFs11asUSB-SEQ

I think this should work, but better double check that:
https?://fiction\.live/(Sci-fi|stories|anonkun)/[^/]*/([a-zA-Z0-9]{17}|[a-f0-9]{8}-[a-f0-9]{4}-[a-f0-9]{4}-[a-f0-9]{4}-[a-f0-9]{12}|[a-zA-Z0-9\-]{20})/?(home)?

@HazelSh
Copy link

HazelSh commented Sep 7, 2020

I swear, every time I start thinking this site is somewhat reasonable, something else turns up. I'll sort this. Thanks for the suggested regex, I may use it.

@HazelSh
Copy link

HazelSh commented Sep 8, 2020

As I expected as soon as I saw Sci-fi, it's worse than that.
https://fiction.live/Writing/Side-Quest-Suggestion-Quest/gXKah64TFtZs2fwEd

stories that are old, but not old enough to be in the UUID-id style, have the first tag? genre? in the url. I have no idea where to get an exhaustive list of possible options for these.

I'm gonna write a script to scrape story urls. A brief look at fiction.live/stories doesn't show where the urls are coming from, so this may take a bit.

@JimmXinu
Copy link
Owner

JimmXinu commented Sep 8, 2020

Unless you're trying to collect metadata from it, does it matter what's in the URL other than the story ID?
https?://fiction\.live/[^/]*/[^/]*/([a-zA-Z0-9]+)

HazelSh pushed a commit to HazelSh/FanFicFare that referenced this issue Sep 8, 2020
@HazelSh
Copy link

HazelSh commented Sep 8, 2020

Yeah, I suppose it doesn't matter. Still, that's too permissive -- I need to rule out URLs that aren't a story frontpage. I think I've got it now.

I might do that script to scrape URLs anyway, just to see if there's any more surprises waiting.

HazelSh pushed a commit to HazelSh/FanFicFare that referenced this issue Sep 8, 2020
JimmXinu pushed a commit that referenced this issue Sep 8, 2020
@JimmXinu
Copy link
Owner

JimmXinu commented Sep 8, 2020

New Test version posted in the usual place.

narethdeer pushed a commit to narethdeer/FanFicFare that referenced this issue Sep 26, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants