Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DMM Scraper unable to find random movies that have code in XXX-00X format #138

Closed
szmere opened this issue Oct 17, 2020 · 3 comments · Fixed by #139
Closed

DMM Scraper unable to find random movies that have code in XXX-00X format #138

szmere opened this issue Oct 17, 2020 · 3 comments · Fixed by #139
Assignees
Labels
bug Something isn't working

Comments

@szmere
Copy link

szmere commented Oct 17, 2020

Expected Behavior

For the below movies, expected to get the description from DMM scraper. The movies exists on DMM. The interesting behavior is that IPX-004, 005, 006 are all able to retrieve the description. Not sure the exact pattern here.
IPX-001
IPX-002
IPX-003
SSNI-004
ADN-001
ADN-002
ANGR-008

Current Behavior

Getting the below errors for the above mentioned movies
[IPX-002] [Get-DmmUrl] not matched on DMM

Steps to Reproduce (for bugs)

Simple scrap with the above mentioned file names

Your Environment

  • Module version used: 2.1.3
  • Operating System and PowerShell version: Windows 10, PowerShell 7
@jvlflame jvlflame self-assigned this Oct 17, 2020
@jvlflame jvlflame added the bug Something isn't working label Oct 17, 2020
@jvlflame
Copy link
Collaborator

Thanks for catching this.

This is actually really strange, the link for the video page isn't included in the page's html when scraping it.
And it seems to only happen for the cases that you mention, movie IDs starting with 00.

@szmere
Copy link
Author

szmere commented Oct 17, 2020

Dang you respond fast man. I agree, I didn't get any results from the WebRequest scrapper. I think this is probably an issue of WebRequest unfortunately. I guess I will have to manually get the description content for these few incidents.

$webRequest = Invoke-WebRequest -Uri "https://www.dmm.co.jp/search/?redirect=1&enc=UTF-8&category=&searchstr=ipx00001"
$searchResults = ($webrequest.links.href | Where-Object { $_ -like '*digital/videoa/*' })
$searchResults

@jvlflame
Copy link
Collaborator

I already committed a fix to my dev branch.

Fortunately the ContentID between R18 and DMM are identical, so I just plugged in the R18 url scraper for instances where the ID matches XXX-00X.

The fix will be in the next release.

@jvlflame jvlflame mentioned this issue Oct 19, 2020
@jvlflame jvlflame linked a pull request Oct 19, 2020 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants