Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JAVLibrary scraper errors (Cloudflare) #127

Closed
org0ne opened this issue Oct 7, 2020 · 7 comments · Fixed by #132
Closed

JAVLibrary scraper errors (Cloudflare) #127

org0ne opened this issue Oct 7, 2020 · 7 comments · Fixed by #132
Assignees
Labels
bug Something isn't working enhancement New feature or request

Comments

@org0ne
Copy link

org0ne commented Oct 7, 2020

Current Behavior

Javinzer is sending out the below error on all scrape operations, whether successful or not

Steps to Reproduce (for bugs)

Invoke Javinizer on a folder of videos

Your Environment

  • Module version used:2.1.2

Write-Error: /home/user/.local/share/powershell/Modules/Javinizer/2.1.2/Private/Invoke-Parallel.ps1:532 Line | 532 | Get-RunspaceData | ~~~~~~~~~~~~~~~~ | [SSNI-881] [Get-JavlibraryUrl] Error occured on [GET] on URL [http://www.javlibrary.com/en/vl_searchbyid.php?keyword=SSNI-881]: | Just a moment... html, body {width: 100%; height: 100%; margin: 0; padding: 0;} body {background-color: #ffffff; color: | #000000; font-family:-apple-system, system-ui, BlinkMacSystemFont, "Segoe UI", Roboto, Oxygen, Ubuntu, "Helvetica Neue",Arial, | sans-serif; font-size: 16px; line-height: 1.7em;-webkit-font-smoothing: antialiased;} h1 { text-align: center; font-weight:700; | margin: 16px 0; font-size: 32px; color:#000000; line-height: 1.25;} p {font-size: 20px; font-weight: 400; margin: 8px 0;} p, | .attribution, {text-align: center;} #spinner {margin: 0 auto 30px auto; display: block;} .attribution {margin-top: 32px;} | @keyframes fader { 0% {opacity: 0.2;} 50% {opacity: 1.0;} 100% {opacity: 0.2;} } @-webkit-keyframes fader { 0% {opacity: 0.2;} | 50% {opacity: 1.0;} 100% {opacity: 0.2;} } #cf-bubbles > .bubbles { animation: fader 1.6s infinite;} #cf-bubbles > | .bubbles:nth-child(2) { animation-delay: .2s;} #cf-bubbles > .bubbles:nth-child(3) { animation-delay: .4s;} .bubbles { | background-color: #f58220; width:20px; height: 20px; margin:2px; border-radius:100%; display:inline-block; } a { color: #2c7cb0; | text-decoration: none; -moz-transition: color 0.15s ease; -o-transition: color 0.15s ease; -webkit-transition: color 0.15s ease; | transition: color 0.15s ease; } a:hover{color: #f4a15d} .attribution{font-size: 16px; line-height: 1.5;} .ray_id{display: | block; margin-top: 8px;} #cf-wrapper #challenge-form { padding-top:25px; padding-bottom:25px; } #cf-hcaptcha-container { | text-align:center;} #cf-hcaptcha-container iframe { display: inline-block;} // | Please turn JavaScript on and reload the page. table Checking your browser before accessing | javlibrary.com. Please enable Cookies and reload the page. This process is automatic. Your browser will redirect | to your requested content shortly. Please allow up to 5 seconds… --> | DDoS protection by Cloudflare Ray ID: 5dea11c57d30057d

@jvlflame jvlflame self-assigned this Oct 7, 2020
@jvlflame jvlflame added the bug Something isn't working label Oct 7, 2020
@jvlflame
Copy link
Collaborator

jvlflame commented Oct 7, 2020

Looks like javlibrary re-added their cloudflare protection on the site. I'll have to try again to see if there's a solution to this, otherwise the temporary workaround would be changing the javlibrary.baseurl setting to http:\/\/www.g46e.com

@jvlflame jvlflame changed the title Write-Error: /home/user/.local/share/powershell/Modules/Javinizer/2.1.2/Private/Invoke-Parallel.ps1:539 JAVLibrary scraper errors (Cloudflare) Oct 7, 2020
@jvlflame jvlflame pinned this issue Oct 7, 2020
@zuko7177
Copy link
Contributor

zuko7177 commented Oct 8, 2020

Here's a work around that JAVMovieScraper uses:

  1. Goto JavLibrary with your browser like Firefox. Bypass the Cloudflare check.
  2. Export the cookie using a browser plugin (I use cookies.txt in Firefox).
  3. Have Javinizer utilize that cookie when scraping JavLibrary. The user agent would have to match the browser. I use this site to get user agent: https://www.whatismybrowser.com/detect/what-is-my-user-agent

The session should be good for an hour which gives plenty of time for scraping.

So two settings would be added to jvSettings.json for JavLibrary scraping:

  1. location of cookie file
  2. user agent

It's not the most elegant solution, but it works.

@jvlflame
Copy link
Collaborator

jvlflame commented Oct 8, 2020

@zuko7177 Thanks for that info.
I'll add that functionality just so we do have an option to get the original javlibrary scraper back.

@jvlflame jvlflame added the enhancement New feature or request label Oct 10, 2020
@jvlflame
Copy link
Collaborator

jvlflame commented Oct 10, 2020

Added a few methods to manually specify the Javlibrary websession.
Any time a new websession is specified, it will be updated and persisted in your settings file until Javinizer detects that the websession is expired, in which it will then prompt you for new cookies.

# 1. Input websession values during Javinizer runtime
Javinizer -Path 'C:\JAV\Unsorted' -DestinationPath 'C:\JAV\Sorted'

cmdlet Get-CfSession at command pipeline position 1
Supply values for the following parameters:
Cfduid: dec857987e88528669d3bda527c813213213718
Cfclearance: 812db5df4cfffb1ec11b2b5cff38d6323b842a9d-1602366718-0-1zc9220a8fz43c94e72z7b83132-150
UserAgent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/86.0.4240.75 Safari/537.36

# 2. Create the websession object first and specify it as a parameter
$session = Get-CfSession -Cduid 'dec857987e88528669d3bda527c8119731602366742' `
-Cfclearance '812db5df4cfffb1ec11b2b5cff38d6323b842a9d-1602366718-0-1zc9220a8fz43c94e72z7b83181-150' `
-UserAgent 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/86.0.4240.75 Safari/537.36'

Javinizer -Path 'C:\JAV\Unsorted' -DestinationPath 'C:\JAV\Sorted' -CfSession $session

# 3. Manually populate the settings file with websession values
Javinizer -OpenSettings

"javlibrary.browser.useragent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/86.0.4240.75 Safari/537.36",
"javlibrary.cookie.cfduid": "dec857987e88528669d3bda527c8119731602366718",
"javlibrary.cookie.cfclearance": "812db5df4cfffb1ec11b2b5cff38d6323b842a9d-1602366718-0-1zc9220a8fz43c94e72z7b83181-150",

This also resolves -SetOwned functionality for setting owned movies on Javlibrary.

@ozawaslave
Copy link

Tested on MacOS 11.0 (Big Sur Beta), Powershell 7. It works using the above method.

@jvlflame
Copy link
Collaborator

@ozawaslave Thanks for the confirmation.

@zuko7177
Copy link
Contributor

Working for me in windows as well. Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants