Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proxy Rotation #778

Open
wants to merge 243 commits into
base: alpha
Choose a base branch
from
Open
Changes from 2 commits
Commits
Show all changes
243 commits
Select commit Hold shift + click to select a range
620549c
new pull request because I suck
hwamil May 14, 2021
7636050
moved assigning next item to the bottom of while loop since AmazonMon…
hwamil May 14, 2021
a0263dd
improved BadProxyCollector
hwamil May 15, 2021
88d7aa5
made bad_proxies.json parameters more clear
hwamil May 15, 2021
7a287b2
proxy collector now deletes an unbanned proxy if it has been good for…
hwamil May 15, 2021
e5326e5
Fixed a huge mistake in item assignment thanks to mbbush
hwamil May 15, 2021
acd430e
changed ItemsHandler assign_next_item method name to pop, although it…
hwamil May 15, 2021
7e27c82
more methods for collector. gotta figure out how to save to file ONCE…
hwamil May 15, 2021
a68b3bf
added 'issaver' kwarg to AmazonMonitor. After all sessions are create…
hwamil May 15, 2021
6d24ffc
BadProxyCollector is now never instantiated
hwamil May 15, 2021
1e12313
BadProxyCollector delay for purging cooled down proxies from json cha…
hwamil May 15, 2021
a8060d7
new proxies.template_json
hwamil May 15, 2021
6eda06e
fixed error in proxies template
hwamil May 15, 2021
75da6cc
added return line in except block at BPC load so that it doesn't loop…
hwamil May 15, 2021
cdec75b
previous commit was a mistake. it is supposed to loop lmao
hwamil May 15, 2021
c25a4f3
if no proxies and sleeping sessions for a minute if 503 response
hwamil May 16, 2021
25cbc73
implemented no proxy mode. uses single session using TCPConnector and…
hwamil May 16, 2021
61569a9
changed 503 time.sleep to await asyncio.sleep
hwamil May 16, 2021
92eed16
503 log change for no proxy mode
hwamil May 16, 2021
3c2d936
Saving user agents to reuse when generating headers for new instances…
hwamil May 16, 2021
ee0f14e
feature to save user_agents for corresponding proxy urls so that sess…
hwamil May 16, 2021
a47a0d0
merged headers branch, adding user_agents save feature
hwamil May 16, 2021
31148af
badproxycollector timer bug fix
hwamil May 16, 2021
a511e02
BPC fixed attribute error
hwamil May 16, 2021
b9f68b4
experimental feature stagger
hwamil May 17, 2021
265248c
add init_sleep parameter using idx in monitor initialization and use …
hwamil May 17, 2021
3183587
fixed unexpected kwarg
hwamil May 17, 2021
ce40261
stagger using len of item_list instead of n of proxies
hwamil May 17, 2021
15beaea
realized that cycling len of item list defeats the purpose. using ran…
hwamil May 17, 2021
92ef86e
random.choice may generate too many repeating numbers. going back to …
hwamil May 17, 2021
1e86ef8
threw all previous methods away and just used proxies // len(item_lis…
hwamil May 17, 2021
f1ef100
added init_sleep attr to fail_recreate method
hwamil May 17, 2021
253a6ec
abandoned previous method and placed staggering algo right into the I…
hwamil May 17, 2021
ba043d0
trying another method of staggering
hwamil May 17, 2021
cdf2b28
fixed unpack error caused by remnants of previous method
hwamil May 17, 2021
7269253
fixed mistake of updating access time when returning true on itemshan…
hwamil May 17, 2021
0ed906c
changed method name
hwamil May 17, 2021
1368e33
changed next item flow by accident
hwamil May 17, 2021
183ac89
fixed BPC bug
hwamil May 17, 2021
3f458ca
fixed BPC bug 2
hwamil May 17, 2021
40561fc
increased bad proxy wait from 1 min to 5 min
hwamil May 17, 2021
04c1f35
Removed issaver flag from Monitoring and moved it into queue. Also re…
hwamil May 17, 2021
13ba193
simplified BPC so that it only shows currently bad proxies with last …
hwamil May 18, 2021
75e9345
blackd
hwamil May 18, 2021
52b8817
Realized BPC being called could break program when user has no proxies
hwamil May 18, 2021
2012bb8
cleaned up unused parameter checkout_task from AZNHandler and AZNMHan…
hwamil May 18, 2021
d91ec1d
simplified BPC further to just be a list so that it's more readable a…
hwamil May 18, 2021
e5a5901
increased sleep time from 0.1 to 0.5 for last_access_check for item s…
hwamil May 18, 2021
f055c30
pipenv install uvloop for fasteer asyncio
hwamil May 18, 2021
f3fbb4a
changed staggering back to 1s because... why not?
hwamil May 18, 2021
0044216
cleaning bloat
hwamil May 18, 2021
6b826a5
small things
hwamil May 18, 2021
dfb7b9f
moved IH item_ids declarance into create_items_pool block.
hwamil May 18, 2021
9746abd
Update misc.py
hwamil May 19, 2021
1648ec3
WIP
hwamil May 20, 2021
bae1305
uvloop now a click option as Windows is not supported
hwamil May 20, 2021
b31f1a0
migrated tools from misc to monitoring
hwamil May 20, 2021
340f17e
fixed merge conflict
hwamil May 20, 2021
9d695cb
tracking misc.py
hwamil May 20, 2021
80f0a4f
fixed missing bad_proxies path var and changed item delay to be impli…
hwamil May 20, 2021
ad5395a
apparently you can't import asyncio.sleep
hwamil May 20, 2021
06f62a1
changed it so that task_delay changes with the change in the number o…
hwamil May 20, 2021
765018f
small fix for bpc
hwamil May 20, 2021
0ba6320
renamed bpcs
hwamil May 20, 2021
7377d5f
log for task_delay
hwamil May 20, 2021
a10bdc8
fixed error in f string
hwamil May 20, 2021
5cbb9de
bit more info in bpc log
hwamil May 20, 2021
b4c5c08
changed method name for json_url
hwamil May 20, 2021
323fc50
merge conflict resolved
hwamil May 20, 2021
2436bd8
delay / (good_proxies/asins) if len(good_proxies) > len(asins)
hwamil May 20, 2021
d092bb5
trying different flow
hwamil May 20, 2021
c7e4f31
time.sleep severely impacts performance. Shouldn't be a surprise :/
hwamil May 20, 2021
fb4fd61
took out first item condition
hwamil May 20, 2021
c707989
small changes to check_last_access
hwamil May 20, 2021
db8ed35
commit before merging rotation
hwamil May 20, 2021
e5a67b6
conflict resolved
hwamil May 20, 2021
b40eb98
use same headers when recreating. fake_user_agent causing too many op…
hwamil May 20, 2021
d102ad5
reduced proxies.json layer
hwamil May 20, 2021
94c1cb8
proxies.template_json
hwamil May 20, 2021
5c2171f
queue.put to put_nowait so that it doesn't get blocked
hwamil May 20, 2021
96aee13
had to await last_access_check for it to work properly
hwamil May 20, 2021
01f19e6
changed item stagger to a task stagger implementation where it checks…
hwamil May 20, 2021
d8a4442
changed rest_time format from rounding to two decimal places to no ro…
hwamil May 20, 2021
ebadea7
changed formatting one more time to milliseconds with rounding
hwamil May 20, 2021
9eb3dca
set last_task before sleeping
hwamil May 20, 2021
d9c1569
working build
hwamil May 20, 2021
f7f49bd
tested and working with multiple groups of proxies
hwamil May 20, 2021
ac7c311
time.sleep to async.sleep. realized even if it's a very small amount …
hwamil May 20, 2021
3df8685
got rid of bpc.save since you can find it in log and it slows down mo…
hwamil May 20, 2021
8a9fedb
reducing bpc even further. may just get rid of it altogether. seems u…
hwamil May 20, 2021
fda822d
preload proxy group to get rid of ramp up
hwamil May 20, 2021
485af97
Revert "preload proxy group to get rid of ramp up"
hwamil May 20, 2021
891ed0b
failed to preload proxies and reverted. little hacky but it'll do for…
hwamil May 20, 2021
1bfbcf4
switched to time.sleep again. see what happens
hwamil May 20, 2021
494c812
just going back to asyncio.sleep cuz I don't know any better
hwamil May 20, 2021
9319305
threw out dynamic staggering and just implemented initial staggering …
hwamil May 20, 2021
a3485d2
change non-active groups sleep to delay so it doesn't hang the progra…
hwamil May 20, 2021
16cf946
got rid of unnecessary redundancies
hwamil May 20, 2021
021a0b3
.
hwamil May 20, 2021
5a6d7dd
..
hwamil May 20, 2021
7f0c314
fixed broken timer
hwamil May 20, 2021
bf1704b
I'm too tired to know what I did
hwamil May 20, 2021
2898e71
Merge branch 'rotation_v2.1' into atc_json
hwamil May 21, 2021
a6bf8fb
merged rotation
hwamil May 21, 2021
4c2d7a9
moved uvloop to dev packages so normal users on windows don't have to…
hwamil May 21, 2021
0fe6968
almost implemented
hwamil May 21, 2021
f1df53f
clunky but working
hwamil May 21, 2021
f7e19b7
Merge branch 'atc_json' into rotation_v2.1
hwamil May 21, 2021
16d8177
tiny touch up
hwamil May 21, 2021
ef611b5
timers in the blocks so we're not hitting super fast
hwamil May 21, 2021
779438d
.
hwamil May 21, 2021
537733a
conflict resolved
hwamil May 21, 2021
1004703
checks and balances
hwamil May 21, 2021
709769a
neatify logs
hwamil May 21, 2021
b4c7b17
placed another check so that if json request gets 503'd it moves onto…
hwamil May 21, 2021
bed40d6
blackd
hwamil May 21, 2021
98c8b51
omg this is clunky as hell
hwamil May 21, 2021
d1e33c3
TypeError bandaid in validate_session
hwamil May 21, 2021
d9e93b9
big bandaid for nonetype error
hwamil May 21, 2021
9db43be
tree is not None
hwamil May 21, 2021
0e38e37
proxy next to dict to see confirm change
hwamil May 21, 2021
0957b32
lol coffeebeans freaked me out
hwamil May 21, 2021
1b5fb14
changed test product to a card so sessions start empty lmao
hwamil May 21, 2021
fe22a2e
stopping validation every time the loop starts
hwamil May 21, 2021
dc91dd8
queue.put before save_html, duh
hwamil May 21, 2021
78abf2a
Update amazon_monitoring.py
hwamil May 21, 2021
e470643
clearing json_dict before get to confirm we are getting valid json pa…
hwamil May 22, 2021
16c9a4a
fixed turbo_ini params. confirms it initiates correctly. received emp…
hwamil May 22, 2021
6696154
more logssss cuz we don't have enoughhhh
hwamil May 22, 2021
f9c9eee
remember me checkbox fix taken from calebchongc
hwamil May 22, 2021
2c8002e
fixed confusing debug log that made it appear that it runs ajax on ev…
hwamil May 22, 2021
ef4e7e6
just making logs look pretty.
hwamil May 22, 2021
fbc3809
turbo_checkout missing domain param added
hwamil May 22, 2021
e1e143c
domain -> self.amazon_domain
hwamil May 22, 2021
4dbbeeb
bring me more logs!
hwamil May 22, 2021
7090f5d
isOK check added
hwamil May 22, 2021
52585c0
isOK check added2
hwamil May 22, 2021
fda3247
unsplitted turbo init branching for json and ajax methods
hwamil May 22, 2021
7f776ac
little clean up
hwamil May 22, 2021
5134846
using context manager (async with session.get(url) as r) to validate …
hwamil May 22, 2021
40dd1ad
more cleaning
hwamil May 22, 2021
5f4897f
get revalidated if CSRF Error
hwamil May 22, 2021
30e0543
continue after failing validation
hwamil May 22, 2021
ff565b4
ItemsHandler now adds back items that's been removed after turboing i…
hwamil May 22, 2021
1dbc92e
offerid.template_json
hwamil May 22, 2021
4c33718
ItemHandler timer increased from 10 min to 60 min
hwamil May 22, 2021
538faea
asyncio InvalidStateError bandaid exception catching and timer reset …
hwamil May 22, 2021
efc91ba
clear out removed items list on ItemsHandler.refresh
hwamil May 22, 2021
a6521ef
I'm stupid
hwamil May 22, 2021
a547e79
one less line for the same result. bliss
hwamil May 22, 2021
bb0c46a
Figured out that ValueError: not in list was happening because multip…
hwamil May 23, 2021
f59e886
Merge branch 'alpha' of github.com:Hari-Nagarajan/fairgame into rotat…
hwamil May 23, 2021
aa0beda
cleaned off bpc
hwamil May 23, 2021
a795d85
ValueError exception handling modified
hwamil May 23, 2021
d8d206b
moved queue.put() and save_html into try block instead of outside of …
hwamil May 23, 2021
fd66e60
forgot the f in f-string
hwamil May 23, 2021
933d240
log for checking whether turbo init is getting SellerDetail or offerid
hwamil May 23, 2021
1d121a4
trying to catch StopIteration
hwamil May 23, 2021
11aa281
exception StopIteration catching at next_item method
hwamil May 23, 2021
1000e05
stop logging ValueError
hwamil May 23, 2021
2693d7f
random delay
hwamil May 23, 2021
ce94705
asyncio.TimeoutError exception catching
hwamil May 23, 2021
d747451
log for timeouterror
hwamil May 23, 2021
c18d5ad
lmao infinite delay by mistake. fixed
hwamil May 23, 2021
30b9bee
some sleeping cushions in validation process
hwamil May 23, 2021
0ec1fed
changed randint range to 0,4
hwamil May 23, 2021
58dc858
all exception catching try/except block at cli.py just as a hotfix
hwamil May 23, 2021
89045ab
some changes that I can't remember
hwamil May 24, 2021
b392ea4
fail_recreate seems to cause the crashes (for unknown reasons to me s…
hwamil May 24, 2021
7d858f2
I can't count
hwamil May 24, 2021
fa0298e
getting rid of staggering since now there is random delay
hwamil May 24, 2021
9db4e1c
Reset fail_counter after cooldown
hwamil May 24, 2021
d3a4275
Now using fake_headers module to generate random headers for sessions
hwamil May 24, 2021
f71bd22
dependencies for fake_headers
hwamil May 24, 2021
bcf1149
max fail from 10 to 5
hwamil May 24, 2021
8781e49
current_group_proxies dust cleaned
hwamil May 24, 2021
84c1d34
changed loglevel for ajax so we can see non-offerid items in aioconfig
hwamil May 24, 2021
ca09532
you know i love logs
hwamil May 24, 2021
cd0861f
Trying to catch the reason for 'can't extract image from plain/text' …
hwamil May 24, 2021
fe13f56
amazoncaptcha.exceptions.ContentTypeError was the culprit. Why now? W…
hwamil May 24, 2021
1895e6e
catching typeerror
hwamil May 25, 2021
df3c81d
Linux specific Errno 32, for python it becomes IOError. Putting entir…
hwamil May 25, 2021
039099b
merged hari/alpha with TheTabKey's commit
hwamil May 25, 2021
f5bf60e
proccesspoolexecutor added instead of trying to priorityqueue
hwamil May 26, 2021
ece3a80
merged hari/alpha: --use-proxies flag
hwamil May 26, 2021
ee5ea35
more resolving with main alpha branch and adding offerid flag
hwamil May 26, 2021
8627fec
turned on debug logs
hwamil May 26, 2021
dbb72cb
merged processpoolexecutor method
hwamil May 26, 2021
0665916
blackd
hwamil May 26, 2021
5069651
trying multiprocessing on captcha solving as well
hwamil May 26, 2021
04b943e
increased proxy cooldown time from 1 hr to 6 hrs to lessen chance of …
hwamil May 26, 2021
e09878e
took captcha solving off of multiprocessing.
hwamil May 26, 2021
d6ca7ed
added back init stagger to potentially alleviate IO congestion... pre…
hwamil May 26, 2021
8812eca
changed so that asyncio.gather(monitors) executes first without being…
hwamil May 26, 2021
487b1d3
went back to the old way of submitting to process pool as it made mor…
hwamil May 26, 2021
4f4e527
misc
hwamil May 26, 2021
b477cae
misc2
hwamil May 26, 2021
1718264
misc3
hwamil May 26, 2021
f007999
trying out asyncio's own run_in_executor for parallel computing. capt…
hwamil May 27, 2021
45a8b85
deleted concurrent.futures import line. one thing to note is that run…
hwamil May 27, 2021
56408ef
limit captcha solving workers to 2 so that other two can run checkout…
hwamil May 27, 2021
b53241c
just putting run_in_executor on everything I can since I'm getting 40…
hwamil May 30, 2021
ed4e1ea
take get_qualified_seller off executor
hwamil Jun 4, 2021
1cf50e0
made small change to source code for amazoncaptcha so copied the libr…
hwamil Jun 4, 2021
f7796a4
placed sleep in while loop in validate_session
hwamil Jun 4, 2021
3235aa9
using fake-headers since mobile headers ain't doing shit
hwamil Jun 4, 2021
080bc92
experimenting with captcha solves and whether it can get past bot det…
hwamil Jun 4, 2021
f2030b9
instead of sleeping an hour and resuming now 10 fails will make a pro…
hwamil Jun 4, 2021
9996c8b
updating log statements
hwamil Jun 4, 2021
ec03aee
dem logs
hwamil Jun 4, 2021
45f0a7e
checks status along with tree so we don't unnecessarily call get_sellers
hwamil Jun 4, 2021
8682896
dem logss
hwamil Jun 4, 2021
58f852b
less noise in the logs
hwamil Jun 4, 2021
363bd6e
randomize sleep times so program not bombarded all at once when coold…
hwamil Jun 5, 2021
e3d6019
50 tries seem okay but with a lot of proxies getting captchas it real…
hwamil Jun 5, 2021
ce0dbf2
add to bad proxies list when failing validation
hwamil Jun 5, 2021
3caa800
changed fail sleep time between 5-10 minutes instead of 30-60 minutes
hwamil Jun 5, 2021
7786572
added captchaaio
hwamil Jun 5, 2021
04d342e
undoing multiprocessing on checkout_worker since some users complaine…
hwamil Jun 6, 2021
a0a39ba
await gather
hwamil Jun 6, 2021
c378ade
modify logs
hwamil Jun 6, 2021
226c0cf
got rid of cooldown for 503 since json method gets 200 while ajax get…
hwamil Jun 6, 2021
05b6943
remove proxy from badproxies list if it becomes validated
hwamil Jun 6, 2021
b14e86f
clean up
hwamil Jun 6, 2021
151d7c1
trying another way to run checkout_worker in parallel
hwamil Jun 6, 2021
b111eef
apparently if you don't set process executor for run_in_executor it d…
hwamil Jun 6, 2021
a304f68
captcha max_workers to half cpu_count
hwamil Jun 6, 2021
96d442d
captcha max try back to 25
hwamil Jun 6, 2021
956006e
infinite (1000) captcha tries
hwamil Jun 7, 2021
0900119
changes I can't remember.
hwamil Jun 8, 2021
d2a7566
Merge branch 'alpha' of github.com:Hari-Nagarajan/fairgame into rotat…
hwamil Jun 8, 2021
574ef28
merged alpha
hwamil Jun 8, 2021
ea3975f
no hard-coded domain
hwamil Jun 8, 2021
0a17616
Merge branch 'alpha' of github.com:Hari-Nagarajan/fairgame into rotat…
hwamil Jun 8, 2021
12e1419
fix lack of scheme -Dakk
hwamil Jun 8, 2021
6ebd2d5
merged alpha
hwamil Jun 8, 2021
526d462
I'm getting way too many captchas. gotta filter out the good ones
hwamil Jun 8, 2021
8b5f765
tiny fix for good_proxies.json len
hwamil Jun 8, 2021
1c0343e
more proxies list changes
hwamil Jun 8, 2021
0a7143b
.
hwamil Jun 8, 2021
ead4438
pass monitoring session into captcha solver so that it doesn't use ho…
hwamil Jun 8, 2021
26d6127
log and html save changes
hwamil Jun 11, 2021
4434e3b
.
hwamil Jun 28, 2021
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
133 changes: 83 additions & 50 deletions stores/amazon_monitoring.py
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,8 @@
from fake_useragent import UserAgent
from amazoncaptcha import AmazonCaptcha

from itertools import cycle

from urllib.parse import urlparse

import re
Expand Down Expand Up @@ -66,22 +68,6 @@
policy = asyncio.WindowsSelectorEventLoopPolicy()
asyncio.set_event_loop_policy(policy)

# PDP_URL = "https://smile.amazon.com/gp/product/"
# AMAZON_DOMAIN = "www.amazon.com.au"
# AMAZON_DOMAIN = "www.amazon.com.br"
# AMAZON_DOMAIN = "www.amazon.ca"
# NOT SUPPORTED AMAZON_DOMAIN = "www.amazon.cn"
# AMAZON_DOMAIN = "www.amazon.fr"
# AMAZON_DOMAIN = "www.amazon.de"
# NOT SUPPORTED AMAZON_DOMAIN = "www.amazon.in"
# AMAZON_DOMAIN = "www.amazon.it"
# AMAZON_DOMAIN = "www.amazon.co.jp"
# AMAZON_DOMAIN = "www.amazon.com.mx"
# AMAZON_DOMAIN = "www.amazon.nl"
# AMAZON_DOMAIN = "www.amazon.es"
# AMAZON_DOMAIN = "www.amazon.co.uk"
# AMAZON_DOMAIN = "www.amazon.com"
# AMAZON_DOMAIN = "www.amazon.se"

AMAZON_URLS = {
"BASE_URL": "https://{domain}/",
Expand All @@ -95,6 +81,7 @@

PDP_PATH = "/dp/"
REALTIME_INVENTORY_PATH = "gp/aod/ajax?asin="
BAD_PROXIES_PATH = "html_saves/bad_proxies.json"
hwamil marked this conversation as resolved.
Show resolved Hide resolved

CONFIG_FILE_PATH = "config/amazon_requests_config.json"
PROXY_FILE_PATH = "config/proxies.json"
Expand All @@ -111,6 +98,33 @@
amazon_config = {}


class ItemsHandler:
@classmethod
def create_items_pool(cls, item_list):
cls.items = cycle(item_list)

@classmethod
def assign_next_item(cls):
return next(cls.items)

class BadProxyCollector:
@classmethod
def __init__(cls):
if os.path.exists(BAD_PROXIES_PATH):
with open(BAD_PROXIES_PATH) as f:
cls.collection = json.load(f)
else:
cls.collection = {}

@classmethod
def collect(cls, status, connector):
if status == 503:
proxy = str(connector.proxy_url)
cls.collection.update({proxy : "null"})
hwamil marked this conversation as resolved.
Show resolved Hide resolved
with open(BAD_PROXIES_PATH, "w") as f:
json.dump(cls.collection, f, indent=4)
hwamil marked this conversation as resolved.
Show resolved Hide resolved


class AmazonMonitoringHandler(BaseStoreHandler):
http_client = False
http_20_client = False
Expand All @@ -123,8 +137,7 @@ def __init__(
delay: float,
amazon_config,
tasks=1,
checkshipping=False,
) -> None:
checkshipping=False,) -> None:
hwamil marked this conversation as resolved.
Show resolved Hide resolved
log.debug("Initializing AmazonMonitoringHandler")
super().__init__()

Expand All @@ -133,62 +146,46 @@ def __init__(
self.notification_handler = notification_handler
self.check_shipping = checkshipping
self.item_list: typing.List[FGItem] = item_list
self.stock_checks = 0
self.start_time = int(time.time())
self.amazon_config = amazon_config
ua = UserAgent()

self.proxies = get_proxies(path=PROXY_FILE_PATH)
ItemsHandler.create_items_pool(self.item_list)

# Initialize the Session we'll use for stock checking
log.debug("Initializing Monitoring Sessions")
self.sessions_list: Optional[List[AmazonMonitor]] = []
for idx in range(len(item_list)):
connector = None
if self.proxies and idx < len(self.proxies):
connector = ProxyConnector.from_url(self.proxies[idx]["https"])
for idx in range(len(self.proxies)):
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This won't be suitable for merging unless it also supports the "default" configuration of no proxies.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But since you're cycling each AmazonMonitor through all the items, it seems like that shouldn't be hard to add, you'd just need to make sure that if there are no proxies, to instantiate a single AmazonMonitor that uses the default connector.

This PR changes the meaning of the delay parameter from a per-item delay to a per-connection/proxy delay. That still seems totally workable, though, as long as people running the code are aware of it.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I wrote it so that the delay parameter still works as intended - it's still the delay for check per session/proxy which should keep them getting dogg'd (503). I kinda figured if you're not using proxies alpha is kinda useless; might as well use master branch.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I disagree pretty strongly that alpha is useless without proxies. I was getting softbanned using master because it was doing stock checks while logged in. Alpha uses anonymous sessions for stock checking, which is a huge improvement. I successfully bought what I wanted (granted, an AMD CPU, not a GPU) using alpha with no proxies.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've implemented a default mode with no proxies, though it looks ugly rn. Gonna try to clean it up once I figure out how.

connector = ProxyConnector.from_url(self.proxies[idx])
self.sessions_list.append(
AmazonMonitor(
headers=HEADERS,
item=item_list[idx],
amazon_config=self.amazon_config,
connector=connector,
delay=delay,
)
)
self.sessions_list[idx].headers.update({"user-agent": ua.random})


# class Offers(NamedTuple):
# asin: str
# offerlistingid: str
# merchantid: str
# price: float
# timestamp: float
# __slots__ = ()
#
# def __str__(self):
# return f"ASIN: {self.asin}; offerListingId: {self.offerlistingid}; merchantId: {self.merchantid}; price: {self.price}"


class AmazonMonitor(aiohttp.ClientSession):
def __init__(
self,
item: FGItem,
amazon_config: Dict,
delay: float,
*args,
**kwargs,
):
super(self.__class__, self).__init__(*args, **kwargs)
self.item = item
self.item = self.next_item()
self.check_count = 1
self.amazon_config = amazon_config
self.domain = urlparse(self.item.furl.url).netloc

self.delay = delay
if item.purchase_delay > 0:
if self.item.purchase_delay > 0:
self.delay = 20
self.block_purchase_until = time.time() + item.purchase_delay
self.block_purchase_until = time.time() + self.item.purchase_delay
log.debug("Initializing Monitoring Task")

def assign_config(self, azn_config):
Expand All @@ -197,14 +194,13 @@ def assign_config(self, azn_config):
def assign_delay(self, delay: float = 5):
self.delay = delay

def assign_item(self, item: FGItem):
self.item = item
def next_item(self):
return ItemsHandler.assign_next_item()

def fail_recreate(self):
# Something wrong, start a new task then kill this one
log.debug("Max consecutive request fails reached. Restarting session")
session = AmazonMonitor(
item=self.item,
amazon_config=self.amazon_config,
delay=self.delay,
connector=self.connector,
Expand All @@ -218,14 +214,17 @@ async def stock_check(self, queue: asyncio.Queue, future: asyncio.Future):
# Do first response outside of while loop, so we can continue on captcha checks
# and return to start of while loop with that response. Requires the next response
# to be grabbed at end of while loop
log.debug(f"Monitoring Task Started for {self.item.id}")

# log.debug(f"Monitoring Task Started for {self.item.id}")
collector = BadProxyCollector()

fail_counter = 0 # Count sequential get fails
delay = self.delay
end_time = time.time() + delay
status, response_text = await self.aio_get(url=self.item.furl.url)

save_html_response("stock-check", status, response_text)
# save_html_response("stock-check", status, response_text)
collector.collect(status, self.connector)

# do this after each request
fail_counter = check_fail(status=status, fail_counter=fail_counter)
Expand All @@ -234,10 +233,10 @@ async def stock_check(self, queue: asyncio.Queue, future: asyncio.Future):
future.set_result(session)
return

check_count = 1

# Loop will only exit if a qualified seller is returned.
while True:
log.debug(f"{self.item.id} Stock Check Count: {check_count}")
log.debug(f"{self.item.id} : {self.connector.proxy_url} : Stock Check Count = {self.check_count}")
tree = check_response(response_text)
if tree is not None:
if captcha_element := has_captcha(tree):
Expand Down Expand Up @@ -287,15 +286,17 @@ async def stock_check(self, queue: asyncio.Queue, future: asyncio.Future):
await wait_timer(end_time)
end_time = time.time() + delay
status, response_text = await self.aio_get(url=self.item.furl.url)
save_html_response("stock-check", status, response_text)
# save_html_response("stock-check", status, response_text)
collector.collect(status, self.connector)
# do this after each request
fail_counter = check_fail(status=status, fail_counter=fail_counter)
if fail_counter == -1:
session = self.fail_recreate()
future.set_result(session)
return

check_count += 1
self.check_count += 1
self.next_item()
hwamil marked this conversation as resolved.
Show resolved Hide resolved

async def aio_get(self, url):
text = None
Expand Down Expand Up @@ -526,6 +527,7 @@ def get_proxies(path=PROXY_FILE_PATH):

return proxies


# def verify(self):
# log.debug("Verifying item list...")
# items_to_purge = []
Expand Down Expand Up @@ -644,3 +646,34 @@ def get_proxies(path=PROXY_FILE_PATH):
# pickle.dump(item_cache, open(item_cache_file, "wb"))
#
# return True


# class Offers(NamedTuple):
# asin: str
# offerlistingid: str
# merchantid: str
# price: float
# timestamp: float
# __slots__ = ()
#
# def __str__(self):
# return f"ASIN: {self.asin}; offerListingId: {self.offerlistingid}; merchantId: {self.merchantid}; price: {self.price}"


# PDP_URL = "https://smile.amazon.com/gp/product/"
# AMAZON_DOMAIN = "www.amazon.com.au"
# AMAZON_DOMAIN = "www.amazon.com.br"
# AMAZON_DOMAIN = "www.amazon.ca"
# NOT SUPPORTED AMAZON_DOMAIN = "www.amazon.cn"
# AMAZON_DOMAIN = "www.amazon.fr"
# AMAZON_DOMAIN = "www.amazon.de"
# NOT SUPPORTED AMAZON_DOMAIN = "www.amazon.in"
# AMAZON_DOMAIN = "www.amazon.it"
# AMAZON_DOMAIN = "www.amazon.co.jp"
# AMAZON_DOMAIN = "www.amazon.com.mx"
# AMAZON_DOMAIN = "www.amazon.nl"
# AMAZON_DOMAIN = "www.amazon.es"
# AMAZON_DOMAIN = "www.amazon.co.uk"
# AMAZON_DOMAIN = "www.amazon.com"
# AMAZON_DOMAIN = "www.amazon.se"