- About
- Getting Started
- Installing, Testing, Building
- Running locally
- Deployment
- Continuous Ιntegration
- Built With
- License
- Acknowledgments
A bot that automatically sends emails to new ads posted in any desired xe.gr search url.
In just a few minutes of configuring until it suits your needs, it can easily be deployed and start sending your specified emails to every new ad that gets posted in the search url you select within xe.gr.
With a little programming, you can also modify the XeGrAdSiteCrawler class and make it support other advertisement sites too. Feel free to fork.
These instructions will get you a copy of the project up and running on your local machine for development and testing purposes. See deployment for notes on how to deploy the project on a live system.
You need to have a machine with Python > 3.6 and any Bash based shell (e.g. zsh) installed.
$ python3.6 -V
Python 3.6.9
echo $SHELL
/usr/bin/zsh
You will also need to setup the following:
- Gmail: An application-specific password for your Google account. Reference 1, Reference 2
- Dropbox: An Api key for your Dropbox account. Reference 1, Reference 2
- MySql: If you haven't any, you can create a free one on Amazon RDS. Reference 1, Reference 2
In order to run the main.py or the tests you will need to set the following environmental variables in your system:
DROPBOX_API_KEY=<VALUE>
MYSQL_HOST=<VALUE>
MYSQL_USERNAME=<VALUE>
MYSQL_PASSWORD=<VALUE>
MYSQL_DB_NAME=<VALUE>
EMAIL_ADDRESS=<VALUE>
GMAIL_API_KEY=<VALUE>
CHECK_INTERVAL=<VALUE>
CRAWL_INTERVAL=<VALUE>
TEST_MODE=<VALUE>
LOOKUP_URL=<VALUE>
- LOOKUP_URL (str): The url that matches your desired search results. You can copy it straight from your browser.
- CHECK_INTERVAL (int) : The seconds to wait before each check (for new ads).
- CRAWL_INTERVAL (int) : The seconds to wait before each crawl (for the discovering of sublinks).
- TEST_MODE (bool) : If enabled, every email will be sent to you instead of the discovered email addresses.
Before starting, you should modify the emails that are going to be sent, the stop-words e.t.c.
- stop_words.txt: A list of words that you don't want to be present in the ads that the bot sends emails to.
- application_to_send_subject.txt: The subject of the email that is going to be sent to new ads.
- application_to_send_body.html: The html body of the email that is going to be sent to new ads.
- inform_success_subject.txt: The subject of the email that is going to be sent to you when the bot successfully sends an email.
- inform_success_body.html: The html body of the email that is going to be sent to you when the bot successfully sends an email. Make sure to use the {link} and {email} vars in order to include them in the email.
- inform_should_call.txt: The subject of the email that is going to be sent to you when the bot couldn't find any email to a new ad, and requires manual action.
- inform_should_call_body.html: The html body of the email that is going to be sent to you when the bot couldn't find any email to a new ad, and requires manual action. Make sure to use the {link} var in order to include it in the email.
- Attachments: Add any attachments you want to be included in the Ad Email and define their names in xegr_jobs.yml
All the installation steps are being handled by the Makefile.
If you don't want to go through the setup steps and finish the installation and run the tests, execute the following command:
$ make install server=local
If you executed the previous command, you can skip through to the Running locally section.
$ make help
-----------------------------------------------------------------------------------------------------------
DISPLAYING HELP
-----------------------------------------------------------------------------------------------------------
make delete_venv
Delete the current venv
make create_venv
Create a new venv for the specified python version
make requirements
Upgrade pip and install the requirements
make run_tests
Run all the tests from the specified folder
make setup
Call setup.py install
make clean_pyc
Clean all the pyc files
make clean_build
Clean all the build folders
make clean
Call delete_venv clean_pyc clean_build
make install
Call clean create_venv requirements run_tests setup
make help
Display this message
-----------------------------------------------------------------------------------------------------------
$ make clean server=local
make delete_venv
make[1]: Entering directory '/home/drkostas/Projects/AutoApplyBot'
Deleting venv..
rm -rf venv
make[1]: Leaving directory '/home/drkostas/Projects/AutoApplyBot'
make clean_pyc
make[1]: Entering directory '/home/drkostas/Projects/AutoApplyBot'
Cleaning pyc files..
find . -name '*.pyc' -delete
find . -name '*.pyo' -delete
find . -name '*~' -delete
make[1]: Leaving directory '/home/drkostas/Projects/AutoApplyBot'
make clean_build
make[1]: Entering directory '/home/drkostas/Projects/AutoApplyBot'
Cleaning build directories..
rm --force --recursive build/
rm --force --recursive dist/
rm --force --recursive *.egg-info
make[1]: Leaving directory '/home/drkostas/Projects/AutoApplyBot'
$ make create_venv server=local
Creating venv..
python3.6 -m venv ./venv
$ make requirements server=local
Upgrading pip..
venv/bin/pip install --upgrade pip wheel setuptools
Collecting pip
.................
The tests are located in the tests
folder. To run all of them, execute the following command:
$ make run_tests server=local
source venv/bin/activate && \
.................
To build the project locally using the setup.py command, execute the following command:
$ make setup server=local
venv/bin/python setup.py install '--local'
running install
.................
In order to run the code now, you will only need to change the yml file if you need to and run either the main or the created console script.
There is an already configured yml file under xegr_jobs.yml with the following structure:
tag: production
lookup_url: !ENV ${LOOKUP_URL}
check_interval: !ENV ${CHECK_INTERVAL}
crawl_interval: !ENV ${CRAWL_INTERVAL}
test_mode: !ENV ${TEST_MODE}
cloudstore:
- config:
api_key: !ENV ${DROPBOX_API_KEY}
local_files_folder: data
attachments_names:
- cv.pdf
- cover_letter.pdf
update_attachments: true
update_stop_words: true
update_application_to_send_email: true
update_inform_success_email: true
update_inform_should_call_email: true
type: dropbox
datastore:
- config:
hostname: !ENV ${MYSQL_HOST}
username: !ENV ${MYSQL_USERNAME}
password: !ENV ${MYSQL_PASSWORD}
db_name: !ENV ${MYSQL_DB_NAME}
port: 3306
type: mysql
email_app:
- config:
email_address: !ENV ${EMAIL_ADDRESS}
api_key: !ENV ${GMAIL_API_KEY}
type: gmail
The !ENV
flag indicates that a environmental value follows.
You can change the values/environmental var names as you wish.
If a yaml variable name is changed/added/deleted, the corresponding changes should be reflected
on the Configuration class and the yml_schema.json too.
You can also modify each class's default options
First, make sure you are in the created virtual environment:
$ source venv/bin/activate
(venv)
OneDrive/Projects/auto_apply_bot dev
$ which python
/home/drkostas/Projects/auto_apply_bot/venv/bin/python
(venv)
If it's the first time you are running the code you may need to execute those 2 steps:
-
To create the required table in the Database run:
$ python main.py -m create_table -c confs/conf.yml -l logs/output.log
-
To upload the files that are going to be used to Dropbox (after modifying them appropriately) run:
$ python main.py -m upload_files -c confs/conf.yml -l logs/output.log
Now, in order to run the code you can either call the main.py
directly, or the auto_apply_bot
console script.
$ python main.py --help
usage: main.py -m
{crawl_and_send,list_emails,remove_email,upload_files,create_table}
-c CONFIG_FILE [-l LOG] [--email-id EMAIL_ID] [-d] [-h]
A bot that automatically sends emails to new ads posted in the specified xe.gr
search page.
required arguments:
-m {crawl_and_send,list_emails,remove_email,upload_files,create_table}, --run-mode {crawl_and_send,list_emails,remove_email,upload_files,create_table}
-c CONFIG_FILE, --config-file CONFIG_FILE
The configuration yml file
-l LOG, --log LOG Name of the output log file
Optional Arguments:
--email-id EMAIL_ID The id of the email you want to be deleted
-d, --debug Enables the debug log messages
-h, --help Show this help message and exit
# Or
$ auto_apply_bot --help
usage: auto_apply_bot -m
{crawl_and_send,list_emails,remove_email,upload_files,create_table}
-c CONFIG_FILE [-l LOG] [--email-id EMAIL_ID] [-d] [-h]
A bot that automatically sends emails to new ads posted in the specified xe.gr
search page.
required arguments:
-m {crawl_and_send,list_emails,remove_email,upload_files,create_table}, --run-mode {crawl_and_send,list_emails,remove_email,upload_files,create_table}
-c CONFIG_FILE, --config-file CONFIG_FILE
The configuration yml file
-l LOG, --log LOG Name of the output log file
Optional Arguments:
--email-id EMAIL_ID The id of the email you want to be deleted
-d, --debug Enables the debug log messages
-h, --help Show this help message and exit
If you notice that no ad is being discovered, fine-tune the crawl_interval
and anchor_class_name
values that affect
the XeGrAdSiteCrawler class.
-
The
crawl_interval
defines the time between each crawl and should be increased if the bot is being flagged as a bot (well..). You can change this from the yaml file. -
The
anchor_class_name
is the css class value that characterizes all the search results anchors (<a .. class=
) and if you think it is wrong, you can change this from the yaml file too.
The deployment is being done to Heroku. For more information you can check the setup guide.
Make sure you check the defined Procfile (reference) and that you set the above-mentioned environmental variables (reference).
For the continuous integration, the CircleCI service is being used. For more information you can check the setup guide.
Again, you should set the above-mentioned environmental variables (reference) and for any modifications, edit the circleci config.
- Dropbox Python API - Used for the Cloudstore Class
- Gmail Sender - Used for the EmailApp Class
- Heroku - The deployment environment
- CircleCI - Continuous Integration service
This project is licensed under the GNU License - see the LICENSE file for details.
- Thanks το PurpleBooth for the README template