Skip to content

A bot that automatically sends emails to new ads posted in any desired xe.gr search url.

License

Notifications You must be signed in to change notification settings

drkostas/JobApplicationBot

Repository files navigation

Auto Apply Bot

CircleCI GitHub license

Table of Contents

About

A bot that automatically sends emails to new ads posted in any desired xe.gr search url.

In just a few minutes of configuring until it suits your needs, it can easily be deployed and start sending your specified emails to every new ad that gets posted in the search url you select within xe.gr.

With a little programming, you can also modify the XeGrAdSiteCrawler class and make it support other advertisement sites too. Feel free to fork.

Getting Started

These instructions will get you a copy of the project up and running on your local machine for development and testing purposes. See deployment for notes on how to deploy the project on a live system.

Prerequisites

You need to have a machine with Python > 3.6 and any Bash based shell (e.g. zsh) installed.

$ python3.6 -V
Python 3.6.9

echo $SHELL
/usr/bin/zsh

You will also need to setup the following:

Set the required environment variables

In order to run the main.py or the tests you will need to set the following environmental variables in your system:

DROPBOX_API_KEY=<VALUE>
MYSQL_HOST=<VALUE>
MYSQL_USERNAME=<VALUE>
MYSQL_PASSWORD=<VALUE>
MYSQL_DB_NAME=<VALUE>
EMAIL_ADDRESS=<VALUE>
GMAIL_API_KEY=<VALUE>
CHECK_INTERVAL=<VALUE>
CRAWL_INTERVAL=<VALUE>
TEST_MODE=<VALUE>
LOOKUP_URL=<VALUE>
  • LOOKUP_URL (str): The url that matches your desired search results. You can copy it straight from your browser.
  • CHECK_INTERVAL (int) : The seconds to wait before each check (for new ads).
  • CRAWL_INTERVAL (int) : The seconds to wait before each crawl (for the discovering of sublinks).
  • TEST_MODE (bool) : If enabled, every email will be sent to you instead of the discovered email addresses.

Modify the files in the data folder

Before starting, you should modify the emails that are going to be sent, the stop-words e.t.c.

  • stop_words.txt: A list of words that you don't want to be present in the ads that the bot sends emails to.
  • application_to_send_subject.txt: The subject of the email that is going to be sent to new ads.
  • application_to_send_body.html: The html body of the email that is going to be sent to new ads.
  • inform_success_subject.txt: The subject of the email that is going to be sent to you when the bot successfully sends an email.
  • inform_success_body.html: The html body of the email that is going to be sent to you when the bot successfully sends an email. Make sure to use the {link} and {email} vars in order to include them in the email.
  • inform_should_call.txt: The subject of the email that is going to be sent to you when the bot couldn't find any email to a new ad, and requires manual action.
  • inform_should_call_body.html: The html body of the email that is going to be sent to you when the bot couldn't find any email to a new ad, and requires manual action. Make sure to use the {link} var in order to include it in the email.
  • Attachments: Add any attachments you want to be included in the Ad Email and define their names in xegr_jobs.yml

Installing, Testing, Building

All the installation steps are being handled by the Makefile.

If you don't want to go through the setup steps and finish the installation and run the tests, execute the following command:

$ make install server=local

If you executed the previous command, you can skip through to the Running locally section.

Check the available make commands

$ make help

-----------------------------------------------------------------------------------------------------------
                                              DISPLAYING HELP                                              
-----------------------------------------------------------------------------------------------------------
make delete_venv
       Delete the current venv
make create_venv
       Create a new venv for the specified python version
make requirements
       Upgrade pip and install the requirements
make run_tests
       Run all the tests from the specified folder
make setup
       Call setup.py install
make clean_pyc
       Clean all the pyc files
make clean_build
       Clean all the build folders
make clean
       Call delete_venv clean_pyc clean_build
make install
       Call clean create_venv requirements run_tests setup
make help
       Display this message
-----------------------------------------------------------------------------------------------------------

Clean any previous builds

$ make clean server=local
make delete_venv
make[1]: Entering directory '/home/drkostas/Projects/AutoApplyBot'
Deleting venv..
rm -rf venv
make[1]: Leaving directory '/home/drkostas/Projects/AutoApplyBot'
make clean_pyc
make[1]: Entering directory '/home/drkostas/Projects/AutoApplyBot'
Cleaning pyc files..
find . -name '*.pyc' -delete
find . -name '*.pyo' -delete
find . -name '*~' -delete
make[1]: Leaving directory '/home/drkostas/Projects/AutoApplyBot'
make clean_build
make[1]: Entering directory '/home/drkostas/Projects/AutoApplyBot'
Cleaning build directories..
rm --force --recursive build/
rm --force --recursive dist/
rm --force --recursive *.egg-info
make[1]: Leaving directory '/home/drkostas/Projects/AutoApplyBot'

Create a new venv and install the requirements

$ make create_venv server=local
Creating venv..
python3.6 -m venv ./venv

$ make requirements server=local
Upgrading pip..
venv/bin/pip install --upgrade pip wheel setuptools
Collecting pip
.................

Run the tests

The tests are located in the tests folder. To run all of them, execute the following command:

$ make run_tests server=local
source venv/bin/activate && \
.................

Build the project locally

To build the project locally using the setup.py command, execute the following command:

$ make setup server=local
venv/bin/python setup.py install '--local'
running install
.................

Running the code locally

In order to run the code now, you will only need to change the yml file if you need to and run either the main or the created console script.

Modifying the Configuration

There is an already configured yml file under xegr_jobs.yml with the following structure:

tag: production
lookup_url: !ENV ${LOOKUP_URL}
check_interval: !ENV ${CHECK_INTERVAL}
crawl_interval: !ENV ${CRAWL_INTERVAL}
test_mode: !ENV ${TEST_MODE}
cloudstore:
  - config:
      api_key: !ENV ${DROPBOX_API_KEY}
      local_files_folder: data
      attachments_names:
        - cv.pdf
        - cover_letter.pdf
      update_attachments: true
      update_stop_words: true
      update_application_to_send_email: true
      update_inform_success_email: true
      update_inform_should_call_email: true
    type: dropbox
datastore:
  - config:
      hostname: !ENV ${MYSQL_HOST}
      username: !ENV ${MYSQL_USERNAME}
      password: !ENV ${MYSQL_PASSWORD}
      db_name: !ENV ${MYSQL_DB_NAME}
      port: 3306
    type: mysql
email_app:
  - config:
      email_address: !ENV ${EMAIL_ADDRESS}
      api_key: !ENV ${GMAIL_API_KEY}
    type: gmail

The !ENV flag indicates that a environmental value follows. You can change the values/environmental var names as you wish. If a yaml variable name is changed/added/deleted, the corresponding changes should be reflected on the Configuration class and the yml_schema.json too.

You can also modify each class's default options

Execution Options

First, make sure you are in the created virtual environment:

$ source venv/bin/activate
(venv) 
OneDrive/Projects/auto_apply_bot  dev 

$ which python
/home/drkostas/Projects/auto_apply_bot/venv/bin/python
(venv) 

If it's the first time you are running the code you may need to execute those 2 steps:

  • To create the required table in the Database run:

    $ python main.py -m create_table -c confs/conf.yml -l logs/output.log

  • To upload the files that are going to be used to Dropbox (after modifying them appropriately) run:

    $ python main.py -m upload_files -c confs/conf.yml -l logs/output.log

Now, in order to run the code you can either call the main.py directly, or the auto_apply_bot console script.

$ python main.py --help
usage: main.py -m
               {crawl_and_send,list_emails,remove_email,upload_files,create_table}
               -c CONFIG_FILE [-l LOG] [--email-id EMAIL_ID] [-d] [-h]

A bot that automatically sends emails to new ads posted in the specified xe.gr
search page.

required arguments:
  -m {crawl_and_send,list_emails,remove_email,upload_files,create_table}, --run-mode {crawl_and_send,list_emails,remove_email,upload_files,create_table}
  -c CONFIG_FILE, --config-file CONFIG_FILE
                        The configuration yml file
  -l LOG, --log LOG     Name of the output log file

Optional Arguments:
  --email-id EMAIL_ID   The id of the email you want to be deleted
  -d, --debug           Enables the debug log messages
  -h, --help            Show this help message and exit


# Or

$ auto_apply_bot --help
usage: auto_apply_bot -m
               {crawl_and_send,list_emails,remove_email,upload_files,create_table}
               -c CONFIG_FILE [-l LOG] [--email-id EMAIL_ID] [-d] [-h]

A bot that automatically sends emails to new ads posted in the specified xe.gr
search page.

required arguments:
  -m {crawl_and_send,list_emails,remove_email,upload_files,create_table}, --run-mode {crawl_and_send,list_emails,remove_email,upload_files,create_table}
  -c CONFIG_FILE, --config-file CONFIG_FILE
                        The configuration yml file
  -l LOG, --log LOG     Name of the output log file

Optional Arguments:
  --email-id EMAIL_ID   The id of the email you want to be deleted
  -d, --debug           Enables the debug log messages
  -h, --help            Show this help message and exit

If you notice that no ad is being discovered, fine-tune the crawl_interval and anchor_class_name values that affect the XeGrAdSiteCrawler class.

  • The crawl_interval defines the time between each crawl and should be increased if the bot is being flagged as a bot (well..). You can change this from the yaml file.

  • The anchor_class_name is the css class value that characterizes all the search results anchors (<a .. class=) and if you think it is wrong, you can change this from the yaml file too.

Deployment

The deployment is being done to Heroku. For more information you can check the setup guide.

Make sure you check the defined Procfile (reference) and that you set the above-mentioned environmental variables (reference).

Continuous Integration

For the continuous integration, the CircleCI service is being used. For more information you can check the setup guide.

Again, you should set the above-mentioned environmental variables (reference) and for any modifications, edit the circleci config.

Built With

License

This project is licensed under the GNU License - see the LICENSE file for details.

Acknowledgments