A RESTful API for housing website, built with secure Elastic Stack and MongoDB.
- housing data Crawler
- RESTful API with MongoDB for house data querying and creating
- RESTful API with Token Authorization
- API Doc built with Aglio
- Elasticsearch Cluster Storage
- House Data with Kibana Display
- Filebeat Log Dashboard for MongoDB
- APM Dashboard for API application
- Metricbeat Dashboard for MongoDB
- python ^3.7.3
- docker ^18.09.2
- docker-compose ^1.17.1
- Tesseract OCR ^4.x
a. Install pipenv environment
make init
pipenv shell
b. Activate MongoDB
cd docker-manifests
docker-compose -f docker-mongo.yml up -d
c. Activate Elasticsearch, Kibana
-
add .env to environment/
-
setup elasticsearch Account at first usage
cd docker-manifests
docker-compose -f create-certs.yml run --rm create_certs
docker-compose -f docker-elastic.yml up -d
docker exec es01 /bin/bash -c "bin/elasticsearch-setup-passwords auto --batch --url https://es01:9200"
docker-compose -f docker-elastic.yml down
-
setup ELASTICSEARCH_PASSWORD in .env
-
copy es01.crt & kib01.crt to your computer and set certificate
docker cp es01:/usr/share/elasticsearch/config/certificates/es01/es01.crt ./
docker cp es01:/usr/share/elasticsearch/config/certificates/kib01/kib01.crt ./
- restart elasticsearch & kibana
docker-compose -f docker-elastic.yml up -d
d. Activate Filebeat, APM, MetricBeat
- start filebeat, apm, metricbeat
cd docker-manifests
docker-compose -f docker-beats.yml up -d
- setup filebeat
docker exec -it filebeat01 bash
filebeat setup
filebeat -e
- setup apm
docker exec -it apm01 bash
apm-server -e
- setup metricbeat
docker exec -it metricbeat01 bash
metricbeat01 setup
metricbeat01 -e
e. Execute Crawler
parse house data in Taipei (city_id=1) and Export House data to csv
cd services
export PYTHONPATH=$PYTHONPATH:$PWD
python crawler/main.py \
--urls_file data/urls.csv \
--data_file data/temp_info.csv\
--city_id 1 \
--url_start 0 \
--url_end 250
f. Import Data to Database
- Import house data to MongoDB (from csv)
py api/loader/csv_to_mongo.py --file data/temp_info.csv
- Import house data to Elasticsearch (from csv)
py api/loader/csv_to_es.py --file data/temp_info.csv
g. Start RESTful API
python api/app.py
docker-compose -f docker-house.yml up -d
Build API Docker Image
docker build . -f Dockerfile.crawler -t house-api-img --no-cache
Start WSGI in Container
docker run --env-file .env --name house-api -d -p 5000:5000 house-api-img
Build Crawler Docker Image
docker build . -f Dockerfile.crawler -t house-crawler-img --no-cache
Start Crawler in Container
docker run --env-file .env --name house-crawler house-crawler-img
pipenv install --dev
Insert Fake Data to MongoDB
cd services
export PYTHONPATH=$PYTHONPATH:$PWD
python api/loader/csv_to_mongo.py --file api/tests/api/fake_houses.csv
create default api account
python api/loader/init_account.py
Execute Tests
make tests
create API Doc (Use Aglio)
aglio --theme-variables streak -i docs/api.apib --theme-template triple -o docs/api.html
make lint
docker exec -it mongodb bash
mongo -u <user_name>
use <db>
- Check CHANGELOG.md
- Po-Chun, Lu