This package provides an API to perform simple operations on a dictionary of words related to anagrams. It also includes a module that manages the anagrams in such a way it can be expanded upon and used in other projects.
- Flask - The web framework
- pytest - Testing tool
- pytest-cov - Code coverage tool
- Docker - Containerization
- gunicorn - WSGI app for production
- make - Build scripting
There are two methods to start the service:
Requirements:
- docker (only tested on docker-ce 18.06.1-ce)
The simplest method is to run make
which will build and start a container called anagramizer. It listens on 0.0.0.0:3000.
You can also run make build
to simply build the container and start it with the configuration of your choice.
Requirements:
- python (only tested with python 3.6)
- virtualenv (recommended)
Setting up a virtualenv:
virtualenv --python=python3.6 .
source bin/activate
pip install -r requirements.txt
Then if running a production use gunicorn: gunicorn --bind 0.0.0.0:3000 app
add -D
to enable daemon mode
Run in debug mode: FLASK_ENV=development flask run
An endpoint to get a general summary of the entire corpus.
Example:
curl -i http://localhost:3000/
HTTP/1.1 200 OK
{
"stats": {
"latest": true,
"max": 4,
"mean": 4,
"median": 4,
"min": 4,
"num_words": 3,
"top_anagrams": []
},
"words": [
"dare",
"dear",
"read"
]
}
An endpoint to add a list of words to the corpus
Example:
curl -i -X POST -d '{ "words": ["read", "dear", "dare"] }' http://localhost:3000/words.json
{
"words": [] # list of words to add to the corpus.
}
HTTP/1.1 201 Created
Delete all the words in the corpus.
Example:
curl -i -X DELETE http://localhost:3000/words.json
HTTP/1.1 204 No Content
Get all the anagrams in the dictionary for the specified word.
Example:
curl -i http://localhost:3000/anagrams/read.json?limit=1
- word (string): word to find anagagrams
- limit (int): Limits the number of return anagrams to this value
- include_proper (bool): Whether to include proper nouns in returned anagams
HTTP/1.1 200 OK
{
"anagrams": [
"dare"
]
}
Delete the word fromt the corpus
Example:
curl -i -X DELETE http://localhost:3000/anagrams/read.json
- word (string): word to to remove from the corpus
HTTP/1.1 204 No Content
Delete a word and all its anagrams from the corpus.
Example:
curl -i -X DELETE http://localhost:3000/anagrams/read/delall
- word (string): word to to remove from the corpus, and all its anagrams
HTTP/1.1 204 No Content
Get all anagram groups greater than or equal to size.
Example:
curl -i http://localhost:3000/anagrams/more/3
- size (int): minimum size of anagram sets to return
HTTP/1.1 200 OK
{
"anagrams": [
[
"dare",
"dear",
"read"
]
]
}
Test if a list of words are anagrams of each other.
Example:
curl -i -X POST -d '{ "words": ["read", "dear", "dare"] }' http://localhost:3000/words.json
{
"words": [] # list of words to check
}
HTTP/1.1 200 OK
{
"anagrams": true
}
Example:
curl -i http://localhost:3000/anagrams/stats
HTTP/1.1 200 OK
{
"stats": {
"latest": true, # whether this stats are out of date with current corpus
"max": 4, # Max word length in corpus
"mean": 4, # Mean word length in corpus
"median": 4, # Median word length in corpus
"min": 4, # Min word Length in corpus
"num_words": 3 # Total number of words in corpus
}
}
Tests were run using pytest. To run the most basic test run pytest
at the root project dir.
For full coverage report run:
pytest --cov-report term-missing --cov anagramizer --cov app
As I was developing this I had several thoughts which I will note here:
- Testing the API itself was slightly more difficult and felt like integration testing as opposed to unit testing. Especially when trying to build the initial state to be able to actually use some the endpoints under different conditions
- Versioning the API was something I struggled with. I could not come up with a decent way that wouldn't cause large amounts of spaghetti code in the future.
- I debated going with some sort of SQLLite on the back end for a data sstore since it is built in to python, but decided against because it seemed like overkill. Instead I simply store it to a file that can be either uncompressed or compressed with gzip.
- I had to find a vway to capture the service exiting in order to be sure and actually save the corpus to file so it is preserved on restart. Luckily python has a built in atexit which registers a function to be called on exit of the program. It currently does not catch the more abrupt signals.
- I did not to seem to have poor performance when using the provided dictionary, however more thorough testing of larger data sets would be needed to ensure scalability.
- I found it kind of difficult to actually document the end points in a consitent way that people would understand.
- Versioning for the api
- More paramterization in the api to allow more configuration
- Better testing of the api
- Testing of more versions of python
- Package the anagramizer package so it can be pip installed
- More robust build scripts
- Integrate testing and coverage output into CI/CD systems
- More robust backend data store to handle several requests, maybe NoSQL due to simple data structure
- More thorough error handling when not following the specified API docs