pip install datto
datto is a package with various data tools to help in data analysis and data science work.
You can find the documentation here.
Some examples of what you can do:
- Remove links from some text
- Extract body of an email only (no greeting or signature)
- Easily load/save data from S3
- Run SQL from Python
- Explore data - check for mistyped data, find correlated data
- Assign a given user to an experimental condition
- Create an HTML dropdown from a DataFrame
- Find the most common phrases by a category
- Classify free text responses into any number of meaningful groups (e.g. find survey themes)
- Make a simple Python logger with default options
- Take some data and test a bunch of machine learning models on it
For detailed examples of how you can use it, check out this Juypter notebook.
Create virtualenv (specify version of Python you want):
pyenv virtualenv 3.6 datto
Activate virtualenv:
pyenv activate datto
Install dependencies (specified in pyproject.toml file) in virtualenv:
poetry install
To add any new dependencies you need to Poetry, run:
poetry add PACKAGE_NAME
Run tests:
Run the following to make sure all tests pass:
make test
Submitting a change:
Create a PR with your desired change(s), and request review from the code owner!