Skip to content

Latest commit

 

History

History
66 lines (44 loc) · 1.64 KB

README.md

File metadata and controls

66 lines (44 loc) · 1.64 KB

Installation

pip install datto

Overview

datto is a package with various data tools to help in data analysis and data science work.

You can find the documentation here.

Some examples of what you can do:

  • Remove links from some text
  • Extract body of an email only (no greeting or signature)
  • Easily load/save data from S3
  • Run SQL from Python
  • Explore data - check for mistyped data, find correlated data
  • Assign a given user to an experimental condition
  • Create an HTML dropdown from a DataFrame
  • Find the most common phrases by a category
  • Classify free text responses into any number of meaningful groups (e.g. find survey themes)
  • Make a simple Python logger with default options
  • Take some data and test a bunch of machine learning models on it

For detailed examples of how you can use it, check out this Juypter notebook.

Contributing

Create virtualenv (specify version of Python you want):

pyenv virtualenv 3.6 datto

Activate virtualenv:

pyenv activate datto

Install dependencies (specified in pyproject.toml file) in virtualenv:

poetry install

To add any new dependencies you need to Poetry, run:

poetry add PACKAGE_NAME

Run tests:

Run the following to make sure all tests pass:

make test

Submitting a change:

Create a PR with your desired change(s), and request review from the code owner!