Differential Language Analysis ToolKit

DLATK is an end to end human text analysis package, specifically suited for social media and social scientific applications. It is written in Python 3 and developed by the World Well-Being Project at the University of Pennsylvania and Stony Brook University.

It contains:

feature extraction
part-of-speech tagging
correlation
prediction and classification
mediation
dimensionality reduction and clustering
wordcloud visualization

DLATK can utilize:

Mallet for creating LDA topics
Stanford Parser
CMU's TweetNLP
pandas dataframe output

Installation

DLATK is available via any of four popular installation platforms: conda, pip, github, or Docker:

New to installing Python packages?

It is recommended that you see the full installation instructions.

1. conda

conda install -c wwbp dlatk

2. pip

pip install dlatk

3. GitHub

git clone https://github.com/dlatk/dlatk.git
cd dlatk
python setup.py install

4. Docker

Detailed Docker install instructions here.

docker run --name mysql_v5  --env MYSQL_ROOT_PASSWORD=my-secret-pw --detach mysql:5.5
docker run -it --rm --name dlatk_docker --link mysql_v5:mysql dlatk/dlatk bash

Dependencies

See the full installation instructions for recommended and optional dependencies.

Documentation

The documentation for the latest release is at dlatk.wwbp.org.

Citation

If you use DLATK in your work please cite the following paper:

@InProceedings{DLATKemnlp2017,
  author =  "Schwartz, H. Andrew
    and Giorgi, Salvatore
    and Sap, Maarten
    and Crutchley, Patrick
    and Eichstaedt, Johannes
    and Ungar, Lyle",
  title =   "DLATK: Differential Language Analysis ToolKit",
  booktitle =   "Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing: System Demonstrations",
  year =  "2017",
  publisher =   "Association for Computational Linguistics",
  pages =   "55--60",
  location =  "Copenhagen, Denmark",
  url =   "http://aclweb.org/anthology/D17-2010"
}

License

Licensed under a GNU General Public License v3 (GPLv3)

Background

Developed by the World Well-Being Project based out of the University of Pennsylvania and Stony Brook University.

Name		Name	Last commit message	Last commit date
Latest commit History 481 Commits
dlatk		dlatk
doc		doc
install		install
.gitignore		.gitignore
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
README.md		README.md
dlatkInterface.py		dlatkInterface.py
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Differential Language Analysis ToolKit

Installation

New to installing Python packages?

1. conda

2. pip

3. GitHub

4. Docker

Dependencies

Documentation

Citation

License

Background

About

Releases

Packages

Languages

License

hugokce/dlatk

Folders and files

Latest commit

History

Repository files navigation

Differential Language Analysis ToolKit

Installation

New to installing Python packages?

1. conda

2. pip

3. GitHub

4. Docker

Dependencies

Documentation

Citation

License

Background

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages