All notable changes to CD4Py tool will be documented in this file. The format is based on Keep a Changelog and this project adheres to Semantic Versioning.
- A parallel tokenizer for Python source code files.
- A library module for pre-processing tokenized files, calculating TF-IDF, finding KNNs, and identifying duplicate files.
- A command-line interface for detection of duplicate files in Python projects.