Skip to content

Commit

Permalink
Merge pull request #5 from aviiciii/4-shift-txt-to-csv
Browse files Browse the repository at this point in the history
shifted output to csv
  • Loading branch information
aviiciii authored Jun 27, 2023
2 parents a6aba0b + a75bdee commit a2e65d9
Show file tree
Hide file tree
Showing 13 changed files with 16,892 additions and 16,381 deletions.
7 changes: 7 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,5 +2,12 @@

datset used: https://www.kaggle.com/datasets/praveengovi/tamil-language-corpus-for-nlp

## process flow

1. pre_process.py
2. program.py
3. clean_output.py
4. top.py

## changes
filtered about 10,000 non-tamil words and removed trailing punctuations from about 25,000 words
100 changes: 0 additions & 100 deletions output/top_100.txt

This file was deleted.

Loading

0 comments on commit a2e65d9

Please sign in to comment.