The project structure is split up into main(Sources root), resources(Resources root), test(Test root) and "report and paper" with Driver.java as the main class.
- DualPivotQuickSort.java which uses pinyin4j for preprocessing and performs dual pivot quick sort on the chinese array.
- MSDRadixSort.java which makes of pinyin4j library to convert chinese text to pinyin and then sorts the pinyin array whilst also sorting the original chinese text array
- TimSort.java which sorts the chinese array post conversion to pinyin
- LSRadixSort.java which sorts the pinyin array post conversion
- MSDRadixSortWithCutoff.java which is an enhancement to MSDRadixSort with a cutoff to Insertion sort
- PureHuskySort.java which sorts pinyin array in Dual Pivot QuickSort format but uses hashed long versions to compare and sort
- Benchmark utilities (Benchmark, Benchmark_Timer, BenchmarkTarget, Timer) to be able to measure the running times of the sort algorithms (Credits: Info6205 assignments repo)
- SortUtils.java that houses functions for compares, swapping, stubs for unit tests, file reader and Chinese to pinyin converter
- Logger utils to log and format text displayed for benchmarks
- The chinese text files containing data sets for sort (Courtesy of Prof. Robin Hillyard) and reproduced for the benchmark sizes of 250k, 500k, 1M, 2M and 4M
- log4j.properties file for the configuration of the logger
- chineseExample.txt for unit tests
- sortedArraySamples which consists of a part of shuffledChinese.txt (1500 words) in sorted order which has been sorted using our implementation of the MSDRadix, LSDRadix, DPQuick, Tim, PureHusky and MSDRadix with cutoff sort algorithms (respective suffixes added)
- Tests for all the sort mechanisms (with the inclusion of partition test for DualPivotQuickSort)
- Tests for the benchmark utility (Credits: info6205 assignments repo)
- Final_project_Report.pdf which contains the report of our findings in this project
- PSA_Final_Paper_LiteratureSurvey.pdf which contains the literature survey of the 3 papers read by the team along with the work we have done in relation to the papers
- pinyin4j
- ini4j
- log4j
- junit
- Prof. Robin Hillyard and contributors of the Info6205 assignments & huskysort repositories (https://github.com/rchillyard/INFO6205 ; https://github.com/rchillyard/The-repository-formerly-known-as)
- pinyin4j (http://pinyin4j.sourceforge.net/)
- log4j (https://logging.apache.org/log4j/2.x/)
- ini4j (http://ini4j.sourceforge.net/)
- stackoverflow and geeksforgeeks community