Skip to content

Homework counting words in Shakespeare's plays using Spark

Notifications You must be signed in to change notification settings

tsteffek/Spark-Scala-Example

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Top 30 words in all of Shakespeare's work (case insensitive):

Split at "\W+" (all non-alphanumericals)

1: (the,27378)
2: (and,26084)
3: (i,22538)
4: (to,19771)
5: (of,17481)
6: (a,14725)
7: (you,13826)
8: (my,12489)
9: (that,11318)
10: (in,11112)
11: (is,9319)
12: (d,8960)
13: (not,8512)
14: (with,7791)
15: (me,7777)
16: (it,7725)
17: (s,7721)
18: (for,7655)
19: (be,6897)
20: (his,6859)
21: (he,6679)
22: (your,6657)
23: (this,6608)
24: (but,6277)
25: (have,5902)
26: (as,5749)
27: (thou,5549)
28: (him,5205)
29: (so,5058)
30: (will,5008)

Split at "[^\w']+" (all non-alphanumericals except apostrophe ' )

1: (the,27336)
2: (and,26071)
3: (i,20296)
4: (to,19637)
5: (of,17473)
6: (a,14545)
7: (you,13611)
8: (my,12473)
9: (in,10989)
10: (that,10904)
11: (is,9131)
12: (not,8505)
13: (with,7774)
14: (me,7771)
15: (it,7685)
16: (for,7565)
17: (be,6856)
18: (his,6853)
19: (your,6648)
20: (this,6587)
21: (but,6268)
22: (he,6252)
23: (have,5881)
24: (as,5736)
25: (thou,5479)
26: (him,5204)
27: (so,5045)
28: (will,4969)
29: (what,4456)
30: (thy,4032)

Split at "\s+" (all whitespaces)

1: (the,27267)
2: (and,25340)
3: (i,19540)
4: (to,18656)
5: (of,17301)
6: (a,14365)
7: (my,12455)
8: (in,10660)
9: (you,10597)
10: (that,10256)
11: (is,8681)
12: (with,7706)
13: (not,7416)
14: (for,7297)
15: (his,6749)
16: (your,6644)
17: (be,6467)
18: (he,5884)
19: (but,5881)
20: (this,5859)
21: (it,5858)
22: (have,5675)
23: (as,5658)
24: (thou,5138)
25: (me,4851)
26: (will,4502)
27: (thy,4028)
28: (so,3870)
29: (what,3834)
30: (by,3614)

Split at " +" (all blanks)

1: (the,26480)
2: (and,24866)
3: (i,19163)
4: (to,18353)
5: (of,16961)
6: (a,13878)
7: (my,12235)
8: (in,10429)
9: (you,10216)
10: (that,10051)
11: (is,8486)
12: (with,7546)
13: (not,7205)
14: (for,7161)
15: (his,6594)
16: (your,6473)
17: (be,6238)
18: (but,5773)
19: (he,5770)
20: (this,5741)
21: (it,5688)
22: (as,5552)
23: (have,5533)
24: (\n,5125)
25: (thou,5046)
26: (me,4527)
27: (will,4397)
28: (thy,3957)
29: (what,3786)
30: (so,3764)

About

Homework counting words in Shakespeare's plays using Spark

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages