Popular N-Grams of Shakespeare’s Complete Works using Hash Tables and Hashing Algorithms


A group of teammates and I utilized hash table data structures and hashing functions to parse and extract the most popular n-grams within all of Shakespeare’s works. N-grams were characterized by all found “n” consecutive words uninterrupted by punctuation. For example, the 3-gram “i pray you” is the most popular three words in a row that was discovered 249 times within Shakespeare’s complete works.

The very specific n-gram strings that we were searching for made perfect keys for hash maps. Therefore, all implemented algorithms and data structures in this program were based on hash tables. The following is a list of Shakespeare’s “Top Ten” 1-grams, 2-grams, 3-grams, and 4-grams.

There are 26982 unique 1-grams.

The 10 most common ones are:
the  [26851]
and  [24077]
i  [20535]
to  [18561]
of  [16013]
you  [13856]
a  [13840]
my  [12282]
that  [10761]
in  [10537]

There are 273439 unique 2-grams.

The 10 most common ones are:
i am  [1858]
my lord  [1685]
i have  [1628]
i will  [1582]
in the  [1578]
to the  [1517]
of the  [1378]
it is  [1079]
to be  [968]
that i  [910]

There are 520199 unique 3-grams.

The 10 most common ones are:
i pray you  [249]
i will not  [214]
i know not  [162]
i do not  [160]
i am a  [141]
i am not  [139]
my good lord  [132]
and i will  [129]
i would not  [126]
this is the  [122]

There are 546088 unique 4-grams.

The 10 most common ones are:
with all my heart  [47]
i know not what  [39]
give me your hand  [34]
i do beseech you  [33]
give me thy hand  [31]
i do not know  [29]
i would not have  [26]
ay my good lord  [25]
what is the matter  [25]
give me leave to  [24]

Github Repository



Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s