WordCount Exercise
# We will perform word count using Spark with Christmas Carol Text
# If you are working with the notebook on your local computer instead
# of EC2 you can download the text from https://www.gutenberg.org/files/46/46-h/46-h.htm
# We look at first 10 lines
!head -10 christmas.txt
# Create a textfile RDD
christmas = sc.textFile(“christmas.txt”)
#simple actions on the RDD
print(christmas.count())
print(christmas.first())
print(christmas.take(10))
Questions: Use Spark, how do I get the 20 most common “lowercased” words (don’t count stopwords)
Hi there! Click one of our representatives below and we will get back to you as soon as possible.