Sentiment Analysis

Mining Twitter for Sentiment Analysis Using R Studio

The race for the American League West is too close to call. The two teams to watch are the Rangers and the Oakland A’s. They are coming together one more time in the regular season. By the end of this month, we will know who has won the division title and going forward to compete for the league title.

Hundreds of thousands of baseball fans have expressed their emotions using 140-character tweets. Wouldn’t it be nice to have a tool to capture and analyze these data in a meaningful way using a very simple algorithm?

Tutorials on how to use R to mine twitter for sentiment analysis

Tutorial 1 - Building a corpus from Twitter data

This video shows you how to access Twitter API using R Studio and how to create 3 different data sets for analysis. Data sets include tweets with #Rangers, #Athletics (an official twitter for Oakland A’s), and #MLB (Major League Baseball).

 

 

Tutorial 2 - Scoring tweets

This tutorial video shows you how to define score function and how to score the tweets. It also shows you how to upload positive and negative word lists for the scoring process.

 

 

 

Tutorial 3 - Visual representation of the data

This tutorial video shows you how to make plots indicating the sentiments score.

 

 

 

 

Tutorial 4 - Classifying tweets by emotion and polarity

Opinions expressed in social networks can become valuable resources for mining user sentiments. With an automated knowledge discovery technique, online opinions can be categorized into “joy”, “sadness”, “anger”, “surprise”, “fear”, and “disgust”. They can also be categorized into emotional polarities such as “positive”, “neutral”, or “negative”. The technique has been used in a wide range of fields such as tracking political opinions, classifying consumer’s product reviews, predicting stock market movements, tracking trends in discussion boards, and etc.

 

 
 
 
 
 
 
 
 

References

Breen, J. O. (2012). Mining twitter for airline consumer sentiment. Practical Text Mining and Statistical Analysis for Non-structured Text Data Applications, 133.

Hu, M. & Liu, B. (2004). Mining and Summarizing Customer Reviews. Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-2004), Aug 22-25, 2004, Seattle, Washington, USA.

  • Download the complete script file for the tutorials 
  • Download R. For more information, please visit www.r-project.org/
  • Download R Studio
  • Download Opinion Lexicon by Hu and Liu (2004) - positive word list and negative word list. Please save these files in your working folder. For more information, please visit www.cs.uic.edu/~liub/FBS/sentiment-analysis.html#lexicon
  • Download cacert.pem - this file has to be in your working folder when you access Twitter API from R

 

Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.

 

 

 

 

AttachmentSize
positive-words.txt20.15 KB
negative-words.txt45.21 KB
cacert.pem249.25 KB
Baseballtweets2.R8.25 KB