FORMAL REFERENCE:
Mishne, Gilad. “Experiments with Mood Classification in Blog Posts.” 2005. Proceedings of ACM SIGIR 2005 Workshop on Stylistic Analysis of Text. Presented at Style2005- 1st Workshop on Stylistic Analysis of Text for Information. CiteSeerX (accessed February 21, 2011).
RELEVANT SECTIONS: All
SUMMARY:
The author performs an interesting analysis of blog posts, hoping to find systematic indicators of mood as self-reported by “mood” tags associated with the posts.
The top 40 most referenced moods were classified by frequency counts of words, length, punctuation, and several linguistic algorithms. A program was run to try to determine the mood based on the algorithms, and across all the words the computer was able to correctly guess mood around 55% of the time. Humans were then asked to perform the same task, and their accuracy was 63%: reasonably close to the algorithm, though significantly higher.
ASSESSMENT:
The author has been frequently cited in other publications. The conference at which this paper was delivered as well as the organization are professional and scholarly, and the analysis methods were extremely thorough… the sample size for the computer analysis was over 600,000 blog posts, though the human assessor may have only been one person—it’s hard to tell. Still, the paper is useful and I do not question is soundness.
The study is imperfect in that many people arbitrarily choose mood tags on LiveJournal.
REFLECTION:
We can only program computers to look for certain traits and patterns, and this study confirms that certain structures and linguistic patterns can be used to very strongly determine mood to a degree nearing a human’s interpretation.
Because punctuation was a large part of the algorithm, we can clearly see that patterns of punctuation have an effect on the interpretation of mood. As exclamations are considered the most emotional punctuation, this can be argued even more strongly.
KEYWORDS AND LABELS:
general punctuation, mood, research
No comments:
Post a Comment