Hi Friends,
In the past, you might have already seen a lot of movie reviews written by plenty of critiques out there on the internet. No matter how genuine a critique is, as long as he is human, there is always a tendency to introduce 'bias' factor into his review. It is very rare that you get 'actual' public opinion from a single reviewer!
In this context, I gave a small practical attempt to pull out 'public pulse' of what people are buzzing on twitter.
As we all know dhoom 3 is set to release in a couple of weeks, I pulled out twitter tweets tagged with #Dhoom3 (systematic sampling) from last 20 days and performed some text mining.
Here are the findings
Lets delve into what are common buzz words around #dhoom3. Here's a wordcloud of what public is tweeting about the movie. (bigger word size implies more frequently used word in tweets)
Now, I tried to make a few associations between words and here is list of words that are commonly buzzed about around frequent terms
Malang -> associated frequently with (dum,ishq,song , soexcited)
[Looks like fans are indeed exited about this song!!]
Aamir -> associated frequently with (khan, 5crore, underwent, medication,nobody)
[Ohh.. Looks like song budget and his medication are trending with aamir]
Katrina-> associated frequently with ( kaif, kaif’s, acrobats, gain , loss, training )
[Nothing unusual :( ]
Kamli -> associated frequently with ( katrinakaif, showbiz, sizzles , promo )
Finally, An attempt to group related words into distinct buckets brought down the following clusters of words that people commonly tweeted!!
[Top 8]
cluster 1: kaif katrina dhoom3 dhoom kamli
cluster 2: aamirkhan dhoom3 dhoom movie days
cluster 3: song dhoom3 malang promo dhoom
cluster 4: aamir dhoom3 dhoom khan indeals
cluster 5: dhoom3 hot looks getting better
cluster 6: 20th big cant december dhoom3
cluster 7: malang dhoom3 dum ishq song
cluster 8: dhoom3 malang back dhoom bollywood
What next?? wait for actual public sentiment coming up after movie is released!
[And for geeks out there, science behind the scenes: R-statistical language, java & twitter4j api, Datamining, Text mining, Analytics ]