Central Park is the happiest spot in New York City according to research that classified over six hundred thousand tweets in order to map people's mood to their time and location.
Researchers at the New England Complex Systems Institute (NECSI) used Twitter data to generate a sentiment map of New York City that provides a time-sensitive and geographically specific analysis of public mood. The data revealed that public mood is generally highest in public parks and lowest at transportation hubs.
Over the course of two weeks in April 2012, the research team supervised by Professor Yaneer Bar-Yam, Founding President of NECSI collected 603,954 tweets via the Twitter API restricted to those which were tagged with geocoordinates around the immediate New York metropolitan area.
Using tweets that contained the following emoticons, the researchers built two classifiers for positive and negative tweets. That is the presence of an emoticon was used to determine if the tweet was positive or negative and these were used to create classifiers using the text. The classifiers could then classify tweets that didn't have emoticons.
Then, for each tweet in the full set, URLs and usernames were removed, the text was tokenized and assigned a sentiment score based on the classifiers. Combining the sentiment ratings with geotags resulted in a public sentiment map for the New York City metropolitan area in which cyan represents the most positive sentiment and magenta the most negative. White represents areas with insufficient tweet density for analysis.
(click for larger version)
Spatial analysis of the tweets shows that sentiment progressively improves with proximity to Times Square:
Periodic patterns of sentiment were also revealed with fluctuations on both a daily and a weekly scale: more positive tweets are posted on weekends than on weekdays, with a daily peak in sentiment around midnight and a low point between 9:00 a.m. and noon:
Due to the use of geotagging, the researchers were able to locate specific areas of extreme sentiment - apart from parks and transportation hubs they included cemeteries, medical centers, a jail, and a sewage facility. In the part of the map that shows Manhattan, Central Park (A1) and Highland Park (A9) stand out as positive; Penn Sation (B4) and Brooklyn Bridge (B7) are negative as is Riker's island (D1), New York City's main jail complex. The report also notes:
"One area with markedly negative sentiment is Maspeth Creek in Brooklyn (E1). While its geographic features are unremarkable, this area is one of the most polluted urban water bodies in the country."
and it goes into graphic details about this site so that you are likely to imagine the smell of sludge and untreated sewage.
The report concludes with a comment on the advantages of this data mining exercise:
"Our method of public mood analysis has several strengths. By utilizing Twitter's abundance of geotagged data, we can obtain spatial information that is both wide-ranging and fine-grained. The brevity of tweets allows for rapid processing and classification, while their frequency produces a time-sensitive picture of public sentiment."
This is a clever methodology and one that produces results that fit in with common sense - parks are positive places and sewage is sad.
We have surely got over the shock of computers being involved in mathematical proofs? It seems not, but in this case the proof occupies a 13GByte file - bigger than the whole of Wikipedia, so perhaps [ ... ]