Last active
August 29, 2015 14:14
-
-
Save devmacrile/81839728d6ec090b3ddf to your computer and use it in GitHub Desktop.
Simple R wrapper function for the Sentiment140 API
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# Wrapper function for the Sentiment140 API | |
# An API for a maximum entropy model trained on ~1.5M tweets | |
# The server will timeout if the job takes > 60 seconds, | |
# so if the tweet count is relatively high, the function | |
# will split the data into chunks of 2500 (fairly arbitrary choice) | |
# http://help.sentiment140.com/api | |
Sentiment140 <- function(sentences){ | |
# Load required packages | |
library(plyr) | |
library(httr) | |
library(jsonlite) | |
# Empy data.frame for storing results | |
results <- data.frame(text = character(), | |
polarity = numeric(), | |
stringsAsFactors = FALSE) | |
# API url | |
URL <- "http://www.sentiment140.com/api/bulkClassifyJson" | |
chunk.size <- 2500 | |
if(length(sentences) > chunk.size){ | |
# used to prevent API timeout (60 seconds) | |
# by splitting data into smaller chunks | |
grps <- ceiling((1:length(sentences))/chunk.size) | |
} else grps <- rep(1, length(sentences)) | |
# custom json conversion | |
for(i in unique(grps)){ | |
chunk <- sentences[which(grps == i)] | |
json.text <- paste('{"text": "', chunk, '"}', sep="") | |
json.text <- paste(json.text, collapse=",") | |
json <- paste('{"data": [', json.text, "]}", sep="") | |
rsp <- POST(URL, body=json, encode="json") | |
rsp.cont <- content(rsp) | |
data <- rsp.cont$data | |
df <- do.call("rbind", data) | |
results <- rbind(results, df) | |
} | |
results | |
} |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# Example usage | |
my.thoughts <- c("I love this job", | |
"This job could be better", | |
"I cannot stand that one tv show", | |
"This book I am reading is incredibly enlightening", | |
"This book I am reading is incredibly boring") | |
scores <- Sentiment140(my.thoughts) | |
scores | |
# Outputs a data.frame as such | |
# text polarity meta | |
# 1 I love this job 4 en | |
# 2 This job could be better 2 en | |
# 3 I cannot stand that one tv show 2 en | |
# 4 This book I am reading is incredibly enlightening 2 en | |
# 5 This book I am reading is incredibly boring 0 en |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Note this unfortunately breaks fairly easily when passed characters that conflict with the JSON accepted by the API. Haven't figured this out completely.