Sentimental Analysis in R

Sentimental Analysis in R

Folks,

In this blog we will do the sentimental analysis of Trump & Clinton Tweets using R!

We will use Microsoft Cognitive Services (Text Analytics API) in R to calculate sentimental scores of tweets!


Step 1) Twitter Data Extraction

Extract tweets of Trump & Clinton using twitteR Package.

library(twitteR)
setup_twitter_oauth(Consumer_API_Key, Consumer_API_Secret, Access_Token, Access_Token_Secret)

clinton_tweets = searchTwitter("Hillary Clinton+@HillaryClinton", n=200, lang="en")
trump_tweets = searchTwitter("Donald Trump+@realDonaldTrump", n=200, lang="en")

trump_tweets_df = do.call("rbind", lapply(trump_tweets, as.data.frame))
trump_tweets_df = subset(trump_tweets_df, select = c(text))

clinton_tweets_df = do.call("rbind", lapply(clinton_tweets, as.data.frame))
clinton_tweets_df = subset(clinton_tweets_df, select = c(text))

If you are new to twitteR package, please visit this blog & learn how to setup twitter application & Oauth in R.

Step 2) Cleaning of Tweets

Cleaning both dataframe – trump_tweets_dfclinton_tweets_df.

Below is the just sample code for cleaning text in R.

# Removing blank spaces, punctuation, links, extra spaces, special characters and other unwanted things.
clinton_tweets$text = gsub("[:blank:]", "", clinton_tweets$text)
clinton_tweets$text = gsub("[[:punct:]]", "", clinton_tweets$text)
clinton_tweets$text = gsub("[:cntrl:]", "", clinton_tweets$text)
clinton_tweets$text = gsub("[[:digit:]]", "", clinton_tweets$text)
clinton_tweets$text = gsub("[:blank:]", "", clinton_tweets$text)
clinton_tweets$text = gsub("(RT|via)((?:\\b\\W*@\\w+)+)", " ",  clinton_tweets$text)
clinton_tweets$text = gsub("@\\w+", "", clinton_tweets$text)
clinton_tweets$text = gsub("http\\w+", "", clinton_tweets$text)

# Removing Duplicate tweets
clinton_tweets["DuplicateFlag"] = duplicated(clinton_tweets$text)
clinton_tweets = subset(clinton_tweets, clinton_tweets$DuplicateFlag=="FALSE")
clinton_tweets = subset(clinton_tweets, select = -c(DuplicateFlag))

Here is the snapshot of cleaned data frames trump_tweets_dfclinton_tweets_df.

data.png

Step 3) Calculate Sentimental Scores

We will use Microsoft Cognitive Services (Text Analytics API) in R to calculate sentimental scores of tweets. If you are new to Microsoft Cognitive Services, please visit this blog .

Calculating sentimental scores for trump_tweets_df – 

library(jsonlite)
library(httr)

# Creating the request body for Text Analytics API
trump_tweets_df["language"] = "en"
trump_tweets_df["id"] = seq.int(nrow(trump_tweets_df))
request_body_trump = trump_tweets_df[c(2,3,1)]

# Converting tweets dataframe into JSON
request_body_json_trump = toJSON(list(documents = request_body_trump))

# Calling text analytics API
result_trump = POST("https://westus.api.cognitive.microsoft.com/text/analytics/v2.0/sentiment",
body = request_body_json_trump,
add_headers(.headers = c("Content-Type"="application/json","Ocp-Apim-Subscription-Key"="my_Subscription-Key")))

Output = content(result_trump)

score_output_trump = data.frame(matrix(unlist(Output), nrow=100, byrow=T))
score_output_trump$X1 =  as.numeric(as.character(score_output_trump$X1))
score_output_trump$X1 = as.numeric(as.character(score_output_trump$X1)) *10
score_output_trump["Candidate"] = "Trump"

Calculating sentimental scores for clinton_tweets_df – 


# Creating the request body for Text Analytics API
clinton_tweets_df["language"] = "en"
clinton_tweets_df["id"] = seq.int(nrow(clinton_tweets_df))
request_body_clinton = clinton_tweets_df[c(2,3,1)]

# Converting tweets dataframe into JSON
request_body_json_clinton = toJSON(list(documents = request_body_clinton))

result_clinton = POST("https://westus.api.cognitive.microsoft.com/text/analytics/v2.0/sentiment",
body = request_body_json_clinton,
add_headers(.headers = c("Content-Type"="application/json","Ocp-Apim-Subscription-Key"="my-Subscription-Key")))

Output_clinton = content(result_clinton)

score_output_clinton = data.frame(matrix(unlist(Output_clinton), nrow=100, byrow=T))
score_output_clinton$X1 =  as.numeric(as.character(score_output_clinton$X1))
score_output_clinton$X1 = as.numeric(as.character(score_output_clinton$X1)) *10

score_output_clinton["Candidate"] = "Clinton"

Here is the snapshot of sentimental scores of trump_tweets_dfclinton_tweets_df.

Where X1 is Sentimntal Score & Where X2 is ID of tweets present in dataframe.

Here scores close to 10 indicate positive sentiment, while scores close to 1 indicate negative sentiment

scores.png

Snapshot –

sample.png

Step 4) Sentimental Analysis Output

Boxplot for the sentimental scores.


final_score = rbind(score_output_clinton,score_output_trump)

library(ggplot2)

cols = c("#7CAE00", "#00BFC4")
names(cols) = c("Clinton", "Trump")

# boxplot
ggplot(final_score, aes(x=final_score$Candidate, y=X1, group=final_score$Candidate)) +
geom_boxplot(aes(fill=final_score$Candidate)) +
scale_fill_manual(values=cols) +
geom_jitter(colour="gray40",
position=position_jitter(width=0.5), alpha=0.3)

Box Plot –

Here you can see that Trump Median(7.1) > Hillary Median (6.7)

score.png

Here scores close to 10 indicate positive sentiment, while scores close to 1 indicate negative sentiment

Summary of Sentimental Scores –

mean.png


Thanks!

Happy Learning! Your feedback would be appreciated!

Face API in R – Microsoft Cognitive Services

Face API in R – Microsoft Cognitive Services

Folks,

In this blog we will explore Face API in R (Face API – Microsoft Cognitive Services)

This API can detect human faces in image and returns face locations, landmarks, and other important attributes like (age, gender, smile & glasses etc.)

Click here & Register for the free subscription of Microsoft Cognitive Services (Face API).

Here is my free subscription. Free 30,000 transactions per month. 20 per Minute.

1.JPG

After registering please copy the Subscription Key. This is the Subscription Key which provides access to this API.


Face – Detect API

Request URL:

https://api.projectoxford.ai/face/v1.0/detect[?returnFaceId][&returnFaceLandmarks][&returnFaceAttributes]

Request Parameters:

Parameter 1: returnFaceId (Optional) 
Type: Boolean(Default value is true)
Decription: It return unique faceIds of the detected human face in the image. 

Parameter 2: returnFaceLandmarks (Optional)
Type: Boolean(Default value is false)
Description: It return face landmarks of the detected human face in the image..

Parameter: returnFaceAttributes (Optional) 
Type: String (Comma Seperated) 
Description: It return the face attributes including age, gender, headPose, smile, facialHair & glasses.
Input Example: "returnFaceAttributes=age,gender"

Request Headers:

Content-Type (optional): Media type of the body sent to the API.
application/json or application/octet-stream
Ocp-Apim-Subscription-Key: Subscription key which provides access to this API.

Request Body: JSON or Binary Data

{
    "url":"Url of the image"
}

R Commands & Output:-

R Packages Required: httr

library(httr)

# Below is the URL having returnFaceLandmarks = true and returnFaceAttributes = age,gender,headPose,smile,facialHair,glasses

face_api_url = "https://api.projectoxford.ai/face/v1.0/detect?returnFaceLandmarks=true&returnFaceAttributes=age,gender,headPose,smile,facialHair,glasses"

# Below is the image we are going to upload

body_image = upload_file("C:\\Users\\lenovo\\Documents\\image.jpeg")

# Below is the POST methord (Adding Request headers using add_headers)

result = POST(face_api_url,
              body = body_image,
              add_headers(.headers = c("Content-Type"="application/octet-stream",
                                       "Ocp-Apim-Subscription-Key"="Subscription Key")))

API_Output = content(result)

# Coverting Output into R Dataframe 

Output_Face_Attributes = as.data.frame(API_Output)

Input:-

input

Input R Code Snapshot: 

2.png

Output:-

View(Output_Face_Attributes)

Here you can check the attributes of the image.

3.png

For more details & other Image and Face API’s, please visit this Link.


Thanks!

Happy Learning! Your feedback would be appreciated!

Microsoft Cognitive Services (Text Analytics API) in R

Microsoft Cognitive Services (Text Analytics API) in R

Folks,

In this blog we will explore Microsoft Cognitive Services (Text Analytics API) in R!

This API can detect sentiment, key phrases, topics, and language from your text.

Click here & Register for the free subscription of Microsoft Cognitive Services (Text Analytics).

Here is my free subscription. Free 5,000 transactions per month.

free-subs

After registering please copy the Subscription Key. This is the Subscription Key which provides access to this API.


Detect Sentiments 

Request URL:

https://westus.api.cognitive.microsoft.com/text/analytics/v2.0/sentiment

Request Headers:

Content-Type (optional): Media type of the body sent to the API.
Ocp-Apim-Subscription-Key: Subscription key which provides access to this API.

Request Body

{
  "documents": [
    {
      "language": "string",
      "id": "string",
      "text": "string"
    }
  ]
}
R Commands & Output:
R Packages required:httr & jsonlite.

# Below is the Request body for the API having text id 1 = Negative sentiments, id 2 = Positive sentiments

request_body <- data.frame(
language = c("en","en"),
id = c("1","2"),
text = c("This is wasted! I'm angry","This is awesome! Good Job Team! appreciated")
)

# Converting the Request body(Dataframe) to Request body(JSON)

request_body_json <- toJSON(list(documents = request_body), auto_unbox = TRUE)

# Below we are calling API (Adding Request headers using add_headers)

result <- POST("https://westus.api.cognitive.microsoft.com/text/analytics/v2.0/sentiment",
body = request_body_json,
add_headers(.headers = c("Content-Type"="application/json","Ocp-Apim-Subscription-Key"="my_subscrition_key")))
Output <- content(result)

# Show Output
Output
Output Score:-
id - "1" score - 0.2324503

id "2" score - 0.9998128
Where scores close to 1 indicate positive sentiment, while scores close to 0 indicate negative sentiment.
score.png

Detect Language

Request URL:

https://westus.api.cognitive.microsoft.com/text/analytics/v2.0/languages[?numberOfLanguagesToDetect]

Request Parameters:

numberOfLanguagesToDetect - (Optional) Number of languages to detect. Set to 1 by default.

Request Headers:

Content-Type (optional): Media type of the body sent to the API.
Ocp-Apim-Subscription-Key: Subscription key which provides access to this API.

Request Body:

{
  "documents": [
    {
      "id": "string",
      "text": "string"
    }
  ]
}

R Commands & Output:

R Packages required:httr & jsonlite.

# Below is the Request body for the API
request_body <- data.frame(
id = "1",
text = "भारतीय धर्म में निर्वाण मुक्ति है",
stringsAsFactors = FALSE
)

# Converting the Request body(Dataframe) to Request body(JSON)
request_body_json <- toJSON(list(documents = request_body), auto_unbox = TRUE)

# Below we are calling API (Adding Request headers using add_headers)
# Here parameter numberOfLanguagesToDetect=1

result <- POST("https://westus.api.cognitive.microsoft.com/text/analytics/v2.0/languages?numberOfLanguagesToDetect=1",
body = request_body_json ,
add_headers(.headers = c("Content-Type"="application/json","Ocp-Apim-Subscription-Key"="my_subscription_key"))
)
Output <- content(result)
Output
Output Detected Language :-
name - "Hindi"
iso6391Name - "hi"
score - 1
Where scores close to 1 indicate 100% certainty that the identified language is true.
2
Output
You can set numberOfLanguagesToDetect & text as per your requirement. API can detect multiple languages also, see below example where  numberOfLanguagesToDetect = 2 .

# Below is the Request body for the API
request_body <- data.frame(
id = "1",
text = "Nirvana is most commonly associated with Buddhism भारतीय धर्म में निर्वाण मुक्ति है",
stringsAsFactors = FALSE
)

request_body_json <- toJSON(list(documents = request_body), auto_unbox = TRUE)

result <- POST("https://westus.api.cognitive.microsoft.com/text/analytics/v2.0/languages?numberOfLanguagesToDetect=2",
body = request_body_json ,
add_headers(.headers = c("Content-Type"="application/json","Ocp-Apim-Subscription-Key"="my_subscription_key"))
)
Output <- content(result)
Output
3.png

For more details & API’s, please visit this Link.

You can also use this R Package for Microsoft Cognitive Services (Text Analytics API)


Thanks!

Happy Learning! Your feedback would be appreciated!