Extracting Twitter Trends using R Script

Extracting Twitter Trends using R Script

Folks,

In this blog we will learn how to extract the twitter trends from twitteR Package using R, after that we will learn how to save data into SQL Server using ODBC connection!

R Packages required: 

 

  • install.packages(“twitteR”): It provides an interface to the Twitter web API. For twitteR App & OAuth access setup – Visit this blog.
  • install.packages(“RODBC”) for OBDC connectivity. For ODBC connection setup for SQL Server Database/Oracle – Visit this blog .

Let’s get started!

twitteR package has getTrends function that can be used to extract the twitter trends based on a input parameter (woeid).

A WOEID (Where On Earth IDentifier) is a unique 32-bit reference identifier, originally defined by GeoPlanet and now assigned by Yahoo!, that identifies any feature on Earth.[Source: Wikipedia]

See below the WOEID for INDIA. You can use this link for WOEID look up.

woeid.png

Here is the Script for extracting the twitter trend & saving data into SQL Server for further analysis.

 
#------------------------------------------------------------------------#
# R Script Name: GetTrends.R
# R Script Description: This script collect the twitter trend data 
# from twitter API and dump into database.
#------------------------------------------------------------------------#
#-- Importing Required library
library("twitteR", lib.loc="~/R/win-library/3.3")
library("RODBC", lib.loc="~/R/win-library/3.3")

#-- Provide woeid from internet as per your requirement
#-- Below woeid is for INDIA
woeid <-23424848

#-- Fetching Access Keys for Twitter API, which is present in local file 
OAuth_Location <- "C:\\Users\\lenovo\\Desktop\\OAuth.csv"
twitter_OAuth <- read.csv(file=OAuth_Location, header=TRUE, sep=",",colClasses = "character")

  #-- Calling twitteR OAuth function
  setup_twitter_oauth(twitter_OAuth$Consumer_API_Key, twitter_OAuth$Consumer_API_Secret, 
                      twitter_OAuth$Access_Token, twitter_OAuth$Access_Token_Secret)

  #-- Extracting Trends using getTrends Function
  current_trends  <-  getTrends(woeid) 
  current_trends["trend_date"]  <-  Sys.Date()

  #-- Opening OBBC Connection and Saving Trends in Database
  my_conn  <-  odbcConnect("RSQL", uid="***", pwd="***")
  sqlSave(my_conn, current_trends, tablename = "t_current_trends", append = TRUE)

  #-- Closing/Removing unwanted values and data
  #remove(list=c("current_trends","woeid","OAuth_Location","twitter_OAuth"))
  close(my_conn)

 

Output:- IPL Cricket fever in INDIA, that’s why #KKRvSRH is trending on top today  🙂

trend.png

You can also scheduled this R Script using the windows scheduler, so that you get your trends in scheduled time automatically. For scheduling R script visit this blog.


Thanks!

Happy Learning! Your feedback would be appreciated!

 

Stock Data Analysis – Shiny App

Folks,

This Shiny App will show you the Historical Stock data & Chart using R quantmod getSymbol function.

Enter the Valid Stock Symbol in text box to extract the Historical data & chart. Adjust the last months using slider & output data to show using numeric input.

Symbol Examples – NSE, ^BSESN, RELIANCE.NS & TCS.NS etc.

Check out the R Shiny App

Code: https://github.com/shobhit-singh/StockData

Shiny Scrrensot.JPG

Output data extracted using yahoo finace source.*

In the Extratced Output Data you can see different columns having data for Open, High, Low, Close, Volume & Adjusted Stock Price.

Where High refers to the highest price of the stock touched the same day, Low refer to the lowest price the stock was traded on the same day, Close refers to the closing price of that particular stock and the Volume refer to the number of share traded that day.

R Package designed to assist the quantitative trader in the development, testing, and deployment of statistically based trading models. It has many features so check out its link quantmod.

Thanks!

Happy Learning! Your feedback would be appreciated!

 

Stock Market Analysis Using R

Stock Market Analysis Using R

Folks,

In this blog we will learn how to extract & analyze the Stock Market data using R!

Using quantmod package first we will extract the Stock data after that we will create some charts for analysis.

Quantmod – “Quantitative Financial Modeling and Trading Framework for R”!

R Package designed to assist the quantitative trader in the development, testing, and deployment of statistically based trading models. It has many features so check out its link.

Check out this app for Quantmod getSymol R Shiny App – Link


R Packages Required:-

install.packages("quantmod")

Extracting Stock Market Data–

Functions getSymbols: It load and manage data from Multiple Sources.

getSymbols(“SYMBOL”, src=”SOURCE” , from =”YYYY-MM-DD”, to = “YYYY-MM-DD”)

Some src methods are: yahoo, google, oanda etc.

In this blog we will first extract Bombay Stock Exchange Data using yahoo finance source. Bombay Stock Exchange Index/Symbol – BSESN 

1) Analyze One Month Data of Bombay Stock Exchange- 

library(quantmod)

getSymbols("^BSESN",src="yahoo" , from ="2016-10-23", to = Sys.Date())

View(BSESN)

Here is the BSESN (xts Object) Output Data. Here you can see different columns having data for Open, High, Low, Close, Volume & Adjusted Stock Price.

High refers to the highest price of the stock touched the same day, Low refer to the lowest price the stock was traded on the same day, Close refers to the closing price of that particular stock and the Volume refer to the number of share traded that day.

1.png

Output Charts:- 

chart_Series(BSESN)

As you can see in below chart there was huge dip after 8 Nov 2016, may be this is due to demonetization in India.

3.JPG

2) Analyze One Year Data of Bombay Stock Exchange- 

getSymbols("^BSESN",src="yahoo" , from ="2015-10-23", to = Sys.Date())

chart_Series(BSESN,type = "candlesticks")

Output Chart:-

4.JPG

2) Complete Data of Bombay Stock Exchange– 

It will provide you all data after 2007.

getSymbols("^BSESN",src="yahoo")

Quantmod has some other features. For more details, please visit this Link.


Thanks!

Happy Learning! Your feedback would be appreciated!

Interactive Maps in R: Leaflet

Interactive Maps in R: Leaflet

Folks,

Leaflet is one of the most popular open-source JavaScript libraries for interactive maps.

There are many ways to visualize latitude and longitude data on map using R, such as using ggmaps or RgoogleMaps packages. But these packages generate static maps images only. Leaflet allow users to zoom in or zoom out in a very interactive  way.

In this blog we will learn how to create a interactive map using Leaflet in R & also we will learn how to map and style – latitude and longitude data using R & Leaflet package!

Basics of Leaflet in R – Leaflet Basics Blog


R Packages Required:  

install.packages("leaflet")

1) Creating a map using leaflet –

R Code – 

Code 0.JPG

Here leaflet() initializes the leaflet workspace & addTiles() will bring default OpenStreetMap tiles. OpenStreetMap is a free open-source service to create a free editable map of the world.

Output –

R Viewer (Snapshot)- It allow users to zoom in or zoom out in a very interactive way.

out.JPG

I have published this output on RPubs. Click on below link to see interactive output!

Output Link – rpubs.com/BdataEnthusiast/InteractiveMap

2) Creating a map with single marker –

Suppose user want to mark 28.61293° N, 72.229564° E “India Gate” co-ordinates on Map.

R Code – 

code-1

Here addMarkers – Add markers to the map eg. geo co-ordinates, Popup, link. etc.

Output –

I have published this output on RPubs. Click on below link to see interactive output!

Link – rpubs.com/BdataEnthusiast/InteractiveMap01

Output GIF –

ezgif.com-crop.gif

2) Creating a map with multiple marker –

Suppose user want to mark multiple co-ordinates on Map.

Eg. – Below is R dataframe (india_smart_cities) having  latitude & longitude of 30 proposed smart cities in India.

data.jpg

R Code – 

code multiple.JPG

Output– Click on below link to see interactive output!

Link – rpubs.com/BdataEnthusiast/IndiaSmartCities


If you also share your interactive map outside of the RStudio environment, just click on the Save as Web Page option in Export. It will generate an HTML file.

2


 

Check out this awesome leaflet R Shiny App Link. This basic R Shiny App allows you to locate your geographic coordinates on Leaflet interactive map.

Check out the Leaflet R Shiny App here shinyapps.io/LeafletShinyR/

leafletshinyr

For more details of R leaflet package, please visit this Link.


Thanks!

Happy Learning! Your feedback would be appreciated!

Sentimental Analysis in R

Sentimental Analysis in R

Folks,

In this blog we will do the sentimental analysis of Trump & Clinton Tweets using R!

We will use Microsoft Cognitive Services (Text Analytics API) in R to calculate sentimental scores of tweets!


Step 1) Twitter Data Extraction

Extract tweets of Trump & Clinton using twitteR Package.

library(twitteR)
setup_twitter_oauth(Consumer_API_Key, Consumer_API_Secret, Access_Token, Access_Token_Secret)

clinton_tweets = searchTwitter("Hillary Clinton+@HillaryClinton", n=200, lang="en")
trump_tweets = searchTwitter("Donald Trump+@realDonaldTrump", n=200, lang="en")

trump_tweets_df = do.call("rbind", lapply(trump_tweets, as.data.frame))
trump_tweets_df = subset(trump_tweets_df, select = c(text))

clinton_tweets_df = do.call("rbind", lapply(clinton_tweets, as.data.frame))
clinton_tweets_df = subset(clinton_tweets_df, select = c(text))

If you are new to twitteR package, please visit this blog & learn how to setup twitter application & Oauth in R.

Step 2) Cleaning of Tweets

Cleaning both dataframe – trump_tweets_dfclinton_tweets_df.

Below is the just sample code for cleaning text in R.

# Removing blank spaces, punctuation, links, extra spaces, special characters and other unwanted things.
clinton_tweets$text = gsub("[:blank:]", "", clinton_tweets$text)
clinton_tweets$text = gsub("[[:punct:]]", "", clinton_tweets$text)
clinton_tweets$text = gsub("[:cntrl:]", "", clinton_tweets$text)
clinton_tweets$text = gsub("[[:digit:]]", "", clinton_tweets$text)
clinton_tweets$text = gsub("[:blank:]", "", clinton_tweets$text)
clinton_tweets$text = gsub("(RT|via)((?:\\b\\W*@\\w+)+)", " ",  clinton_tweets$text)
clinton_tweets$text = gsub("@\\w+", "", clinton_tweets$text)
clinton_tweets$text = gsub("http\\w+", "", clinton_tweets$text)

# Removing Duplicate tweets
clinton_tweets["DuplicateFlag"] = duplicated(clinton_tweets$text)
clinton_tweets = subset(clinton_tweets, clinton_tweets$DuplicateFlag=="FALSE")
clinton_tweets = subset(clinton_tweets, select = -c(DuplicateFlag))

Here is the snapshot of cleaned data frames trump_tweets_dfclinton_tweets_df.

data.png

Step 3) Calculate Sentimental Scores

We will use Microsoft Cognitive Services (Text Analytics API) in R to calculate sentimental scores of tweets. If you are new to Microsoft Cognitive Services, please visit this blog .

Calculating sentimental scores for trump_tweets_df – 

library(jsonlite)
library(httr)

# Creating the request body for Text Analytics API
trump_tweets_df["language"] = "en"
trump_tweets_df["id"] = seq.int(nrow(trump_tweets_df))
request_body_trump = trump_tweets_df[c(2,3,1)]

# Converting tweets dataframe into JSON
request_body_json_trump = toJSON(list(documents = request_body_trump))

# Calling text analytics API
result_trump = POST("https://westus.api.cognitive.microsoft.com/text/analytics/v2.0/sentiment",
body = request_body_json_trump,
add_headers(.headers = c("Content-Type"="application/json","Ocp-Apim-Subscription-Key"="my_Subscription-Key")))

Output = content(result_trump)

score_output_trump = data.frame(matrix(unlist(Output), nrow=100, byrow=T))
score_output_trump$X1 =  as.numeric(as.character(score_output_trump$X1))
score_output_trump$X1 = as.numeric(as.character(score_output_trump$X1)) *10
score_output_trump["Candidate"] = "Trump"

Calculating sentimental scores for clinton_tweets_df – 


# Creating the request body for Text Analytics API
clinton_tweets_df["language"] = "en"
clinton_tweets_df["id"] = seq.int(nrow(clinton_tweets_df))
request_body_clinton = clinton_tweets_df[c(2,3,1)]

# Converting tweets dataframe into JSON
request_body_json_clinton = toJSON(list(documents = request_body_clinton))

result_clinton = POST("https://westus.api.cognitive.microsoft.com/text/analytics/v2.0/sentiment",
body = request_body_json_clinton,
add_headers(.headers = c("Content-Type"="application/json","Ocp-Apim-Subscription-Key"="my-Subscription-Key")))

Output_clinton = content(result_clinton)

score_output_clinton = data.frame(matrix(unlist(Output_clinton), nrow=100, byrow=T))
score_output_clinton$X1 =  as.numeric(as.character(score_output_clinton$X1))
score_output_clinton$X1 = as.numeric(as.character(score_output_clinton$X1)) *10

score_output_clinton["Candidate"] = "Clinton"

Here is the snapshot of sentimental scores of trump_tweets_dfclinton_tweets_df.

Where X1 is Sentimntal Score & Where X2 is ID of tweets present in dataframe.

Here scores close to 10 indicate positive sentiment, while scores close to 1 indicate negative sentiment

scores.png

Snapshot –

sample.png

Step 4) Sentimental Analysis Output

Boxplot for the sentimental scores.


final_score = rbind(score_output_clinton,score_output_trump)

library(ggplot2)

cols = c("#7CAE00", "#00BFC4")
names(cols) = c("Clinton", "Trump")

# boxplot
ggplot(final_score, aes(x=final_score$Candidate, y=X1, group=final_score$Candidate)) +
geom_boxplot(aes(fill=final_score$Candidate)) +
scale_fill_manual(values=cols) +
geom_jitter(colour="gray40",
position=position_jitter(width=0.5), alpha=0.3)

Box Plot –

Here you can see that Trump Median(7.1) > Hillary Median (6.7)

score.png

Here scores close to 10 indicate positive sentiment, while scores close to 1 indicate negative sentiment

Summary of Sentimental Scores –

mean.png


Thanks!

Happy Learning! Your feedback would be appreciated!

Face API in R – Microsoft Cognitive Services

Face API in R – Microsoft Cognitive Services

Folks,

In this blog we will explore Face API in R (Face API – Microsoft Cognitive Services)

This API can detect human faces in image and returns face locations, landmarks, and other important attributes like (age, gender, smile & glasses etc.)

Click here & Register for the free subscription of Microsoft Cognitive Services (Face API).

Here is my free subscription. Free 30,000 transactions per month. 20 per Minute.

1.JPG

After registering please copy the Subscription Key. This is the Subscription Key which provides access to this API.


Face – Detect API

Request URL:

https://api.projectoxford.ai/face/v1.0/detect[?returnFaceId][&returnFaceLandmarks][&returnFaceAttributes]

Request Parameters:

Parameter 1: returnFaceId (Optional) 
Type: Boolean(Default value is true)
Decription: It return unique faceIds of the detected human face in the image. 

Parameter 2: returnFaceLandmarks (Optional)
Type: Boolean(Default value is false)
Description: It return face landmarks of the detected human face in the image..

Parameter: returnFaceAttributes (Optional) 
Type: String (Comma Seperated) 
Description: It return the face attributes including age, gender, headPose, smile, facialHair & glasses.
Input Example: "returnFaceAttributes=age,gender"

Request Headers:

Content-Type (optional): Media type of the body sent to the API.
application/json or application/octet-stream
Ocp-Apim-Subscription-Key: Subscription key which provides access to this API.

Request Body: JSON or Binary Data

{
    "url":"Url of the image"
}

R Commands & Output:-

R Packages Required: httr

library(httr)

# Below is the URL having returnFaceLandmarks = true and returnFaceAttributes = age,gender,headPose,smile,facialHair,glasses

face_api_url = "https://api.projectoxford.ai/face/v1.0/detect?returnFaceLandmarks=true&returnFaceAttributes=age,gender,headPose,smile,facialHair,glasses"

# Below is the image we are going to upload

body_image = upload_file("C:\\Users\\lenovo\\Documents\\image.jpeg")

# Below is the POST methord (Adding Request headers using add_headers)

result = POST(face_api_url,
              body = body_image,
              add_headers(.headers = c("Content-Type"="application/octet-stream",
                                       "Ocp-Apim-Subscription-Key"="Subscription Key")))

API_Output = content(result)

# Coverting Output into R Dataframe 

Output_Face_Attributes = as.data.frame(API_Output)

Input:-

input

Input R Code Snapshot: 

2.png

Output:-

View(Output_Face_Attributes)

Here you can check the attributes of the image.

3.png

For more details & other Image and Face API’s, please visit this Link.


Thanks!

Happy Learning! Your feedback would be appreciated!

Microsoft Cognitive Services (Text Analytics API) in R

Microsoft Cognitive Services (Text Analytics API) in R

Folks,

In this blog we will explore Microsoft Cognitive Services (Text Analytics API) in R!

This API can detect sentiment, key phrases, topics, and language from your text.

Click here & Register for the free subscription of Microsoft Cognitive Services (Text Analytics).

Here is my free subscription. Free 5,000 transactions per month.

free-subs

After registering please copy the Subscription Key. This is the Subscription Key which provides access to this API.


Detect Sentiments 

Request URL:

https://westus.api.cognitive.microsoft.com/text/analytics/v2.0/sentiment

Request Headers:

Content-Type (optional): Media type of the body sent to the API.
Ocp-Apim-Subscription-Key: Subscription key which provides access to this API.

Request Body

{
  "documents": [
    {
      "language": "string",
      "id": "string",
      "text": "string"
    }
  ]
}
R Commands & Output:
R Packages required:httr & jsonlite.

# Below is the Request body for the API having text id 1 = Negative sentiments, id 2 = Positive sentiments

request_body <- data.frame(
language = c("en","en"),
id = c("1","2"),
text = c("This is wasted! I'm angry","This is awesome! Good Job Team! appreciated")
)

# Converting the Request body(Dataframe) to Request body(JSON)

request_body_json <- toJSON(list(documents = request_body), auto_unbox = TRUE)

# Below we are calling API (Adding Request headers using add_headers)

result <- POST("https://westus.api.cognitive.microsoft.com/text/analytics/v2.0/sentiment",
body = request_body_json,
add_headers(.headers = c("Content-Type"="application/json","Ocp-Apim-Subscription-Key"="my_subscrition_key")))
Output <- content(result)

# Show Output
Output
Output Score:-
id - "1" score - 0.2324503

id "2" score - 0.9998128
Where scores close to 1 indicate positive sentiment, while scores close to 0 indicate negative sentiment.
score.png

Detect Language

Request URL:

https://westus.api.cognitive.microsoft.com/text/analytics/v2.0/languages[?numberOfLanguagesToDetect]

Request Parameters:

numberOfLanguagesToDetect - (Optional) Number of languages to detect. Set to 1 by default.

Request Headers:

Content-Type (optional): Media type of the body sent to the API.
Ocp-Apim-Subscription-Key: Subscription key which provides access to this API.

Request Body:

{
  "documents": [
    {
      "id": "string",
      "text": "string"
    }
  ]
}

R Commands & Output:

R Packages required:httr & jsonlite.

# Below is the Request body for the API
request_body <- data.frame(
id = "1",
text = "भारतीय धर्म में निर्वाण मुक्ति है",
stringsAsFactors = FALSE
)

# Converting the Request body(Dataframe) to Request body(JSON)
request_body_json <- toJSON(list(documents = request_body), auto_unbox = TRUE)

# Below we are calling API (Adding Request headers using add_headers)
# Here parameter numberOfLanguagesToDetect=1

result <- POST("https://westus.api.cognitive.microsoft.com/text/analytics/v2.0/languages?numberOfLanguagesToDetect=1",
body = request_body_json ,
add_headers(.headers = c("Content-Type"="application/json","Ocp-Apim-Subscription-Key"="my_subscription_key"))
)
Output <- content(result)
Output
Output Detected Language :-
name - "Hindi"
iso6391Name - "hi"
score - 1
Where scores close to 1 indicate 100% certainty that the identified language is true.

2
Output

You can set numberOfLanguagesToDetect & text as per your requirement. API can detect multiple languages also, see below example where  numberOfLanguagesToDetect = 2 .

# Below is the Request body for the API
request_body <- data.frame(
id = "1",
text = "Nirvana is most commonly associated with Buddhism भारतीय धर्म में निर्वाण मुक्ति है",
stringsAsFactors = FALSE
)

request_body_json <- toJSON(list(documents = request_body), auto_unbox = TRUE)

result <- POST("https://westus.api.cognitive.microsoft.com/text/analytics/v2.0/languages?numberOfLanguagesToDetect=2",
body = request_body_json ,
add_headers(.headers = c("Content-Type"="application/json","Ocp-Apim-Subscription-Key"="my_subscription_key"))
)
Output <- content(result)
Output
3.png

For more details & API’s, please visit this Link.

You can also use this R Package for Microsoft Cognitive Services (Text Analytics API)


Thanks!

Happy Learning! Your feedback would be appreciated!

Scheduling R Script using Windows Task Scheduler

Scheduling R Script using Windows Task Scheduler

Folks,

In this blog we will learn how to schedule R Script using Windows Task Scheduler!


Suppose here is the R Script which we want to schedule.

Output of this script: Create a log file of name “R_Scripts_Logs_<Time stamp>.TXT” & having text “Script successfully invoked by scheduler at <Time stamp>” .

1

Create a batch file having below commands:

@echo off
“Location of R.exe” CMD BATCH “Location of R Script”

Example: 

2.png


Steps for scheduling R Script using Windows Task Scheduler:

Open Window Task Scheduler & Create Basic Task

3

Step 1: Provide your task name & description.step-1

Step 2: Provide trigger timings details as per your requirement.

step-2

step-3

Step 3: In Action:- provide the location of the batch file of the script. After that save the task.

step-5step-6

Task has been scheduled.

step8

At 22:05, task triggered the batch file automatically.

Here is the output of that script:

outut

When you run R script using CMD Batch, one ROUT file also generated automatically having details of script commands & time elapsed of script.

Below is ROUT fie of the script my R Script.R

output


Thanks!

Happy Learning! Your feedback would be appreciated!

Visualization of Tweets on Google Maps

Visualization of Tweets on Google Maps

Folks,

In this blog we will learn how to visualize tweets on Google Maps using R!

Twitter App required to get started with this. If you don’t have Twitter Application, go to Twitter Developer link and sign in with your credentials, after that go to twitter Apps & click on “Create New App”button for creating new application.

1.png

Once you’ve done this, make a note of your Keys & Access Token.

  • Consumer Key (API Key)
  • Consumer Secret (API Secret)
  • Access Token
  • Access Token Secret

R Packages required: 

install.packages(“ggmap”): It allows us to access maps from the Google Maps API.

ggmap package has a function get_map that can download maps from Google Maps API.

install.packages(“twitteR“): It provides an interface to the Twitter web API.

twitteR package has searchTwitter function that can search tweets based on a supplied search string.


Extracting Maps: 

Extraction of India map using get_map using below coordinates. Coordinates extracted using this Blog .

bounds.png

R Commands: 

7.png

I have selected the maptype = “terrain“, other maptype eg. satellite & roadmap.

Here is the output of ggmap(india_map).

8.png


Extracting Tweets: 

Twitter OAuth Setup:

2.png

Extracting 500 tweets having tag @narendramodi within a 1000 km radius of the given latitude/longitude.

20.593684° N, 78.96288° E is of India Location: Extracted using this Blog.

3.png

Here is the output of View(my_tweets_df). 500 Observation & 16 Variable.

4

5

Excluding rows from data frame where longitude/latitude=NA & taking only last two columns (unique data).

6.png

Here is the final output of View(my_tweets_df). 206 Observation & 2 Variable.

1111


Mapping of Tweets & Google Map: 

Converting both columns of my_tweets_df as numeric.

11

Mapping Map with my_tweets_df .

9.png

Here is the output of plot.

Seems like more tweets (having tag @narendramodi) are coming from Mumbai & Delhi.

10.png


Thanks!

Happy Learning!

Text Analysis – Facebook Post Comments

Text Analysis – Facebook Post Comments

Folks,

In this blog we will learn how to analyze the comments of Public Facebook post using Facebook Graph API Explorer & R!

Facebook developer account required to get started with this Facebook Graph API .

If you don’t have Facebook developer account, you can upgrade your personal Facebook account to a Facebook Developer account from here this link.

After registering as Facebook Developer, go to “Tools & Support”->”Graph API Explorer”

To explore Graph API – Token & Permissions are required, so just click on the “Get Token”.

1.png

~ Courtesy Facebook Developer

As public profiles included by default in permissions, just click on “Get Access Token”.

4.png

Below is my access token, which will expire after some time. As shown in token info.

5.png
Graph Explorer

Now we have token,  let’s explore now.


Extracting Comments from the Public Facebook Post.

First thing you required is the Post Id of the post. See below steps to get the post Id.

Suppose below is the post, we want to analyze. Click on the Post Date Time. See below highlighted box. ~ Post Courtesy Facebook

3.png

Copy below Id. This is the post Id.

2.png

Go to the Graph Explorer.

Type “Post_id/comments” in below box & click on Submit.

6.pngYou can also give limit for number of comments to return, like this

Post_id/comments?limit=”.

My Input:

7.png

Output: Below are post comments in the JSON format.

If you want more comments click on “next” for next page of comments.

7.png

Click on “Get Code” to get the cURL code. Copy this URL, we will use this URL in R.

8.pngText Analysis in R 

R Packages required: 

install.packages(“RCurl “): It allows us to compose general HTTP requests and provides convenient functions to fetch data.

install.packages(“rjson”): It allows us to converts JSON object into R objects and vice-versa.

install.packages(“tm”):  A Mining Package for text mining applications within R. It offers a number of transformations that ease the tedium of cleaning data.

R Commands:

111

url used in above image is copied from cURL code from Graph Explorer.

Output: So the first page give me 25 comments only. We will analyze here only 1st page i.e. 25 comments only.

output

Cleaning & Analyzing Data:

Creating corpus & removing extra spaces, special characters & other unwanted things.

cleaning.png

Creating Term Document Matrix:

1111.png

Here is the 760 extracted words with frequency.

1112.png

Creating Wordcloud: 

install.packages(“wordcloud”): For plotting a word cloud

lastt.png

In this Word Cloud we are taking only 100 words with minimum frequency of 2.

Output:

lasttttttt.png

Graph API Reference ~Facebook Developer  For more details, please read this Link.

Feedback and suggestions are most welcome. If you have any feedback, suggestions or questions please comment.


Thanks!

Happy Learning!