Stock Market Analysis Using R

Stock Market Analysis Using R

Folks,

In this blog we will learn how to extract & analyze the Stock Market data using R!

Using quantmod package first we will extract the Stock data after that we will create some charts for analysis.

Quantmod – “Quantitative Financial Modeling and Trading Framework for R”!

R Package designed to assist the quantitative trader in the development, testing, and deployment of statistically based trading models. It has many features so check out its link.

Check out this blog for Quantmod getSymol R Shiny App – Link


R Packages Required:-

install.packages("quantmod")

Extracting Stock Market Data–

Functions getSymbols: It load and manage data from Multiple Sources.

getSymbols(“SYMBOL”, src=”SOURCE” , from =”YYYY-MM-DD”, to = “YYYY-MM-DD”)

Some src methods are: yahoo, google, oanda etc.

In this blog we will first extract Bombay Stock Exchange Data using yahoo finance source. Bombay Stock Exchange Index/Symbol – BSESN 

1) Analyze One Month Data of Bombay Stock Exchange- 

library(quantmod)

getSymbols("^BSESN",src="yahoo" , from ="2016-10-23", to = Sys.Date())

View(BSESN)

Here is the BSESN (xts Object) Output Data. Here you can see different columns having data for Open, High, Low, Close, Volume & Adjusted Stock Price.

High refers to the highest price of the stock touched the same day, Low refer to the lowest price the stock was traded on the same day, Close refers to the closing price of that particular stock and the Volume refer to the number of share traded that day.

1.png

Output Charts:- 

chart_Series(BSESN)

As you can see in below chart there was huge dip after 8 Nov 2016, may be this is due to demonetization in India.

3.JPG

2) Analyze One Year Data of Bombay Stock Exchange- 

getSymbols("^BSESN",src="yahoo" , from ="2015-10-23", to = Sys.Date())

chart_Series(BSESN,type = "candlesticks")

Output Chart:-

4.JPG

2) Complete Data of Bombay Stock Exchange– 

It will provide you all data after 2007.

getSymbols("^BSESN",src="yahoo")

Quantmod has some other features. For more details, please visit this Link.


Thanks!

Happy Learning! Your feedback would be appreciated!

Interactive Maps in R: Leaflet

Interactive Maps in R: Leaflet

Folks,

Leaflet is one of the most popular open-source JavaScript libraries for interactive maps.

There are many ways to visualize latitude and longitude data on map using R, such as using ggmaps or RgoogleMaps packages. But these packages generate static maps images only. Leaflet allow users to zoom in or zoom out in a very interactive  way.

In this blog we will learn how to create a interactive map using Leaflet in R & also we will learn how to map and style – latitude and longitude data using R & Leaflet package!

Basics of Leaflet in R – bigdataenthusiast.com/2016/12/12/Leaflet.html


R Packages Required:  

install.packages("leaflet")

1) Creating a map using leaflet –

R Code – 

Code 0.JPG

Here leaflet() initializes the leaflet workspace & addTiles() will bring default OpenStreetMap tiles. OpenStreetMap is a free open-source service to create a free editable map of the world.

Output –

R Viewer (Snapshot)- It allow users to zoom in or zoom out in a very interactive way.

out.JPG

I have published this output on RPubs. Click on below link to see interactive output!

Output Link – rpubs.com/BdataEnthusiast/InteractiveMap

2) Creating a map with single marker –

Suppose user want to mark 28.61293° N, 72.229564° E “India Gate” co-ordinates on Map.

R Code – 

code-1

Here addMarkers – Add markers to the map eg. geo co-ordinates, Popup, link. etc.

Output –

I have published this output on RPubs. Click on below link to see interactive output!

Link – rpubs.com/BdataEnthusiast/InteractiveMap01

Output GIF –

ezgif.com-crop.gif

2) Creating a map with multiple marker –

Suppose user want to mark multiple co-ordinates on Map.

Eg. – Below is R dataframe (india_smart_cities) having  latitude & longitude of 30 proposed smart cities in India.

data.jpg

R Code – 

code multiple.JPG

Output– Click on below link to see interactive output!

Link – rpubs.com/BdataEnthusiast/IndiaSmartCities


If you also share your interactive map outside of the RStudio environment, just click on the Save as Web Page option in Export. It will generate an HTML file.

2


 

Check out this awesome leaflet R Shiny App Blog. This basic R Shiny App allows you to locate your geographic coordinates on Leaflet interactive map.

Check out the Leaflet R Shiny App here shinyapps.io/LeafletShinyR/

leafletshinyr

For more details of R leaflet package, please visit this Link.


Thanks!

Happy Learning! Your feedback would be appreciated!

Sentimental Analysis in R

Sentimental Analysis in R

Folks,

In this blog we will do the sentimental analysis of Trump & Clinton Tweets using R!

We will use Microsoft Cognitive Services (Text Analytics API) in R to calculate sentimental scores of tweets!


Step 1) Twitter Data Extraction

Extract tweets of Trump & Clinton using twitteR Package.

library(twitteR)
setup_twitter_oauth(Consumer_API_Key, Consumer_API_Secret, Access_Token, Access_Token_Secret)

clinton_tweets = searchTwitter("Hillary Clinton+@HillaryClinton", n=200, lang="en")
trump_tweets = searchTwitter("Donald Trump+@realDonaldTrump", n=200, lang="en")

trump_tweets_df = do.call("rbind", lapply(trump_tweets, as.data.frame))
trump_tweets_df = subset(trump_tweets_df, select = c(text))

clinton_tweets_df = do.call("rbind", lapply(clinton_tweets, as.data.frame))
clinton_tweets_df = subset(clinton_tweets_df, select = c(text))

If you are new to twitteR package, please visit this blog & learn how to setup twitter application & Oauth in R.

Step 2) Cleaning of Tweets

Cleaning both dataframe – trump_tweets_dfclinton_tweets_df.

Below is the just sample code for cleaning text in R.

# Removing blank spaces, punctuation, links, extra spaces, special characters and other unwanted things.
clinton_tweets$text = gsub("[:blank:]", "", clinton_tweets$text)
clinton_tweets$text = gsub("[[:punct:]]", "", clinton_tweets$text)
clinton_tweets$text = gsub("[:cntrl:]", "", clinton_tweets$text)
clinton_tweets$text = gsub("[[:digit:]]", "", clinton_tweets$text)
clinton_tweets$text = gsub("[:blank:]", "", clinton_tweets$text)
clinton_tweets$text = gsub("(RT|via)((?:\\b\\W*@\\w+)+)", " ",  clinton_tweets$text)
clinton_tweets$text = gsub("@\\w+", "", clinton_tweets$text)
clinton_tweets$text = gsub("http\\w+", "", clinton_tweets$text)

# Removing Duplicate tweets
clinton_tweets["DuplicateFlag"] = duplicated(clinton_tweets$text)
clinton_tweets = subset(clinton_tweets, clinton_tweets$DuplicateFlag=="FALSE")
clinton_tweets = subset(clinton_tweets, select = -c(DuplicateFlag))

Here is the snapshot of cleaned data frames trump_tweets_dfclinton_tweets_df.

data.png

Step 3) Calculate Sentimental Scores

We will use Microsoft Cognitive Services (Text Analytics API) in R to calculate sentimental scores of tweets. If you are new to Microsoft Cognitive Services, please visit this blog .

Calculating sentimental scores for trump_tweets_df – 

library(jsonlite)
library(httr)

# Creating the request body for Text Analytics API
trump_tweets_df["language"] = "en"
trump_tweets_df["id"] = seq.int(nrow(trump_tweets_df))
request_body_trump = trump_tweets_df[c(2,3,1)]

# Converting tweets dataframe into JSON
request_body_json_trump = toJSON(list(documents = request_body_trump))

# Calling text analytics API
result_trump = POST("https://westus.api.cognitive.microsoft.com/text/analytics/v2.0/sentiment",
body = request_body_json_trump,
add_headers(.headers = c("Content-Type"="application/json","Ocp-Apim-Subscription-Key"="my_Subscription-Key")))

Output = content(result_trump)

score_output_trump = data.frame(matrix(unlist(Output), nrow=100, byrow=T))
score_output_trump$X1 =  as.numeric(as.character(score_output_trump$X1))
score_output_trump$X1 = as.numeric(as.character(score_output_trump$X1)) *10
score_output_trump["Candidate"] = "Trump"

Calculating sentimental scores for clinton_tweets_df – 


# Creating the request body for Text Analytics API
clinton_tweets_df["language"] = "en"
clinton_tweets_df["id"] = seq.int(nrow(clinton_tweets_df))
request_body_clinton = clinton_tweets_df[c(2,3,1)]

# Converting tweets dataframe into JSON
request_body_json_clinton = toJSON(list(documents = request_body_clinton))

result_clinton = POST("https://westus.api.cognitive.microsoft.com/text/analytics/v2.0/sentiment",
body = request_body_json_clinton,
add_headers(.headers = c("Content-Type"="application/json","Ocp-Apim-Subscription-Key"="my-Subscription-Key")))

Output_clinton = content(result_clinton)

score_output_clinton = data.frame(matrix(unlist(Output_clinton), nrow=100, byrow=T))
score_output_clinton$X1 =  as.numeric(as.character(score_output_clinton$X1))
score_output_clinton$X1 = as.numeric(as.character(score_output_clinton$X1)) *10

score_output_clinton["Candidate"] = "Clinton"

Here is the snapshot of sentimental scores of trump_tweets_dfclinton_tweets_df.

Where X1 is Sentimntal Score & Where X2 is ID of tweets present in dataframe.

Here scores close to 10 indicate positive sentiment, while scores close to 1 indicate negative sentiment

scores.png

Snapshot –

sample.png

Step 4) Sentimental Analysis Output

Boxplot for the sentimental scores.


final_score = rbind(score_output_clinton,score_output_trump)

library(ggplot2)

cols = c("#7CAE00", "#00BFC4")
names(cols) = c("Clinton", "Trump")

# boxplot
ggplot(final_score, aes(x=final_score$Candidate, y=X1, group=final_score$Candidate)) +
geom_boxplot(aes(fill=final_score$Candidate)) +
scale_fill_manual(values=cols) +
geom_jitter(colour="gray40",
position=position_jitter(width=0.5), alpha=0.3)

Box Plot –

Here you can see that Trump Median(7.1) > Hillary Median (6.7)

score.png

Here scores close to 10 indicate positive sentiment, while scores close to 1 indicate negative sentiment

Summary of Sentimental Scores –

mean.png


Thanks!

Happy Learning! Your feedback would be appreciated!