Tag Archives: R programming

Tables with R

People around Thanksgiving table enjoying dinner
Thanksgiving Table 11/26/2009

Cirque du Soleil Kurios show has an act where they mirror a table. It is amazing to see people upside down mirroring a table.

R   programming language has several packages for doing tables with R. Basic has a function called table. Which is good enough. Sometimes you want more. At a meeting last night someone said pander was the best package. Someone else said that they liked htmltable better. Also there is xtable and tables. tables was written by someone to be like SAS PROC TABULATE.  Many choices, pick out the one that you understand the directions and meets your publishing needs. Better depends on your point of view.

table

tables

xtable

htmltable

pander

Thanksgiving Table right side up
table right side up

Functions in R

Hug Point Oregon Coast
Hug Point Oregon Coast

Wish I was at the coast.

R does a lot with functions. Let’s start with a simple function statement. The base R has a function called function.

In the following code:

f is the objects name

x is the varible

the function x +1 goes between { }

Pretty simple

f <- function(x) {x + 1}

Take
f(4)

results

[1] 5

Write your code try other functions. It is easier to write a function in R than other languages.

ggplot2

ggplot2 book cover
gggplot2 Elegant Graphics for Data Analysis

Latest addition of Hadley Wickam’s book ggplot2

Springer International Publishing 2016

This is a major update. I spent a lot of time going over the last chapters in the book.

Part 3 Data Analysis covers a different way of using ggplot2. Instead of doing analysis then plotting. Do both parts at the same time using ggplot2 plot and other new useful packages.

Chapter 9 covers tidy data. Tidy data has variables in columns  and observations  in rows. Straight forward but the data doesn’t always come that way.  Packages tidyr and dplyr  help with tidying up data.

One of things covered in Chapter 10 is pipes and the package magrittr. Using pipes makes for cleaner code.

Chapter 11 Modelling for Visualization. Introduces the new package called broom. broom package takes messy data out put of model functions such as lm, glm, anova and makes them tidy.

The beginning of the book covers aes() and that you need it for your plot and geom() you keep adding them as layers.

This a good book for learning how to use ggplot2 and new techniques for analyzing data.

Mastering Social Media Mining with R

Mastering Social Media Mining with R

Sharan Kumar Ravindran September 2015 Packt publisher

Useful R book that covers current Social Media and  data science techniques.

My favorite library in this book is from chapter six, SocialMediaMineR.

The function get_facebook from SocialMediaMineR package takes a URL and returns a data frame of shares, likes etc.  The function is easy to use. You do not need OAuth just a link. Works like this:

> library(SocialMediaMineR)
> get_facebook(“https://www.packtpub.com”)
trying URL ‘http://curl.haxx.se/ca/cacert.pem’
Content type ‘¸’
ýþ’ length 256338 bytes (250 Kb)
opened URL
downloaded 250 Kb

url normalized_url
1 https://www.packtpub.com https://www.packtpub.com/
share_count like_count comment_count total_count click_count
1 432 361 155 948 0
comments_fbid commentsbox_count
1 10150745127795008 0
>

This one function could keep you occupied for a long time.

But there are other useful libraries in this book: ROAuth for OAuth, twitterR for Twitter, Rfacebook for facebook, and rgithub for github.

The book covers exploratory data analysis, EDA. in the chapter on github.

Sentiment Analysis in the chapter on Twitter.

The book briefly covers a lot. There are many other books that cover a single topic in more detail. Read this book to discover what you want to explore.

 

Building a Recommendation System with R

Written by  Suresh K. Gorakala and Michele Usuelli, published by Packt Press 2015

This is whole book on a topic that is often only a single chapter in a book. It is a book for people who already know R and machine learning .

The book uses Math equations not just code for teaching the concepts.

Covers confusion matrix for classification. Along with sensitivity and specification.  Lots of details about type one and type two errors. This  clearly written section will help you understand why you don’t want either type of error and what they are.

Classification similarity measures include Euclidean Distance, Cosine Distance and Pearson Correlation.

Dimensionality  reduction techniques include Principle Component Analysis.

Data Mining techniques include K-means clustering and Support Vector Machine.

Recommender System includes collaborative filtering and content based filtering.

 

R package for the book is recommenderlab.

recommenderlab: Lab for Developing and Testing Recommender Algorithms by Michael Hahsler at http://CRAN.R-project.org/package=recommenderlab

Other packages used are lsa, e1071, cluster.

 

Beginning Data Science with R

begindatascience9783319120652

Beginning Data Science with R written by Manas A. Pathak, published by Springer Publishing 2014.
ISBN 978-3-319-12065-2

Code examples at extras.springer.com

This book is written for coders who already know how to code to learn R for data science.

The book covers how to install and use R, but not an IDE like RStudio.

Chapter 2 includes control structures and functions. That functions in R are treated as first class objects. A fundamental property of functional programming languages.

Chapter 3 is on getting data into R. How do get the data into R is a common question. Years ago I was puzzled about getting data into R. I didn’t want to type it all into an array. You don’t have to type in the data, R will read, pull, connect to all sorts of data sources.

Chapter 4 is a nice over view of data visualization.

The book goes on to cover necessary topics and techniques in Data Science. What I want to point out is Chapter 7.3.1 on nearest neighbors uses a package that I haven’t used before kknn. The package is straight forward to use. The author Pathak has written an easy to grasp explanation of the technique.

This is a good book to get you stated coding in R for data science.

color in R

There are 101 shades of gray in R.  Along with lightgray, lightslategray, slategray, darkgray, darkslategray. Way more shades of gray than I will ever use.  I think I will try lavenderblush4 and chocolate4

colors()
[1] “white” “aliceblue” “antiquewhite” “antiquewhite1”
[5] “antiquewhite2” “antiquewhite3” “antiquewhite4” “aquamarine”
[9] “aquamarine1” “aquamarine2” “aquamarine3” “aquamarine4”
[13] “azure” “azure1” “azure2” “azure3”
[17] “azure4” “beige” “bisque” “bisque1”
[21] “bisque2” “bisque3” “bisque4” “black”
[25] “blanchedalmond” “blue” “blue1” “blue2”
[29] “blue3” “blue4” “blueviolet” “brown”
[33] “brown1” “brown2” “brown3” “brown4”
[37] “burlywood” “burlywood1” “burlywood2” “burlywood3”
[41] “burlywood4” “cadetblue” “cadetblue1” “cadetblue2”
[45] “cadetblue3” “cadetblue4” “chartreuse” “chartreuse1”
[49] “chartreuse2” “chartreuse3” “chartreuse4” “chocolate”
[53] “chocolate1” “chocolate2” “chocolate3” “chocolate4”
[57] “coral” “coral1” “coral2” “coral3”
[61] “coral4” “cornflowerblue” “cornsilk” “cornsilk1”
[65] “cornsilk2” “cornsilk3” “cornsilk4” “cyan”
[69] “cyan1” “cyan2” “cyan3” “cyan4”
[73] “darkblue” “darkcyan” “darkgoldenrod” “darkgoldenrod1”
[77] “darkgoldenrod2” “darkgoldenrod3” “darkgoldenrod4” “darkgray”
[81] “darkgreen” “darkgrey” “darkkhaki” “darkmagenta”
[85] “darkolivegreen” “darkolivegreen1” “darkolivegreen2” “darkolivegreen3”
[89] “darkolivegreen4” “darkorange” “darkorange1” “darkorange2”
[93] “darkorange3” “darkorange4” “darkorchid” “darkorchid1”
[97] “darkorchid2” “darkorchid3” “darkorchid4” “darkred”
[101] “darksalmon” “darkseagreen” “darkseagreen1” “darkseagreen2”
[105] “darkseagreen3” “darkseagreen4” “darkslateblue” “darkslategray”
[109] “darkslategray1” “darkslategray2” “darkslategray3” “darkslategray4”
[113] “darkslategrey” “darkturquoise” “darkviolet” “deeppink”
[117] “deeppink1” “deeppink2” “deeppink3” “deeppink4”
[121] “deepskyblue” “deepskyblue1” “deepskyblue2” “deepskyblue3”
[125] “deepskyblue4” “dimgray” “dimgrey” “dodgerblue”
[129] “dodgerblue1” “dodgerblue2” “dodgerblue3” “dodgerblue4”
[133] “firebrick” “firebrick1” “firebrick2” “firebrick3”
[137] “firebrick4” “floralwhite” “forestgreen” “gainsboro”
[141] “ghostwhite” “gold” “gold1” “gold2”
[145] “gold3” “gold4” “goldenrod” “goldenrod1”
[149] “goldenrod2” “goldenrod3” “goldenrod4” “gray”
[153] “gray0” “gray1” “gray2” “gray3”
[157] “gray4” “gray5” “gray6” “gray7”
[161] “gray8” “gray9” “gray10” “gray11”
[165] “gray12” “gray13” “gray14” “gray15”
[169] “gray16” “gray17” “gray18” “gray19”
[173] “gray20” “gray21” “gray22” “gray23”
[177] “gray24” “gray25” “gray26” “gray27”
[181] “gray28” “gray29” “gray30” “gray31”
[185] “gray32” “gray33” “gray34” “gray35”
[189] “gray36” “gray37” “gray38” “gray39”
[193] “gray40” “gray41” “gray42” “gray43”
[197] “gray44” “gray45” “gray46” “gray47”
[201] “gray48” “gray49” “gray50” “gray51”
[205] “gray52” “gray53” “gray54” “gray55”
[209] “gray56” “gray57” “gray58” “gray59”
[213] “gray60” “gray61” “gray62” “gray63”
[217] “gray64” “gray65” “gray66” “gray67”
[221] “gray68” “gray69” “gray70” “gray71”
[225] “gray72” “gray73” “gray74” “gray75”
[229] “gray76” “gray77” “gray78” “gray79”
[233] “gray80” “gray81” “gray82” “gray83”
[237] “gray84” “gray85” “gray86” “gray87”
[241] “gray88” “gray89” “gray90” “gray91”
[245] “gray92” “gray93” “gray94” “gray95”
[249] “gray96” “gray97” “gray98” “gray99”
[253] “gray100” “green” “green1” “green2”
[257] “green3” “green4” “greenyellow” “grey”
[261] “grey0” “grey1” “grey2” “grey3”
[265] “grey4” “grey5” “grey6” “grey7”
[269] “grey8” “grey9” “grey10” “grey11”
[273] “grey12” “grey13” “grey14” “grey15”
[277] “grey16” “grey17” “grey18” “grey19”
[281] “grey20” “grey21” “grey22” “grey23”
[285] “grey24” “grey25” “grey26” “grey27”
[289] “grey28” “grey29” “grey30” “grey31”
[293] “grey32” “grey33” “grey34” “grey35”
[297] “grey36” “grey37” “grey38” “grey39”
[301] “grey40” “grey41” “grey42” “grey43”
[305] “grey44” “grey45” “grey46” “grey47”
[309] “grey48” “grey49” “grey50” “grey51”
[313] “grey52” “grey53” “grey54” “grey55”
[317] “grey56” “grey57” “grey58” “grey59”
[321] “grey60” “grey61” “grey62” “grey63”
[325] “grey64” “grey65” “grey66” “grey67”
[329] “grey68” “grey69” “grey70” “grey71”
[333] “grey72” “grey73” “grey74” “grey75”
[337] “grey76” “grey77” “grey78” “grey79”
[341] “grey80” “grey81” “grey82” “grey83”
[345] “grey84” “grey85” “grey86” “grey87”
[349] “grey88” “grey89” “grey90” “grey91”
[353] “grey92” “grey93” “grey94” “grey95”
[357] “grey96” “grey97” “grey98” “grey99”
[361] “grey100” “honeydew” “honeydew1” “honeydew2”
[365] “honeydew3” “honeydew4” “hotpink” “hotpink1”
[369] “hotpink2” “hotpink3” “hotpink4” “indianred”
[373] “indianred1” “indianred2” “indianred3” “indianred4”
[377] “ivory” “ivory1” “ivory2” “ivory3”
[381] “ivory4” “khaki” “khaki1” “khaki2”
[385] “khaki3” “khaki4” “lavender” “lavenderblush”
[389] “lavenderblush1” “lavenderblush2” “lavenderblush3” “lavenderblush4”
[393] “lawngreen” “lemonchiffon” “lemonchiffon1” “lemonchiffon2”
[397] “lemonchiffon3” “lemonchiffon4” “lightblue” “lightblue1”
[401] “lightblue2” “lightblue3” “lightblue4” “lightcoral”
[405] “lightcyan” “lightcyan1” “lightcyan2” “lightcyan3”
[409] “lightcyan4” “lightgoldenrod” “lightgoldenrod1” “lightgoldenrod2”
[413] “lightgoldenrod3” “lightgoldenrod4” “lightgoldenrodyellow” “lightgray”
[417] “lightgreen” “lightgrey” “lightpink” “lightpink1”
[421] “lightpink2” “lightpink3” “lightpink4” “lightsalmon”
[425] “lightsalmon1” “lightsalmon2” “lightsalmon3” “lightsalmon4”
[429] “lightseagreen” “lightskyblue” “lightskyblue1” “lightskyblue2”
[433] “lightskyblue3” “lightskyblue4” “lightslateblue” “lightslategray”
[437] “lightslategrey” “lightsteelblue” “lightsteelblue1” “lightsteelblue2”
[441] “lightsteelblue3” “lightsteelblue4” “lightyellow” “lightyellow1”
[445] “lightyellow2” “lightyellow3” “lightyellow4” “limegreen”
[449] “linen” “magenta” “magenta1” “magenta2”
[453] “magenta3” “magenta4” “maroon” “maroon1”
[457] “maroon2” “maroon3” “maroon4” “mediumaquamarine”
[461] “mediumblue” “mediumorchid” “mediumorchid1” “mediumorchid2”
[465] “mediumorchid3” “mediumorchid4” “mediumpurple” “mediumpurple1”
[469] “mediumpurple2” “mediumpurple3” “mediumpurple4” “mediumseagreen”
[473] “mediumslateblue” “mediumspringgreen” “mediumturquoise” “mediumvioletred”
[477] “midnightblue” “mintcream” “mistyrose” “mistyrose1”
[481] “mistyrose2” “mistyrose3” “mistyrose4” “moccasin”
[485] “navajowhite” “navajowhite1” “navajowhite2” “navajowhite3”
[489] “navajowhite4” “navy” “navyblue” “oldlace”
[493] “olivedrab” “olivedrab1” “olivedrab2” “olivedrab3”
[497] “olivedrab4” “orange” “orange1” “orange2”
[501] “orange3” “orange4” “orangered” “orangered1”
[505] “orangered2” “orangered3” “orangered4” “orchid”
[509] “orchid1” “orchid2” “orchid3” “orchid4”
[513] “palegoldenrod” “palegreen” “palegreen1” “palegreen2”
[517] “palegreen3” “palegreen4” “paleturquoise” “paleturquoise1”
[521] “paleturquoise2” “paleturquoise3” “paleturquoise4” “palevioletred”
[525] “palevioletred1” “palevioletred2” “palevioletred3” “palevioletred4”
[529] “papayawhip” “peachpuff” “peachpuff1” “peachpuff2”
[533] “peachpuff3” “peachpuff4” “peru” “pink”
[537] “pink1” “pink2” “pink3” “pink4”
[541] “plum” “plum1” “plum2” “plum3”
[545] “plum4” “powderblue” “purple” “purple1”
[549] “purple2” “purple3” “purple4” “red”
[553] “red1” “red2” “red3” “red4”
[557] “rosybrown” “rosybrown1” “rosybrown2” “rosybrown3”
[561] “rosybrown4” “royalblue” “royalblue1” “royalblue2”
[565] “royalblue3” “royalblue4” “saddlebrown” “salmon”
[569] “salmon1” “salmon2” “salmon3” “salmon4”
[573] “sandybrown” “seagreen” “seagreen1” “seagreen2”
[577] “seagreen3” “seagreen4” “seashell” “seashell1”
[581] “seashell2” “seashell3” “seashell4” “sienna”
[585] “sienna1” “sienna2” “sienna3” “sienna4”
[589] “skyblue” “skyblue1” “skyblue2” “skyblue3”
[593] “skyblue4” “slateblue” “slateblue1” “slateblue2”
[597] “slateblue3” “slateblue4” “slategray” “slategray1”
[601] “slategray2” “slategray3” “slategray4” “slategrey”
[605] “snow” “snow1” “snow2” “snow3”
[609] “snow4” “springgreen” “springgreen1” “springgreen2”
[613] “springgreen3” “springgreen4” “steelblue” “steelblue1”
[617] “steelblue2” “steelblue3” “steelblue4” “tan”
[621] “tan1” “tan2” “tan3” “tan4”
[625] “thistle” “thistle1” “thistle2” “thistle3”
[629] “thistle4” “tomato” “tomato1” “tomato2”
[633] “tomato3” “tomato4” “turquoise” “turquoise1”
[637] “turquoise2” “turquoise3” “turquoise4” “violet”
[641] “violetred” “violetred1” “violetred2” “violetred3”
[645] “violetred4” “wheat” “wheat1” “wheat2”
[649] “wheat3” “wheat4” “whitesmoke” “yellow”
[653] “yellow1” “yellow2” “yellow3” “yellow4”
[657] “yellowgreen”

Applied Predictive Modeling

apppredictlearnMax Kuhn and Kjell Johnson; Applied Predictive Modeling published by Springer 2013

This is such a good book it has taken me awhile to work through the book.  All the while finding examples of why people should read the book.

The summary in 2.3 does a good job of explaining why this subject is so important. Easy to pick a model, hard to get it correct with reliable, trustworthy results.

I was asked what models were in the book. All the commonly used ones  like K-Nearest Neighbors, plus models like Multivariate Adaptive Regression Spines and Cubist Regression Trees for Regression Models.

Classification Models including Nearest Shrunken Centroid and Nonlinear Classification Models.

Well thought out examples with the R packages and example code.

Take your time and work through this book.

 

 

 

Guide to Programming and Algorithms using R

algor9781447153276

Good book of useful algorithms programmed using R.

Written by Ozgur Ergul, published by Springer Publishing 2013.

I like how the book starts with cooking an omelette as an example of algorithm development.

Chapter 3.2.3 covers the Towers of Hanoi with detailed instructions and R code.

Chapter 4.5.1 is about the Traveling Salesman problem.

Chapter 6 has various sorting algorithms, Bubble sort, Insertion sort, and Quick sort. Table 6.1 is a comparison of the sorting methods.

Now I don’t have to figure out how to turn JAVA code into R code.

Chapter 7 has solutions of  linear systems of equations. Gaussian elimination, LU Factorization, Pivoting, Cholesky Factorization, and Gauss-Jordan elimination.

The book goes on with more useful information of file Processing.