Linear Mixed-effects Models Using R by Andrzej Galecki and Tomasz Burzkowski, published by Springer is a book that covers in dept a lot of material on linear models.
The book has clear instructions on how to program in R.
The book in chapter 4 covers model reduction using a null model and alternative model, which are nested models. Model reduction is a topic that needs to be discussed by coders. I have talked with many people who have put everything in a regression model just because they could.
Section 5.2 has the proper form for model formulas
R expression ~ term.1 + term.2 + …+ term.k
It is nice to see this spelled out so clearly.
Chapter 8 shows how to use the nlme package.
Part Three covers Lm’s that allow the relaxing of the assumptions of independence and variance of homogeneity. This a topic that I needed information on.
This is a good reference book.
Machine Learning for Hackers gets you started using R for machine learning. The book does a good job telling you how to install R and where to find help.
All the code and data for this book is on https://github.com/johnmyleswhite/ML_for_Hackers.git
Sadly there is not an R package.
There are lots examples on how to explore data using ggplot2. Other package covered include plyr which they equal to map reduce. tm package which is used in polynomial regression. glmnet and the Lamda function. K-Nearist neighbor algorithm which uses the class package.
Also good information on how to work with api’s and json using RCurl. RJSONIO and igraph.
This book is written for hackers, people who already know how to code. The theory is found in other books. More detail on specific techniques and R code is in other books. This book is a good starting point for machine learning and R.
I took a break from trying to figure how to get the data that goes along with the books that I am reading, to read a Springer Book
Fisher, Neyman, and the Creation of Classical Statistics; by Erich Lehman.
The book was a nice break, I enjoyed reading about the Human traits of the founders of modern classical statistics. The author put a lot of work into finding and citing the writings from Fisher and Neyman.
I learned that Ronald Aylmer Fisher was a wrangler, a student doing the best in examinations. I have been puzzled by the term data wrangler, thinking about rodeos and the west. It makes more sense to be the best student. Although a lasso might come in handy when fetching data.
It was fun to read about the silver jubilee of my dispute with Fisher by Neyman. Twenty five years of arguments. Wow that is a conflict.
The book ends with a discussion on the irony of Bayesian Inference.
This is a well done book that I recommend reading. I also think that it would make a great graphic novel.
Recently I updated my R package to 3.0.1 Good Sport. I wanted to download a package that wasn’t available in 2.4 toasted marshmallows. The book said that the package works with 2.5. I guess that it is only 2.5 because it doesn’t work in 3.0.1 either.
The point of this post is to remind myself to keep a list of the packages that I am using. When I upgraded R didn’t keep all the packages. At first I was puzzled and surprised. Then I figured it out. That upgrading into a new folder was part of the problem.
I am going to solve the problem by starting up my other computer the MAC book and compare packages. I try to keep my windows and MAC R environments the same.
Next time I upgrade I am going to write down a list of packages.
I am enjoying reading this book, authored by A Ohri. I like the short interviews of people like Hadley Wickam author of ggplot2, ch 5.10 and James Dixon founder of Pentaho, ch 18.104.22.168.
We discussed Pentaho at a recent Quantified Self Meet-up. After learning about Pentaho, I was pleased to find a section on in this book.
Along with how to work with R and every current database, cloud service, api’s and json there is a section on postgreSQL my favorite database.
After reading this book I feel more confident about getting data into R.
The amount of information about graphics cover just about everything. Chapter 5 has code for pie charts and Venn diagrams, even code for a word cloud.
Chapter 6 Building Regression Models covers multicollinearity and hetroscedasticity. Something that I don’t think is talked about often.
Note about the code in this book he uses = as the assignment operator not <-
Each Chapter has a summary at the end listing all the packages and functions used in the chapter. I am finding this to be a very useful book on business analytics.