Instant PostgreSQL Starter

Author Daniel K Lyons published by Pakct Publishing

http://www.packtpub.com/instant-postgresql-starter/book
I wish that his book would of been available when I first started using PostgreSQL, it would of saved me a lot of trouble.
The Installation instructions are straight forward. The quick start section has clear SQL instructions.
Top 9 features you need to know about covers, things like properly storing passwords, encryption using pgcrypto and backup and restore which are necessary for all databases.

XLConnect

Wow this works sweet. Thank you Six Sigma with R.
I have an Excel worksheet that I need to analyze. They are not always to smoothest thing to read into R.
I just downloaded and used XLConnect. First try exactly is what I wanted.

dummy code
library (XLConnect)
wb <- loadWorkbook(“toyprob.xls”)
data.toyprob <- readWorksheet(wb, sheet = 3)
str (data.toyprob)

this side is the object <- what it is assigned to

R Error Messages

I spent a good part of yesterday trying to figure what an error message meant. I was trying to draw a classification tree. I kept getting an environmental error message. I couldn’t figure out what was wrong. I searched for answer, only to find nothing useful. Then I remembered about vectorization and turning my data into a data object. I didn’t think I needed it here since I was following the example exactly. But I did.
Useful information on Data objects is in Six Sigma with R, Emilio Cano, Javier Moguerza and Andres Redchuk; Chapter 2.4.

Useful information on subsetting is in R Cookbook, Paul Teetor; Chapter 5.24

examples of what worked.
toycat <-subset(datatoycat, select= c(animal,eye,fur, legs))

toy <- rpart(toycat, method = "class")

Ignite OSCON

I am presenting at Ignite OSCON 2013. Is There a Cat in Here, Data Mining with Toys. I am busy working on my slides. It is difficult to condense data mining into 20 slides in five minutes. I am having fun doing this. I have lots of great pictures for my slides. Books that I have been using for the theory and practice of data mining are: The Elements of Statistical Learning by Trevor Hastie, Robert Tibshirani and Jerome Friedman, Springer Press. And one that I now owe the library fines on, Introduction to Algorithms by Thomas Cormen

Open Source Bridge 2013

I spoke at Open Source Bridge this year. My slides

Study Design OSB – https://docs.google.com/presentation/d/1DK2Y7SWKyNljORjHY8KaAsU3o72zo52uS9wkRa-gs94/edit?usp=sharing

Open Source Bridge is a great conference. We were sad to see it over. Registration is already open for next year. I hope to see you there. http://opensourcebridge.org/

Instant R Starter, the missing piece

Instant R Starter from Packpub.com is a book that I wish was available awhile ago. The book has information that I had to dig for when I needed it. Special values (NA, NaN, INF) NA is missing number. NaN not a number. And how they are used in R.

Clear directions on working with vectors. How vectors can be used as arguments of functions.

A clear concise book that is the missing piece.  It covers R programming with code and examples of loops and how to make your own functions. The missing piece in my library of R books.

Six Sigma with R

978-1-4614-3651-5

Six Sigma with R

Statistical Engineering for Process Improvement
Cano, Emilio L.; Martinez Moguerza, Javier; Redchuk, Andrés
Publication year 2012
I am a six sigma black belt. Six Sigma with R is a straight forward book that seamlessly matches my other Six Sigma books. The R code is understandable and easy to reuse. I am using the book to help me write a talk for Open Source Bridge 2013.
http://opensourcebridge.org/sessions/1127

Is There a Cat in Here?

I did a session at bar camp 7 Portland. I brought a plastic bin of toys and asked the question Is there a cat in here? Talked and demoed how we would go about this. It is very slow to inspect each item and verify if it is a cat. First how would we know if we had a cat? We concluded that a cat had four legs, a head and fur. Took samples out of the bin and classified them into groups. Showed different types of classification trees, including discussion on red-black trees. Members of the group discussed their big data issues and sorts. Like coming up with an inspection criteria that allows you to make large cuts at the beginning and never look at that data again. We got thru 80% of the toys and concluded that there wasn’t a cat in the bin