Separating Data in R

I had some messy data to turn tidy. Column of data that needed to be separated into two columns. All the directions where obscure and not helpful. Try searching for a regular expression on the web.
One of the things I was puzzled over was \\.+ found out it meant gosub(). Much easier to search on. Delimiter was another puzzling thing until I realized that I could treat it the same as when I read csv files. This is the R code that worked.


tidymessydata <- (separate(messydata, State.ZIP, into = c(“State”,”Zip”), sep = ” “))

separate is a function

messydata is the data.frame and State.Zip is the column that should be two.

into is the new column names

sep is the delimiter function, space is what it was separated on. I pressed the space bar between the quotation marks.

Hopefully this is clearer than what I found for directions.



The Cox Model and Its Applications

The Cox Model and Its Applications
The Cox Model and Its Applications

The Cox Model and Its Applications published in Springer Briefs in Statistics 2016. Written by Mikhail Nikulin and Hong-Dar Isaac Wu.

I enjoyed reading this book although it has no code examples. I think I can figure out the code from the precise equations.

Cox proportional hazards model is a type of survival analysis.  The proportional hazards model was put forward by Sir David Cox in 1972.

Chapter 2 covers the basic concepts for models. Including  classical parametric models and how to handle censored data.

Chapter 3 covers the cox proportional hazards model including tampered failure time model.

Chapter 5 is about Cross-effect Models of Survival Functions.

5.2 Parametric Weibull Regression with Hetroscedastic Shape parameter.

There are lots more models. I recommend reading the book with a card you have written on explaining in a way you understand the  definitions and symbols used in the book.


RStudio & GitHub

Last night I learned what step I was missing to use RStudio and GitHub together. When I needed to push code to GitHub I couldn’t get it to work. This worked:

First make a repository on GitHub.

Then copy the SSH code for cloning repositories.

Open RStudio,make a new project for the repository,  go to tools tab, choose version control, pick git.

Next set up the project version control

Paste the SSH clone code in RStudio box for GitHub

Then the rest happens and RStudio is linked to GitHub and you can commit, push and pull.

I am glad I finally figured this out. Going to user groups in beneficial.

Web Application Development with R using Shiny

Chris Beely wrote Web Application Development with R using Shiny, second edition, published by Packt Publishing January 2016

Shiny is based on bootstrap.

This is a good book to read even if you were not planning on using Shiny because it covers a lot about web app development.

I have found R and Shiny a useful tool for data scientists to communicate with developers. It make great mock-ups.

The code for the book is on

jQuery Essentials

Troy Miles wrote jQuery Essentials published by Packt Publishing 2016. This is a good enough book that twice I started a review of it. The code for this book is available to download on

The book has good coverage of the DOM, document object model. I like the section in chapter9 about never modify the DOM in a loop.

Chapter 8 about separation of concerns covers unit tests. Tells you how to use events to decouple code. Break the code into logical units. Separation of Concerns is a useful software architecture pattern.

The book covers key fundamentals of jQuery

Practical DevOps

Practical DevOps by Joakim Verona published by Packt Publishing 2016

I am taking a DevOps class thru Hack Oregon. I found this book useful and recommended it to my class. We are learning how to use Ansible to provision and this book was most helpful. Chapter Seven has code to do Ansible and Docker together. I am working on getting this to work.

Vagrant UP?

I am taking a DevOps class. We are using vagrant. Saturday I lost my box. I typed vagrant up on the command line in a terminal window and nothing happened. I was thinking it would pop up like web servers do.
The command that I was missing was vagrant ssh.  This command ssh (secure shell) into the virtual box.

vagrant provision command allows you to make changes and add things like games to your virtual box.

Useful vagrant commands:

vagrant up

vagrant provision

vagrant ssh

Python Data Science Essentials

Authors: Alberto Boschetti and Luca Massaron published by Packt April 2015.

I am a Data Scientist who usually codes in R. It was a challenge to get comfortable  enough in python code to review the book. Python come in a lot of flavors.  I used Anaconda Launcher to run jupyter notebooks. The code is on the publishers page.

With broad strokes in six chapters it cover the fundamentals of Data Science using python. The pretty blue mosaic tile swirl on the cover catches your eye.

My favorite chapter is chapter five on Social Network Analysis. I like the table on graph types, node and edges. For example Twitter, a directed graph, people are nodes and followers are edges. Very useful table for writing code.

Get the code, run the notebooks, have fun.