SYS-CON MEDIA Authors: Cynthia Dunlop, Mark O'Neill, Kevin Benedict, RealWire News Distribution, Gilad Parann-Nissany

Blog Feed Post

Tutorial: Basic data processing with R

R can do a lot of really amazing things, but to use just about any of R's many features you need to first import your data and get it into the appropriate shape. For R beginners, this "data wrangling" task can be daunting.  Fortunately, ComputerWorld's Sharon Machlis has created an in-depth tutorial for many data preparation tasks, which is well worth working through to get a sense of data-handling in R. This 8-page tutorial provides step-by-step instructions in the R language for adding columns to a data set, aggregating data by subgroup, sorting data, and reshaping data (converting "wide" data sets to "long" data sets, and vice versa). Unlike older R tutorials, this guide uses newer contributed R packages (including Hadley Wickham's reshape2 package) for many tasks. That's a good choice: especially the at the earlier stages of learning R, it's well worth learning these modern data manipulation tools rather than the more complicated standard R syntax. Check out the full tutorial at the link below. ComputerWorld: 4 data wrangling tasks in R for advanced beginners

Read the original blog entry...

More Stories By David Smith

David Smith is Vice President of Marketing and Community at Revolution Analytics. He has a long history with the R and statistics communities. After graduating with a degree in Statistics from the University of Adelaide, South Australia, he spent four years researching statistical methodology at Lancaster University in the United Kingdom, where he also developed a number of packages for the S-PLUS statistical modeling environment. He continued his association with S-PLUS at Insightful (now TIBCO Spotfire) overseeing the product management of S-PLUS and other statistical and data mining products.<

David smith is the co-author (with Bill Venables) of the popular tutorial manual, An Introduction to R, and one of the originating developers of the ESS: Emacs Speaks Statistics project. Today, he leads marketing for REvolution R, supports R communities worldwide, and is responsible for the Revolutions blog. Prior to joining Revolution Analytics, he served as vice president of product management at Zynchros, Inc. Follow him on twitter at @RevoDavid