TL;DR: Just look at the Gist. Summary: Act I, in which I try and fail. Act II, in which I think I succeed but actually failed without knowing it till I tried to use it. Act III, in which I return to my beginning, ponder the universe, dive deep into the depths of the abyss, and come back with the magic bean that makes everything work.
R doing what R does really, really, really, really, really, really, *R*eally well: visualization. Folks, this might be THE plot to use to visualize distributions of discrete/categorical variables or simultaneous distributions of multiple continuous variables, replacing or at least taking up a seat alongside the violin plots as the current best approach IMHO. Source code repository: ggjoy Example of use (EDIT: This plot style is named after the “Joy Division”, due to a similar graphic on one of their album covers.
There are several functions that calculate principal component statistics in R. Two of these are “prcomp()” and “princomp()”. The “prcomp()” function has fewer features, but is numerically more stable than “princomp()”. Both of these functions can be invoked by simply passing in a suitable data frame, in which case all columns will be used: pca1 = prcomp(d) pca2 = princomp(d) Alternatively, the columns to be used can be specified using a formula notation:
R R is like a microwave oven. It is capable of handling a wide range of pre-packaged tasks, but can be frustrating or inappropriate when trying to do even simple things that are outside of its (admittedly vast) library of functions. Ever tried to make toast in a microwave? There has been a push to start using R for simulations and phylogenetic analysis, and I am actually rather ambiguous about how I feel about this.