# How to Install R on an HPC: A Comedy in T̶w̶o̶ -- NO -- THREE Acts (a.k.a. 'The Longest Day')

TL;DR: Just look at the Gist. Summary: Act I, in which I try and fail. Act II, in which I think I succeed but actually failed without knowing it till I tried to use it. Act III, in which I return to my beginning, ponder the universe, dive deep into the depths of the abyss, and come back with the magic bean that makes everything work.

# The Rise of the Giants

Whales are all large by any measure, but one group of them in particular, the baleen whales (Mysticeti), are especially large, and, interestingly, this group only became really big relatively recently. Why did they get so big? Ed Yong (on Twitter) writes about the rise of these majestic giants in a series of great articles here and here, based on two separate yet related studies by Slater et al. and Gearty et al.

# Estimate Time for Job Completion (With Progress Updates) When Tar'ing Huge Directories

For the sake of future me, I am recording this here, the coolest shell trick I’ve learned this year: (Linux): tar cf - /folder-with-big-files -P | pv -s $(du -sb /folder-with-big-files | awk '') | gzip > big-files.tar.gz (OSX): tar cf - /folder-with-big-files -P | pv -s$((\$(du -sk /folder-with-big-files | awk '') * 1024)) | gzip > big-files.tar.gz with output looking like: 4.69GB 0:04:50 [16.3MB/s] [==========================> ] 78% ETA 0:01:21 Requires ‘pv’: https://github.

# The Traveler's Restaurant Process --- A Better Description of the Dirichlet Process for Partitioning Sets

I. "Have Any of These People Ever Been to a Chinese Restaurant?" The Dirichlet process is a stochastic process that can be used to partition a set of elements into a set of subsets. In biological modeling, it is commonly used to assign elements into groups, such as molecular sequence sites into distinct rate categories. Very often, an intuitive explanation as to how it works invokes the "Chinese Restaurant Process"

# 'Joy Plots' -- Great Plot Style for Visualizing Distributions on Discrete/Categorical or Multiple Continuous Variables

R doing what R does really, really, really, really, really, really, *R*eally well: visualization. Folks, this might be THE plot to use to visualize distributions of discrete/categorical variables or simultaneous distributions of multiple continuous variables, replacing or at least taking up a seat alongside the violin plots as the current best approach IMHO. Source code repository: ggjoy Example of use (EDIT: This plot style is named after the “Joy Division”, due to a similar graphic on one of their album covers.

# 'Pre-Columbian Mycobacterial Genomes Reveal Seals As A Source Of New World Human Tuberculosis'

When, in 1994, definitive evidence of tuberculosis in humans was reported from pre-Columbian America, it was a startling. Conventional understanding had pegged tuberculosis as part of the new, exotic, and (to immunologically-naive populaces) deadly menagerie of pathogens brought by Europeans over to the Americas. While there were suggestions of pre-Columbian tuberculosis in the Americans, these were based on lesions on bones, which were ambiguous. Unlike previous cases, however, the Chiribaya mummy from 1000-1300 CE in Peru was shown beyond doubt to have been exposed to tuberculosis:

# 'Early Paleocene landbird supports rapid phylogenetic and morphological diversification of crown birds after the K–Pg mass extinction'

A new fossil provides some insight into the critical K-PG boundary around which most modern bird lineages radiated: doi: 10.1073/pnas.1700188114

# Multispecies Coalescent Species Delimitation: Conflating Populations with Species in the Grey Zone (Evolution 2017 Talk)

Folks! The always fantastic Evolution meetings were a blast. So many great talks, and, perhaps more importantly, great catching up with so many friends, collaborators, and colleagues! I presented a talk on our PNAS paper showing how the Multispecies Coalescent model, when used for “species” delimitation, actually delimits Wright-Fisher populations. Titled “Multispecies Coalescent Species Delimitation: Conflating Populations with Species in the Grey Zone”, the entire talk can be viewed here:

# 'Phylogenomics reveals rapid, simultaneous diversification of three major clades of Gondwanan frogs at the Cretaceous–Paleogene boundary'

Some nice work that ties the timing of the radiation of three independent lineages of frogs, constituting the majority of modern living frogs, to about the time the major groups of dinosaurs took a hit (literally and figuratively!). Compelling and interesting story, with lots of intriguing follow-up questions. A more general article covering the findings is available here. Yan-Jie Feng, David C. Blackburn, Dan Liang, David M. Hillis, David B.

# Solving the 'Could not find all biber source files' Error

Biblatex is a fantastic bibliography/citation manager for LaTeX. It trumps the older bibtex for its much easier customizability and configuration. It does however, have one bug that can be very perplexing to figure out due to the misleading error message that results: “Could not find all biber source files”. At first glance this message seemed straightforward enough to send me poking about the project file structure and build system, checking paths and names.