5円
Distractions and amusements, with a sandwich and coffee.
With some very smart people, I work on problems in data visualization applied to cancer research and genome analysis. Previously I was involved in fingerprint mapping, system administration, computer security, fashion photography, medical imaging and LHC particle physics. My work is guided by a need to rationalize, make things pretty, combine science with art, mince words, find good questions, help make connections between ideas and explain complicated things. All while exercising snark.
Circos is software that generates circularly composited views of genomic data and annotations.
Figures created by Circos are engaging, pretty and informative.
Circos is particularly suited for visualizing alignments, conservation and intra and inter-chromosomal relationships. (presentations on Circos; drawn heavily from Tufte's Visual Display of Quantitative Information)
This image reached the finalist stage at the 2009 National Science Foundation Visualization Challenge.
December 2009 saw the 10th Anniversary of the Genome Sciences Center. Some commemorative swag was handed out, among which was a stainless steel water bottle with the following image.
The image contains a barcode called QR Code (learn more) which encodes the names of all current employees at the Center.
Lexical analysis of 2008 US Presidential and Vice-Presidential Debates indicates that the speech patterns between candidates (especially those paired in a debate) are extremely similar and that the complexity of vice-presidential candidates is lower than presidential candidates (uniqueness is lower, repetition is higher).
Palin has the longest sentences, Biden repeats himself the most and has the smallest vocabulary, while patterns for Obama and McCain are eerily similar.
Use Atom feeds of candidates' word lists to create Wordles.
carpalx is a keyboard optimizer which rearranges letter positions on a keyboard to minimize typing effort. Discover the magical XBUL keyboard layouts which minimizes typing of English text. Or, if you dare, venture into the land of the disfigured TNWCLR keyboard layout which makes typing English text excruciatingly painful.
High Dynamic Time Range images (HDTR) are single-frame composites of a set of time-lapse photos.
The bioinformatics Perl workshop offers courses to help you learn Perl and apply it to your work. We have courses on introductory Perl, intermediate Perl, and others. Learn how to use map, grep and sort more efficiently or how to perform data analysis at the command line. The workshop is open to the public (given at the GSC 570 W 7th location) and all slides from each lecture are available online.
clusterpunch is a mini-benchmarker for clusters designed to monitor availability of resources
portknocking is a network authentication method in which a client establishes a connection to a host which presents no open ports
color encoding of vectors Color::TupleEncode - Mapping tuples to colors and visually comparing numbers
short-read sequencing genome coverage tables tables of read coverage for haploid, diploid and triploid genomes for a given sequencing redundancy
genome coverage simulator explore whole genome shotgun statistics
Image color summarizer produces statistics about an image's mean/median hue, saturation and intensity values. It's fun to play with and can be (eventually) used to auto-tag images based on color content.
My cover design on the 11 April 2022 Cancer Cell issue depicts depicts cellular heterogeneity as a kaleidoscope generated from immunofluorescence staining of the glial and neuronal markers MBP and NeuN (respectively) in a GBM patient-derived explant.
LeBlanc VG et al. Single-cell landscapes of primary glioblastomas and matched explants and cell lines show variable retention of inter- and intratumor heterogeneity (2022) Cancer Cell 40:379–392.E9.
Browse my gallery of cover designs.
My cover design on the 4 April 2022 Nature Biotechnology issue is an impression of a phylogenetic tree of over 200 million sequences.
Konno N et al. Deep distributed computing to reconstruct extremely large lineage trees (2022) Nature Biotechnology 40:566–575.
Browse my gallery of cover designs.
My cover design on the 17 March 2022 Nature issue depicts the evolutionary properties of sequences at the extremes of the evolvability spectrum.
Vaishnav ED et al. The evolution, evolvability and engineering of gene regulatory DNA (2022) Nature 603:455–463.
Browse my gallery of cover designs.
Celebrate `\pi` Day (March 14th) and finally hear what you've been missing.
“three one four: a number of notes” is a musical exploration of how we think about mathematics and how we feel about mathematics. It tells stories from the very beginning (314…) to the very (known) end of π (...264) as well as math (Wallis Product) and math jokes (Feynman Point), repetition (nn) and zeroes (null).
The album is scored for solo piano in the style of 20th century classical music – each piece has a distinct personality, drawn from styles of Boulez, Feldman, Glass, Ligeti, Monk, and Satie.
Each piece is accompanied by a piku (or πku), a poem whose syllable count is determined by a specific sequence of digits from π.
Check out art from previous years: 2013 `\pi` Day and 2014 `\pi` Day, 2015 `\pi` Day, 2016 `\pi` Day, 2017 `\pi` Day, 2018 `\pi` Day, 2019 `\pi` Day, 2020 `\pi` Day and 2021 `\pi` Day.
My design appears on the 25 January 2022 PNAS issue.
The cover shows a view of Earth that captures the vision of the Earth BioGenome Project — understanding and conserving genetic diversity on a global scale. Continents from the Authagraph projection, which preserves areas and shapes, are represented as a double helix of 32,111 bases. Short sequences of 806 unique species, sequenced as part of EBP-affiliated projects, are mapped onto the double helix of the continent (or ocean) where the species is commonly found. The length of the sequence is the same for each species on a continent (or ocean) and the sequences are separated by short gaps. Individual bases of the sequence are colored by dots. Species appear along the path in alphabetical order (by Latin name) and the first base of the first species is identified by a small black triangle.
Lewin HA et al. The Earth BioGenome Project 2020: Starting the clock. (2022) PNAS 119(4) e2115635118.
As part of the COVID Charts series, I fix a muddled and storyless graphic tweeted by Adrian Dix, Canada's Health Minister.
I show you how to fix color schemes to make them colorblind-accessible and effective in revealing patters, how to reduce redundancy in labels (a key but overlooked part of many visualizations) and how to extract a story out of a table to frame the narrative.