ooh… aah… TreeMap Figures

Lately I’ve been struggling to find a good way to visually represent annotations for genomic loci. Specifically, I have a set of outlier loci SNPs that I need to describe in a way that is both reductive and hierarchical, for publication. More importantly, it has to look awesome, because everybody who reads my paper needs to remember it.

Well, after several days of scratching my head at icicle plots and sunburst plots, I finally found something that is truly pleasing to the eye.

Lucky for me, there is this great new R package called treemap, that draws… well… treemaps. The best way to describe a treemap is just to show you one:

Quartz %d

Every color on the plot represents a category. And within each color space are sub-categories. In the above example, most of my outlier SNP loci were associated with intron sequences. And a small portion of those intron sequences were in splice site donor regions (at the 5′ end of an intron). Likewise, the exon box is divided up into exon, 3 prime untranslated region, and 5 prime untranslated region, according to the proportion of SNPs found in those parts of the gene.

The plot effectively conveys the information I need it to, but it also looks like a piece of mid-century modern art. Looking at it makes me feel a bit… groovy.

So, how did I make it? There is a useful online tutorial found at: https://rpubs.com/brandonkopp/creating-a-treemap-in-r

But, since you’re already here I’ll show you specifically what I did.

 

> library(treeplot)

> assoc1 <- c(rep(“Intron”, 2), rep(“Exon”, 3), “Upstream < 5 kbp”, “Downstream < 5 kbp”, “Intergenic”)

> assoc2 <- c(“Intron”, “Splice site donor”, “Exon”, “5′ UTR”, “3′ UTR”, “Upstream < 5 kbp”, “Downstream < 5 kbp”, “Intergenic”)

> assoc3 <- as.numeric(c(39,3,11,2,4,15,12,14))

What I’ve just done above is make three lists: two levels of categorical variables and one list of quantitative variables. You might notice that all the numbers add up to 100. That’s right, they are the percentages of loci found in each category… but I suppose raw numbers would work just as well.

Next I combine these lists into a data frame and plot.

> assocX <- as.data.frame(cbind(assoc1, assoc2))

> assocX <- cbind(assocX, assoc3)

> names(assocX) <- c(“association”, “sub-assoc”, “proportion”)

> assocX
         association                 sub-assoc          proportion
1       Intron                            Intron                                 39
2       Intron                           Splice site donor                3
3       Exon                              Exon                                      11
4       Exon                              5′ UTR                                    2
5       Exon                               3′ UTR                                   4
6       Upstream < 5 kbp      Upstream < 5 kbp             15
7       Downstream < 5 kbp Downstream < 5 kbp       12
8       Intergenic                     Intergenic                           14

> treemap(assocX, index=c(“association”,”sub-assoc”), vSize=”proportion”, type=”index”, palette = “Set1″, title=”Gene Associations”, fontsize.title=14, fontsize.labels=14)

And that’s it. There’s nothing else to it.

Here’s another example. This time, instead of gene association, I made a treemap of putative gene function.

Quartz %d

Graphs like these are information sugar–––people absorb them quickly and enjoy the stimulation. Bold colors and geometric patterns drive information into peoples brains as effectively as any drug. Ergo, I dare say that these treemaps are some of the most effective figures I’ve ever come across.

Expect to see them in virtually all my future publications.


One thought on “ooh… aah… TreeMap Figures

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s