Introduction

Visualization is important to see our data especially after ordering and/or clustering. Of course appropriate data size worth to see in a figure. In this post, some visualization possibility can be read without being exhaustive.

Firstly a matrix needed to see it graphically. Let’s create one. It has more structure than real cases to see easily the plots.

A <- matrix(c(2,5,2,1,0,0,0,0,1,0,0,0,0,1,3,5,6,0,0,1,0,0,0,2,0,0,1,2,7,2,4,6,2,5,1,0,0,1,0,0,0,1,0,0,3,5,4,0,0,1,0,0,1,0,0,2,0,3,5,7,3,1,4,0,1,0,0,0,0,2,0,0,0,1,3,4,6,0,0,1), byrow=T, nrow=8, ncol=10)
colnames(A) <- letters[1:10]
rownames(A) <- LETTERS[1:8]
print(A)
##   a b c d e f g h i j
## A 2 5 2 1 0 0 0 0 1 0
## B 0 0 0 1 3 5 6 0 0 1
## C 0 0 0 2 0 0 1 2 7 2
## D 4 6 2 5 1 0 0 1 0 0
## E 0 1 0 0 3 5 4 0 0 1
## F 0 0 1 0 0 2 0 3 5 7
## G 3 1 4 0 1 0 0 0 0 2
## H 0 0 0 1 3 4 6 0 0 1

Basic figures

This figure will show this matrix without any ordering or clustering. ggplot() needs long data that melt() {reshape} function can easily produce. It’s suggested to use fig.align="center" chunk option in rmarkdown.

library(reshape2)
library(ggplot2)

longData<-melt(A)
longData<-longData[longData$value!=0,]
ggplot(longData, aes(x = Var2, y = Var1)) + 
  geom_raster(aes(fill=value)) + 
  scale_fill_gradient(low="grey90", high="red") +
  labs(x="letters", y="LETTERS", title="Matrix") +
  theme_bw() + theme(axis.text.x=element_text(size=9, angle=0, vjust=0.3),
                     axis.text.y=element_text(size=9),
                     plot.title=element_text(size=11))

plot of chunk basic_matrix_figure

Some orders make this clear. One of the best way to order matrix is seriate() function in {seriation} package. Let’s see how it works.

library(seriation)

set.seed(2)
o <- seriate(A, method="BEA_TSP")

#with the same longData then earlier
longData$Var1 <- factor(longData$Var1, levels=names(unlist(o[[1]][]))) 
longData$Var2 <- factor(longData$Var2, levels=names(unlist(o[[2]][])))
#levels must be names
ggplot(longData, aes(x = Var2, y = Var1)) + 
  geom_raster(aes(fill=value)) + 
  scale_fill_gradient(low="grey90", high="red") +
  labs(x="letters", y="LETTERS", title="Matrix") +
  theme_bw() + theme(axis.text.x=element_text(size=9, angle=0, vjust=0.3),
                     axis.text.y=element_text(size=9),
                     plot.title=element_text(size=11))

plot of chunk ordered_matrix_figure

seriate() randomly choose the first step that’s why every code running results different plot. If you want to get the same plot set the seed.

Clustered matrix

In seriated matrix some structure can be seen. Let’s suppose that this matrix is a representation of a bipartite graph that nodes can be clustered for example with Louvain method implemented in {igraph} package. In this post graphical solutions are in focus that’s why graph theory things aren’t explained.

Our aim that we want to color elements with same colors that are in the same cluster.

First make clusters of matrix elements.

library(igraph)

#define a graph that represented as adjacency matrix with matrix A
g <- graph.incidence(A, weighted = TRUE)

#clustering with Louvain algorithm
lou <- cluster_louvain(g)
df.lou <- data.frame(lou$names,lou$membership)

After that join cluster information to longData that we want to plot.

library(dplyr)

#the same longData than earlier
longData <- left_join(longData, df.lou, by=c("Var1"="lou.names"))
colnames(longData)[4] <- "Var1_clust"
longData$Var2 <- as.factor(longData$Var2)
longData <- left_join(longData, df.lou, by=c("Var2"="lou.names"))
colnames(longData)[5] <- "Var2_clust"
longData$colour <- ifelse(longData$Var1_clust==longData$Var2_clust, longData$Var1_clust, 0)

Fill colours by clusters

Lastly plot clustered matrix with all cluster information.

longData$Var1 <- factor(longData$Var1, levels=unique(arrange(longData, Var1_clust)[,1]))
longData$Var2 <- factor(longData$Var2, levels=unique(arrange(longData, Var2_clust)[,2]))
#levels must be names
longData$colour <- factor(longData$colour)
#for colours variabes must be factors (discrete scale) otherwise ggplot recognize it continous
ggplot(longData, aes(x = Var2, y = Var1, fill=colour)) + 
  geom_raster() + 
  scale_fill_manual(values=c("grey80", "#B40404", "#0B6121", "#FFBF00")) +
  labs(x="letters", y="LETTERS", title="Matrix") +
  theme_bw() + theme(axis.text.x=element_text(size=9, angle=0, vjust=0.3),
                     axis.text.y=element_text(size=9),
                     plot.title=element_text(size=11),
                     legend.text=element_text(size=7))

plot of chunk clustered_figure

Colours of axis ticks by clusters

Axes ticks can be coloured as well but it needs some preparation. And values can be indicateted in a cells. But be carefull with too much information on one figure.

#use the same longData as earlier

#coloring axes ticks
axis.y.colour % select(Var1, Var1_clust) %>% unique %>% arrange(Var1_clust) %>% select(Var1_clust))[,1] %>% plyr::mapvalues(from=c(1:3), to=c("#B40404", "#0B6121", "#FFBF00"))
axis.x.colour % select(Var2, Var2_clust) %>% unique %>% arrange(Var2_clust) %>% select(Var2_clust))[,1] %>% plyr::mapvalues(from=c(1:3), to=c("#B40404", "#0B6121", "#FFBF00"))
ggplot(longData, aes(x = Var2, y = Var1, fill=colour)) + 
  geom_raster() + 
  scale_fill_manual(values=c("grey80", "#B40404", "#0B6121", "#FFBF00")) +
  labs(x="letters", y="LETTERS", title="Matrix") +
  geom_point(aes(size=value)) +
  theme_bw() + theme(axis.text.x=element_text(size=9, angle=0, vjust=0.3),
                     axis.text.y=element_text(size=9),
                     plot.title=element_text(size=11),
                     legend.text=element_text(size=7)) +
theme(axis.text.y=element_text(colour=axis.y.colour),axis.text.x=element_text(colour=axis.x.colour))

plot of chunk clustered_figure2


Be happyR! 🙂

Advertisements