biographr: Overlaying figures on a ggplot scatterplot

This follows on from a post I made on biomathr about graph diameters and eigenvalues (here), but is really a post about getting ggplot to overlay different types of figures on top of each other.

Let's suppose you want to make a 2d scatterplot of some data (which I do). And over the top of that image, you want to put a smaller plot, or a jpeg or some other kind of annotation (which I also want to do).

There's probably a simpler solution than the one I present, which involves using arrangeGrob to approximate the required grid point (I played around with annotation_custom, but I could only add a single graph using that).

First we make a bunch of graphs on 8 vertices that are all connected and are pairwise non-isomorphic (4156 of them, if that's important to you). We are interested in how the diameter, $d$, of the graphs (the max intervertex distance in the graph) varies with the Laplacian eigenvalues of the graphs. Specifically, how the second smallest of these eigenvals varies with diameter. Since we're only interested in connected graphs, distances and diameters are all well defined and $ \lambda_2 > 0$.

Now, we know that $$1 \leq d \leq n-1 $$ for a graph of order $n$, and that $d=1$ for the complete graph only and $d=n-1$ for the path of length n-1 only. Furthermore, from Mohar, we know that $$d \geq 4 / (n \lambda_2)$$ and that $$ d \leq 2 \left \lceil{\frac{\delta_{max} + \lambda_2}{4 \lambda_2} ln(n-1)} \right \rceil, $$ though, for the 4154 graphs that are neither complete nor paths, these constraints may be no better than $2 \leq d \leq n-2$. Here, $\delta_{max}$ is the highest vertex degree in the graph.

For each graph, we can determine the residuals between Mohar's inequalities and the observed diameter (where * is the Mohar-predicted upper-bound):
$$ m_{upper} = \frac{1}{*} (* - d)$$

$$ m_{lower} = \frac{1}{d} (d - \frac{4}{n \lambda_2}) $$

In the following, we evaluate these functions on each graph (other than the path and complete graph) and plot the upper against the lower residual as a scatter plot using ggplot and igraph (and some functions I've described here or in biomathr)

##########
library(plyr) source('scripts/diss_funcs.R')
source('scripts/diss_script.R')

igraph_ggplot <- function(
X,
coord.func = layout.fruchterman.reingold,
offset = 0.5,
add.vertices = TRUE,
add.labels = TRUE
){
# Converts a graph into x/y coords,
# then plots the graph as straight-line joined coords
d.frame <- igraph_to_ggplot_df(X, coord.func)
# Note that if I don't change the offset myself,
# arrangeGrob cuts off distal bits of my graph
x.range <- range(d.frame$x)
y.range <- range(d.frame$y)
g <- ggplot(data = d.frame,
aes(x = x, y = y, group = element.name, label = element.label)
) + geom_line(
) + theme_nothing(
) + xlim(x.range[1] - offset, x.range[2] + offset
) + ylim(y.range[1] - offset, y.range[2] + offset)
if(add.vertices){
g <- g + geom_point(colour = 'darkgoldenrod1', size = 10
) + geom_point(colour = 'white', size = 7)
}
if(add.labels){
g <- g + geom_text(hjust = 0.5,
vjust = 0.5,
aes(colour = element.type))
}
g
}

transparent_theme <- theme(
axis.title.x = element_blank(),
axis.title.y = element_blank(),
axis.text.x = element_blank(),
axis.text.y = element_blank(),
axis.ticks = element_blank(),
panel.grid = element_blank(),
axis.line = element_blank(),
panel.background = element_rect(fill = "transparent",colour = NA),
plot.background = element_rect(fill = "transparent",colour = NA)
)

lambda_seq <- function(g){
ev <- eigen(graph.laplacian(g))$values
ev
}

lambda_2 <- function(g){
rev(lambda_seq(g))[2]
}

g8 <- Filter(function(g){
is.connected(g) && ! diameter(g) %in% c(1, 7)
}, make_all_graphs(8))

# distributions of the diameter and eigenvalue for g8:
g8.details <- do.call('rbind', lapply(g8, function(g){
data.frame(diam = diameter(g), L2 = lambda_2(g))
}))
summary(as.factor(g8.details$diam))
# 2 3 4 5 6
# 1512 2167 419 50 6

# upper and lower bounds on the diameter, as given by Mohar
m_upper <- function(n, lam2, degmax){
2 * ceiling((degmax + lam2) * log(n-1) / (4 * lam2))
}

m_lower <- function(n, lam2){
4 / (n * lam2)
}

# What do these look like for n=8:
# We use a less strict inequality, where degmax is replaced by n-1
inequal.figure <- function(n){
x <- seq(0, n, length.out = 1000)
inequal.df <- data.frame(
x = x,
mohar.low = m_lower(n, x),
mohar.upp = m_upper(n, x, n-1),
trivial.low = 2,
trivial.upp = n-2
)
inequal.df$low <- apply(inequal.df[, c('mohar.low', 'trivial.low')], 1, max)
inequal.df$upp <- apply(inequal.df[, c('mohar.upp', 'trivial.upp')], 1, min)
gp <- ggplot(data = inequal.df) +
geom_line(aes(x = x, y = low), col = 'blue' ) +
geom_line(aes(x = x, y = mohar.low), col = 'blue', linetype = 'dotted') +
geom_line(aes(x = x, y = trivial.low), col = 'blue', linetype = 'dashed') +
geom_line(aes(x = x, y = upp), col = 'red' ) +
geom_line(aes(x = x, y = mohar.upp), col = 'red', linetype = 'dotted') +
geom_line(aes(x = x, y = trivial.upp), col = 'red', linetype = 'dashed') +
xlim(0, n) +
ylim(0, n) +
xlab(expression(lambda[2])) +
ylab("Diameter (bound)")
gp
}

ifig <- inequal.figure(8) + geom_point(
data = g8.details, aes(x = L2, y = diam)
)

# Diameters are only possible if they lie above the solid blue line and below the solid red line (unless complete / path graph)

# There's quite a wide range of $\lambda_2$ values for any given diameter. What do the graphs look like that have diameter d and a range of different values of $lambda_2$

which.median <- function(x){
mid <- ceiling(length(x) / 2) #
order(x)[mid]
}
g8.details$idx <- 1:nrow(g8.details)

ddply(g8.details,
.(diam),
summarize,
L2.min = min(L2),
L2.med = median(L2),
L2.max = max(L2),
idx.min = idx[which.min(L2)],
idx.med = idx[which.median(L2)],
idx.max = idx[which.max(L2)]
)

# diam L2.min L2.med L2.max idx.min idx.med idx.max
# 1 2 1.0000000 2.0000000 6.0000000 3755 4041 4151
# 2 3 0.3542487 0.9467493 2.3542487 162 3039 2828
# 3 4 0.2384428 0.5291319 1.0000000 140 225 1372
# 4 5 0.2022567 0.2672468 0.5107114 63 104 429

# 5 6 0.1667170 0.1930416 0.2354934 7 3 31

graph.imgs <- lapply(
c(3755, 4041, 4151, 140, 225, 1372, 7, 3, 31),
function(idx){
g <- g8[[idx]]
p <- igraph_ggplot(g, add.vertices = FALSE, add.labels = FALSE)
p
})
ag <- arrangeGrob(grobs = graph.imgs, nrow = 3)
grid.newpage()
grid.draw(ag)

##############################################################
Which is great, but suppose we want to plot these graphs over
the top of a scatter plot?...

###############################################################

lam2_lower <- function(n){
# requires n>=5 for calculation
A <- 2 * log(n-1) / (n-4)
2 * (n-1) * A / (1 - 2 * A)
}
x <- seq(5, 5000, 1)
plot(x, lam2_lower(x))

# m_upper_resid <- function(g){
require(igraph)
d <- diameter(g)
deg <- max(degree(g))
L2 <- lambda_2(g)
n <- length(V(g))
fit <- m_upper(n = n, lam2 = L2, degmax = deg)
resid <- (1 / fit) * (fit - d)
stopifnot(resid >= 0)
resid
}

m_lower_resid <- function(g){
require(igraph)
d <- diameter(g)
L2 <- lambda_2(g)
n <- length(V(g))
fit <- m_lower(n = n, lam2 = L2)
resid <- (1 / d) * (d - fit)
stopifnot(resid >= 0)
resid
}

resid.df <- data.frame(
lower = sapply(g8, m_lower_resid),
upper = sapply(g8, m_upper_resid)
)

p <- ggplot(resid.df, aes(x = lower, y = upper)) + geom_point(
) + xlim(0, 1) + ylim(0, 1)
p # plot of residuals for upper and lower bound

# Would prefer to be able to indicate what some example graphs look like though.

# extremes of the residuals:
rows <- unique(unlist(lapply(resid.df, function(col){
c(which.min(col), which.max(col))
})))
resid.df[rows, ]
# lower upper
# 140 0.4757653 0.7777778
# 3466 0.9583333 0.0000000
# 551 0.5493984 0.8000000

graph1 <- arrangeGrob(
igraph_ggplot(g8[[rows[1]]], add.vertices = FALSE, add.labels = FALSE
) + transparent_theme)
graph2 <- arrangeGrob(
igraph_ggplot(g8[[rows[2]]], add.vertices = FALSE, add.labels = FALSE
) + transparent_theme)
empty <- rectGrob(gp = gpar(col = NULL))

# split the graphing region into a 4 * 5 grid
# and put the igraphs in grids that are pretty close to their scatterpoint
overlay.matrix <- matrix(
rep(1, 4 * 5),
nrow = 4, ncol = 5, byrow = TRUE
)
overlay.matrix[1, 3] <- 2
overlay.matrix[4, 5] <- 3
overlay <- arrangeGrob(
grobs = list(empty, graph1, graph2),
layout_matrix = overlay.matrix
)
g <- ggplotGrob(p)
g
# TableGrob (6 x 5) "layout": 8 grobs
# z cells name grob
# 1 0 (1-6,1-5) background rect[plot.background.rect.808]
# 2 3 (3-3,3-3) axis-l absoluteGrob[GRID.absoluteGrob.800]
# 3 1 (4-4,3-3) spacer zeroGrob[NULL]
# 4 2 (3-3,4-4) panel gTree[GRID.gTree.788]
# 5 4 (4-4,4-4) axis-b absoluteGrob[GRID.absoluteGrob.794]
# 6 5 (5-5,4-4) xlab text[axis.title.x.text.802]
# 7 6 (3-3,2-2) ylab text[axis.title.y.text.804]
# 8 7 (2-2,4-4) title text[plot.title.text.806]
# note that 'panel' lies in grid points
# 3-3 (t-b) and 4-4 (l-r)
# this is what we want to 'overlay' overlay onto:
g1 <- gtable_add_grob(
g,
grobs = overlay,
t = 3, l = 4, b = 3, r = 4
)
grid.newpage()
grid.draw(g1)

biographr

Tuesday, 10 May 2016

Overlaying figures on a ggplot scatterplot

No comments:

Post a Comment