Eigen-analysis of Linear Model Behavior in R

This post is actually about replicating the figures in Otto and Day: A Biologist’s Guide to Mathematical Modeling in Ecology and Evolution. The figures I’m interested in for this post are Figures 9.1 and 9.2 in the chapter ‘General Solutions and Transformations – Models with Multiple Variables’. However, these figures are actually about analyzing model behavior of linear models of the form. So even though I’m replicating figures, the techniques apply to any linear model of the form:

ModelFormalthough there can be more than two variables and equations. These models can be simplified into matrix format:

mat_explicit

which is equivalent to:

MatForm

The long-term behavior of these models is determined by the eigenvalues and eigenvectors:

transA is a matrix of eigenvectors and D is a diagonal matrix of eigenvalues and n(0) is a vector of starting conditions. In their example, n(0) = [2,1] and

M

Using R, n(t), or the output at each time step, can be calculated using the transform above:

M <- matrix( c(20/33, -2/11, 8/33, 46/33) , ncol=2 )
A <- eigen(M)$vectors
D <- diag(eigen(M)$values)

N <- array( dim=c(11, 2) )

n0 <- matrix( c(2,1) )

N[1,] <- n0

for(i in 2:11){
    N[i,] <- A %*% D^(i-1) %*% solve(A) %*% n0
}

N <- as.data.frame(N)

The dataframe N now contains n(t) for each time step. I had to use D^(i-1) because N[1,] was actually n(0), and N[2,] was D^1, etc. Plotting these results, including the eigenvectors, gives Figure 9.1 from the Otto and Day. I arbitrarily scaled the eigenvectors, but I modified the axes to be as similar to the original figure as possible

a <- sqrt(eigen(M)$values) * 5
ends <- abs(A) %*% diag(a)

par(mar=c(4,4,1,1))

library(shape)

plot(V2 ~ V1, N, bty='l',
 xlab='Species 1', ylab='Species 1',
 pch=16,
 ylim=c(0, 12),
 xlim=c(0, 5),
 yaxs='i',
 xaxs='i')

Arrows(0,0, ends[1,], ends[2,])

text(N[1,], 't = 0', pos=3)
text(N[2,], 't = 1', pos=3)
text(N[11,], 't = 10', pos=3)

text(ends[1,], ends[2,],
    c('Second Eigenvector', 'First Eigenvector'),
    pos=c(2,3))

R_Fig91

This example shows how to use R to examine the behavior of a linear model at multiple time steps. It also shows how the system is dominated by the eigenvector with the largest eigenvalue (see how the points begin to converge and lie entirely along the second eigen vector) NOTE: In my example, I switched the names of the eigenvectors to be consistent with the book. R actually organizes the eigenvectors for you in order of eigenvalue, such that the labels of the eigenvectors in the graph above are actually switched from what R would output.

We can transform the model output n(t) to use the eigenvectors as the coordinate axes:

yt <- array( dim=c(11, 2) )
for(i in 1:nrow(N)){
    yt[i,] <- solve(A) %*% N[i,]
}

par(mar=c(4,4,1,1))
plot(yt, pch=16,
    ylab='First Eigenvector', xlab='Second Eigenvector',
    yaxs='i', xaxs='i',
    ylim=c(0, 2), xlim=c(0, 12), bty='l')

text(yt[1,1], yt[1,2], 't = 0', pos=4)
text(yt[11,1], yt[11,2], 't = 10', pos=3)

which gives

R_Fig92

Again, the axes here are reversed from the book because R automatically organizes the eigenvectors. Also note that transforming n(t) onto the eigenvectors uses the formula:mappingwhich is different from the way we map observations onto principle components (i.e. eigenvectors) during PCA. Actually, in a PCA, the formula above and the traditional formula  (y(t) = n(t)A) are equivalent:

X1 <- rnorm(20, 0, 10)
Y1 <- X1 + rnorm(20, 0, 2)

Z <- cbind(X1, Y1)

cmat <- cor(Z)

A <- eigen(cmat)$vectors

Zt <- Z %*% A

yt <- array(dim=c(20, 2))
for(i in 1:nrow(yt)){
    yt[i,] <- solve(A) %*% Z[i,]
}

round(Zt, 3) == round(yt, 3)

This is not the case for linear models (try it). The difference is that PCA operates on the variance-covariance (or correlation) matrix of a set of observations. Correlation and covariance matrices are symmetrical and thus the inverse of A is the same as the transpose of A (again, try it). For example, n(t)A is equivalent to t(A) t(n(t)) where t( ) is the transpose (I have no idea how to do superscripts in WordPress). However, because in PCA the eigenvectors are from a symmetric matrix, t(A) A^-1, and A^-1 t(n(t)) n(t)A. So the two formulas are identical.

Linear model matrices aren’t necessarily symmetrical and the eigenvectors aren’t necessarily orthogonal (see the first figure). Thus, A^-1 t(n(t)) does NOT equal n(t)A That’s what I’ve figured out so far, anyway (this is the problem with learning statistics before linear algebra, rather than the other way around. I got used to using a formula that’s specific to symmetric matrices and never realized it wouldn’t work elsewhere). Just be sure to use the right mapping formula. When in doubt, use A^-1n(t). That’s my pretty pathetic explanation.

Python in Ecology – Stability Analysis of a Predator-Prey Model

As you may notice, my posts bounce back and forth between pop science, commentary, and programming/statistics/math. I basically blog on whatever I happen to be doing at the moment, or whatever I happen to be thinking about (but not everything I happen to be thinking about, which today involved alot of The NeverEnding Story because my dog kind of looks like Falcor, except cuter, and Spaceballs, because Mel Brooks).

Other that that, I’m working my way through this excellent book on modeling in biology by Sarah P. Otto and Troy Day: A Biologist’s Guide to Mathematical Modeling in Ecology and Evolution. Because I’m cheap/broke, I’m learning to use Python to work through some of the problems in this book, and I figured it might be good to share my experience for other interested ecologists who can’t afford Matlab/Mathematica/Maple or who embrace the open-source philosophy.

Today, I was working through one of their examples on equilibria and stability of predator-prey models (using a Type II functional response, albeit parameterized like a Monod function rather than the traditional Type II response equation). I thought I’d share how I got through the example using Python.

Here are the coupled, nonlinear ODE’s:resource

consumer

The first step is to define the variables and enter the equations. We define the variables as non-negative, although they can take on 0 values. Technically, r can be negative, but this won’t impact the results.

import sympy as sm
R, P, r, K, c, e, a, d = sm.symbols('R, P, r, K, c, e, a, d', negative=False)

res = r*R*(1 - R/K) - P*c*R/(a+R)
cons = e*c*R*P/(a+R) - d*P

print res
print cons

The next step is to define the equalities, setting these equations equal to zero (which is the condition for equilibria).

resEqual = sm.Eq(res, 0)
consEqual = sm.Eq(cons, 0)

Solving these ODEs individually would give us the equilibrium conditions for resources and consumers individually, but not simultaneously. The individual equilibria are null clines, where, for example, resources exhibit no change but predator abundances can still fluctuate. The simultaneous equilibria are the true equilibria, where neither resources nor consumers change. Finding the true equilibria is as simple as solving the system of equations simultaneously for both R and C.

equilibria = sm.solve( (resEqual, consEqual), R, P )
print equilibria

This will return three equilibria, with two elements each. The elements are in the order we specified in sm.solve( ), R then P. So the first equilibrium, (0,0), is R = 0 and P = 0. This is intuitive. The system is at equilibrium when nothing is there. The third equilibrium, skipping the second for now, is (K/2 – a/2 + sqrt((K + a)^2)/2, 0). Sympy is a little too strict and absolute sometimes, but you can verify that this simplifies to (K,0), which is R = K and P = 0. When predators are absent, resources go to carrying capacity. The third equilibrium is (ad/(ce – d), -aer(-Kce + Kd + ad)/(K(ce – d)^2) ). This is the only equilibrium with both predators and resources present.

Next, we want to know some properties about these equilibria. Are they stable? Are they unstable? Since this model is purely symbolic, these questions are actually the same: What are the conditions of stability for these equilibria? We’ll use the Jacobian matrix to answer this question (I’m not covering what the Jacobian matrix is here, the book will tell you that).

First, set up a matrix containing the differential equations. Then, set up a matrix containing the variables you want to differentiate (in this case R and P). It is incredibly important that the equations and variables are entered in the same order, that is the resource equation goes first and the resource variable goes first.

eqMat = sm.Matrix([ res, cons ])
resMat = sm.Matrix([ R, P ])

jacMat = eqMat.jacobian(resMat)

Next, we can substitute our equilibrium values into the Jacobian matrix and get the eigenvalues. These eigenvalues will tell us whether the conditions of stability. Start with the equilibrium R=0, P=0

eq1mat = jacMat.subs([ (R,0), (P,0) ])
print eq1mat

This is a diagonal matrix with the elements r and -d, which means the eigenvalues are r and -d. d is the mortality rate of predators and cannot be negative. r is the population growth rate of resources and can be negative (if the population declines). Equilibrium stability requires that both eigenvalues be negative, meaning that populations will shrink towards the equilibrium. The only time the R=0,P=0 equilibrium is stable is when r < 0, or resources have a negative growth rate. Otherwise, resources will expand away from the origin at rate r.

Next, we examine the R=K, P=0 equilibrium.

eq3mat = jacMat.subs([ (R,K), (P,0) ])
print eq3mat

This is an upper triangle matrix, meaning that the eigenvalues are again the diagonal elements, which are -r and Kce/(K + a) – d. Both eigenvalues are required to be negative for stability. So r must be > 1 (positive population growth rate), otherwise resources will fall away from the equilibrium if perturbed. Also, Kce/(K+a) – d must be < 0, which means Kce/(K+a) < d. The easiest way to interpret this is if c (the encounter rate between predators and prey) and e (the conversion efficiency of prey to predators once consumed) are sufficiently small. That is, if predators encounter prey rarely, and are very bad at making new predators, then predator populations will fail to establish. Otherwise, the equilibrium is unstable.

The final equilibrium is (ad/(ce – d), -aer(-Kce + Kd + ad)/(K(ce – d)^2) ), and this is the most complicated. Rather than type it all out, we’ll just substitute the values in from the saved equilibria.

eq2mat = jacMat.subs([ (R, equilibria[1][0]), (P, equilibria[1][1]) ])

print eq2mat

This is a very complex matrix. Taking the eigenvalues results in a complex mess:

evs = eq2mat.eigenvals()
print evs.keys()[0]
print evs.keys()[1]

NOTE: sympys saves the eigenvalues as a dictionary, so we have to print the dictionary keys to get the eigenvalues. That’s what the evs.keys( ) command is about.

These eigenvalues are incredibly complex, but they boil down to this:

-sqrt(some complicated function) / (2Kce(ce – d))

and

+sqrt(some complicated function) / (2Kce(ce – d))

We know two things. The square root value HAS to be positive. It CAN contain a -1, but that would just make the eigenvalue imaginary. If we assume the square root is positive, we can say that the eigenvalues are

-x / (2Kce(ce – d)) and +x / (2Kce(ce – d))

If the square root IS negative, then the eigenvalues are

-x * i /(2Kce(ce – d)) and +x * i / (2Kce(ce – d))

This makes no difference for stability (but it does make a difference whether the function oscillates or not). Regardless, notice that these eigenvalues have the same denominator, and are opposite in sign. The only way that + x / (2Kce(ce – d)) can be negative is if 2Kce(ce-d) is negative. However, if that were true, then -x / (2Kce(ce – d)) would be positive. Essentially, there’s no way both of these values can be negative, so this equilibrium is inherently unstable for any parameter values.

Let me know if I screwed that up, but I think it’s correct.

The Federal Sequester Slashes My Dissertation

Most of you are familiar with the federal sequester, the automatic across the board spending cuts in discretionary spending meant to help reduce spending deficits and balance the budget. (Ironically the main drivers of deficit, like entitlement programs, were exempt from these cuts, but that’s a different issue). Most of you are also probably familiar with the poster child of the sequester, FAA furloughs, which Congress has just passed a “band-aid” to prevent (whether the FAA furloughs were necessary is, again, an altogether different issue). Unfortunately, research and education programs were hit particularly hard by the sequester.

Image

Background

I’m a graduate student  in Biology at Florida International University. Usually, grad students pay the bills by being teaching assistants (TA). At most university, grad students TA for Fall and Spring semesters and have the summer off to do research (usually unpaid, but we pro-rate our stipends to still get income over the summer). At FIU, TAs are required to teach 3 semesters, Fall, Spring, and Summer. Needless to say, this makes getting  research done immensely difficult, especially for field ecologists who have pressing needs to be elsewhere (Australia, Maryland, in the Everglades, out on the water in Key Largo, there are a few examples of places we need to be other than teaching). Most of us have managed to find external funding (NSF, EPA, or research-assistantships from our advisors).

Image

This is me, not TAing and getting my dissertation going

I have a generous fellowship through FIU that gave me two years free from teaching, which have been very productive. Unfortunately, my fellowship ends in August. Which means I’d have to TA three semesters a year again without obtaining another source of income. Which means my ability research would be shot for at least a year, or until I manage to find other sources of funding.

How The Sequester May Have Hurt My Dissertation

I applied for, and received, a fellowship through the Smithsonian to cover my salary next year, pending a budget resolution, which would free me from having to TA. Unfortunately, the sequester happened, and Congress cut the Smithsonian’s budget by about 6% from 2012 levels. Of that 94%, Congress has actually only given the Smithsonian 75%, making the Smithsonian’s current operating budget for 2013 somewhere around 70% of 2012 levels. My fellowship is in that missing 30%. (At least, this is how it was explained to me by my sponsor at SI. Also of note, the sequester is why the Smithsonian is closing galleries and exhibits [and yes, the link is from Fox News and yes, I did that on purpose]).

Currently, I’m trying to work out a solution with FIU that would allow me to get my research done while I wait on word from the Smithsonian as to whether or not they’ll get the remaining 25% of their budget. It looks like we’ve come up with a few viable alternatives, but nothing is yet in writing. In fact, FIU may not even have a TA position available for me, as my advisor didn’t request one because we thought I had the Smithsonian fellowship. Best case scenario: The Smithsonian gets the missing 25% and manages to fund my fellowship, at least in part. This looking increasingly unlikely. Worst case scenario: The Smithsonian doesn’t get its full budget and FIU doesn’t have a TA spot available. At that point, I become unemployed (or at least unpaid). This is more likely than I’d prefer, although I’m reasonably confident I can avoid that. Next-to-worse case scenario: I have a TA spot, but I have to TA on campus rather than online: My dissertation will be delayed for at least a year and the quality of the work will decline substantially. Most likely scenario: A get a TA spot to TA an online class, which resolves most of these issues. This is, hopefully, the most likely resolution, but frankly, it’s still too uncertain.

Needless to say, the past month has been quite stressful.