Time series and Gaussian process

Yes, trees above 15" received strong insecticide every other year.

Yes, lat/long is recorded. Let me know how you will use it and I will try to provide this info to you.

No, the treatment was applied in June and observation was performed in August. The root cause for dying trees is larvae that eat the tree from inside so tree doesn’t get food. Insects are just producing the larvae. June is chosen for treatment because it is time insects are mating so by reducing insect population we effectively reduce larvae population. August is chosen for observation because it is good to estimate damage from larvae by that time. If you go further it is indirect measurement of larvae population. I think there is a way to think about it as controlling the spread of infectious disease.

In short, in a GP we can include lat/long as two dimensions in the process. Right now, I have density of treated trees in a radius of tree 21, but the radii could overlap, and this is potentially useful.

I still haven’t decided yet how I’ll use this information.

First I I’ll develop a model to predict canopy loss with high accuracy. With this model, we should be able to determine which inputs will keep canopy loss at a minimum. After we get this rolling, I’ll come back and we’ll discuss further.

Thanks.

@linas

A few things. Please expound upon anything, where ever you can. Send me links to references as needed.

  1. can you please send me the lat long, or anything that I can use to model spatial component equivalently? I’m thinking that the pests are more likely to travel to closer trees, so we should definitely include a spatial component. What do you think?

  2. Can you describe anything you know about treatments X,T at a qualitative level (air they administered via a spray, are they painted on the stumps of trees, as much information as possible).

  3. Please describe the pests. Do they fly? How fast do they migrate? What is their species? Is there more than one species? Do the different treatments interact with different pests differently? As much information as possible.

  4. Please describe the trees. I want to know their species, which ones are prone to which pests, how quickly they grow. Are they planted in concrete? Is this an urban environment or forested? What about their soil? Does this soil breed pests quicker? Are there areas of the environment that have different soil? If you have the species of the measure trees, please include their number and species together.

Please answer all of this in detail as much as possible, and I’ll incorporate all of this into the model. I’ll come by later with more questions.

Anything about the environment I should include, please do elaborate on. Again, I’m no ecologist.

But yeah - just so you know what I’m trying to do, from a naive statistical point of view. I’m trying to exploit as many hierarchies as possible. So it could be that we find the pest likes one species of tree, and the pesticide X isn’t working. Something like that. I remember this happening with the emerald ash borer in Michigan (the pests liked ash trees). May be we can find some phenomena like that.

Really, I just need categories but I’m hoping some discussion with spark new ones I can create hierarchies on

I am sorry if I missed some of your questions. Maybe it is a time to switch to non-web browser based email client? Please let me know if I missed something.

Anyway, the insect is https://www.americanforests.org/magazine/article/will-we-kiss-our-ash-goodbye/.

We have 3 ash species: blue, green, and white. The strong treatment is Emamectin Benzoate (labeled as T) every other year, the weak is Xytec (labeled as X) every year.
The application is urban forest. Virtual city with 30% ash trees is described in “Evaluation of potential strategies to SLow Ash Mortality (SLAM) caused by emerald ash borer (Agrilus planipennis): SLAM in an urban forest” by Deborah G. McCullough & Rodrigo J. Mercader.

Data is data.csv (20.7 KB)

Sure, my personal email is andrezapico@yahoo.com. Please send me an email and we can continue.

A few things. In the new dataset, now I don’t have density of treated trees in the area, and the ID’s have switched, so I can’t be sure which “Density” and “Distance” correspond to which tree. This is still potentially useful information I’d like.

Can you post the email discussion here or at least relevant parts and code? This is pretty interesting!

2 Likes

Seconded!

Absolutely. Currently I have a code which is far from perfect but works and gives reasonable predictions (predicted % of trees with thin < 30 matches observed). If interested I can post the code and associate dataset.

1 Like

I think I’ll just post code and data here. Multilevel/hierarchical models do a lot of the heavy lifting. Plots and code i can post soon. First I just did hierarchical based on tree ID which drastically improved prediction (for a linear model. adding higher order terms didn’t help much). But this is not “generalizable” or “extrapolatable” in that this model I couldn’t apply to new trees “out of sample”.

Can you please share it with us?

Unscaled data preprocessdata.csv (37.9 KB)
Stan model GP5year.stan (5.3 KB)
Predictive posterior for each tree (red x - observed, blue or green vertical line - 90% CR of yP) GP5year_6_1_100coverage.pdf (17.4 KB)
Predictive probability of thin<30% GP5year_6_1_100_thin
Sampling statistics (n_eff is not terrific while Rhat is good) GP5year_6_1_100_estimates.txt (2.9 KB)
Data file used for simulation (ID, relativeYear columns are not used, EByear column is not used either since GP is different for each other year to correspond to the treatment schedule with strong insecticide) GP5year.data.r (58.8 KB)

Obviously there is a potential for improvement. There are 4 chains with 1000 warmup and 1000 sampling. Trace is GP5year_6_1_100trace.pdf (338.5 KB)

1 Like

Few additions. Bayesian p-value based on \chi^2 discrepancy is 0.5 and the graphical representation is


It is possible that I messed up but looks like model is pretty good. Any comments?
The convergence/efficiency statistics estimates.txt (3.6 KB)
I would appreciate any critique - I mean it.

1 Like

Hey @linas! Thanks for being patient.

Via email you said you’d be happy to run this on a server. I don’t have access to one right now, so it would be splendid if you could run this for me.

I’m uploading the following:

  1. R notebook (canopy_loss_gp.R)
  2. the Stan code (gp_regression_ard.stan)
  3. raw data file you’ve sent me, who’s name I’ve changed (data_species.csv)

I’ve been very careful massaging the data, and this should only need to be run once.

Once you run the R file, you can run the GP model from command stan with:

./gp_regression_ard sample data file=gp_trees_demo.input.R output file=trees_demo_output.csv.

There shouldn’t be any divergences, but proposals with initially get reject a lot due to the covariance matrix.

What I’m doing is “generating counterfactual predictions from the GP”.

What we’ll get from this model:

  1. A time series of canopy Thin from Tree 1, had it received:
    a. treatment T (T)
    b. treatment X (X)
    c. no treatment (U)

What else would you like to learn?

We can also generate whatever we want. In words (please, no math, just English) can you please describe what you’d like to learn? Some examples would be:

  1. A Map at time T, for all trees untreated (treatment U)
  2. At location L (given in lat, long), what would the Canopy Time Series look like if this were a White Ash as opposed to a Blue Ash.

After that, it’s just a matter of formatting data and waiting patiently.

Possible Extensions

We can do a survival model, but I’ve since forgotten everything I read about survival analysis and I have to freshen up before I make myself look silly, as I’ve done previously in this thread :).

And the documents:

canopy_loss_gp.R (4.3 KB)
gp_regression_ard.stan (2.0 KB)
data.csv (20.7 KB)

Forget this stuff below, I was just doing the “LME” model instead with Stan, but it’s not particularly useful for us, I think since it’s spatial GPs are the way to go.

3 Likes

Thanks a lot for the model. I knew that you have excellent ideas.

Just a quick check. On my server

Gradient evaluation took 2.38 seconds
1000 transitions using 10 leapfrog steps per transition would take 23800 seconds.
Adjust your expectations accordingly!

Bit long…

The question in English.
Generate completely new trees. Treat some with T, treat some with X. Some are green ash, some white ash, some blue ash. Predict thin every year for untreated trees. What it we want - find proportion of treatment with X and T so probability of thin<30 after 5 years for untreated trees.

My concern with this model (as it is right now) is that the same GP is applied for treated and untreated trees. The physical processes are different. I think we can modify the model easily or perhaps having a covariate for year and when T is applied will do.

EDITS: I did some edits…

Totally agree. I have access to powerful GPUs on the server. Unfortunately stan can’t run there because they don’t support open CL. I have a weak one on Linux PC.

@Linas. This model ran with optimization in a few minutes. I’ve put code in the R code to generate posterior predictive for unseen data. All you need to do to compare against unseen data, is write the posterior predictive, as I’ve done in the code I shared with you, and plot the generated canopy time series and compare with the real canopy time series. If this gives you trouble, please let me know I can I write up the code for you.

Do you mean:

Generating Counterfactual Predictions from a Gaussian Process

scenario 1: Generate Thin time-series, when all trees untreated (U):

I have problem with some of the functions in the stan model. I cloned it yesterday but stanc gave me:

A returning function was expected but an undeclared identifier ‘gp_exp_quad_cov’ was supplied

Few thoughts. From physical point of view it makes sense to do train/test this way:
Assume train data is X_{train},y_{train}
Assume test data is X_{test},y_{test}

Then we train GP on full covariance matrix X=X_{train},X_{test} but using only observations y_{train} and considering y_{test} as missing. Then we predict missing test observations.

No, for a testing sample we want everything to be “unseen” by the model. In a GP, we compute the cross covariance matrix between training data and testing data, K_{X_*,X}. The posterior predictive mean function for noisy observations then becomes: \bar{f}_* =K_{X_*,X}(K_{X, X} + \sigma_n^2 I)^{-1}y. For a reference, see Gaussian Processes for Machine Learning, Chapter 2, equation 2.3. The whole book is worth reading.

Can you please share a more verbose output? Was this Rstan? Probably because it hasn’t been added to Rstan yet. This should work in command stan. Another developer, Rok, has just added these functions to the language in the new compiler. Thanks much, Rok.

Anyway - I’ve simplified the problem a bit in order to speed up execution time, I’ll post results on the simplified problem.