Great, thank you for all of this additional info. It’s really helping me understand the problem. Especially the weaker insecticide, anything is helpful.
I know this is not the dataset, but it would help if we had the relative day, you can excluded identifying information but it would be more realistic if we’re modeling time dependency and we have the day/month. I’ve made the mistake of assumed even spacing when it wasn’t, and it made some silly inferences, we better include this.
Also - I’m at a phone right now. Haven’t looked at the csv yet, but 5 is 5 years ago, 4 is for years ago…etc, correct?
Yeah so a few things. There’s a lot of small details that can go wrong if you’re not careful. So the written model has a few weird things. Usually capital Y is a matrix. And YT would be matrix multiplication. So that’s a notation mistake. Also, generating your covariance matrix, you have K(X_t) as covariance matrix and (correcting the subscript mistakes, I think you mean) Y_{t-1}, so we’re modeling the past with the covariance of the future. Which won’t be possible until we figure out time travel. So that doesn’t make sense.
But an AR model with the covariance of the past
I don’t think is unheard of. I think I flipped through INLA to check out some models and (I think) there was something like y_t = f_{t-1} + \eta, so I don’t think that’s unheard of, if that’s what you meant to write down.
But yeah the the second like just doesn’t make sense. Like, we have matrix multiplication with T as the mean function? And what’s the different of all Y and Y_t? No idea. Not to say your idea isn’t possible. Just the math isn’t accurate to me. And how it interacts with the other part doesn’t really make sense.
So - Sometimes it helps to just write it in words and then we write the model accordingly.
Ok… and then, if that’s what you meant, and we have the code you posted above, the code totally doesn’t at all correspond to your model. After the code corresponds to your model, we can can use the loveliness that is Statistical Computation to realize that our model is not unified with the data generating process. Indicators that your model is not unified with the generating process are: a) prior predictive checks, b) posterior predictive checks, c) the folk theorem, d) and of course divergences/HMC diagnostics (convergence, effective sample size, etc).
And I can elaborate on 1-4 if you need, I find the vocab confusing sometimes.
So in summary for all of this so be fruitful we need:
- specific research question and understanding of the data, and collecting, and process
- a model that answers the research question and is mathematically accurate
- to satisfy the Bayesian computation
- model checking
(3,4 are a,b and c).
Which is cool because 3-4 can be an indicator if there’s something not quite right in step 1-2.
EDIT: I’m looking at this now. So EBYear is an indicator of whether the tree was big, and thus treated every year, so if it’s 1 it was treated this year (and also big) and 0 otherwise?
EDIT 2: Along with exact date of administering the pesticides, do you have the trees location geocoded? That would be very helpful.
EDIT 3: I realize you’ve answered my question about years. But also - I’m assuming treatment was administered at the time you went out into the field to take measurements? How and when did you administer the pesticide?