Straight forward question:
Are there particular LLM/AIs that do well with troubleshooting/creating stan models?
Not sure about best but I use the paid version of chatgpt mostly to double check my models. Even the paid version will still do weird things like put the function block down at the bottom of the model. It helps to have a really good idea of what the model kinda looks like. Like something like this:
I have monthly time series data. I want to fit a model to the data and use the model to forecast future values. Let’s call the data y[t]. Here are some things I want to include:
1. Seasonality. If the value in March is high in one year it is probably also high in the following year, and so on for other months. So I want ‘month effects’.
2. Autocorrelation in month effects. The month effect, for a given month of the year, is not iid random. If the March effect is high for several years and then drops a lot, it is more likely to stay low than to jump back up.
3. Local trend. If the value has increased for the past several months it will probably keep increasing, although it might not.
4. Site. I want a site effect since each site is likely to be slightly different.
Please write a Stan program to do this, and include comments that show where each of those four requirements is implemented.
Will get you this
data {
int<lower=1> T; // Number of time points
int<lower=1> S; // Number of sites
vector[T] y; // Observed data
int<lower=1, upper=12> month[T]; // Month indicator for each time point
int<lower=1, upper=S> site[T]; // Site indicator for each time point
}
parameters {
real mu; // Overall mean
real<lower=0> sigma_mu; // Standard deviation of overall mean
real<lower=0> sigma; // Residual standard deviation
vector[12] gamma; // Month effects
vector[S] eta; // Site effects
vector[T] trend; // Local trend
}
transformed parameters {
vector[T] y_hat; // Fitted values
for (t in 1:T) {
// Incorporate seasonality with month effects
y_hat[t] = mu + gamma[month[t]] + eta[site[t]] + trend[t];
}
}
model {
// Priors for overall mean, month effects, site effects, and residual standard deviation
mu ~ normal(0, 1);
sigma_mu ~ normal(0, 1);
gamma ~ normal(0, 1);
eta ~ normal(0, 1);
sigma ~ normal(0, 1);
// Prior for local trend with autocorrelation
for (t in 2:T) {
trend[t] ~ normal(trend[t-1], sigma);
}
// Likelihood
y ~ normal(y_hat, sigma);
}
which is way pretty close. But being more specific and even having a simplified model fed in helps.
Hi Ara,
thanks for the response. I am using ChatGPT “pro” as well, and for the simple models it works pretty decent indeed. I guess it is also interesting to see other’s prompts and figure out how the LLM responds to that. That code came out clean.
Let me see if I can dump out some of my wandering convos with chatgpt about Stan into some place that you can access.