Autoregression on non-linear model parameters in brms

Dear all,

I am trying to fit a model with a Gaussian optimum and ZI Poisson distribution. I’m trying to see whether I can fit autocorrelation between years in said optimum.

Right now, my brm call looks like this:

prior <- 
    prior(gamma(0.01, 0.01), class = "b", nlpar = "Wmax", lb = 0) +
    prior(gamma(0.01, 0.01), class = "b",  nlpar = "omega", lb = 0) +
    prior(normal(0, 10), class = "b", nlpar = "theta") +
    prior(cauchy(0,5), class = "sd", group = "Year", nlpar = "theta") +
    prior(cauchy(0,5), class = "sd", group = "Year", nlpar = "Wmax")

formula <- 
    bf(Fitness ~ Wmax * exp(-(((Pheno.scale - theta) / omega)^2)),
       omega ~ 1,
       theta + Wmax ~ (1|Year),
       nl = TRUE)

mod_paper <- 
    brm(formula = formula,
        data    = data,
        prior   = prior,
        family  = zero_inflated_poisson(link = "identity"),
        warmup  = 500,
        iter    = 1500,
        thin    = 1)

Based on this response from Paul Bürkner, it seems I need to define the auto-correlation process from within a function and perform a call like this:

formula <- 
    bf(Fitness ~ nlfun(Wmax, Pheno.scale, theta, omega),
       omega ~ 1,
       theta + Wmax ~ (1|Year),
       nl = TRUE,
       loop = FALSE)

What I am not sure to understand is how I can put a stochastic process within such a non-linear “formula” call. Would for example the following thing work (setting up sigma as new parameter of course):

theta[n] <- rnorm(1, theta[n-1], sigma)

It fells somehow weird to include such a random call within the non-linear function itself. What would be best practice here?

I would suggest, nlfun to look as follows (in Stan language):

vector nlfun(vector Wmax, vector PS, vector theta, 
             vector omega, vector AR) {
    int N = rows(Wmax);
    vector[N] theta_new = theta;
    vector[N] mu;
    mu[1] = Wmax[1] * exp(-(((PS[1] - theta_new[1]) / omega[1])^2));
    for (n in 2:N) {
      theta_new[n] += AR[n] * theta_new[n-1];
      mu[n] = Wmax[n] * exp(-(((PS[n] - theta_new[n]) / omega[n])^2));
    }
    return mu;
  }

I haven’t tested this so I don’t know if it’s entirely correct, but you should get the idea.

You can then put this into brms via

nlfun <- stanvar(
  scode = "<stancode of nlfun>",
  block = "functions"
)

formula <- 
  bf(Fitness ~ nlfun(Wmax, Pheno.scale, theta, omega, AR),
     omega + AR ~ 1,
     theta + Wmax ~ (1|Year),
     nl = TRUE,
     loop = FALSE)

I will try to implement this next week, thanks! Just to make just I understand the cleverness behind this. The autoregression is deterministic within the nlfun():

theta_new[n] += AR[n] * theta_new[n-1];

but the “noise” is still there by keeping the random effect over years:

theta + Wmax ~ (1|Year)

Am I understanding this correctly?

(also, why is AR a vector here, shouldn’t it be a scalar, i.e. a common slope for all autoregressions?)

I think you understood it correctly.

Due to the way non-linear models are handeled, AR has to be a vector but since you specify AR ~ 1 in the formula, it will be constant across observations (so mathematically a single scalar).

Thank you so much for your help! I’ll try to remember to report here my failures and (hopefully) successes in implementing this.

Using this code:

# Prior distribution
prior <- 
    prior(gamma(0.01, 0.01), class = "b", nlpar = "Wmax", lb = 0) +
    prior(gamma(0.01, 0.01), class = "b",  nlpar = "omega", lb = 0) +
    prior(normal(0, 100), class = "b", nlpar = "theta") +
    prior(normal(0, 100), class = "b", nlpar = "AR") +
    prior(cauchy(0,5), class = "sd", group = "Year", nlpar = "theta") +
    prior(cauchy(0,5), class = "sd", group = "Year", nlpar = "Wmax")

# Initial values
inits <- list(b_Wmax_Intercept  = map(1:4, ~ wmax),
              b_omega_Intercept = map(1:4, ~ omega),
              b_theta_Intercept = map(1:4, ~ theta),
              sd_Intercept_Year_theta   = map(1:4, ~ sd_th),
              sd_Intercept_Year_Wmax    = map(1:4, ~ sd_wmax),
              zi                = map(1:4, ~ est["zi"]))

# Custom code for the model
nlfun <- stanvar(
    scode = "vector nlfun(vector Wmax, vector PS, vector theta, 
             vector omega, vector AR) {
                 int N = rows(Wmax);
                 vector[N] theta_new = theta;
                 vector[N] mu;
                 mu[1] = Wmax[1] * exp(-(((PS[1] - theta_new[1]) / omega[1])^2));
                 for (n in 2:N) {
                    theta_new[n] += AR[n] * theta_new[n-1];
                    mu[n] = Wmax[n] * exp(-(((PS[n] - theta_new[n]) / omega[n])^2));
                 }
                 return mu;
             }",
    block = "functions"
)

# BRMS formula for the model
formula <- 
    bf(Fitness ~ nlfun(Wmax, Pheno.scale, theta, omega, AR),
       omega + AR ~ 1,
       theta + Wmax ~ (1|Year),
       nl = TRUE,
       loop = FALSE)

# Running brms on the model
mod <- 
    brm(formula = formula,
        inits   = transpose(inits),
        data    = data,
        prior   = prior,
        family  = zero_inflated_poisson(link = "identity"),
        warmup  = 500,
        iter    = 1500,
        thin    = 1)

I get the following output:

SYNTAX ERROR, MESSAGE(S) FROM PARSER:

No matches for: 

  nlfun(vector, vector, vector, vector, vector)

Function nlfun not found.
  error in 'model481d10ed6147_file481d6598c7f6' at line 143, column 62
  -------------------------------------------------
   141:   } 
   142:   // compute non-linear predictor 
   143:   mu = nlfun(nlp_Wmax , C_1 , nlp_theta , nlp_omega , nlp_AR);
                                                                     ^
   144:   // priors including all constants 
  -------------------------------------------------

Error in stanc(model_code = paste(program, collapse = "\n"), model_name = model_cppname,  : 
  failed to parse Stan model 'file481d6598c7f6' due to the above error.
Error in stanc(model_code = paste(program, collapse = "\n"), model_name = model_cppname,  : 
  failed to parse Stan model 'file481d6598c7f6' due to the above error.

I get the same output from make_stancode(). Something I did wrong?

you didn’t pass nlfun to the stanvars argument of brm.

I see, yes. For the record, in case someone else reads this, it looks like this:

mod <- 
    brm(formula = formula,
        inits   = transpose(inits),
        data    = data,
        prior   = prior,
        family  = zero_inflated_poisson(link = "identity"),
        stanvars = nlfun,
        warmup  = 500,
        iter    = 1500,
        thin    = 1)

It’s running now, thank you!

Hi Paul,

A last, but possibly important question. What is “N” in the loop of your Stan code? I get that it’s the length of Wmax, but to what does it expend to? Especially, how do I ensure than N is the number of years? Or, more problematic, is N the total number of observations (i.e. number of rows of the design matrix)?

Thanks a lot for your help!

N is the number of observations. If you want it to loop over years you need to add years as a covariate to nlfun and handle that internally there.

As I suspected. I’m trying to use years as a covariate, but variable types in Stan is a nightmare. The covariate needs to be passed as a vector, hence reals, although they are integers, and there seems to be no way to use that covariate to produce anything integer-like.

For now, I cannot even find a way to extract the number of years to loop over and I suspect that if I construct a variable like this [1, 1, 2, 2, 3, 3, 3] representing the sequence of years, I will not be able to use it as indices, since the covariate cannot be passed as anything else than a vector… Any help would be much welcome as I’m running circles…

Incidentally, in the code you kindly produced, why do you use the += operator in:

theta_new[n] += AR[n] * theta_new[n-1];

I’m also wondering whether theta_new[n-1] should not be theta[n-1] in this line, otherwise I feel like this is a series going to zero, but this might be my lack of understanding of some of the Stan/BRMS mechanics.

Thank you for your help and sorry to be a pain.

Hmm I hadn’t thought about the integer problem… At this point I would think it’s easiest to move to using Stan directly by extracting the brms generated stan code (via make_stancode) and the corresponding data (via make_standata) and pass that stuff directly to Stan (after making the necessary changes to your stan code).
Handling integer covariates may also be possible in brms directly at some point but currently isn’t.

The code

theta_new[n] += AR[n] * theta_new[n-1];

just means

theta_new[n] = theta_new[n] + AR[n] * theta_new[n-1];

Actually, now that I think more about the data structure I wonder, what are you actually trying to autocorrelate? Over consequitive years? If so, how to we handle multiple observations per year? Or are there actually multiple timseries, which we want to autocorrelate separately?

Our data consists of fitness information (the response variable) and measures of a phenotypic trait. For many breeding events a year, we want to infer the optimal value (in terms of maximum of fitness) for the trait, hence the Gaussian optimum (a very classical model of fitness optimum). Because fitness is a number of offspring, it typically follows a zero-inflated error distribution.

The aim is this, for each year, to infer the optimum theta (based on all breeding events this year) and possible autocorrelations between optima (so between theta's) between years.

EDIT: Maybe to be more precise, the only thing we want to autocorrelate are the theta's (maybe the Wmax as well at some point, independently, but it’s basically the same problem structure…). So the “breeding events” are not time series, only the years are.

Ok I see. In that case, we need to abbdandon the nlfun approach I think. I recommend using the original non-linear formulation in brms, extract the corresponding stan code and add the autocorrelation to the r_theta_ parameters (the varying effects of years for theta) in the same way I described above with theta_new.

The difference will be that we autorcorrelate the varying effects of years (of theta) directly rather than working on the linear predictor, which contains multiple observations per year.

Something like this code (see addition l.122 and l.135-137)?

// generated with brms 2.4.0
functions { 
    
    /* zero-inflated poisson log-PDF of a single response 
     * Args: 
     *   y: the response value 
     *   lambda: mean parameter of the poisson distribution
     *   zi: zero-inflation probability
     * Returns:  
     *   a scalar to be added to the log posterior 
     */ 
    real zero_inflated_poisson_lpmf(int y, real lambda, real zi) { 
        if (y == 0) { 
            return log_sum_exp(bernoulli_lpmf(1 | zi), 
                               bernoulli_lpmf(0 | zi) + 
                               poisson_lpmf(0 | lambda)); 
        } else { 
            return bernoulli_lpmf(0 | zi) +  
            poisson_lpmf(y | lambda); 
        } 
    }
    /* zero-inflated poisson log-PDF of a single response 
     * logit parameterization of the zero-inflation part
     * Args: 
     *   y: the response value 
     *   lambda: mean parameter of the poisson distribution
     *   zi: linear predictor for zero-inflation part 
     * Returns:  
     *   a scalar to be added to the log posterior 
     */ 
    real zero_inflated_poisson_logit_lpmf(int y, real lambda, real zi) { 
        if (y == 0) { 
            return log_sum_exp(bernoulli_logit_lpmf(1 | zi), 
                               bernoulli_logit_lpmf(0 | zi) + 
                               poisson_lpmf(0 | lambda)); 
        } else { 
            return bernoulli_logit_lpmf(0 | zi) +  
            poisson_lpmf(y | lambda); 
        } 
    }
    /* zero-inflated poisson log-PDF of a single response
     * log parameterization for the poisson part
     * Args: 
     *   y: the response value 
     *   eta: linear predictor for poisson distribution
     *   zi: zero-inflation probability
     * Returns:  
     *   a scalar to be added to the log posterior 
     */ 
    real zero_inflated_poisson_log_lpmf(int y, real eta, real zi) { 
        if (y == 0) { 
            return log_sum_exp(bernoulli_lpmf(1 | zi), 
                               bernoulli_lpmf(0 | zi) + 
                               poisson_log_lpmf(0 | eta)); 
        } else { 
            return bernoulli_lpmf(0 | zi) +  
            poisson_log_lpmf(y | eta); 
        } 
    }
    /* zero-inflated poisson log-PDF of a single response 
     * log parameterization for the poisson part
     * logit parameterization of the zero-inflation part
     * Args: 
     *   y: the response value 
     *   eta: linear predictor for poisson distribution
     *   zi: linear predictor for zero-inflation part 
     * Returns:  
     *   a scalar to be added to the log posterior 
     */ 
    real zero_inflated_poisson_log_logit_lpmf(int y, real eta, real zi) { 
        if (y == 0) { 
            return log_sum_exp(bernoulli_logit_lpmf(1 | zi), 
                               bernoulli_logit_lpmf(0 | zi) + 
                               poisson_log_lpmf(0 | eta)); 
        } else { 
            return bernoulli_logit_lpmf(0 | zi) +  
            poisson_log_lpmf(y | eta); 
        } 
    }
    // zero-inflated poisson log-CCDF and log-CDF functions
    real zero_inflated_poisson_lccdf(int y, real lambda, real zi) { 
        return bernoulli_lpmf(0 | zi) + poisson_lccdf(y | lambda); 
    }
    real zero_inflated_poisson_lcdf(int y, real lambda, real zi) { 
        return log1m_exp(zero_inflated_poisson_lccdf(y | lambda, zi));
    }
} 
data { 
    int<lower=1> N;  // total number of observations 
    int Y[N];  // response variable 
    int<lower=1> K_omega;  // number of population-level effects 
    matrix[N, K_omega] X_omega;  // population-level design matrix 
    int<lower=1> K_theta;  // number of population-level effects 
    matrix[N, K_theta] X_theta;  // population-level design matrix 
    int<lower=1> K_Wmax;  // number of population-level effects 
    matrix[N, K_Wmax] X_Wmax;  // population-level design matrix 
    // covariate vectors 
    vector[N] C_1;
    // data for group-level effects of ID 1
    int<lower=1> J_1[N];
    int<lower=1> N_1;
    int<lower=1> M_1;
    vector[N] Z_1_theta_1;
    // data for group-level effects of ID 2
    int<lower=1> J_2[N];
    int<lower=1> N_2;
    int<lower=1> M_2;
    vector[N] Z_2_Wmax_1;
    int prior_only;  // should the likelihood be ignored? 
} 
transformed data { 
} 
parameters { 
    vector<lower=0>[K_omega] b_omega;  // population-level effects 
    vector[K_theta] b_theta;  // population-level effects 
    vector<lower=0>[K_Wmax] b_Wmax;  // population-level effects 
    real<lower=0,upper=1> zi;  // zero-inflation probability 
    vector<lower=0>[M_1] sd_1;  // group-level standard deviations
    vector[N_1] z_1[M_1];  // unscaled group-level effects
    vector<lower=0>[M_2] sd_2;  // group-level standard deviations
    vector[N_2] z_2[M_2];  // unscaled group-level effects
    real<lower=-1,upper=1> phi;
} 
transformed parameters { 
    // group-level effects 
    vector[N_1] r_1_theta_1 = sd_1[1] * (z_1[1]);
    // group-level effects 
    vector[N_2] r_2_Wmax_1 = sd_2[1] * (z_2[1]);
} 
model { 
    vector[N] nlp_omega = X_omega * b_omega;
    vector[N] nlp_theta = X_theta * b_theta;
    vector[N] nlp_Wmax = X_Wmax * b_Wmax;
    vector[N] mu;
    for (g in 2:N_1) {
        r_1_theta_1[g] += phi * r_1_theta_1[g-1];
    }
    for (n in 1:N) { 
        nlp_theta[n] += r_1_theta_1[J_1[n]] * Z_1_theta_1[n];
        nlp_Wmax[n] += r_2_Wmax_1[J_2[n]] * Z_2_Wmax_1[n];
        // compute non-linear predictor 
        mu[n] = nlp_Wmax[n] * exp( - (((C_1[n] - nlp_theta[n]) / nlp_omega[n]) ^ 2));
    } 
    // priors including all constants 
    target += gamma_lpdf(b_omega | 0.01, 0.01)
    - 1 * gamma_lccdf(0 | 0.01, 0.01); 
    target += normal_lpdf(b_theta | 0, 100); 
    target += gamma_lpdf(b_Wmax | 0.01, 0.01)
    - 1 * gamma_lccdf(0 | 0.01, 0.01); 
    target += beta_lpdf(zi | 1, 1); 
    target += student_t_lpdf(sd_1 | 3, 0, 10)
    - 1 * student_t_lccdf(0 | 3, 0, 10); 
    target += normal_lpdf(z_1[1] | 0, 1);
    target += student_t_lpdf(sd_2 | 3, 0, 10)
    - 1 * student_t_lccdf(0 | 3, 0, 10); 
    target += normal_lpdf(z_2[1] | 0, 1);
    // likelihood including all constants 
    if (!prior_only) { 
        for (n in 1:N) {
            target += zero_inflated_poisson_lpmf(Y[n] | mu[n], zi);
        }
    } 
} 
generated quantities { 
} 

I don’t know to pass it to brms/rstan however… I guess I need to use stan_model() and then sampling(), but I still need to figure the kinks of it to actually test that code.

The Stan code looks reasonable to me. Perhaps the easiest way to interact with Stan via rstan is rstan::stan.

There was a glitch in the previous code (r_1_theta_1 cannot be overwritten because it’s a transformed parameter, as far as I understand), here’s a working one:

// generated with brms 2.4.0
functions { 
    
    /* zero-inflated poisson log-PDF of a single response 
     * Args: 
     *   y: the response value 
     *   lambda: mean parameter of the poisson distribution
     *   zi: zero-inflation probability
     * Returns:  
     *   a scalar to be added to the log posterior 
     */ 
    real zero_inflated_poisson_lpmf(int y, real lambda, real zi) { 
        if (y == 0) { 
            return log_sum_exp(bernoulli_lpmf(1 | zi), 
                               bernoulli_lpmf(0 | zi) + 
                               poisson_lpmf(0 | lambda)); 
        } else { 
            return bernoulli_lpmf(0 | zi) +  
            poisson_lpmf(y | lambda); 
        } 
    }
    /* zero-inflated poisson log-PDF of a single response 
     * logit parameterization of the zero-inflation part
     * Args: 
     *   y: the response value 
     *   lambda: mean parameter of the poisson distribution
     *   zi: linear predictor for zero-inflation part 
     * Returns:  
     *   a scalar to be added to the log posterior 
     */ 
    real zero_inflated_poisson_logit_lpmf(int y, real lambda, real zi) { 
        if (y == 0) { 
            return log_sum_exp(bernoulli_logit_lpmf(1 | zi), 
                               bernoulli_logit_lpmf(0 | zi) + 
                               poisson_lpmf(0 | lambda)); 
        } else { 
            return bernoulli_logit_lpmf(0 | zi) +  
            poisson_lpmf(y | lambda); 
        } 
    }
    /* zero-inflated poisson log-PDF of a single response
     * log parameterization for the poisson part
     * Args: 
     *   y: the response value 
     *   eta: linear predictor for poisson distribution
     *   zi: zero-inflation probability
     * Returns:  
     *   a scalar to be added to the log posterior 
     */ 
    real zero_inflated_poisson_log_lpmf(int y, real eta, real zi) { 
        if (y == 0) { 
            return log_sum_exp(bernoulli_lpmf(1 | zi), 
                               bernoulli_lpmf(0 | zi) + 
                               poisson_log_lpmf(0 | eta)); 
        } else { 
            return bernoulli_lpmf(0 | zi) +  
            poisson_log_lpmf(y | eta); 
        } 
    }
    /* zero-inflated poisson log-PDF of a single response 
     * log parameterization for the poisson part
     * logit parameterization of the zero-inflation part
     * Args: 
     *   y: the response value 
     *   eta: linear predictor for poisson distribution
     *   zi: linear predictor for zero-inflation part 
     * Returns:  
     *   a scalar to be added to the log posterior 
     */ 
    real zero_inflated_poisson_log_logit_lpmf(int y, real eta, real zi) { 
        if (y == 0) { 
            return log_sum_exp(bernoulli_logit_lpmf(1 | zi), 
                               bernoulli_logit_lpmf(0 | zi) + 
                               poisson_log_lpmf(0 | eta)); 
        } else { 
            return bernoulli_logit_lpmf(0 | zi) +  
            poisson_log_lpmf(y | eta); 
        } 
    }
    // zero-inflated poisson log-CCDF and log-CDF functions
    real zero_inflated_poisson_lccdf(int y, real lambda, real zi) { 
        return bernoulli_lpmf(0 | zi) + poisson_lccdf(y | lambda); 
    }
    real zero_inflated_poisson_lcdf(int y, real lambda, real zi) { 
        return log1m_exp(zero_inflated_poisson_lccdf(y | lambda, zi));
    }
} 
data { 
    int<lower=1> N;  // total number of observations 
    int Y[N];  // response variable 
    int<lower=1> K_omega;  // number of population-level effects 
    matrix[N, K_omega] X_omega;  // population-level design matrix 
    int<lower=1> K_theta;  // number of population-level effects 
    matrix[N, K_theta] X_theta;  // population-level design matrix 
    int<lower=1> K_Wmax;  // number of population-level effects 
    matrix[N, K_Wmax] X_Wmax;  // population-level design matrix 
    // covariate vectors 
    vector[N] C_1;
    // data for group-level effects of ID 1
    int<lower=1> J_1[N];
    int<lower=1> N_1;
    int<lower=1> M_1;
    vector[N] Z_1_theta_1;
    // data for group-level effects of ID 2
    int<lower=1> J_2[N];
    int<lower=1> N_2;
    int<lower=1> M_2;
    vector[N] Z_2_Wmax_1;
    int prior_only;  // should the likelihood be ignored? 
} 
transformed data { 
} 
parameters { 
    vector<lower=0>[K_omega] b_omega;  // population-level effects 
    vector[K_theta] b_theta;  // population-level effects 
    vector<lower=0>[K_Wmax] b_Wmax;  // population-level effects 
    real<lower=0,upper=1> zi;  // zero-inflation probability 
    vector<lower=0>[M_1] sd_1;  // group-level standard deviations
    vector[N_1] z_1[M_1];  // unscaled group-level effects
    vector<lower=0>[M_2] sd_2;  // group-level standard deviations
    vector[N_2] z_2[M_2];  // unscaled group-level effects
    real<lower=-1,upper=1> phi;
} 
transformed parameters { 
    // group-level effects 
    vector[N_1] r_1_theta_1 = sd_1[1] * (z_1[1]);
    // group-level effects 
    vector[N_2] r_2_Wmax_1 = sd_2[1] * (z_2[1]);
} 
model { 
    vector[N] nlp_omega = X_omega * b_omega;
    vector[N] nlp_theta = X_theta * b_theta;
    vector[N] nlp_Wmax = X_Wmax * b_Wmax;
    vector[N_1] r_1_theta_1_new = r_1_theta_1;
    vector[N] mu;
    for (g in 2:N_1) {
        r_1_theta_1_new[g] += phi * r_1_theta_1_new[g-1];
    }
    for (n in 1:N) { 
        nlp_theta[n] += r_1_theta_1_new[J_1[n]] * Z_1_theta_1[n];
        nlp_Wmax[n] += r_2_Wmax_1[J_2[n]] * Z_2_Wmax_1[n];
        // compute non-linear predictor 
        mu[n] = nlp_Wmax[n] * exp( - (((C_1[n] - nlp_theta[n]) / nlp_omega[n]) ^ 2));
    } 
    // priors including all constants 
    target += gamma_lpdf(b_omega | 0.01, 0.01)
    - 1 * gamma_lccdf(0 | 0.01, 0.01); 
    target += normal_lpdf(b_theta | 0, 100); 
    target += gamma_lpdf(b_Wmax | 0.01, 0.01)
    - 1 * gamma_lccdf(0 | 0.01, 0.01); 
    target += beta_lpdf(zi | 1, 1); 
    target += student_t_lpdf(sd_1 | 3, 0, 10)
    - 1 * student_t_lccdf(0 | 3, 0, 10); 
    target += normal_lpdf(z_1[1] | 0, 1);
    target += student_t_lpdf(sd_2 | 3, 0, 10)
    - 1 * student_t_lccdf(0 | 3, 0, 10); 
    target += normal_lpdf(z_2[1] | 0, 1);
    // likelihood including all constants 
    if (!prior_only) { 
        for (n in 1:N) {
            target += zero_inflated_poisson_lpmf(Y[n] | mu[n], zi);
        }
    } 
} 
generated quantities { 
} 

Saving this code in stancode.stan, I can then run Stan using:

standata <- 
    make_standata(formula,
                  data = data,
                  prior = prior,
                  family = zero_inflated_poisson(link = "identity"))

mod <- stan("stancode.stan", data = standata)

This seems to work. A parameter which I don’t understand the role is Z_1_theta_1 (and similar Z_2_Wmax_1[n]). At first, I thought it played the role of a “design vector”, but J_1 is the parameter assigning year effects to the corresponding rows and Z_1_theta_1 seems to just be a vector of 1?

Anyway, thank you so much for your help through this Paul!!