How to declare a variable in model block


#6

Thanks, bgoodri, for all those useful suggestions.
I fixed the error and finally got to compile the model, but I got this error message.

Error in compileCode(f, code, language = language, verbose = verbose) :
Compilation ERROR, function(s)/method(s) not created! Warning message:
running command ‘make -f “C:/PROGRA~1/R/R-34~1.2/etc/x64/Makeconf” -f “C:/PROGRA~1/R/R-34~1.2/share/make/winshlib.mk” SHLIB_LDFLAGS=’(SHLIB_CXXLDFLAGS)' SHLIB_LD='(SHLIB_CXXLD)’ SHLIB=“file16e466222198.dll” WIN=64 TCLBIN=64 OBJECTS=“file16e466222198.o”’ had status 127
In addition: Warning messages:
1: running command ‘“C:/PROGRA~1/R/R-34~1.2/bin//R” CMD config CXX’ had status 1
2: running command ‘C:/PROGRA~1/R/R-34~1.2/bin/x64/R CMD SHLIB file16e466222198.cpp 2> file16e466222198.cpp.err.txt’ had status 1

do you have any idea what is wrong?

Thanks so much!

By the way, you said I can put the gamma(1, 1) out of the loop, but will it be the same?

Thanks again.


#7

Your Rtools is either not installed or configured correctly.

Yes, d ~ exponential(1) is equivalent to but faster than

for (j in 1:T) {
  d[j] ~ gamma(1, 1);
  ...
}

#8

I mean, if you put it outside the loop, will “d” be sampling NT times with 1 sample each time, or just one time with NT of samplings? will it be the same with it being put inside the loop?
Also, can you add a vector with a number in stan?
Thanks


#9

Stan never samples individual parameters. If you put it inside a loop over T and increment the log-kernel once for each scalar that is literally equivalent to putting it outside the loop and incrementing the log-kernel once for the entire vector.

And yes, adding a scalar to a vector is a legal operation in the Stan language.


#10

Thanks for your quick reply.
Just curious, have you ever used Jags? If yes ,how do you compare their speeds? Is Stan better in some ways ?


#11

I haven’t used BUGS since 2003 but it isn’t much of a comparison if you do the comparison correctly. Comparing wall time until something finishes is useless because you can easily make one or the other finish in whatever amount of time you want. It is also useless to compare MCMC algorithms unless both converge to the posterior distribution, which Gibbs samplers such as BUGS and JAGS have difficulty doing in examples with more than a few parameters or without choosing particular sets of distributions for the priors and likelihood. In models where BUGS / JAGS converges and mixes well, you can compare effective sample size per second of wall time with Stan and Stan will usually be at least as good.


#12

Hi, sorry that I keep bugging you with questions. Now I want to make a more complicated model, and revised my code as follows. something went wrong with the data “trial”. The model can be complied but when I apply my data, it showed error message. My data should be good, the variable “trial” only contains 1,2,3,4,5,6,7,8.

data{
int T;
int N;
vector[N] score;
matrix[N, T] A;
int trial[N];
int<lower=0> y1[N,T];
int<lower=0> y2[N,T];
}
transformed data{
matrix[N,T] log_A;
log_A = log(A);
}
parameters{
matrix[1,2] beta;
vector[T] d;
matrix[2,8] ran_trial;
vector[2] s_trial;
}
transformed parameters{
vector[T] log_d;
log_d= log(d)
}
model {
for (o in 1:2){
beta[1,o] ~ normal(0,1)
s_trial[o] ~ student_t(1,0,1)
for(tn in 1:8){
ran_trial[o, tn] ~ normal(0,s_trial[o]}
}}
d ~ exponential(1);
for(j in 1:T) {
   for (i in 1:N) {
y1[i,j] ~ poisson_log(log_A[i,j] +beta[1,1]* score[i] +ran_trial[1,trial[i]]+log_d);
y2[i,j] ~ poisson_log(log_A[i,j] +beta[1,2]* score[i] +ran_trial[2,trial[i]]+log_d);
}}
} 

Error in new_CppObject_xp(fields$.module, fields$.pointer, …) :
Exception: variable does not exist; processing stage=data initialization; variable name=trial; base type=int (in ‘model29045b845223_104d2d1efc277c900643b11430b3da1e’ at line 14)

In addition: Warning message:
In FUN(X[[i]], …) : data with name trial is not numeric and not used
failed to create the sampler; sampling not done


Besides, when I complied the model, it showed warning:
Warning message:
In if (nchar(CXX) > 0) return(TRUE) :
the condition has length > 1 and only the first element will be used

Is it normal?


#13

Offhand, it looks like the data for the variable trial was not input correctly. Are you using RStan? If so, what does the R script that you’re using look like?


#14

Oh I figured out, the trial variable in my data is not numeric. so I put as.numeric(), it worked! THANKS


transformed data{
matrix[N,T] log_A;
log_A = log(A);
}

transformed parameters{
vector[T] log_d;
log_d= log(d)
}

Am I doing it right for the log() as A is a matrix and d is a vector?

Dimension mismatch in assignment; variable name = log_A, num dimensions given = 0; right-hand side dimensions = 2
error in ‘model27e430f5182_2bd322cf0524e4cdf85261f7275688d3’ at line 22, column 8

20: transformed data{
21: matrix[N,T] log_A;
22: log_A= log(A);
           ^

#15

Just show us the R script you’re using. It’s hard to tell what mistakes you’re making when we can’t see your code.


#16
data{
int T;
int N;
vector[N] score;
matrix[N, T] A;
int trial[N];
int<lower=0> y1[N,T];
int<lower=0> y2[N,T];
}
transformed data{
matrix[N,T] log_A;
log_A = log(A);
}
parameters{
matrix[1,2] beta;
vector[T] d;
matrix[2,8] ran_trial;
vector[2] s_trial;
}
transformed parameters{
vector[T] log_d;
log_d= log(d);
}
model {
for (o in 1:2){
beta[1,o] ~ normal(0,1);
s_trial[o] ~ student_t(1,0,1);
for(tn in 1:8){
ran_trial[o, tn] ~ normal(0,s_trial[o]);
}
}}
d ~ exponential(1);
for(j in 1:T) {
   for (i in 1:N) {
y1[i,j] ~ poisson_log(log_A[i,j] +beta[1,1]* score[i] +ran_trial[1,trial[i]]+log_d);
y2[i,j] ~ poisson_log(log_A[i,j] +beta[1,2]* score[i] +ran_trial[2,trial[i]]+log_d);
}}
} 
sm = stan_model(model_code=modelstring)
stan_data = list(score=mydata$score,trial=as.numeric(mydata$trial),
                 y1=y1,y2=y2,A=A, N=N, T=T)
fit= sampling(sm, data=stan_data, chains=3,iter=600,warmup=200,thin=2)

Dimension mismatch in assignment; variable name = log_A, num dimensions given = 0; right-hand side dimensions = 2
error in ‘model27e430f5182_2bd322cf0524e4cdf85261f7275688d3’ at line 22, column 8
20: transformed data{
21: matrix[N,T] log_A;
22: log_A= log(A);
^


#17

What do you get when you run the following lines of R code?:

print(dim(y1))
print(dim(y2))
print(dim(A))
print(N)
print(T)

Also, be careful when using T as a variable in R. By default, R sets T to TRUE, so if you forget to set the value of T yourself, weird things can happen.


#18

print(N)
[1] 4400
print(T)
[1] 30
print(dim(y1))
[1] 4400 30
print(dim(y2))
[1] 4400 30
print(dim(A))
[1] 4400 30
print(N)
[1] 4400
print(T)
[1] 30


#19

I’m not sure what the problem is, but I do notice a lot of missing semicolons in the Stan code that you posted. Indeed, I can’t even get it to compile, which makes me wonder if the code you’ve posted is the code that’s causing the error messages that you’re showing.


#20

Yeah, about the semicolons, because I can not share my code. This is the fake code that I created but similar to mine. I have all the semicolons in my real code, so it is not the problem. I just fixed it in my post. How about now?


#21

Look, if you need to post fake code as a stand-in, then say so upfront and try to make sure that it either compiles or produces the same error that you got from your real code. I had to change

normal(0,s_trial[o];
}

to

normal(0,s_trial[o]);

in order for the code to compile.

Now when I Googled the error message “Dimension mismatch in assignment”, I found a forum post that seemed to indicate that the error happens during compilation. Before the error message that you posted, did you see a message that said, “SYNTAX ERROR, MESSAGE(S) FROM PARSER”?


#22

Yes.
SYNTAX ERROR, MESSAGE(S) FROM PARSER:
Dimension mismatch in assignment; variable name = log_A, num dimensions given = 0; right-hand side dimensions = 2
error in ‘model27e4503d3a3_8003a76c0b33e462a94a34effb8bf314’ at line 23, column 8
20: transformed data{
21: matrix[N,T] log_A;
22: log_A= log(A);
^

PARSER EXPECTED: expression assignable to left-hand side>
Error in stanc(file = file, model_code = model_code, model_name = model_name, :
failed to parse Stan model ‘8003a76c0b33e462a94a34effb8bf314’ due to the above error.

The data in matrix A has 0 as some elements. is it the reason that I can not get the log(A)? I want to use poisson_log to avoid the overflow in the model. there is exp() in the model.


#23

The parser has no idea what’s in matrix A, so no. This is purely a problem with syntax, and one that is not in the “fake” Stan code that you’ve shown here.


#24

The problem is solved by doing:

transformed data{
matrix[N,T] log_A= log(A);
}

instead of

transformed data{
matrix[N,T] log_A;
log_A= log(A);
}

#25

For some easy conjugate problems, like say the Rats problem in BUGS, the speeds are about the same when measured with properly calibrated multi-chain effective sample sizes.

For most of those JAGS and BUGS models with discrete parameters, it’s way faster to run them marginalized in Stan and in BUGS or JAGS.

For hard problems, Stan is infinitely faster because Gibbs will never converge.

For problems in between, the speedup is somewhere between 1 and infinite. This isn’t so much JAGS or BUGS per se, as Gibbs. Switching to faster Gibbs doesn’t help for hard problems—you can run millions of iterations and not converge.

This does require you to code the models well in both systems. Badly coded models will fail to converge in both JAGS and Stan.

Cole Monahan wrote a nice survey. Michael Betancourt’s arXiv papers contain all the details on why HMC is so much faster in high dimensions and for hard problems.