Normal_cdf does not behave as expected (comparison with pnorm)

sillymalagg · April 11, 2022, 9:19pm

Hi everyone,
I am translating a fisheries stock assessment model from R to stan language for Bayesian fitting. I am trying to translate the following R code:

Linf  = 115
nlen = 200
L0 = 1
DL = 1
L = L0:nlen
K     = 0.15
cv = 0.1
no_at_age = c(0,3e+04,1e+04,5e+03,2e+03,1e+02)
maxage = length(no_at_age)

mean_pop_init <- rep(0, length(no_at_age))
sd_pop_init <- rep(0, length(no_at_age))
pop_init_distr <- rep(0, length(no_at_age))
pop_init <- rep(0, nlen)

for (m in 1:maxage) {
  mean_pop_init[m] = Linf - (Linf-L0)*exp(-K*m);
}

for (y in 1:maxage) {
  sd_pop_init[y] = cv * mean_pop_init[y];
}

pr_at_age = no_at_age/sum(no_at_age);
for (a in 1:maxage){
  for (p in 1:nlen) {
    pop_init_distr[p] = pnorm(L[p]+DL,mean_pop_init[a],sd_pop_init[a])-pnorm(L[p],mean_pop_init[a],sd_pop_init[a]);
  }
  pop_init = pop_init + pr_at_age[a]*pop_init_distr;
}
pop_init = pop_init/sum(pop_init)*sum(no_at_age);

See figure attached for the correct outcome of this code. From what I am reading online, in this case pnorm could be substituted by normal_cdf. Surprise surprise, this is not happening. This is what I am writing in stan, I don’t report here the initialisation, the data and the parameters, I just report the code where I substituted pnorm with normal_cdf
Rplot
Rplot01
:

 // calculate pop_init
    for (m in 1:maxage) {
        mean_pop_init[m] = Linf - (Linf-L0)*exp(-K*m);
    }
    
    for (y in 1:maxage) {
      sd_pop_init[y] = cv * mean_pop_init[y];
    }
    
    pr_at_age = no_at_age/sum(no_at_age);
    for (a in 1:maxage){
      for (p in 1:nlen) {
        pop_init_distr[p] = normal_cdf(L[p]+DL,mean_pop_init[a],sd_pop_init[a])-normal_cdf(L[p],mean_pop_init[a],sd_pop_init[a]);
      }
      pop_init = pr_at_age[a]*pop_init_distr;
    }
    pop_init = pop_init/sum(pop_init)*sum(no_at_age);

See figure for the output of this code.
Thank you so much to each one who will spend a bit of time for replying, any help will be very appreciated!

andrjohns · April 13, 2022, 12:46pm

It looks like you’ve got a step missing in your Stan code.

Your R code has:

  pop_init = pop_init + pr_at_age[a]*pop_init_distr;

But your Stan code only has:

  pop_init = pr_at_age[a]*pop_init_distr;

sillymalagg · April 21, 2022, 12:44am

Hi!
Thank you so much for your reply and sorry if i am getting mack to you quite late. Actually you are right, and your answer helped me to fix part of the problem. Nevertheless, there is another issue and this time I am quite sure is especially related to the normal_cdf function. In fact, the peak of length-distribution does not show up at the mean that I have specified. Check out the plot, the grey histogram is the stan output, while the red line is the desired output. I don’t understand what is happening…

Rplot

The code remained the same, the problem what you pointed out, but to make it work I had to initialise pop_init with 0 before looping:

for (k in 1:nlen) {
  pop_init[k] = 0;
}

andrjohns · April 21, 2022, 12:52am

Just to be sure, can you post the full R and Stan code that’s being used now?

sillymalagg · April 21, 2022, 12:57am

sure!

this is the stan code:

for (m in 1:maxage) {
    mean_pop_init[m] = Linf - (Linf-L0)*exp(-K*m);
}

for (y in 1:maxage) {
  sd_pop_init[y] = cv * mean_pop_init[y];
}

pr_at_age = no_at_age/sum(no_at_age);

for (k in 1:nlen) {
  pop_init[k] = 0;
}

for (a in 1:maxage){
  for (p in 1:nlen) {
    pop_init_distr[p] = normal_cdf(L[p]+DL, mean_pop_init[a],sd_pop_init[a])-normal_cdf(L[p], mean_pop_init[a],sd_pop_init[a]);
  }
  pop_init = pop_init + pr_at_age[a]*pop_init_distr;
}

pop_init = pop_init/sum(pop_init)*sum(no_at_age);

this is the R code:

for (m in 1:maxage) {
mean_pop_init[m] = Linf - (Linf-L0)exp(-Km);
}

for (y in 1:maxage) {
sd_pop_init[y] = cv * mean_pop_init[y];
}

pr_at_age = no_at_age/sum(no_at_age);
for (a in 1:maxage){
for (p in 1:nlen) {
pop_init_distr[p] = pnorm(L[p]+DL,mean_pop_init[a],sd_pop_init[a])-pnorm(L[p],mean_pop_init[a],sd_pop_init[a]);
}
pop_init = pop_init + pr_at_age[a]*pop_init_distr;
}
pop_init = pop_init/sum(pop_init)*sum(no_at_age);

andrjohns · April 21, 2022, 12:59am

Can you post the Stan code for mean_pop_init?

andrjohns · April 21, 2022, 1:36am

Ah ignore me, it was already in your code!

As far as I can see those two code snippets look the same. The next step is to debug whether it is the normal_cdf function or some other aspect of the Stan model. The easiest way to do this is to export the normal_cdf function to R and use it in place of pnorm, and see whether you get the same results (with no other changes).

The easiest way to do this is through the stanFunction framework in the StanHeaders package, which allows you to export a Stan function to R. In your case you would run:

# Need to provide 'dummy' initial values to define the needed argument types 
StanHeaders::stanFunction("normal_cdf", y = 0, mu = 0, sigma = 1)

Then you can just replace pnorm with normal_cdf in your R code. If the results of the R code with pnorm differ from those with normal_cdf, then that gives us a good starting point

sillymalagg · April 21, 2022, 2:30pm

It is the first loop, mean_pop_init is just a quantity calculated starting from some parameters (K, L0, Linf and m)
It is initialised as:

mean_pop_init ← rep(0, length(no_at_age))

Also, when printed, it returns the desired values: 16.87929 30.54672 42.31039 52.43547 61.15021 68.65106

Topic		Replies	Views
Setting up priors correctly Modeling rstan , fitting-issues , specification	3	880	May 15, 2023
Normal_lcdf status General	8	1343	January 31, 2019
Basic stan newbe translation from jags (really quick I'm sure) Modeling	5	1624	July 19, 2018
Poisson Likelihood with normal priors Modeling poisson	7	1120	December 13, 2020
Bayesian inference in Stan Modeling	40	5805	July 18, 2017

Normal_cdf does not behave as expected (comparison with pnorm)

Related topics