Could you please give me an example of **marginalize discrete unknowns out of the posterior**? I think truncating in this case works, because the unknown should be finite number (less than `1000`

, probably)

# MCMC sampling does not work when execute

**ghjk**#41

**ghjk**#43

Do you happen to know if **PyMC3** could work much easier for MCMC when we need to model prior distributions to be Poisson?

**bgoodri**#44

The best thing to do even if you were using PyMC3, BUGS, etc. is to marginalize out the discrete unknowns. Otherwise, it is very difficult to sample from the posterior distribution efficiently.

**ghjk**#45

Thank you so much for your insight. But for PyMC3 I plan to mimic this link using Metrpolis Hastings rather than NUTS: https://am207.github.io/2017/wiki/Lab7_bioassay.html.

Regards to Stan: Would there be a difference if I tried changing the default simulation model from NUTS (a HMC) to M-H (what is the default for calling out this simulation model in `stanfit()`

, do you know?) HMC primarily works with **continuous** variables, while M-H works **very well** with **discrete** varaible.

**Bob_Carpenter**#47

Metropolis-Hastings certainly doesnât work well in general for combinatorially hard problems like variable selection where there are exponentially many variable configurations and a very high degree of dependency.

In many models that can be expressed using integer parameters, itâs almost always more efficient and robust in Stan or in other systems, to marginalize them out if itâs tractable.

For something like missing Poisson data, Metropolis might work. The problem youâll have is that HMC is very sensitive to its tuning parameters and changing the discrete parameters changes the continuous part of the posterior conditionally. You can code this in PyMC3, but you should be careful in checking conergence. As far as I know, they havenât extensively evaluated whether these models work. For instance, youâll want to be careful about divergences which will indicate potential bias in estimates (donât know if PyMC3 reports these, but they probably do). The way to test more generally is to generate fake data from the model then make sure you can recover the model parameters to within posterior intervals. And please report back if youâre successfulâIâm curious as to how something like this will go in PyMC3.

**ghjk**#48

So I read Chapter 15 in the Stan manual, but did not quite see how to apply it to my problem at hand now. Basically, I need to marginalize the prior `p(yT|y)`

where `yT~ Poisson(lambda)`

, while my likelihood `p(x|yT) ~ Normal(A*yT, sigma)`

is very easy to manage, even when its mean contain the parameter we want to estimate. So if I take the log of `p(yT|y).p(x|yT)`

, I would get `log(Poisson(lambda)) + log(Normal(A*yT, sigma))`

. Could you please help me on modeling this in Stan so that I could run the built-in `Metropolis-Hastings`

algorithm in R? Also, even after I finish the sampling part, would I be able to get the posterior distribution of `x|y`

using `mcmc_trace()`

or `plot(stanfit2, pars = c("yT"))`

command?

**ghjk**#49

Yes, I would let you know right away if **PyMC3** works out for me. But I still want to get it with R;) Could you please show me how to marginalize the product of `p(yT|y)p(x|yT)`

where `yT|y ~ Poisson(lambda)`

and `x|yT ~ Normal(A*yT, sigma)`

(`A`

= a matrix, you could either make it up or use the matrix in the OP).

**bgoodri**#50

```
for (i in 0:2147483647) // or other large integer
target += poisson_log_lpmf(i | log_lambda) + normal_lpdf(x | A * i, sigma);
```

To get the posterior predictive distribution of `x`

just draw from a Poisson and use that realization multiplied by `A`

to draw from a normal distribution with standard deviation `sigma`

. You can do that in generated quantities or on the outside in R.

**ghjk**#51

Thank you so much for your prompt help!! I wonder what does `poisson_log_lpmf(i | log_lambda)`

stand for (I did not see this function in the manual, only saw `poisson_lpdf()`

)?

Also, I think I did not clear it somehow, but what I am trying to find is the posterior distribution of `yT`

based on the product `p(x|y_T)p(y_T|y)`

(this final form is attained after some algebraic manipulations already). So shouldnât I replace `i`

by `yT[i]`

? Furthermore, `i`

should be a column vector in order to make `A*i`

makes sense. But `i`

here is just a number, so I am not sure if you are **discretizing everything** basically??

Finally, could you elaborate on `To get the posterior predictive distribution of x just draw from a Poisson and use that realization multiplied by A to draw from a normal distribution with standard deviation sigma`

? I meant, didnât you already do that with the `target`

variable? Also, in this case, what does my `model`

section in Stan become?

**ghjk**#52

I tried your model with the following code in Stan, and it tells me the mismatch in the **variable type** (once again, same error as before;p) for poisson_log_lpmf. The mistake originated from the fact that **parameters** cannot be determined as **int**-type variable, but `poisson_log_lpmf`

requires one input as **int variable** or **int array**. How could we get around this:(

```
data {
int<lower=1> ncol;
int<lower=1> nrow;
vector[ncol] yH;
int x[nrow];
matrix[nrow, ncol] A;
vector[nrow] sigma_x;
vector[ncol] sigma_y;
vector[nrow] epsilon;
int log_lambda[ncol];
}
parameters {
vector[ncol] yT;
}
model {
for (i in 0:ncol){
target += poisson_log_lpmf(yT[i]| log_lambda[i]) + normal_lpdf(x[i]| A[i,]*yT, sigma_x[i]);
}
}
```

**ghjk**#53

@bgoodri: could you please be patient and help me overcome this ** difficult** problem? I am stumbledâŚ;p

**bgoodri**#54

`poisson_log_lpmf`

will not accept a vector, only an `int`

or an `int`

array. So, you have to loop over observations and apply `poisson_log_lpmf`

to the loop index.

**ghjk**#55

Yes, you are correct. But then when I tried that, it still gives me error that `yT[i]`

cannot be a vector. Could you please help try my above code yourself and let me know why it still cannot accept `yT[i]`

as an int?

Also, could you elaborate on what exactly does the `target`

variable represent? I think `target`

equals to the log of `p(yT|y) x p(x|yT)`

. How do we go back from the log to the **pdf** of the posterior, (aka, p(y|yT))? Is it as simple as applying `exp()`

to `target`

?

Now, I am thinking of re-writing the first term on the RHS of `target +=...`

as `log(Poisson)`

simply as `log(exp(-lambda[i]\)*(lambda[i])^(yT[i]))/Gamma(yT[i]+1)`

. But I am not sure if Gamma accepts 'yT` declared as a vector rather than an int-type. How do you think?

**ghjk**#57

By saying âloop over observationsâ, you mean I have to loop over yH rather than yT? Could Stan still work in a model without parameters being specified??

**bgoodri**#58

Well, you have to specify the continuous parameters, but then Stan can work very well by marginalizing over the discrete unknowns that cannot be declared in the parameters block. However, that means for each observation, you have to evaluate the log-probability that the discrete count takes on each integer value between 0 and some big number and then do a `log_sum_exp`

with the other distribution. All of this is discussed in the Stan User Manual.

**ghjk**#59

I read the Stan User Manual - Chapter 15, but when I tried to apply into my model above, it does not simply work. The reason is because of the way the example does marginalization (they have more parameters than I do, and the terms that I have Poisson on, is the Prior, not the likelihood. My prior `(yT|yH, A)`

is conditional over 2 (out of 3) parameters already; my likelihood is `(x[i] | d)`

. Thatâs why when you say âspecifying the continuous parameterâ, I switched to `vector[ncol] yT`

, but then cannot fit this into `poisson_log_lpmf()`

because of the constraint on input-type for this particular function. Not sure how could I overcome this issue in R:p

**bgoodri**#60

It doesnât matter whether the distribution is for the prior or the likelihood because the process of marginalization is the same. And it doesnât matter if you try to declare a discrete unknown as an integer array or a vector because Stan is going to refuse to deal with discrete unknowns.

If you just simulate some data in R according to your generative model, such as

```
x <- rpois(1)
y <- rnorm(1, mean = x, sd = 1)
```

and do that many times, you get a marginal distribution of `y`

. And if you have a bunch of observed `y`

values that come from that data-generating process, you can use that marginal distribution as a likelihood with something like

```
vector[big_integer] log_weights;
vector[big_integer] log_dens;
for (i in 0:big_integer) {
log_weights[i] = poisson_log_lpmf(i | log_lambda);
log_dens[i] = normal_log_lpdf(y | i, 1);
}
target += log_sum_exp(log_weights + log_dens);
```

but you have to do that for each observation.