Issue due to too less digits?

Olibats · June 1, 2018, 5:43pm

Hello,
at first: I’m using cmdstan 2.16.0 with Matlabstan 2.15.1.0

I have working code in matlab to produce curves. To enable bayesian inference I transferred the code to stan. Interestingly at one certain step I get a noisy curve from stan (through print output), although the curve is smooth in matlab.
In this step, the program just multiplies values of about 1e+06 with values around 1e-04.
The 1e+06 values occur due to a constant in data. When I change the constant in a way, that the values to be multiplied are 1e-2 and 1e-4, the issue seem not to be there, so I assume precision issues.

Therefore, my question is, if stan might keep much less digits than matlab does, or if the data transfer matlab->stan might keep less digits.

Unfortunately, I am not able to make a small example, since my stan code is relatively large.

Regards

aaronjg · June 1, 2018, 9:32pm

Matlab wraps Cmdstan, which has limited precision in the CSV output:

Olibats · June 2, 2018, 8:21am

Thank you for the reply. It is good to know, that the samples transferred back to matlab have limited precision.
But I think there could be an internal issue, too.
The print output (e.g.: print(“name”,variable)) has about 5 decimal digits in the command line. When I compare my code with the original matlab code round to the same decimal digits, I’m facing this issue after the multiplication part mentioned above. Before the multiplication, the stan and the round matlab output is the same.
The multiplication is done between a matrix (20x4200) and a vector (4200).

aaronjg · June 2, 2018, 4:32pm

The print method also truncates rather than printing out the full ~16 decimals. Internally, Stan does everything using double precision floating points.

Olibats · June 2, 2018, 6:38pm

I was able to create a simple example.
I transfer two vectors (called “avec” and “invKY”) from matlab to stan, multiply them and print the result.
So it is just multiplying two vectors. Once in stan, once in matlab and the difference is really large, I think:
in STAN: -611.6530e-003
in MATLAB: -581.7570e-003

I uploaded the data. Would be nice if someone could confirm the issue.

aaronjg · June 2, 2018, 7:15pm

Those are very different results. I don’t have Matlab installed so I can’t confirm. The difference may be in loss of precision in getting the data into stan.

Likely this line in RData.m
num_list = strjoin( ts{1}, ',');

See this issue.

It’s been a while since I’ve used matlab, so I don’t know the exact solution, but you can probably try to convert the elements of ts{1} to a string manually with fprintf( '%.16', x ) and then join them with strjoin.

-edit-

I missed the solution in that link. Try changing the line I mentioned to:
num_list = join(num2str(ts{1},"%1.12e"),"," )

Olibats · June 4, 2018, 5:25am

Thanks again for your reply.
I changed the line to
num_list = join(num2str(ts{1},'%1.12e'),',' );
But there is no difference in the result.

I think “+mstan/rdump.m” is used rather than “Rdata.m”.
Afterwards, the data is written to the file “temp.data.R”. In this file, we have only 6 decimal digits, so there might be the issue located. I will see, what I have to edit to get more decimal digits.

-edit-

In “+mstan/rdump.m”, I replaced all %d to %.12d. It seems, that this fixes the issue.

Bob_Carpenter · June 5, 2018, 6:59am

I’m not sure that’s currently possible. Rarely are you going to get output from Stan accurate to beyond 6 decimal digits. Furthermore, it’s all going into estimates of expectations, which add further noise.

We’re busy redesigning the I/O piping and one of the things on our to-do list is to make the number of decimal places in output controllable. We’ll also be designing binary outputs.

Olibats · June 5, 2018, 2:06pm

So far, I don’t have a problem with too less decimal digits in the output.
The issue here was, that matlabstan transfered only 6 decimal digits of the data to stan. As written in my last post, this problem was solved by editing the “+mstan/rdump.m” file.
But it would be nice to have direct control about the decimal digits or to have something more precise in the manual.

Bob_Carpenter · June 5, 2018, 7:12pm

Yes, it should definitely go into the interface manuals. I added an issue for CmdStan:

We will try to add control over the output. As well as a protobuf-based binary output, which should be faster and more memory efficient with full precision.

Topic		Replies	Views
Lack of precision/truncation of log posterior trace on cmdStan (but not PyStan) CmdStan	3	1037	January 18, 2019
Stan produces a high amount of inf at init phase Modeling fitting-issues	0	446	January 15, 2019
Shinystan for cmdstan output Other cmdstan , shinystan	6	875	January 6, 2021
Question about the Reproducibility of Stan Results Algorithms cmdstan , cmdstanr	6	1648	January 10, 2022
Variable precision causing bug? General	0	408	March 13, 2019

Issue due to too less digits?

Related topics