Issue due to too less digits?

Hello,
at first: I’m using cmdstan 2.16.0 with Matlabstan 2.15.1.0

I have working code in matlab to produce curves. To enable bayesian inference I transferred the code to stan. Interestingly at one certain step I get a noisy curve from stan (through print output), although the curve is smooth in matlab.
In this step, the program just multiplies values of about 1e+06 with values around 1e-04.
The 1e+06 values occur due to a constant in data. When I change the constant in a way, that the values to be multiplied are 1e-2 and 1e-4, the issue seem not to be there, so I assume precision issues.

Therefore, my question is, if stan might keep much less digits than matlab does, or if the data transfer matlab->stan might keep less digits.

Unfortunately, I am not able to make a small example, since my stan code is relatively large.

Regards

Matlab wraps Cmdstan, which has limited precision in the CSV output:

Thank you for the reply. It is good to know, that the samples transferred back to matlab have limited precision.
But I think there could be an internal issue, too.
The print output (e.g.: print(“name”,variable)) has about 5 decimal digits in the command line. When I compare my code with the original matlab code round to the same decimal digits, I’m facing this issue after the multiplication part mentioned above. Before the multiplication, the stan and the round matlab output is the same.
The multiplication is done between a matrix (20x4200) and a vector (4200).

The print method also truncates rather than printing out the full ~16 decimals. Internally, Stan does everything using double precision floating points.

I was able to create a simple example.
I transfer two vectors (called “avec” and “invKY”) from matlab to stan, multiply them and print the result.
So it is just multiplying two vectors. Once in stan, once in matlab and the difference is really large, I think:
in STAN: -611.6530e-003
in MATLAB: -581.7570e-003

I uploaded the data. Would be nice if someone could confirm the issue.

Those are very different results. I don’t have Matlab installed so I can’t confirm. The difference may be in loss of precision in getting the data into stan.

Likely this line in RData.m
num_list = strjoin( ts{1}, ',');

See this issue.

It’s been a while since I’ve used matlab, so I don’t know the exact solution, but you can probably try to convert the elements of ts{1} to a string manually with fprintf( '%.16', x ) and then join them with strjoin.

-edit-

I missed the solution in that link. Try changing the line I mentioned to:
num_list = join(num2str(ts{1},"%1.12e"),"," )

Thanks again for your reply.
I changed the line to
num_list = join(num2str(ts{1},'%1.12e'),',' );
But there is no difference in the result.

I think “+mstan/rdump.m” is used rather than “Rdata.m”.
Afterwards, the data is written to the file “temp.data.R”. In this file, we have only 6 decimal digits, so there might be the issue located. I will see, what I have to edit to get more decimal digits.

-edit-

In “+mstan/rdump.m”, I replaced all %d to %.12d. It seems, that this fixes the issue.

I’m not sure that’s currently possible. Rarely are you going to get output from Stan accurate to beyond 6 decimal digits. Furthermore, it’s all going into estimates of expectations, which add further noise.

We’re busy redesigning the I/O piping and one of the things on our to-do list is to make the number of decimal places in output controllable. We’ll also be designing binary outputs.

So far, I don’t have a problem with too less decimal digits in the output.
The issue here was, that matlabstan transfered only 6 decimal digits of the data to stan. As written in my last post, this problem was solved by editing the “+mstan/rdump.m” file.
But it would be nice to have direct control about the decimal digits or to have something more precise in the manual.

Yes, it should definitely go into the interface manuals. I added an issue for CmdStan:

We will try to add control over the output. As well as a protobuf-based binary output, which should be faster and more memory efficient with full precision.