Here’s the code (I used MatlabStan as recompiling summarystan binary was easier for me than figuring out what to do in rstan). MatlabStan uses CmdStan which leaves the csv files behind. Dashed line is computed using rmse and in the plotting line it’s squared to get mse and 1./4000 is the analytic result for independent draws.
model='normal.stan';
for di=1:11
D=2^(di-1);
for i2=1:100
dat = struct('D',D);
fitn = stan('file',model,'data',dat,'sample_file','schools','file_overwrite',true,'verbose',false,'seed', i2, 'iter', 2000);
fitn.block()
ss=unix('rm -f foo.csv;/u/77/ave/unix/proj/cmdstan/bin/stansummary_ipse --sig_figs=8 schools-1 schools-2 schools-3 schools-4 --csv_file=foo.csv > /dev/null');
unix('egrep -v "^#" foo.csv > bar.csv');
warning off
q=dataset('File','bar.csv','delimiter',',');
warning on
neffss(di,i2)=mean(q.N_Eff(8:end,:)/4000);
rmse(di,i2)=rmeansqr(q.Mean(8:end,:));
ss=unix('rm -f foo.csv;/u/77/ave/unix/proj/cmdstan/bin/stansummary_imse --sig_figs=8 schools-1 schools-2 schools-3 schools-4 --csv_file=foo.csv > /dev/null');
unix('egrep -v "^#" foo.csv > bar.csv');
warning off
q=dataset('File','bar.csv','delimiter',',');
warning on
mneffss(di,i2)=mean(q.N_Eff(8:end,:)/4000);
ss=unix('rm -f foo.csv;/u/77/ave/unix/proj/cmdstan/bin/stansummary_old --sig_figs=8 schools-1 schools-2 schools-3 schools-4 --csv_file=foo.csv > /dev/null');
unix('egrep -v "^#" foo.csv > bar.csv');
warning off
q=dataset('File','bar.csv','delimiter',',');
warning on
oneffss(di,i2)=mean(q.N_Eff(8:end,:)/4000);
end
end
h1=plot(mean(oneffss,2));set(h1,'color',clrs(2,:))
hold on
line(xlim,[1 1],'color','k','linestyle',':')
h2=plot(mean(neffss,2));set(h2,'color',clrs(1,:))
h3=plot((1./4000)./(mean(rmse(:,:).^2,2))','k--');
xlim([0 12])
ylabel('$N_{\mathrm{eff}}/N$')
set(gca,'ytick',0:.5:3)
ylim([0 2.6])
set(gca,'xtick',1:11,'xticklabel',2.^(0:10))
xlabel('Dimensions D')
legend([h1 h2 h3],'Current','Geyer''s initial positive','Estimate from repeated simulations','location','southeast')
I generated 4x1000 iid draws, so that the split chain part stays the same. Here are violinplots with Geyer’s initial positive (blue) and with the current (red, with some jitter for exact 4000 values). This gives idea of overall accuracy for n_eff estimates
