Thanks for linking the paper! I havenāt used factor models in Stan, but thought this paper would give me an excuse to look into it.
I played around with it a bit. Tbh, I thought the paper could have been a bit clearer on some of the things (naming conventions, simulation codeā¦), but I guess they are still working on it.
I think the main point is that they just combine a multilevel regression with a factor structure. For model selection (and identification) they rely on the Bayesian Lasso, using the mixture of normals to re-parameterize the Laplace/DoubleExponentials.
The way they conceptualize the varying intercept/slopes (on the regressors) could have been better in my opinion. I guess it is conventional wisdom by now to use multivariate normal priors to model the correlations between intercepts and slopes (using cholseky decomposed correlation matrices for the non-centered parameterization). But they donāt do that.
Regarding the factor part: What they do might work in a Gibbs sampling scheme, but youād need to put a lot more effort in it with HMC (or NUTS). There is literature about identification restriction, but they donāt really discuss that. The multi-modality issues would not work out fine in Stanā¦
They rely on the Bayesian Lasso for āmodel searchā (and partly identificationā¦?). I could see that this make sense if youāre interested in MAPs, but for full Bayes the āBaysian Lassoā is not really doing selection, right? You could maybe think about using some Horseshoe variant, but Iām afraid this would still be hard to fit.
They are not using any MCMC diagnostics, which is kind of disappointing. It says they are building an R package with the sample (built in C++), which I couldnāt find with a quick google search. I think it would be much butter if they had tried to implement their model in a more general framework (like Stan, but really anything more āopenā).
Re reading this paper to better digest it is in my to-do list. Iām new to synthetic control and factor models, so I have a lot of reading to do. Are you aware of any implementation of synthetic control with factor models and Stan? Do you have any other readings to recommend?
Thanks for reading our paper careful. Iām embarassed for the many typos and its incompleteness. We were rushing out the paper for the workshop. We have just updated a new version. I hope itās clearer now.
Some diagnostic plots are now included in the appendix, though none of the diagnostics will be conclusive. The open-source package will be available soon.
Check out https://arxiv.org/abs/1910.06106 for a latent factor model for synthetic control. He implemented in pymc3 (https://github.com/eliastuo/bayessynth) and itās super slow. We reimplemented it in Stan, with the multi_normal Cholesky while also standardizing things around 0, 1 and it runs much, much faster. Iāll see if I can post the code after getting the appropriate work authorization.
The āsynthetic controlā piece of the above model is just an additional parameter measuring the intervention. Setting that parameter to 0, as in do(X = 0) if you know Pearlās do-calculus, gives the āsyntheticā estimate under no intervention.
In the Stancon piece we donāt have any intervention weāre interested in estimating. Instead, weāre interested in filling in missing data. We combine the latent factor model, though without any intervention variable, with a quadratic solver. This is not necessarily a new idea. What is ānewā are two pieces. The first is that itās multivariate and makes extensive use of the estimation of the correlations across the nested time series. And, secondly, the latent factor model and quadratic solver is all contained within one Stan program. The weights are solved jointly with the correlations and we get out an estimate of the variance of the weights. The fact that itās in Stan makes it akin to a stochastic quadratic program.
Check out this fairly new working paper by Abadie for a great overview and survey of the existing literature on synthetic control https://economics.mit.edu/files/17847.
I promised Tuesday and here it is. I updated the graphics and the official release on arxiv is delayed until Wednesday. Iāll update with the link once I have it.
Nice paper. Do you have github with the code? Iām learning about Synthetic Control . My idea was Synthetic control is nearly equivalent to simulate an intervention as in Statistical Rethinking course. I found your paper so cool.
Thanks for supplying code. I had a go at running the code, and looks like it works. On the way I got some warnings as part of the process (similar to below for all 4 chains at initialisation):
Chain 1 Iteration: 1 / 1000 [ 0%] (Warmup)
Chain 1 Informational Message: The current Metropolis proposal is about to be rejected because of the following issue:
Chain 1 Exception: normal_id_glm_lpdf: Scale vector is inf, but must be positive finite! (in ā/tmp/RtmpbzKZ2k/model-54596f811ac9.stanā, line 126, column 6 to column 89)
Chain 1 If this warning occurs sporadically, such as for highly constrained variable types like covariance matrices, then the sampler is fine,
Chain 1 but if this warning occurs often then your model may be either severely ill-conditioned or misspecified.
Chain 1
Chain 1 Informational Message: The current Metropolis proposal is about to be rejected because of the following issue:
Chain 1 Exception: normal_id_glm_lpdf: Scale vector is inf, but must be positive finite! (in ā/tmp/RtmpbzKZ2k/model-54596f811ac9.stanā, line 126, column 6 to column 89)
Chain 1 If this warning occurs sporadically, such as for highly constrained variable types like covariance matrices, then the sampler is fine,
Chain 1 but if this warning occurs often then your model may be either severely ill-conditioned or misspecified.
Chain 1