Paper: Causal inference with panel data by Pang, Liu, and Xu

Hi everyone,

I just found finished reading Pang, Xun and Liu, Licheng and Xu, Yiqing, Bayesian Causal Inference with Time-Series Cross-Sectional Data: A Dynamic Multilevel Latent Factor Model with Hierarchical Shrinkage (July 12, 2020). and I was wondering if someone here had implemented it in Stan. Any thoughts about the methodology?

Thanks,

Ignacio

7 Likes

Hey @ignacio!

Thanks for linking the paper! I havenā€™t used factor models in Stan, but thought this paper would give me an excuse to look into it.

I played around with it a bit. Tbh, I thought the paper could have been a bit clearer on some of the things (naming conventions, simulation codeā€¦), but I guess they are still working on it.

I think the main point is that they just combine a multilevel regression with a factor structure. For model selection (and identification) they rely on the Bayesian Lasso, using the mixture of normals to re-parameterize the Laplace/DoubleExponentials.

The way they conceptualize the varying intercept/slopes (on the regressors) could have been better in my opinion. I guess it is conventional wisdom by now to use multivariate normal priors to model the correlations between intercepts and slopes (using cholseky decomposed correlation matrices for the non-centered parameterization). But they donā€™t do that.

Regarding the factor part: What they do might work in a Gibbs sampling scheme, but youā€™d need to put a lot more effort in it with HMC (or NUTS). There is literature about identification restriction, but they donā€™t really discuss that. The multi-modality issues would not work out fine in Stanā€¦

They rely on the Bayesian Lasso for ā€œmodel searchā€ (and partly identificationā€¦?). I could see that this make sense if youā€™re interested in MAPs, but for full Bayes the ā€œBaysian Lassoā€ is not really doing selection, right? You could maybe think about using some Horseshoe variant, but Iā€™m afraid this would still be hard to fit.

They are not using any MCMC diagnostics, which is kind of disappointing. It says they are building an R package with the sample (built in C++), which I couldnā€™t find with a quick google search. I think it would be much butter if they had tried to implement their model in a more general framework (like Stan, but really anything more ā€œopenā€).

Whatā€™s your take on this paper?

Cheers,
Max

4 Likes

Re reading this paper to better digest it is in my to-do list. Iā€™m new to synthetic control and factor models, so I have a lot of reading to do. Are you aware of any implementation of synthetic control with factor models and Stan? Do you have any other readings to recommend?

My Stancon video has a latent factor model and synthetic control. Will be posting the paper to arxiv, hopefully, in September.

8 Likes

Dear Max,

Thanks for reading our paper careful. Iā€™m embarassed for the many typos and its incompleteness. We were rushing out the paper for the workshop. We have just updated a new version. I hope itā€™s clearer now.

Some diagnostic plots are now included in the appendix, though none of the diagnostics will be conclusive. The open-source package will be available soon.

Best,
Yiqing

4 Likes

Check out https://arxiv.org/abs/1910.06106 for a latent factor model for synthetic control. He implemented in pymc3 (https://github.com/eliastuo/bayessynth) and itā€™s super slow. We reimplemented it in Stan, with the multi_normal Cholesky while also standardizing things around 0, 1 and it runs much, much faster. Iā€™ll see if I can post the code after getting the appropriate work authorization.

The ā€œsynthetic controlā€ piece of the above model is just an additional parameter measuring the intervention. Setting that parameter to 0, as in do(X = 0) if you know Pearlā€™s do-calculus, gives the ā€œsyntheticā€ estimate under no intervention.

In the Stancon piece we donā€™t have any intervention weā€™re interested in estimating. Instead, weā€™re interested in filling in missing data. We combine the latent factor model, though without any intervention variable, with a quadratic solver. This is not necessarily a new idea. What is ā€œnewā€ are two pieces. The first is that itā€™s multivariate and makes extensive use of the estimation of the correlations across the nested time series. And, secondly, the latent factor model and quadratic solver is all contained within one Stan program. The weights are solved jointly with the correlations and we get out an estimate of the variance of the weights. The fact that itā€™s in Stan makes it akin to a stochastic quadratic program.

Check out this fairly new working paper by Abadie for a great overview and survey of the existing literature on synthetic control https://economics.mit.edu/files/17847.

7 Likes

Thanks

Thanks a lot @spinkney. I would love to see your Stan re implementation of this method.

Me too - could you share it, @spinkney?

Thanks for reminding me about this! Iā€™ll cross post here and on the stan blog. Give me a week or two as work is really busy.

3 Likes

Coming Tuesday! https://twitter.com/thatpinkney/status/1376158302208425986?s=19

1 Like

I promised Tuesday and here it is. I updated the graphics and the official release on arxiv is delayed until Wednesday. Iā€™ll update with the link once I have it.

spinkney_improved_bayesian_synthetic_control.pdf (429.3 KB)

7 Likes

Thanks, @spinkney ! :)

1 Like

Official link [2103.16244] An Improved and Extended Bayesian Synthetic Control

Github code coming soon

4 Likes

Nice paper. Do you have github with the code? Iā€™m learning about Synthetic Control . My idea was Synthetic control is nearly equivalent to simulate an intervention as in Statistical Rethinking course. I found your paper so cool.

Thanks for the kind words. Hereā€™s the Stan and R code for the ā€œsmokingā€ data example that includes covariates.

2 Likes

Thanks for supplying code. I had a go at running the code, and looks like it works. On the way I got some warnings as part of the process (similar to below for all 4 chains at initialisation):
Chain 1 Iteration: 1 / 1000 [ 0%] (Warmup)
Chain 1 Informational Message: The current Metropolis proposal is about to be rejected because of the following issue:
Chain 1 Exception: normal_id_glm_lpdf: Scale vector is inf, but must be positive finite! (in ā€˜/tmp/RtmpbzKZ2k/model-54596f811ac9.stanā€™, line 126, column 6 to column 89)
Chain 1 If this warning occurs sporadically, such as for highly constrained variable types like covariance matrices, then the sampler is fine,
Chain 1 but if this warning occurs often then your model may be either severely ill-conditioned or misspecified.
Chain 1
Chain 1 Informational Message: The current Metropolis proposal is about to be rejected because of the following issue:
Chain 1 Exception: normal_id_glm_lpdf: Scale vector is inf, but must be positive finite! (in ā€˜/tmp/RtmpbzKZ2k/model-54596f811ac9.stanā€™, line 126, column 6 to column 89)
Chain 1 If this warning occurs sporadically, such as for highly constrained variable types like covariance matrices, then the sampler is fine,
Chain 1 but if this warning occurs often then your model may be either severely ill-conditioned or misspecified.
Chain 1