Paper: Causal inference with panel data by Pang, Liu, and Xu

ignacio · August 14, 2020, 5:12pm

Hi everyone,

I just found finished reading Pang, Xun and Liu, Licheng and Xu, Yiqing, Bayesian Causal Inference with Time-Series Cross-Sectional Data: A Dynamic Multilevel Latent Factor Model with Hierarchical Shrinkage (July 12, 2020). and I was wondering if someone here had implemented it in Stan. Any thoughts about the methodology?

Thanks,

Ignacio

Max_Mantei · August 24, 2020, 3:03pm

Hey @ignacio!

Thanks for linking the paper! I haven’t used factor models in Stan, but thought this paper would give me an excuse to look into it.

I played around with it a bit. Tbh, I thought the paper could have been a bit clearer on some of the things (naming conventions, simulation code…), but I guess they are still working on it.

I think the main point is that they just combine a multilevel regression with a factor structure. For model selection (and identification) they rely on the Bayesian Lasso, using the mixture of normals to re-parameterize the Laplace/DoubleExponentials.

The way they conceptualize the varying intercept/slopes (on the regressors) could have been better in my opinion. I guess it is conventional wisdom by now to use multivariate normal priors to model the correlations between intercepts and slopes (using cholseky decomposed correlation matrices for the non-centered parameterization). But they don’t do that.

Regarding the factor part: What they do might work in a Gibbs sampling scheme, but you’d need to put a lot more effort in it with HMC (or NUTS). There is literature about identification restriction, but they don’t really discuss that. The multi-modality issues would not work out fine in Stan…

They rely on the Bayesian Lasso for “model search” (and partly identification…?). I could see that this make sense if you’re interested in MAPs, but for full Bayes the “Baysian Lasso” is not really doing selection, right? You could maybe think about using some Horseshoe variant, but I’m afraid this would still be hard to fit.

They are not using any MCMC diagnostics, which is kind of disappointing. It says they are building an R package with the sample (built in C++), which I couldn’t find with a quick google search. I think it would be much butter if they had tried to implement their model in a more general framework (like Stan, but really anything more “open”).

What’s your take on this paper?

Cheers,
Max

ignacio · August 24, 2020, 4:01pm

Re reading this paper to better digest it is in my to-do list. I’m new to synthetic control and factor models, so I have a lot of reading to do. Are you aware of any implementation of synthetic control with factor models and Stan? Do you have any other readings to recommend?

spinkney · August 24, 2020, 6:00pm

My Stancon video has a latent factor model and synthetic control. Will be posting the paper to arxiv, hopefully, in September.

Yiqing_Xu · August 25, 2020, 9:08am

Dear Max,

Thanks for reading our paper careful. I’m embarassed for the many typos and its incompleteness. We were rushing out the paper for the workshop. We have just updated a new version. I hope it’s clearer now.

Some diagnostic plots are now included in the appendix, though none of the diagnostics will be conclusive. The open-source package will be available soon.

Best,
Yiqing

spinkney · August 25, 2020, 10:03am

Check out https://arxiv.org/abs/1910.06106 for a latent factor model for synthetic control. He implemented in pymc3 (https://github.com/eliastuo/bayessynth) and it’s super slow. We reimplemented it in Stan, with the multi_normal Cholesky while also standardizing things around 0, 1 and it runs much, much faster. I’ll see if I can post the code after getting the appropriate work authorization.

The “synthetic control” piece of the above model is just an additional parameter measuring the intervention. Setting that parameter to 0, as in do(X = 0) if you know Pearl’s do-calculus, gives the “synthetic” estimate under no intervention.

In the Stancon piece we don’t have any intervention we’re interested in estimating. Instead, we’re interested in filling in missing data. We combine the latent factor model, though without any intervention variable, with a quadratic solver. This is not necessarily a new idea. What is “new” are two pieces. The first is that it’s multivariate and makes extensive use of the estimation of the correlations across the nested time series. And, secondly, the latent factor model and quadratic solver is all contained within one Stan program. The weights are solved jointly with the correlations and we get out an estimate of the variance of the weights. The fact that it’s in Stan makes it akin to a stochastic quadratic program.

Check out this fairly new working paper by Abadie for a great overview and survey of the existing literature on synthetic control https://economics.mit.edu/files/17847.

ignacio · August 31, 2020, 4:43pm

Thanks

Thanks a lot @spinkney. I would love to see your Stan re implementation of this method.

DanielWeitzenfeld · February 18, 2021, 3:04am

Me too - could you share it, @spinkney?

spinkney · February 18, 2021, 10:12am

Thanks for reminding me about this! I’ll cross post here and on the stan blog. Give me a week or two as work is really busy.

spinkney · March 28, 2021, 1:08pm

Coming Tuesday! https://twitter.com/thatpinkney/status/1376158302208425986?s=19

spinkney · March 30, 2021, 10:57am

I promised Tuesday and here it is. I updated the graphics and the official release on arxiv is delayed until Wednesday. I’ll update with the link once I have it.

spinkney_improved_bayesian_synthetic_control.pdf (429.3 KB)

Max_Mantei · March 30, 2021, 12:39pm

Thanks, @spinkney ! :)

spinkney · March 31, 2021, 12:45am

Official link [2103.16244] An Improved and Extended Bayesian Synthetic Control

Github code coming soon

Jose_Luis_Canadas · May 2, 2023, 5:15pm

Nice paper. Do you have github with the code? I’m learning about Synthetic Control . My idea was Synthetic control is nearly equivalent to simulate an intervention as in Statistical Rethinking course. I found your paper so cool.

spinkney · May 3, 2023, 4:06pm

Thanks for the kind words. Here’s the Stan and R code for the “smoking” data example that includes covariates.

MBN · July 18, 2023, 4:33am

Thanks for supplying code. I had a go at running the code, and looks like it works. On the way I got some warnings as part of the process (similar to below for all 4 chains at initialisation):
Chain 1 Iteration: 1 / 1000 [ 0%] (Warmup)
Chain 1 Informational Message: The current Metropolis proposal is about to be rejected because of the following issue:
Chain 1 Exception: normal_id_glm_lpdf: Scale vector is inf, but must be positive finite! (in ‘/tmp/RtmpbzKZ2k/model-54596f811ac9.stan’, line 126, column 6 to column 89)
Chain 1 If this warning occurs sporadically, such as for highly constrained variable types like covariance matrices, then the sampler is fine,
Chain 1 but if this warning occurs often then your model may be either severely ill-conditioned or misspecified.
Chain 1
Chain 1 Informational Message: The current Metropolis proposal is about to be rejected because of the following issue:
Chain 1 Exception: normal_id_glm_lpdf: Scale vector is inf, but must be positive finite! (in ‘/tmp/RtmpbzKZ2k/model-54596f811ac9.stan’, line 126, column 6 to column 89)
Chain 1 If this warning occurs sporadically, such as for highly constrained variable types like covariance matrices, then the sampler is fine,
Chain 1 but if this warning occurs often then your model may be either severely ill-conditioned or misspecified.
Chain 1

Topic		Replies	Views
Suggestions to justify use of stan/bayesian MLM (to journal editors/reviewers) Publicity	2	723	November 25, 2022
Dynamic panel data models with Stan? Modeling	59	9538	November 20, 2023
Post on Bayesian IV Modeling	6	2348	September 16, 2019
[PUBLICATION] New Psychometrika Paper: Diagnosing and Improving Estimation in Growth Mixture Models – Paper with Full Stan Code Publicity specification , loo , cmdstanr	0	66	June 1, 2025
Question about bayesian hierarchical modeling for a clustered randomized control trial Modeling	1	446	February 21, 2019

Paper: Causal inference with panel data by Pang, Liu, and Xu

Related topics