Causal Inference

Kun Kuang; Lian Li; Zhi Geng; Lei Xu , Kun Zhang; Beishui Liao; Huaxin Huang; Peng Ding; Wang Miao; Zhichao Jiang; 蒋智超

doi:10.1016/j.eng.2019.08.016

PDF(576 KB)

Engineering ›› 2020, Vol. 6 ›› Issue (3) : 253-263. DOI: 10.1016/j.eng.2019.08.016

Research

Review

Causal Inference

Author information +

History +

Abstract

Causal inference is a powerful modeling tool for explanatory analysis, which might enable current machine learning to become explainable. How to marry causal inference with machine learning to develop eXplainable Artificial Intelligence (XAI) algorithms is one of key steps towards to the artificial intelligence 2.0. With the aim of bringing knowledge of causal inference to scholars of machine learning
and artificial intelligence, we invited researchers working on causal inference to write this survey from different aspects of causal inference. This survey includes the following sections: ″Estimating average treatment effect: A brief review and beyond″ from Dr. Kun Kuang, ″Attribution problems in counterfactual inference″ from Prof. Lian Li, ″The Yule-Simpson paradox and the surrogate paradox″ from Prof. Zhi Geng, ″Causal potential theory″ from Prof. Lei Xu, ″Discovering causal information from observational data″ from Prof. Kun Zhang, ″Formal argumentation in causal reasoning and explanation″ from Profs. Beishui Liao and Huaxin Huang, ″Causal inference with complex experiments″ from Prof. Peng Ding, ″Instrumental variables and negative controls for observational studies″ from Prof. Wang Miao, and ″Causal inference with interference″ from Dr. Zhichao Jiang.

Keywords

Causal inference / Instructive variables / Negative control / Causal reasoning and explanation / Causal discovery / Counter factual inference / Treatment effect estimation

Cite this article

EndNote

Ris (Procite)

Bibtex

Download citation ▾

Kun Kuang, Lian Li, Zhi Geng, Lei Xu , Kun Zhang, Beishui Liao, Huaxin Huang, Peng Ding, Wang Miao, Zhichao Jiang, 蒋智超. Causal Inference. Engineering, 2020, 6(3): 253‒263 https://doi.org/10.1016/j.eng.2019.08.016

References

Publishing order | Descend order by publishing year | Descend order by cited within

[1]	Imbens GW, Rubin DB. Causal inference for statistics, social, and biomedical sciences. New York: Cambridge University Press; 2015.

[2]	Bang H, Robins JM. Doubly robust estimation in missing data and causal inference models. Biometrics 2005;61(4):962–73.

[3]	Kuang K, Cui P, Li B, Jiang M, Yang S, Wang F. Treatment effect estimation with data-driven variable decomposition. In: Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence; 2017 Feb 4–9; San Francisco, CA, USA; 2017.

[4]	Athey S, Imbens GW, Wager S. Approximate residual balancing: debiased inference of average treatment effects in high dimensions. J R Stat Soc Ser B (Stat Methodol) 2018;80(4):597–623.

[5]	Kuang K, Cui P, Li B, Jiang M, Yang S. Estimating treatment effect in the wild via differentiated confounder balancing. In: Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining; 2017 Aug 13–17; Halifax, NS, Canada; 2017. p. 265–74.

[6]	Imai K, Van Dyk DA. Causal inference with general treatment regimes: generalizing the propensity score. J Am Stat Assoc 2004;99(467):854–66.

[7]	Egami N, Imai K. Causal interaction in factorial experiments: application to conjoint analysis. J Am Stat Assoc 2019;114(526):529–40.

[8]	Louizos C, Shalit U, Mooij JM, Sontag D, Zemel R, Welling M. Causal effect inference with deep latent-variable models. In: Proceedings of Advances in Neural Information Processing Systems 30; 2017 Dec 4–9; Long Beach, CA, USA; 2017. p. 6446–56.

[9]	Crump RK, Hotz VJ, Imbens GW, Mitnik OA. Dealing with limited overlap in estimation of average treatment effects. Biometrika 2009;96(1):187–99.

[10]	Li F, Thomas LE, Li F. Addressing extreme propensity scores via the overlap weights. Am J Epidemiol 2019;188(1):250–7.

[11]	Kuang K, Cui P, Athey S, Xiong R, Li B. Stable prediction across unknown environments. In: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining; 2018 Aug 19–23; London, UK; 2018. p. 1617–26.

[12]	Zhuang Y, Wu F, Chen C, Pan Y. Challenges and opportunities from big data to knowledge in AI 2.0. Front Inf Technol Elec Eng 2017;18(1):3–14.

[13]	Pan Y. 2018 special issue on artificial intelligence 2.0: theories and applications. Front Inf Technol Elec Eng 2018;19(1):1–2.

[14]	Hoerl C, McCormack T, Beck SR, editors. Understanding counterfactuals, understanding causation: issues in philosophy and psychology. New York: Oxford University Press; 2011.

[15]	Pearl J, Glymour M, Jewell NP. Causal inference in statistics: a primer. Hoboken: John Wiley & Sons; 2016.

[16]	Daniel RM, De Stavola BL, Vansteelandt S. Commentary: the formal approach to quantitative causal inference in epidemiology: misguided or misrepresented? Int J Epidemiol 2016;45(6):1817–29.

[17]	Pearl J. Causal and counterfactual inference. Forthcoming section in the handbook of rationality. Cambridge: MIT press; 2018.

[18]	Goldfeld K. Considering sensitivity to unmeasured confounding: part 1 [Internet]. New York: Keith Golgfeld; 2019 Jan 2 [cited 2019 Jun 1]. Available from: https://www.rdatagen.net/post/what-does-it-mean-if-findings-aresensitive-to-unmeasured-confounding/.

[19]	Yule GU. Notes on the theory of association of attributes in statistics. Biometrika 1903;2(2):121–34.

[20]	Simpson EH. The interpretation of interaction in contingency tables. J R Stat Soc B 1951;13(2):238–41.

[21]	Chen H, Geng Z, Jia J. Criteria for surrogate end points. J R Stat Soc Series B Stat Methodol 2007;69(5):919–32.

[22]	Geng Z, Liu Y, Liu C, Miao W. Evaluation of causal effects and local structure learning of causal networks. Annu Rev Stat Appl 2019;6(1):103–24.

[23]	Pearl J. Is scientific knowledge useful for policy analysis? A peculiar theorem says: no. J Causal Infer 2014;2(1):109–12.

[24]	Fleming TR, DeMets DL. Surrogate end points in clinical trials: are we being misled? Ann Intern Med 1996;125(7):605–13.

[25]	Xu L, Pearl J. Structuring causal tree models with continuous variables. In: Proceedings of the Third Conference on Uncertainty in Artificial Intelligence. Arlington: AUAI Press; 1987. p. 170–9.

[26]	Xu L. Deep bidirectional intelligence: alphazero, deep IA-search, deep IAinfer, and TPC causal learning. Appl Inf 2018;5(1):5.

[27]	Xu L. Machine learning and causal analyses for modeling financial and economic data. Appl Inf 2018;5(1):11.

[28]	Spirtes P, Glymour C, Scheines R. Causation, prediction, and search. 2nd ed. Cambridge: MIT Press; 2001.

[29]	Pearl J. Causality: models, reasoning, and inference. Cambridge: Cambridge University Press; 2000.

[30]	Spirtes P, Zhang K. Causal discovery and inference: concepts and recent methodological advances. Appl Inform 2016;3(1):3.

[31]	Shimizu S, Hoyer PO, Hyvärinen A, Kerminen A. A linear non-gaussian acyclic model for causal discovery. J Mach Learn Res 2006;7:2003–30.

[32]	Zhang K, Hyvärinen A. On the identifiability of the post-nonlinear causal model. In: Proceedings of the Twenty-Fifth Conference on Uncertainty in Artificial Intelligence; 2009 Jun 18–21; Montreal, QC, Canada. Arlington: AUAI Press; 2019. p. 647–55.

[33]	Hoyer PO, Janzing D, Mooij JM, Peters J, Scholkopf B. Nonlinear causal discovery with additive noise models. In: Proceedings of International Conference on Neural Information Processing Systems; 2008 Dec 8–13; Vancouver, BC, Canada; 2008. p. 689–96.

[34]	Zhang K, Hyvärinen A. Causality discovery with additive disturbances: an information-theoretical perspective. In: Buntine W, Grobelnik M, Mladenic´ D, Shawe-Taylor J, editors. Machine learning and knowledge discovery in databases. Berlin: Springer; 2009. p. 570–85.

[35]	Zhang K, Schölkopf B, Muandet K, Wang Z. Domain adaptation under target and conditional shift. In: Proceedings of the 30th International Conference on Machine Learning; 2013 Jun 16–21; Atlanta, GA, USA; 2013. p. 819–27.

[36]	Baroni P, Gabbay DM, Giacomin M, Van der Torre L. Handbook of formal argumentation. London: College Publications; 2018.

[37]	Osborne J. Arguing to learn in science: the role of collaborative, critical discourse. Science 2010;328(5977):463–6.

[38]	Shoham Y. Nonmonotonic reasoning and causation. Cogn Sci 1990;14 (2):213–52.

[39]	Liao B, Jin L, Koons RC. Dynamics of argumentation systems: a division-based method. Artif Intell 2011;175(11):1790–814.

[40]	Sklar EI, Azhar MQ. Explanation through argumentation. In: Proceedings of the 6th International Conference on Human–Agent Interaction; 2018 Dec 15– 18; Southampton, UK; 2018. p. 277–85.

[41]	Fazzinga B, Flesca S, Furfaro F. Complexity of fundamental problems in probabilistic abstract argumentation: beyond independence. Artif Intell 2019;268:1–29.

[42]	Pearl J. On a class of bias-amplifying variables that endanger effect estimates. In: Proceedings of the Twenty-Sixth Conference on Uncertainty in Artificial Intelligence; 2010 Jul 8–11; Catalina Island, CA, USA; 2000. p. 425–32.

[43]	Kempthorne O. The design and analysis of experiments. New York: Wiley; 1952.

[44]	Scheffe H. The analysis of variance. New York: John Wiley & Sons; 1959.

[45]	Hinkelmann K, Kempthorne O. Design and analysis of experiments: volume 1: introduction to experimental design. 2nd ed. New York: John Wiley & Sons; 2007.

[46]	Imbens GW, Rubin DB. Causal inference for statistics, social, and biomedical sciences: an introduction. New York: Cambridge University Press; 2015.

[47]	Splawa-Neyman J. On the application of probability theory to agricultural experiments: essay on principles. Section 9. Stat Sci 1990;5(4):465–72.

[48]	Ding P, Dasgupta T. A randomization-based perspective on analysis of variance: a test statistic robust to treatment effect heterogeneity. Biometrika 2018;105(1):45–56.

[49]	Dasgupta T, Pillai NS, Rubin DB. Causal inference from 2K factorial designs by using potential outcomes. J R Stat Soc Series B Stat Methodol 2015;77 (4):727–53.

[50]	Wu J, Ding P. Randomization tests for weak null hypotheses. 2018. arXiv:1809.07419.

[51]	Miratrix LW, Sekhon JS, Yu B. Adjusting treatment effect estimates by poststratification in randomized experiments. J R Stat Soc Series B Stat Methodol 2013;75(2):369–96.

[52]	Li X, Ding P, Lin Q, Yang D, Liu JS. Randomization inference for peer effects. J Am Stat Assoc 2019:1–31.

[53]	Li X, Ding P. General forms of finite population central limit theorems with applications to causal inference. J Am Stat Assoc 2017;112(520):1759–69.

[54]	Zhao A, Ding P, Mukerjee R, Dasgupta T. Randomization-based causal inference from split-plot designs. Ann Stat 2018;46(5):1876–903.

[55]	Mukerjee R, Dasgupta T, Rubin DB. Using standard tools from finite population sampling to improve causal inference for complex experiments. J Am Stat Assoc 2018;113(522):868–81.

[56]	Fisher R. Statistical methods for research workers. Edinburgh: Oliver and Boyd; 1925.

[57]	Freedman DA. On regression adjustments to experimental data. Adv Appl Math 2008;40(2):180–93.

[58]	Lin W. Agnostic notes on regression adjustments to experimental data: reexamining Freedman’s critique. Ann Appl Stat 2013;7(1):295–318.

[59]	Eicker F. Limit theorems for regressions with unequal and dependent errors. In: Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability; 1967 Jun 21–Jul 18; Berkeley, CA, USA. Berkeley: University of California Press; 1967. p. 59–82.

[60]	Huber PJ. The behavior of maximum likelihood estimates under nonstandard conditions. In: Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability; 1967 Jun 21–Jul 18; Berkeley, CA, USA; Berkeley: University of California Press; 1967. p. 221–33.

[61]	White H. A heteroskedasticity-consistent covariance matrix estimator and a direct test for heteroskedasticity. Econometrica 1980;48(4):817–38.

[62]	Bloniarz A, Liu H, Zhang CH, Sekhon JS, Yu B. Lasso adjustments of treatment effect estimates in randomized experiments. Proc Natl Acad Sci USA 2016;113(27):7383–90.

[63]	Tibshirani R. Regression shrinkage and selection via the lasso. J R Stat Soc Series B Stat Methodol 1996;58(1):267–88.

[64]	Lei L, Ding P. Regression adjustment in completely randomized experiments with a diverging number of covariates. 2018. arXiv:1806.07585.

[65]	Ding P, Feller A, Miratrix L. Decomposing treatment effect variation. J Am Stat Assoc 2019;114(525):304–17.

[66]	Lu J. Covariate adjustment in randomization-based causal inference for 2K factorial designs. Stat Probab Lett 2016;119:11–20.

[67]	Middleton JA. A unified theory of regression adjustment for design-based inference. 2018. arXiv:1803.06011.

[68]	Cox DR. Randomization and concomitant variables in the design of experiments. In: Anderson TW, Styan GHP, Kallianpur GG, Krishnaiah PR, Ghosh JK, editors. Statistics and probability: essays in honor of CR Rao. Amsterdam: North-Holland; 1982. p. 197–202.

[69]	Morgan KL, Rubin DB. Rerandomization to improve covariate balance in experiments. Ann Stat 2012;40(2):1263–82.

[70]	Li X, Ding P, Rubin DB. Asymptotic theory of rerandomization in treatmentcontrol experiments. Proc Natl Acad Sci USA 2018;115(37):9157–62.

[71]	Morgan KL, Rubin DB. Rerandomization to balance tiers of covariates. J Am Stat Assoc 2015;110(512):1412–21.

[72]	Branson Z, Dasgupta T, Rubin DB. Improving covariate balance in 2K factorial designs via rerandomization with an application to a New York City department of education high school study. Ann Appl Stat 2016;10 (4):1958–76.

[73]	Li X, Ding P, Rubin DB. Rerandomization in 2K factorial experiments. 2018. arXiv:1812.10911.

[74]	Zhou Q, Ernst PA, Morgan KL, Rubin DB, Zhang A. Sequential rerandomization. Biometrika 2018;105(3):745–52.

[75]	Fisher RA. The design of experiments. Edinburgh: Oliver and Boyd; 1935.

[76]	Rubin DB. Comment on ‘‘randomization analysis of experimental data: the Fisher randomization test”. J Am Stat Assoc 1980;75(371):591–3.

[77]	Tukey JW. Tightening the clinical trial. Control Clin Trials 1993;14(4):266–85.

[78]	Rosenbaum PR. Covariance adjustment in randomized experiments and observational studies. Stat Sci 2002;17(3):286–327.

[79]	Aronow PM. A general method for detecting interference between units in randomized experiments. Sociol Methods Res 2012;41(1):3–16.

[80]	Athey S, Eckles D, Imbens GW. Exact p-values for network interference. J Am Stat Assoc 2018;113(521):230–40.

[81]	Basse G, Feller A, Toulis P. Exact tests for two-stage randomized designs in the presence of interference. 2017. arXiv:1709.08036.

[82]	Ding P. A paradox from randomization-based causal inference. Stat Sci 2017;32(3):331–45.

[83]	Rosenbaum PR. Exact confidence intervals for nonconstant effects by inverting the signed rank test. Am Stat 2003;57(2):132–8.

[84]	Rigdon J, Hudgens MG. Randomization inference for treatment effects on a binary outcome. Stat Med 2015;34(6):924–35.

[85]	Li X, Ding P. Exact confidence intervals for the average causal effect on a binary outcome. Stat Med 2016;35(6):957–60.

[86]	Ding P, Li F. Causal inference: a missing data perspective. Stat Sci 2018;33 (2):214–37.

[87]	Greenland S, Robins JM, Pearl J. Confounding and collapsibility in causal inference. Stat. Sci 1999;14:29–46.

[88]	Greenland S, Pearl J. Adjustments and their consequences—collapsibility analysis using graphical models. Int Stat Rev 2011;79(3):401–26.

[89]	Rosenbaum PR, Rubin DB. The central role of the propensity score in observational studies for causal effects. Biometrika 1983;70(1):41–55.

[90]	Horvitz DG, Thompson DJ. A generalization of sampling without replacement from a finite universe. J Am Stat Assoc 1952;47(260):663–85.

[91]	Wright PG. Tariff on animal and vegetable oils. New York: Macmillan; 1928.

[92]	Heckman J. Instrumental variables: a study of implicit behavioral assumptions used in making program evaluations. J Hum Resour 1997;32 (3):441–62.

[93]	Manski CF. Nonparametric bounds on treatment effects. Am Econ Rev 1990;80(2):319–23.

[94]	Balke A, Pearl J. Bounds on treatment effects from studies with imperfect compliance. J Am Stat Assoc 1997;92(439):1171–6.

[95]	Goldberger AS. Structural equation methods in the social sciences. Econometrica 1972;40(6):979–1001.

[96]	Robins JM. Correcting for non-compliance in randomized trials using structural nested mean models. Commun Stat Theory Method 1994;23 (8):2379–412.

[97]	Hernán MA, Robins JM. Causal inference. Boca Raton: Chapman & Hall; 2011.

[98]	Angrist JD, Imbens GW, Rubin DB. Identification of causal effects using instrumental variables. J Am Stat Assoc 1996;91(434):444–55.

[99]	Lin W, Feng R, Li H. Regularization methods for high-dimensional instrumental variables regression with an application to genetical genomics. J Am Stat Assoc 2015;110(509):270–88.

[100]

Kang H, Zhang A, Cai TT, Small DS. Instrumental variables estimation with some invalid instruments and its application to mendelian randomization. J Am Stat Assoc 2016;111(513):132–44.

[101]

Wang L, Robins JM, Richardson TS. On falsification of the binary instrumental variable model. Biometrika 2017;104(1):229–36.

[102]

Manski CF, Pepper JV. Monotone instrumental variables: with an application to the returns to schooling. Econometrica 2000;68(4):997–1010.

[103]

Small DS. Sensitivity analysis for instrumental variables regression with overidentifying restrictions. J Am Stat Assoc 2007;102(479):1049–58.

[104]

Miao W, Geng Z, Tchetgen Tchetgen EJ. Identifying causal effects with proxy variables of an unmeasured confounder. Biometrika 2018;105(4):987–93.

[105]

Miao W, Tchetgen Tchetgen E. Invited commentary: bias attenuation and identification of causal effects with multiple negative controls. Am J Epidemiol 2017;185(10):950–3.

[106]

Miao W, Tchetgen ET. A confounding cridge approach for couble negative control inference on causal effects. 2018. arXiv:1808.04945.

[107]

Lipsitch M, Tchetgen Tchetgen E, Cohen T. Negative controls: a tool for detecting confounding and bias in observational studies. Epidemiology 2010;21(3):383–8.

[108]

Smith GD. Negative control exposures in epidemiologic studies. Epidemiology 2012;23(2):350–1.

[109]

Flanders WD, Strickland MJ, Klein M. A new method for partial correction of residual confounding in time-series and other observational studies. Am J Epidemiol 2017;185(10):941–9.

[110]

Rosenbaum PR. The role of known effects in observational studies. Biometrics 1989;45(2):557–69.

[111]

Wang J, Zhao Q, Hastie T, Owen AB. Confounder adjustment in multiple hypothesis testing. Ann Stat 2017;45(5):1863–94.

[112]

Gagnon-Bartsch JA, Speed TP. Using control genes to correct for unwanted variation in microarray data. Biostatistics 2012;13(3):539–52.

[113]

Hong G, Raudenbush SW. Evaluating kindergarten retention policy: a case study of causal inference for multilevel observational data. J Am Stat Assoc 2006;101(475):901–10.

[114]

Sobel ME. What do randomized studies of housing mobility demonstrate? Causal inference in the face of interference. J Am Stat Assoc 2006;101 (476):1398–407.

[115]

Halloran ME, Struchiner CJ. Causal inference in infectious diseases. Epidemiology 1995;6(2):142–51.

[116]

Halloran ME, Struchiner CJ. Study designs for dependent happenings. Epidemiology 1991;2(5):331–8.

[117]

Halloran ME, Hudgens MG. Dependent happenings: a recent methodological review. Curr Epidemiol Rep 2016;3(4):297–305.

[118]

Hudgens MG, Halloran ME. Toward causal inference with interference. J Am Stat Assoc 2008;103(482):832–42.

[119]

Basse G, Feller A. Analyzing two-stage experiments in the presence of interference. J Am Stat Assoc 2018;113(521):41–55.

[120]

Forastiere L, Mealli F, VanderWeele TJ. Identification and estimation of causal mechanisms in clustered encouragement designs: disentangling bed nets using bayesian principal stratification. J Am Stat Assoc 2016;111 (514):510–25.

[121]

Kang H, Imbens G. Peer encouragement designs in causal inference with partial interference and identification of local average network effects. 2016. arXiv:1609.04464.

[122]

Rigdon J, Hudgens MG. Exact confidence intervals in the presence of interference. Stat Probab Lett 2015;105:130–5.

[123]

Aronow PM, Samii C. Estimating average causal effects under interference between units. 2018. arXiv:1305.6156v4.

[124]

Aronow PM, Samii C. Estimating average causal effects under general interference, with application to a social network experiment. Ann Appl Stat 2017;11(4):1912–47.

[125]

Choi D. Estimation of monotone treatment effects in network experiments. J Am Stat Assoc 2017;112(519):1147–55.

[126]

Forastiere L, Airoldi EM, Mealli F. Identification and estimation of treatment and interference effects in observational studies on networks. 2016. arXiv:1609.06245.

[127]

Eckles D, Karrer B, Ugander J. Design and analysis of experiments in networks: reducing bias from interference. J Causal Inference 2017;5(1): 1–23.

[128]

Eckles D, Kizilcec RF, Bakshy E. Estimating peer effects in networks with peer encouragement designs. Proc Natl Acad Sci USA 2016;113(27):7316–22.

[129]

Jagadeesan R, Pillai N, Volfovsky A. Designs for estimating the treatment effect in networks with interference. 2017. arXiv:1705.08524.

[130]

Bowers J, Fredrickson MM, Panagopoulos C. Reasoning about interference between units: a general framework. Polit Anal 2013;21(1):97–124.

[131]

Toulis P, Kao E. Estimation of causal peer influence effects. In: Proceedings of 30th International Conference on Machine Learning; 2013 Jun 16–21; Atlanta, GA, USA; 2013. p. 1489–97.

[132]

Basse GW, Feller A, Toulis P. Randomization tests of causal effects under interference. Biometrika 2019;106(2):487–94.

[133]

Sävje F, Aronow PM, Hudgens MG. Average treatment effects in the presence of unknown interference. 2017. arXiv:1711.06399.

[134]

Liu L, Hudgens MG. Large sample randomization inference of causal effects in the presence of interference. J Am Stat Assoc 2014;109(505):288–301.

[135]

Imai K, Jiang Z, Malani A. Causal inference with interference and noncompliance in two-stage randomized experiments. Technical report. Princeton: Princeton University; 2018.

[136]

Kang H, Keele L. Spillover effects in cluster randomized trials with noncompliance. 2018. arXiv:1808.06418.

[137]

Loh WW, Hudgens MG, Clemens JD, Ali M, Emch ME. Randomization inference with general interference and censoring. 2018. arXiv:1803.02302.

[138]

Vanderweele TJ, Hong G, Jones SM, Brown JL. Mediation and spillover effects in group-randomized trials: a case study of the 4Rs educational intervention. J Am Stat Assoc 2013;108(502):469–82.