Publications – Xinlei (Sherry) Wang

Corresponding author *; Doctoral students underlined

Before 2010

1. Wang, X.*, Stokes, S. L., Lim, J., and Chen, M. (2006), “Concomitants of Multivariate Order Statistics with Application to Judgment Post-stratification”. Journal of the American Statistical Association, 101, 1693-1704.

2. Wang, X.*, Lim, J., and Stokes, S. L. (2006), “Forming Post-Strata via Bayesian Treed Capture-recapture Models”. Biometrika, 93 (4), 861-876.

3. Gao, Y., Lu, L., and Wang, X. (2006). “Significance Calculation and New Analysis Method in Searching for New Physics at the LHC”. European Physical Journal C, 45, 659-667.

4. Wang, X.* and George, E. I. (2007), “Adaptive Bayesian Criteria in Variable Selection for Generalized Linear Models”. Statistica Sinica, 17(2), 667-690.

5. Lim, J., Wang, X., and Sherman, M. (2007), “An Adjustment of Edge Effects Using an Augmented Neighborhood Model in Spatial Auto-Logistic Models”. Computational Statistics and Data Analysis, 51(8), 3679-3688.

6. Stokes, S. L., Wang, X., and Chen, M. (2007), “Judgment Post-Stratification with Multiple Rankers”. Journal of Statistical Theory and Applications, 6(4), 344-359.

7. Abu-Nimeh, S., Nappa, D., Wang, X., and Nair, S. (2007), “A Comparison of Machine Learning Techniques for Phishing Detection”. ACM International Conference Proceeding Series, 269, 60-69.

8. Wang, X.*, Lim, J. and Stokes, S. L. (2008), “A Nonparametric Mean Estimator for Judgment Post-stratified Data”. Biometrics, 64(2), 355-363.

9. Lim, J., Wang, X.*, Lee, S., and Jung, S. (2008), “A Distribution-Free Test of Constant Mean in Mixed Effects Models”. Statistics in Medicine, 27(19), 3833-3846.

10. Abu-Nimeh, S., Nappa, D., Wang, X., and Nair S. (2008), “Bayesian Additive Regression Trees-Based Spam Detection for Enhanced Email Privacy”. 2008 Third International Conference on Availability, Reliability and Security, 1044-1051.

11. Abu-Nimeh, S., Nappa, D., Wang, X., and Nair, S. (2008), “A Distributed Architecture for Phishing Detection using Bayesian Additive Regression Trees”. eCrime Researchers Summit, 1-10.

12. Xie, Y, Wang, X., and Story, M. (2009), “Statistical Methods of Background Correction for Illumina Beadarray Data”. Bioinformatics, 25(6), 751-757.

13. Lim, J., Kim, S. J., and Wang, X.* (2009), “Estimation of Stochastically Ordered Survival Functions by Geometric Programming”. Journal of Computational and Graphical Statistics, 18(4), 978-994.

14. Lim, J., Wang, X., and Choi, W. (2009), “Maximum Likelihood Estimation of Ordered Multinomial Probabilities by Geometric Programming”. Computational Statistics and Data Analysis, 53(4), 889-893.

15. Abu-Nimeh, S., Nappa, D., Wang, X., and Nair, S. (2009), “Hardening Email Security using Bayesian Additive Regression Trees”, in Machine Learning, edited by Mellouk, A. and Chebira, A., I-Tech Education and Publishing, Ch. 9.

16. Abu-Nimeh, S., Nappa, D., Wang, X., and Nair, S. (2009), “Distributed Phishing Detection by Applying Variable Selection using Bayesian Additive Regression Trees”. ICC ’09, IEEE International Conference on Communications, 14(18), 1-5.

2010-2015

17. Wang, X.* and Karr, A. F. (2010), “Preserving Data Utility via BART”. Journal of Statistical Planning and Inference, 140(9), 2551-2561.

18. Chen, M., Xia, Y., and Wang, X. (2011), “Managing Supply Uncertainties Through Bayesian Information Update”. IEEE, Robotics and Automation Society, 7(1), 24-36.

19. Chen, M. and Wang, X. (2011), “Approximate Predictive Densities and Their Applications in Generalized Linear Models”. Computational Statistics and Data Analysis, 55(4), 1570-1580.

20. Xiao, G., Wang, X., and Khodursky, A. (2011), “Modeling Three-dimensional Chromosome Structures Using Gene Expression Data”. Journal of the American Statistical Association, 106(493), 61-72. (This paper was highlighted in the May 2011 issue of Amstat News).

21. Wang, X.*, Wang, K., and Lim, J. (2012), “Isotonized CDF Estimation from Judgment Post-stratification Data with Empty Strata”. Biometrics, 68(1), 194-202.

22. Wang, X.*, Chen, M., Khodursky, A., and Xiao, G. (2012), “Bayesian Joint Analysis of Gene Expression Data and Gene Functional Annotations”. Statistics in Biosciences, 4(2), 300-318.

23. Xiao, G., Wang, X., LaPlant, Q., Eric J. Nestler, and Xie, Y. (2013), “Detection of Epigenetic Changes Using ANOVA with Spatially Varying Coefficients”. Statistical Applications in Genetics and Molecular Biology, 12(2), 189-205. DOI: 10.1515/sagmb-2012-0057.

24. Wang, X., Zang, M., and Xiao, G. (2013), “Epigenetic Change Detection and Pattern Recognition via Bayesian Hierarchical Hidden Markov Models”. Statistics in Medicine, 32(13), 2292-2307. DOI: 10.1002/sim.5658.

25. Chen, M., Zang, M., Wang, X., and Xiao, G. (2013), “A Powerful Bayesian Meta-analysis Method to Integrate Multiple Gene Set Enrichment Studies”. Bioinformatics, 29(7), 862-869. DOI: 10.1093/bioinformatics/btt068.

26. Lim, J. and Wang, X. (2013), “Response to Letter to the Editor by Dr. Vossoughi”. Statistics in Medicine, 32(4), 717. DOI: 10.1002/sim.5660.

27. Yang, J., Wang, X., Kim, M., Xie, Y., and Xiao, G. (2014), “Detection of Candidate Tumor Driver Genes Using a Fully Integrated Bayesian Approach”. Statistics in Medicine, 33(10), 1784-1800. DOI: 10.1002/sim.6066.

28. Lim, J., Chen, M., Park, S., Wang, X., and Stokes, L. (2014), “Kernel Density Estimators for Ranked Set Samples”. Communication in Statistics, 43, 2156-2168. DOI: 10.1080/03610926.2013.791372.

29. Ahn, S., Lim, J., and Wang, X. (2014), “The Student’s t Approximation to Distributions of Pivotal Statistics from Ranked Set Samples”. Journal of the Korean Statistical Society, 43(4), 643-652. DOI: 10.1016/j.jkss.2014.01.004.

30. Chen, M., Ahn, S., Wang, X., and Lim, J. (2014), “Generalized Isotonized Mean Estimators for Judgment Post-stratification with Multiple Rankers”. Journal of Agricultural, Biological, and Environmental Statistics, 19(4), 405-418.

31. Wang, X.*, Lim, J., Kim, S. J., and Hahn, K. S., (2015), “Estimating Cell Probabilities in Contingency Tables with Constraints on Marginals/Conditionals by Geometric Programming with Applications”. Computational Statistics, 30(1), 107-129. DOI: 10.1007/s00180-014-0525-y.

32. Zang, X., Chen, M., Zhou, Y., Xiao, G., Yang, X.*, and Wang, X.* (2015), “Identification of CDKN3 Gene Expression as a Prognostic Biomarker in Lung Adenocarcinoma via Meta-analysis”. Cancer Informatics, 14, 183-191. DOI: 10.4137/CIN.S17287.

2016

33. Bai, O., Chen, M., and Wang, X.* (2016), “Bayesian Estimation and Testing in Random Effects Meta-Analysis of Rare Binary Events”. Statistics in Biopharmaceutical Research, 8(1), 49-59. DOI: 10.1080/19466315.2015.1096823.

34. Park, S., Kim, S.J., Yu, D., Peña-Llopis, S., Gao, J., Park, J.S., Chen, B., Norris, J., Wang, X., Chen, M., Kim, M., Yong, J., Wardak, Z., Choe, K., Story, M., Starr, T., Cheong, J.H., and Hwang, T.H. (2016), “An integrative somatic mutation analysis to identify pathways linked with survival outcomes across 19 cancer types”. Bioinformatics, 32(11), 1643-1651. DOI: 10.1093/bioinformatics/btv692.

35. Wang, X., Lim, J., and Stokes, S. L. (2016), “Using Ranked Set Sampling with Cluster Randomized Designs for Improved Inference on Treatment Effects”. Journal of the American Statistical Association, 111(516), 1576-1590. DOI:10.1080/01621459.2015.1093946.

2017

36. Wang, X.*, Ahn, S., and Lim, J. (2017), “Unbalanced Ranked Set Sampling in Cluster Randomized Studies”. Journal of Statistical Planning and Inference, 187, 1-16. DOI: 10.1016/j.jspi.2017.02.005.

37. Ahn, S., Wang, X., and Lim, J. (2017), “On Unbalanced Group Sizes in Cluster Randomized Designs Using Ranked Set Sampling”. Statistics and Probability Letters, 123, 210-217. DOI: 10.1016/j.spl.2016.12.007.

38. Zamanzade, E. and Wang, X. (2017), “Estimation of Population Proportion for Judgment Post-stratification”. Computational Statistics and Data Analysis, 112, 257-269. DOI: 10.1016/j.csda.2017.03.016.

39. Li, L., Wang, X.*, Xiao, G., and Gazdar, A. (2017), “Integrative Gene Set Enrichment Analysis Using Isoform Specific Expression”. Genetic Epidemiology, 41(6), 498-510. DOI: 10.1002/gepi.22052.

40. Jia, G., Wang, X.*, and Xiao, G*. (2017), “PBNPA: a Permutation Based Non-parametric Analysis of CRISPR Screen Data”. BMC Genomics, 18:545, DOI: 10.1186/s12864-017-3938-5.

41. Yu, D., Lim, J., Wang, X., Liang, F., and Xiao, G. (2017), “Enhanced Construction of Gene Regulatory Networks using Hub Gene Information”. BMC Bioinformatics, 18:186, 210-217. DOI: 10.1186/s12859-017-1576-1.

2018

42. Lu, W., Wang, X.*, Zhan, X. and Gazdar, A. (2018), “Meta-analysis Approaches to Combine Multiple Gene Set Enrichment Studies”. Statistics in Medicine, 37:4, 659-672. DOI:10.1002/sim.7540.

43. Zamanzade, E. and Wang, X. (2018), “Proportion Estimation in Ranked Set Sampling in the Presence of Tie Information”. Computational Statistics, 33(3), 1349-1366. DOI: 10.1007/s00180-018-0807-x.

44. Son, W., Lim, J. and Wang, X. (2018), “Accuracy of Regularized D-rule for Binary Classification”. Journal of the Korean Statistical Society, 47(2), 150-160. DOI: 10.1016/j.jkss.2017.11.002.

45. Li, L., Bai, O., and Wang, X.* (2018), “An Integrative Shrinkage Estimator for Random-Effects Meta-Analysis of Rare Binary Events”. Contemporary Clinical Trials Communications, 10, 141-147. DOI: 10.1016/j.conctc.2018.04.004.

46. Li, X., Choudhary, P. K., Biswas, S., and Wang, X.* (2018), “A Bayesian Latent Variable Approach to Aggregation of Partial and Top Ranked Lists in Genomic Studies”. Statistics in Medicine, 37(28). 4266-4278. DOI: 10.1002/sim.7920. [Code: https://github.com/xuelilyli/BiG]

47. Wang, T., Lu, R., Kapur, P., Jaiswal, B., Hannan, R., Zhang, Z., Pedrosa, I., Luke, J. J., Zhang, H., Goldstein, L. Yousuf, Q., Gu, Y., McKenzie, T., Joyce, A., Kim, M.S., Wang, X., Luo, D., Onabolu, O., Xie, Z., Chen, M., Filatenkov, A., Torrealba, J., Luo, X., Guo, W., He, J., Stawiski, E., Modrusan, Z., Durinck, S., Seshagiri, S., Brugarolas, J. (2018), “An Empirical Approach Leveraging Tumorgrafts to Dissect the Tumor Microenvironment in Renal Cell Carcinoma Identifies Missing Link to Prognostic Inflammatory Factors”. Cancer Discovery, 8(9). 1142-1155. DOI: 10.1158/2159-8290.CD-17-1246.

2019

48. Li, X. and Wang, X.* and Xiao, G. (2019), “A Comparative Study of Rank Aggregation Methods for Partial and Top Ranked Lists in Genomic Applications”. Briefings in Bioinformatics, 20(1), 178-189. DOI: 10.1093/bib/bbx101.

49. Li, L., and Wang, X.* (2019), “Meta-Analysis of Rare Binary Events in Groups with Unequal Variability”. Statistical Methods in Medical Research, 28(1), 263-274. DOI: 10.1177/0962280217721246. (This paper has won an honorable mention award in the ASA Biopharmaceutical Section’s student paper competition for JSM 2017).

50. Park, S., Lim, J., Wang, X., and Lee, S. (2019), “Permutation-based Testing for Covariance Separability”. Computational Statistics, 34(2), 865-883. DOI: 10.1007/s0018.

51. Wang, F., Shen, L., Zhou, H., Wang, S., Wang, X., and Tao, P (2019)., “Machine Learning Classification Model for Functional Binding Modes of TEM-1 β-Lactamase”, Frontiers in Molecular Biosciences, 6:47. DOI: 10.3389/fmolb.2019.00047.

52. Jia, G., Wang, X.*, Li, Q., Lu, W., Tang, X., Wistuba, I., and Xie, Y. (2019), “RCRnorm: An Integrated System of Random-coefficient Hierarchical Regression Models for Normalizing NanoString nCounter Data”. The Annals of Applied Statistics, 13(3), 1617-1647. DOI: 10.1214/19-AOAS1249.

53. Li, Q., Wang, X., Liang, F., Yi, F., Yang, X., Gazdar, A. and Xiao, G. (2019), “A Bayesian Hidden Potts Mixture Model for Analyzing Lung Cancer Pathological Images”. Biostatistics, 20(4), 565-581. DOI: 10.1093/biostatistics/kxy019.

54. Wang, F., Zhou, H., Wang, X., and Tao, P. (2019), “Dynamical behavior of β-lactamases and penicillin binding proteins in different functional states and its potential role in evolution”. Entropy, 21(11), 1130. DOI: 10.3390/e21111130.

55. Li, Q., Wang, X., Liang, F., and Xiao, G. (2019), “A Bayesian Mark Interaction Model for Analysis of Tumor Pathology Images”. The Annals of Applied Statistics, 13(3), 1708-1732. DOI: 10.1214/19-AOAS1254.

2020

56. Yin, S., Wang, X.*, Jia, G., and Xie, Y (2020), “MIXnorm: Normalizing Gene Expression Data from RNA Sequencing of Formalin-Fixed Paraffin-Embedded Samples”. Bioinformatics, 36(11), 3401–3408. DOI: 10.1093/bioinformatics/btaa153.

57. Wang, X.*, Wang, M., Lim, J., and Ahn, S. (2020), “Using Ranked Set Sampling with Binary Outcomes in Cluster Randomized Designs”. The Canadian Journal of Statistics, 48(3), 342-365. DOI:10.1002/cjs.11533.

58. Park, S., Wang, X.*, Lim, J., Xiao, G., Lu, T., and Wang, T.* (2020), “Modeling Immunogenic Neoantigens Using a Bayesian Multiple Instance Regression Model”, Statistical Methods in Medical Research, 29 (10), 3032–3047. DOI: 10.1177/0962280220914321.

59. Zhang, C., Chen, M., and Wang, X.* (2020), “Statistical Methods for Quantifying Between-study Heterogeneity in Meta-analysis with Focus on Rare Binary Events”. Statistics and Its Interface, 13(4), 449-464. DOI: 10.4310/SII.2020.v13.n4.a3.

60. Choi, H., Lim, J., Wang, X., and Kwak, M. (2020), “A Self-Consistent Estimator for Interval Valued Data”, Statistics, 54(5), 1005-1029. DOI: 10.1080/02331888.2020.1811282.

61. Song, Z., Zhou, H., Tian, H., Wang, X., and Tao, P. (2020), “Unraveling the Energetic Significance of Chemical Events in Enzyme Catalysis via Machine-Learning based Regression Approach”. Communications Chemistry, 3, 134. DOI: 10.1038/s42004-020-00379-w.

62. Zamanzade, E. and Wang, X.* (2020), “Improved Nonparametric Estimation using Partially Ordered Sets” In: Chandra G., Nautiyal R., Chandra H. (eds) Statistical Methods and Applications in Forestry and Environmental Sciences, 57-77. Forum for Interdisciplinary Mathematics. Springer, Singapore. DOI: 10.1007/978-981-15-1476-0_5.

2021

63. Cheng, Y., Wang, X.*, and Xia, Y. (2021), “Supervised t-distributed Stochastic Neighbor Embedding for Data Visualization and Classification”. INFORMS Journal on Computing, 33(2), 566-585. DOI: 10.1287/ijoc.2020.0961.

64. Zamanzade, E. and Wang, X.* (2021), “Estimating the Area Under A Receiver Operating Characteristic Curve Using Partially Ordered Sets”, International Journal of Biostatistics, 17(1), 139-152. DOI: 10.1515/ijb-2019-0127.

65. Yin, S., Zhan, X., Yao, B., Xiao, G., Wang, X.*, and Xie, Y.* (2021), “SMIXnorm: Fast and Accurate RNA-seq Data Normalization for Formalin-Fixed Paraffin-Embedded Samples”. Frontiers in Genetics, 12:650795. DOI: 10.3389/fgene.2021.650795.

66. Xiong, D., Zhang, Z., Wang, T., and Wang, X.* (2021), “A Comparative Study of Multiple Instance Learning Methods for Early Cancer Detection Using TCR Repertoire Sequencing Data”. Computational and Structural Biotechnology Journal, 19, 3255-3268. DOI: 10.1016/j.csbj.2021.05.038.

67. Zhang, Z., Xiong, D., Wang, X., Liu, H., and Wang, T. (2021), “Mapping the Functional Landscape of T Cell Receptor Repertoires by Single-T Cell Transcriptomics”. Nature Methods, 18, 92-99.

68. Cao, Y., Zhang, Y., Wang, X., and M. Chen (2021), “Graphical Modeling of Multiple Biological Pathways in Genomic Studies”. In: Zhao, Y. and Chen, D. (eds)Modern Statistical Methods for Health Research. DOI: 10.1007/978-3-030-72437-5.

69. Trozzi, F., Wang, X., and Tao, P. (2021), “UMAP as Dimensionality Reduction Tool for Molecular Dynamics Simulations of Biomacromolecules: A Comparison study”. Journal of Physical Chemistry B, 125(19), 5022-5034. DOI: 10.1021/acs.jpcb.1c02081.

70. Lu, T., Park, S., Zhu, J., Wang, Y., Zhan, X., Wang, X., Wang, L., Zhu, H., and Wang, T. (2021), “Overcoming Expressional Drop-outs in Lineage Reconstruction from Single-Cell RNA-Sequencing Data”. Cell Reports, 34(1), 108589. DOI: 10.1016/j.celrep.2020.108589.

71. Wang, G., Cheng, Y., Chen, M., and Wang, X.* (2021), “Jackknife Empirical Likelihood Confidence Intervals for Assessing Heterogeneity in Meta-analysis of Rare Binary Event Data”. Contemporary Clinical Trials, 107, 106440. DOI: 10.1016/j.cct.2021.106440.

72. Zhang, C., Wang, X.*, Chen, M., and Wang, T. (2021), “A Comparison of Hypothesis Tests for Homogeneity in Meta-analysis with Focus on Rare Binary Events”, Research Synthesis Methods, 12(4), 408-428. DOI: 10.1002/jrsm.1484.

73. Park, S., Wang, X., and Lim, J. (2021), “Estimating High-dimensional Covariance and Precision Matrices under General Missing Dependency”, Electronic Journal of Statistics, 15(2) 4868-4915. DOI: 0.1214/21-EJS1892.

2022+

74. Xu, C., Wang, X.*, Lim, J., Xiao, G. and Xie, Y., “RCRdiff: A Fully Integrated Bayesian Method for Differential Expression Analysis Using Raw NanoString nCounter Data”. Accepted by Statistics in Medicine, 41(4), 665-680.

75. Zhang, Z., Chang, W.Y., Wang, K., Yang, Y., Wang, X., Yao, C., Wu, T., Wang, L., and Wang, T. (2022), “Interpreting the B Cell Receptor Repertoire with Single Cell Gene Expression”. Nature Machine Intelligence, 4, 596-604.

76. Li, Z., Wang, X., Zarazaga, J., Smith-Colin, J., and Minsker, B. (2022), “Do Infrastructure Deserts Exist? Measuring and Mapping Infrastructure Equity”. Cities, 130, 103927.

77. Ahn, S., Wang, X.*, Wang, M., and Lim, J. (2021+), “On Continuity Correction for RSS-structured Cluster Randomized Designs with Binary Outcomes”. Metron, 80, 383-397.

78. Ahn, S., Wang, X., and Lim, J. (2022), “Efficient Sample Allocation by Local Adjustment for Unbalanced Ranked Set Sampling”. In: Ng. G H. K. T., Heitjan, DF (eds) “Recent Advances on Sampling Methods and Educational Statistics – In Honor of S. Lynne Stokes“, Springer.

79. Younghoon, K., Wang, T., Xiong, D., Wang, X., and Park, S. (2022), “Multiple Instance Neural Networks Based on Sparse Attention for Cancer Detection using T-cell receptor sequences”. BMC Bioinformatics, 23: 469.

80. Moon, C., Wang, X., and Lim, J. (2022+), “Empirical Likelihood Inference for Area under the ROC Curve using Ranked Set Samples”. Pharmaceutical Statistics, 21:1219-1245.

81. Park, S., Wang, X., and Lim, J., “Sparse Hanson-Wright Inequality for Bilinear Form of Sub-Gaussian Variables”. Stat. In Press.

82. Wang, G., Cheng, Y., Xia, Y., Ling, Q., and Wang, X.* (2022+), “A Bayesian Semi-supervised Approach to Key Phrase Extraction with Only Positive and Unlabeled Data”. Accepted by INFORMS Journal on Computing.