Education

Ph.D. in Statistics, Boston University (2022 - present)
    Research Interests:
  • Causal Inference
  • Data Science
  • Statistical Machine Learning

B.A. in Mathematics, Texas Christian University (2019 - 2022)
Minors in Computer Science and Economics
Honors Thesis: Statistical Models for Reading Count Data

Experiences

Researcher
Boston University (January 2024 - present)
  • Establishing theoretical framework on consistency and asymptotic normality of multi-component intervention packages in a newly-developed adaptive clinical trial process using probability theory, statistical inference, and empirical processes
  • Developing a mixed-effect model with center characteristics to solve the confounding by indication issue in the trial process
  • Statistical Consultant
    Boston University (September 2023 – Present)
    • Managed several teams with a total of 20+ master’s-level students to facilitate statistical analysis for a wide range of research projects conducted by university researchers, averaging 6-7 completed projects every 4 months
    • Developed statistical models using machine learning, causal inference, and predictive modeling in various fields, including healthcare, psychological, and social sciences research
    • Analyzed data, proposed appropriate statistical methods, created informative data visualizations, and evaluated results to ensure the integrity and validity of research projects, specifically to mitigate risk and bias in experimentation and findings
    Research Assistant
    Texas Christian University (August 2020 - May 2022)
    • Developed 5 discrete/count data models with different regularization configurations and conducted simulations to measure reading accuracy for elementary school students using students’ reading ability data
    • Published the research paper in the Journal of Applied Statistics in July 2022 (DOI: 10.1080/02664763.2022.2103101)
    Technology Intern - Data Track
    New York Life Insurance (May 2021 - August 2021)
    • Built scripts to transfer historical batches in type 2 slowly changing dimension table type from conformed zone to processed zone in SQL (HiveQL) for business uses in Policy, Product, and Transactions
    • Assisted in developing automated scripts/jobs (Python, PySpark, Glue) to transfer the data from AWS S3 bucket to Redshift
    • Collected URLs and certificates associated with applications to populate cybersecurity platform to have an easier tracking system for the cybersecurity team during cyber-attack events
    • Built a sorting algorithm in Excel to manage the report of employees taking training courses
    Data Scientist Intern
    COUNTRY Financial (May 2020 - August 2020)
    • Built two predictive classification models in Python to detect smoker propensity and liar detection using user profile data to help reduce insurance fraud
    • Preprocessed the data using SQL on Hive/Spark, performed exploratory data analysis using Matplotlib/Seaborn in Python, developed a supervised machine learning model with XGBoost to achieve AUC score of 0.74 and precision score of 0.9
    • Presented results to life executive leadership and recommended deployment options, planned deployment H2 2020
    • Developed a Python application using Requests to scrape NOAA weather spotter and radar data for claims use cases
    Academic Tutor
    Texas Christian University (November 2019 - May 2022)
    • Create study materials and plans for 6 to 10 student-athletes every semester to succeed at various subjects, including lower division and upper-level mathematics from Pre-Calculus to Discrete Math, beginner to intermediate Economics

Publications

Minh Thu Bui, Cornelis J. Potgieter & Akihito Kamata (2022) Penalized likelihood methods for modeling count data, Journal of Applied Statistics, DOI: 10.1080/02664763.2022.2103101

Presentations & Talks

Minh Thu Bui." Penalized Likelihood Methods for Modeling of Reading Count Data"
  • Honors College Student Research Symposium (SRS), Texas Christian University, April 2022
  • 16th Annual Texas Undergraduate Mathematics Conference (TUMC), Virtual, October 2021