Bilin Nong's Digital Curriculum Vitae
My resume can be downloaded here.
Education
Sep 2021 - June 2026
University of Toronto (St. George)
Honours Bachelor of Science - Statistical Science & Quantitative Biology Major - ASIP Co-op Program
GPA: 3.95/4.00; Average: A+
Experience
May 2024 - Sep 2024
Research Scholar with Prof. Aya Mitani
Data Science Institute, Toronto
- Conducted a systematic literature review to evaluate the influence of global diseases on data science methodologies.
- Built R scripts to compile datasets, integrating disease burden rankings with data science publication metrics.
- Performed non-parametric statistical analysis to analyze disease research trends from 2010 to 2024.
- Applied hierarchical clustering technique to build dendrograms and heatmaps for visualizing similarity among journals based on their main research focus.
- Exhibited scientific communication skills through delivering a poster presentation on SUDS research day. (poster)
Apr 2023 - Apr 2024
Research Assistant with Dr. Jochen Weile and Prof. Fritz Roth
Lunenfeld-Tanenbaum Research Institute @Sinai Health, Toronto
- Co-developed bioinformatic pipelines – TileSeqMave to quantify the variant effect of genetic mutations and complied benchmark sets of variants from curated extensive literature reviews.
- Leveraged programming skills in Python and R to evaluate and enhance the predictive accuracy of variant effect maps across different pipeline versions, achieving a 30% improvement over previous pipelines.
- Formulated recommendations for optimizing the implementation of TileseqMave pipelines based on analysis results, which are estimated to reduce computational time by up to 40%.
- Demonstrated presentation skills by delivering a presentation on the methodologies and key findings of this project to the entire lab and authored a comprehensive report. (report)
May 2023 - Aug 2023
Research Assistant with Prof. Kuan Liu and Prof. Kevin Thorpe
Dalla Lana School of Public Health, Toronto
- Utilized R to simulate Randomized Controlled Trials (RCTs) with various outcomes under different missing mechanisms.
- Applied 5 common missing data handling methods to each simulated data set, and assessed their performance by employing a range of evaluation metrics include type I error/power, RMSE, and average width of confidence intervals.
- Provided recommendations on how to deal with missing data in RCTs based on evaluation results, and employed the tidyverse library in R for advanced data visualization.
- Enhanced expertise in missing data methodologies, as well as the design and analysis of RCTs, and gave a poster presentation on research showcase Day. (poster)
Projects
safeTO: A Community Safety Web Application
- Collaboratively developed a community safety website (safeTO) that visualizes crime analytics in Toronto neighbourhoods through an interactive city map.
- Designed Java classes for the data access and persistence layer using Clean Architecture, including functionality to fetch data via HTTP requests, export data in JSON format, and store user emails in a key-value database.
- Implemented an Email Alert class to send yearly crime reports to users, utilizing the Builder design pattern to construct and format email content.
- Applied Clean Architecture and SOLID design principles, integrating various design patterns throughout the project to enhance the web app’s maintainability.
Analysis of Sensory Factors & Coffee Ratings
- Fitted a multiple linear regression (MLR) model to invesigate the correlation between the overall ratings of coffee and its sensory aspects using the data from Coffee Quality Institute.
- Developed data analysis skills in R includes cleaning the raw data set, fitting the MLR model, checking the model assumptions via diagnostic plots, applying transformations to mitigate violated assumptions, conducting ANOVA and individual t test to check for significant linear relationship, and qualifying model goodness by a series of likelihood criteria.
- Utilized the tidyverse and car packages in R for efficient data summary, data analysis and visualization, showcasing proficiency in programming using R statistical software.
- Showcased thorough understanding in data integrity by evaluating the model’s limitations, and demonstrated technical writing skills by authored an 5000-word report.
Learning Management System Design
- Leveraged Entity-Relationship Principle to design a schema for a learning management system database, specifically tailored to support the functionalities of a web app for managing student assignments.
- Developed and executed complex SQL queries to facilitate data retrieval and analysis, demonstrating a deep understanding of relational database and SQL intricacies.
- Embedded SQL queries into Python using psycopg2 library, showcasing the ability to integrate SQL with a high-level programming language for efficient data manipulation.
- Conducted testing and validation of database functionalities, ensuring accuracy and reliability of the data, and thereby facilitating insightful analytics for educational management and improvement.
Technical Skills
I am comfortable with the following languages/applications:
- Python
- Java
- SQL
- R
- Git & GitHub
- HTML & CSS
- Unix Shell
- Bash scripting
- LaTeX & Markdown
- Microsoft Access & Google Suite
Relevant Courses
Here is an exhaustive list of relevant courses I have taken:
- MAT135 (Calculus I)
- MAT136 (Calculus II)
- MAT235 (Multivariable Calculus)
- MAT223 (Linear Algebra I)
- MAT224 (Linear Algebra II)
- MAT246 (Abstract Mathematics)
- BIO130 (Molecular and Cell Biology)
- BIO230 (From Genes to Organisms)
- BCH242 (Intro Biochemistry) (group presentation)
- BCB330 (Special Project in Bioinformatics)
- HMB265 (Human Genetics)
- CHM136 (Intro Organic Chem I)
- CHM247 (Intro Organic Chem II)
- CSC108 (Intro to Computer Programming)
- CSC148 (Intro to Computer Science)
- CSC207 (Software Design)
- CSC343 (Intro to Databases)
- STA237 (Probability, Statistics and Data Analysis I)
- STA238 (Probability, Statistics and Data Analysis II)
- STA302 (Methods of Data Analysis I: Regression Analysis)
- STA303 (Methods of Data Analysis II: Categorical Data Analysis)
- STA314 (Intro to Statistical Learning)
- STA475 (Survival Analysis) (infographic)
Honors and Awards
2024
Samuel Beatty In-Course Scholarship
- Given to students with outstanding academic performance in second, third or fourth year, taking a specialist program offered by the Departments of Mathematics, Physics, Statistics or Computer Science. ($1000 CAD)
2023, 2024
Summer Undergraduate Data Science (SUDS) Research Program Award
- Awarded to fund a summer research project for undergraduate students in data science field ($7200 CAD).
2022, 2023
Dean’s List Scholar
- Awarded to students who had excellent GPA in the University