About Me

Hi, there. I just got my PhD degree in biostatistics in this July. Now I am studying in the Metis Data Science Bootcamp and will graduate by the end of this September. After this intense bootcamp learning I want to seek for a full-time and data science related job. I have strong background in applied math and statistical modeling and I am interested in all machine learning and statistical applications in the data science area. The location I prefer is the San Francisco area, but others are still considerable.

If you want to extend your data scientist team or just start to build one, do not hesitate to contact me.


Skills

  • Python
  • R
  • SQL
  • Matlab
  • JavaScript
  • HTML & CSS
  • Statistical Modeling
  • Machine Learning
  • Hadoop

Education

Nanyang Technological University

July, 2016
Ph.D. Biostatistics

Jilin University

June, 2010
B.S. Mathematics and applied mathematics
June, 2009
B.A. Economics

Online Course

April, 2016
Machine Learning by Stanford University on Coursera
May, 2016
Hadoop Platform and Application Framework by University of California, San Diego on Coursera

Research Experience

Nanyang Technological University

2011 - present
Ph.D. Researcher
  • Design one computational algorithm to map BS-Seq reads to reference genome.
  • Develop one statistical method to cluster samples for RNA-Seq data and write one R package to execute it.
  • Design one novel clustering algorithm and subsequently build a statistical model to detect differentially methylated regions for WGBS data and write one R package to execute it.

Jilin University

2009 - 2010
Undergraduate researcher
  • Apply the System Dynamics approach to analyze the complex relationship between several energy targets and GDP in Jilin Province

Publications

Estimation and variable selection for generalised partially linear single-index models, Journal of Nonparametric Statistics. 2014; 26(1): 171-185.

Letter to the Editor, The Annals of Applied Statistics. 2013; 7(2): 1244-1246.

TAMeBS: a sensitive bisulfite-sequencing mead mapping tool for DNA methylation analysis, Accepted by IEEE BIBM 2014.

Penalized model-based clustering for RNA-Seq count data, Submitted.

Detecting Differentially Methylated Regions from Whole-Genome Bisulfite Sequencing Data via Three-Dimensional Rank Clustering. Submitted.

PhD Thesis

TWO CLUSTERING PROBLEMS IN ANALYZING NEXT GENERATION SEQUENCING DATA, 2016