I received my B.Sc. and M.Sc. in Biology from Zhejiang University (Hangzhou, China) and my Ph.D. in Botany from the University of Georgia.  I also received my M.Sc. in Computer Science from Mississippi State University.  I worked at San Diego Supercomputer Center as a Programmer in 2001, Samuel Roberts Noble Foundation as an Assistant Scientist from 2002-2003, SUNY at Buffalo Department of Computer Science and Engineering as a Research Scientist in 2004, and Kansas State University Division of Biology as a Research Assistant Professor from 2005-2006.  I joined Clemson University Department of Genetics and Biochemistry as an Assistant Professor in 2007 and Associate Professor since 2013.


The research in my lab has focused on biological knowledge discovery, genomic data integration and mining, and computational systems biology. We previously developed machine learning models and web-based tools for biomedical research, including BindN and BindN+ for predicting DNA/RNA-binding residues in protein sequence, MuStab for protein stability prediction, and seeSUMO for protein sumoylation site prediction. We also integrated and mined the vast amount of publicly available gene expression data for understanding the molecular pathways involved in human diseases, including intellectual disability, autism, and cancer.

Recently, we have been developing machine learning and data mining approaches for the functional annotation of human long noncoding RNAs (lncRNAs) and circular RNAs (circRNAs) by leveraging the vast amount of genetic and genomic data (“biological big data”). We have constructed machine learning models and gene co-expression networks to predict and prioritize candidate lncRNAs associated with intellectual disability and autism spectrum disorders. We have also applied deep learning techniques to the prediction and pattern analysis of lncRNA subcellular localization, circRNA back-splicing code, and RNA-protein interactions. Our studies demonstrate that genomic data mining can not only give insights into RNA functions in gene regulation and 3D genome organization, but also provide valuable information for experimental studies of candidate genes associated with human diseases.

Recent Publications

