Associate Professor, Department of Genetics and Biochemistry
I received my B.Sc. and M.Sc. in Biology from Zhejiang University (Hangzhou, China) and my Ph.D. in Botany from the University of Georgia. I also received my M.Sc. in Computer Science from Mississippi State University. I worked at San Diego Supercomputer Center as a Programmer in 2001, Samuel Roberts Noble Foundation as an Assistant Scientist from 2002-2003, SUNY at Buffalo Department of Computer Science and Engineering as a Research Scientist in 2004, and Kansas State University Division of Biology as a Research Assistant Professor from 2005-2006. I joined Clemson University Department of Genetics and Biochemistry as an Assistant Professor in 2007 and Associate Professor since 2013.
The research in my lab has focused on biological knowledge discovery, genomic data integration and mining, and computational systems biology. We previously developed machine learning models and web-based tools for biomedical research, including BindN and BindN+ for predicting DNA/RNA-binding residues in protein sequence, MuStab for protein stability prediction, and seeSUMO for protein sumoylation site prediction. We also integrated and mined the vast amount of publicly available gene expression data for understanding the molecular pathways involved in human diseases, including intellectual disability, autism, and cancer.
Recently, we have been developing machine learning and data mining approaches for the functional annotation of human long noncoding RNAs (lncRNAs) and circular RNAs (circRNAs) by leveraging the vast amount of genetic and genomic data (“biological big data”). We have constructed machine learning models and gene co-expression networks to predict and prioritize candidate lncRNAs associated with intellectual disability and autism spectrum disorders. We have also applied deep learning techniques to the prediction and pattern analysis of lncRNA subcellular localization, circRNA back-splicing code, and RNA-protein interactions. Our studies demonstrate that genomic data mining can not only give insights into RNA functions in gene regulation and 3D genome organization, but also provide valuable information for experimental studies of candidate genes associated with human diseases.
Wang J, Wang L. 2019. Deep learning of the back-splicing code for circular RNA formation. Bioinformatics pii: btz382. https://www.ncbi.nlm.nih.gov/pubmed/31077303
Gudenas BL, Wang J, Kuang S, Wei A, Cogill SB, Wang L. 2019. Genomic data mining for functional annotation of human long noncoding RNAs. Journal of Zhejiang University – Science B 20: 476-487.
Ang CE, Ma Q, Wapinski OW, Fan S, Flynn RA, Lee QY, Coe B, Onoguchi M, Olmos VH, Do BT, Dukes-Rimsky L, Xu J, Tanabe K, Wang L, Elling U, Penninger JM, Zhao Y, Qu K, Eichler EE, Srivastava A, Wernig M, Chang HY. 2019. The novel lncRNA lnc-NR2F1 is pro-neurogenic and mutated in human neurodevelopmental disorders. eLife 8: e41770.
Gudenas BL, Wang L. 2018. Prediction of lncRNA subcellular localization with deep learning from sequence features. Scientific Reports 8: 16385.
Cogill SB, Srivastava AK, Yang MQ, Wang L. 2018. Co-expression of long non-coding RNAs and autism risk genes in the developing human brain. BMC Systems Biology 12: 91.
Yang X, Kuang S, Wang L, Wei Y. 2018. MHC class I chain-related A: Polymorphism, regulation and therapeutic value in cancer. Biomedicine & Pharmacotherapy 103: 111-117.
Wang J, Wang L. 2017. Prediction of back-splicing sites reveals sequence compositional features of human circular RNAs. In Proceedings of 2017 IEEE 7th International Conference on Computational Advances in Bio and Medical Sciences (ICCABS 2017). IEEE.
Li J, Yang Y, Guevara J, Wang L, Cao W. 2017. Identification of a prototypical single-stranded uracil DNA glycosylase from Listeria innocua. DNA Repair (Amst) 57: 107-115.
Gudenas BL, Srivastava AK, Wang L. 2017. Integrative genomic analyses for identification and prioritization of long non-coding RNAs associated with autism. PLoS ONE 12: e0178532.
Gudenas BL, Wang L. 2017. A genetic algorithm for finding discriminative functional motifs in long non-coding RNAs. In: Cai Z, Daescu O, Li M (eds) Bioinformatics Research and Applications. ISBRA 2017. Lecture Notes in Computer Science, vol 10330, pp. 408-413. Springer.
Xia B, Liu Y, Guevara J, Li J, Jilich C, Yang Y, Wang L, Dominy BN, Cao W. 2017. Correlated mutation in the evolution of catalysis in uracil DNA glycosylase superfamily. Scientific Reports 7: 45978.
Cogill SB, Wang L. 2016. Support vector machine model of developmental brain gene expression data for prioritization of autism risk gene candidates. Bioinformatics 32: 3611-3618.