Shanghai Institutes for Biological Sciences, China
Title: Deciphering the mechanisms of complex diseases using network and machine learning approaches
Biography: Tao Huang
The complex diseases, such as cancer, involves dysfunctions on multiple levels. There have been several methods to identify the dysfunctions on each level. On genomics level, GWAS (Genome-wide association study) can detect the disease phenotype associated SNPs (single nucleotide polymorphisms). But many GWAS identified SNPs locate in intergenic region and can't be annotated to specific genes. With the help of eQTL (expression quantitative trait loci) method, the downstream genes of the GWAS SNPs can be found. And based on the co-expression network, regulatory network, protein-protein interaction network, or even protein-chemical interaction network of these genes, the possible cascade of the signal transduction from the genetic perturbations can be discovered. By integrating all these omics data, we can establish a systems biology model of how the GWAS SNPs affect the expression of their direct target genes, how these direct targets affect secondary target genes, or proteins, or metabolites, and eventually cause the catastrophic pathological changes on network or pathway. With the comprehensive network, we can not only study the mechanism of one disease, but also the relationship of several traits. For example, schizophrenia and anti-tuberculosis drug-induced hepatotoxicity co-occurred frequently but they do not share common genes. We mapped their genes onto the network, the common drivers of these two diseases were revealed through two-way RWR analysis. Another example is that the lung cancer dysfunctions on several levels but how the different level dysfunctions connect are still unknown. We investigated the mutation, methylation, mRNA and microRNA expression difference between cancer and normal tissues using machine learning methods, then mapped the dysfunction of each level onto the comprehensive network. By connecting each other with shortest paths, the highly frequent shortest path genes that can cause the tumorigenesis on multiple levels were discovered. Overall, the network and machine learning approaches are effective to dissect the complex system and reveal the mechanisms of diseases.