Discovery of Indirect Interactions Between Genes by Deep Learning Using Gene Expression Data


Creative Commons License

Thesis Type: Postgraduate

Institution Of The Thesis: Yildiz Technical University, Faculty Of Chemıcal And Metallurgıcal Engıneerıng, Department Of Bioengineering, Turkey

Approval Date: 2022

Thesis Language: Turkish

Student: GÜLCE ÇELEN

Supervisor: Alper Yılmaz

Abstract:

Gene regulatory networks are graph based mathematical models to elucidatecomplex regulatory relationships between transcription factorsand their target genes.Considering the number of gene pairs, identifying gene regulatory networks forevery transcription factor-target pair is experimentally infeasible. Therefore, manycomputational approaches have been developed to reconstruct gene regulatorynetworks from different types of biological data. However, uncovering regulatoryinteractions from gene expression data more accurately is still a challenge incomputational biology. Convolution neural network (CNN) which is a class of deeplearning methods can be used to predict transcription factor-target pairs from geneexpression data. In this study a similar approach is applied to human RNA-Seq datato predict humantranscription factor-target pairs from normal samples and tumorsamples. Human transcription factor-target interactions were retrieved from TRRUSTdatabase and human RNA-Seq data for tumor and normal samples were retrieved fromUCSC Toil which contains data from TCGA and GTEx projects, respectively. For eachtranscription factor-target pair, expression values were extracted to assemble data as80% for training, 10% for validation and 10% for test data. Two separate models weregenerated for normal and tumor samples. As a result, two separate convolutionalneural network models were developed to predict transcription factor-target pairsin normal and tumor samples.CNN model whichis trained on tumor samplesdemonstrated 92% validation accuracy, while the other CNN model which is trained onnormal samples demonstrated 91% validation accuracy. Our approach has potentialto extend the existing and ever growing human gene regulatory networks. Also,individual transcription factor-target predictions for normal and tumor samples mayreveal transcription factor-target interactions which has role in tumor mechanisms.