Integration analysis of three omics data using penalized regression methods: an application to bladder cancer

dc.contributor.authorPineda Sanjuan, Silvia
dc.contributor.authorRea, Francisco X.
dc.contributor.authorKogevinas, Manolis
dc.contributor.authorCarrato, Alfredo
dc.contributor.authorChanock, Stephen J.
dc.contributor.authorMalats, Núria
dc.contributor.authorSteen, Kristel van
dc.contributor.editorMcConkey, David
dc.date.accessioned2024-01-31T15:48:10Z
dc.date.available2024-01-31T15:48:10Z
dc.date.issued2015-12-08
dc.description.abstractOmics data integration is becoming necessary to investigate the genomic mechanisms involved in complex diseases. During the integration process, many challenges arise such as data heterogeneity, the smaller number of individuals in comparison to the number of parameters, multicollinearity, and interpretation and validation of results due to their complexity and lack of knowledge about biological processes. To overcome some of these issues, innovative statistical approaches are being developed. In this work, we propose a permutation-based method to concomitantly assess significance and correct by multiple testing with the MaxT algorithm. This was applied with penalized regression methods (LASSO and ENET) when exploring relationships between common genetic variants, DNA methylation and gene expression measured in bladder tumor samples. The overall analysis flow consisted of three steps: (1) SNPs/CpGs were selected per each gene probe within 1Mb window upstream and downstream the gene; (2) LASSO and ENET were applied to assess the association between each expression probe and the selected SNPs/CpGs in three multivariable models (SNP, CPG, and Global models, the latter integrating SNPs and CPGs); and (3) the significance of each model was assessed using the permutation-based MaxT method. We identified 48 genes whose expression levels were significantly associated with both SNPs and CPGs. Importantly, 36 (75%) of them were replicated in an independent data set (TCGA) and the performance of the proposed method was checked with a simulation study. We further support our results with a biological interpretation based on an enrichment analysis. The approach we propose allows reducing computational time and is flexible and easy to implement when analyzing several types of omics data. Our results highlight the importance of integrating omics data by applying appropriate statistical strategies to discover new insights into the complex genetic mechanisms involved in disease conditions.en
dc.description.departmentDepto. de Estadística y Ciencia de los Datos
dc.description.facultyFac. de Estudios Estadísticos
dc.description.refereedTRUE
dc.description.sponsorshipFondo de Investigaciones Sanitarias (España)
dc.description.sponsorshipRed Temática de Investigación Cooperativa en Cáncer (España)
dc.description.sponsorshipObra Social Fundación "la Caixa"
dc.description.sponsorshipInstituto de Salud Carlos III
dc.description.sponsorshipEuropean Cooperation in Science and Technology
dc.description.statuspub
dc.identifier.citationPineda S, Real FX, Kogevinas M, Carrato A, Chanock SJ, Malats N, et al. (2015) Integration Analysis of Three Omics Data Using Penalized Regression Methods: An Application to Bladder Cancer. PLoS Genet 11(12): e1005689. doi:10.1371/journal.pgen.1005689
dc.identifier.doi10.1371/journal.pgen.1005689
dc.identifier.essn1553-7404
dc.identifier.issn1553-7390
dc.identifier.officialurlhttps://doi.org/10.1371/journal.pgen.1005689
dc.identifier.relatedurlhttps://journals.plos.org/plosgenetics/article?id=10.1371/journal.pgen.1005689
dc.identifier.urihttps://hdl.handle.net/20.500.14352/97298
dc.issue.number12
dc.journal.titlePLoS Genetics
dc.language.isoeng
dc.page.final22
dc.page.initial1
dc.publisherPlos
dc.relation.projectIDPI06-1614
dc.relation.projectIDRD12/0036/0034
dc.relation.projectIDRD12/0036/0050
dc.relation.projectIDPI06-1614
dc.relation.projectIDPI12-00815
dc.relation.projectIDF2-2008-201663
dc.relation.projectIDF2-2008-201333
dc.relation.projectIDBM1204
dc.rightsAttribution-NonCommercial-NoDerivatives 4.0 Internationalen
dc.rights.accessRightsopen access
dc.rights.urihttp://creativecommons.org/licenses/by-nc-nd/4.0/
dc.subject.cdu616-006.04
dc.subject.keywordOmics data
dc.subject.keywordIntegration analysis
dc.subject.keywordBladder Cancer
dc.subject.ucmCiencias Biomédicas
dc.subject.ucmOncología
dc.subject.unesco2404.01 Bioestadística
dc.titleIntegration analysis of three omics data using penalized regression methods: an application to bladder canceren
dc.typejournal article
dc.type.hasVersionAM
dc.volume.number11
dspace.entity.typePublication
relation.isAuthorOfPublication9ff02bb9-3623-452e-ad72-8bb19687ec4e
relation.isAuthorOfPublication.latestForDiscovery9ff02bb9-3623-452e-ad72-8bb19687ec4e

Download

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Integration analysis of three omics data using penalized regression methods.pdf
Size:
1.92 MB
Format:
Adobe Portable Document Format

Collections