Aviso: para depositar documentos, por favor, inicia sesión e identifícate con tu cuenta de correo institucional de la UCM con el botón MI CUENTA UCM. No emplees la opción AUTENTICACIÓN CON CONTRASEÑA
 

Classification of COVID19 Patients Using Robust Logistic Regression

dc.contributor.authorGhosh, Abhik
dc.contributor.authorJaenada Malagón, María
dc.contributor.authorPardo Llorente, Leandro
dc.date.accessioned2023-06-22T11:02:44Z
dc.date.available2023-06-22T11:02:44Z
dc.date.issued2022-09-21
dc.descriptionCRUE-CSIC (Acuerdos Transformativos 2022)
dc.description.abstractCoronavirus disease 2019 (COVID19) has triggered a global pandemic affecting millions of people. Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) causing the COVID-19 disease is hypothesized to gain entry into humans via the airway epithelium, where it initiates a host response. The expression levels of genes at the upper airway that interact with the SARS-CoV-2 could be a telltale sign of virus infection. However, gene expression data have been flagged as suspicious of containing different contamination errors via techniques for extracting such information, and clinical diagnosis may contain labelling errors due to the specificity and sensitivity of diagnostic tests. We propose to fit the regularized logistic regression model as a classifier for COVID-19 diagnosis, which simultaneously identifies genes related to the disease and predicts the COVID-19 cases based on the expression values of the selected genes. We apply a robust estimating methods based on the density power divergence to obtain stable results ignoring the effects of contamination or labelling errors in the data and compare its performance with respect to the classical maximum likelihood estimator with different penalties, including the LASSO and the general adaptive LASSO penalties.
dc.description.departmentDepto. de Estadística e Investigación Operativa
dc.description.facultyFac. de Ciencias Matemáticas
dc.description.refereedTRUE
dc.description.sponsorshipMinisterio de Ciencia e Innovación (MICINN)
dc.description.sponsorshipMinisterio de Universidades
dc.description.sponsorshipDepartment of Science and Technology (DST), Government of India
dc.description.statuspub
dc.eprint.idhttps://eprints.ucm.es/id/eprint/74726
dc.identifier.doi10.1007/s42519-022-00295-3
dc.identifier.issn1559-8608
dc.identifier.officialurlhttps://doi.org/10.1007/s42519-022-00295-3
dc.identifier.urihttps://hdl.handle.net/20.500.14352/72041
dc.issue.number4
dc.journal.titleJournal of Statistical Theory and Practice
dc.language.isoeng
dc.publisherSpringer Nature
dc.relation.projectIDPGC2018-095 194-B-100
dc.relation.projectIDFPU 19/01824
dc.relation.projectIDSRG/2020/000072
dc.rightsAtribución 3.0 España
dc.rights.accessRightsopen access
dc.rights.urihttps://creativecommons.org/licenses/by/3.0/es/
dc.subject.cdu519.22
dc.subject.cdu616.98:578.834
dc.subject.keywordDensity power divergence
dc.subject.keywordHigh-dimensional data
dc.subject.keywordSparse logistic regression
dc.subject.keywordCOVID-19
dc.subject.keywordGene expression
dc.subject.ucmEstadística matemática (Matemáticas)
dc.subject.ucmEnfermedades infecciosas
dc.subject.ucmGenética médica
dc.subject.ucmBiomatemáticas
dc.subject.unesco1209 Estadística
dc.subject.unesco3205.05 Enfermedades Infecciosas
dc.subject.unesco2410.07 Genética Humana
dc.subject.unesco2404 Biomatemáticas
dc.titleClassification of COVID19 Patients Using Robust Logistic Regression
dc.typejournal article
dc.volume.number16
dcterms.references1. Algamal ZA, Lee MH (2015) Penalized logistic regression with the adaptive LASSO for gene selection in high-dimensional cancer. Expert Syst Appl 42:9326–9332 2. Araveeporn A (2021) The higher-order of adaptive lasso and elastic net methods for classification on high dimensional data. Mathematics 9:1091 3. Avella-Medina M, Ronchetti E (2018) Robust and consistent variable selection in high-dimensional generalized linear models. Biometrika 105:31–44 4. Bianco AM, Yohai VJ (1996) Robust estimation in the logistic regression model. Robust statistics, data analysis, and computer intensive methods. Springer, New York 5. Bianco AM, Boente G, Chebi G (2021) Penalized robust estimators in sparse logistic regression. TEST, 1–32 6. Basu A, Harris R, Hjort N, Jones MC (1998) Robust and efficient estimation by minimising a density power divergence. Biometrika 85(549–559):1998 7. Basu A, Ghosh A, Jaenada M, Pardo L (2021) Robust adaptive Lasso in high-dimensional logistic regression with an application to genomic classification of cancer patients. arXiv:2109.03028 8. Cantoni E, Ronchetti E (2001) Robust inference for generalized linear models. J Am Stat Assoc 96:1022–1030 9. Cawley GC, Talbot NLC (2006) Gene selection in cancer classification using sparse logistic regression with Bayesian regularization. Bioinformatics 22(19):2348–2355 10. Fan J, Li R (2001) Variable selection via nonconcave penalized likelihood and its oracle properties. J Am Stat Assoc 96:1348–1360 11. Fokianos K (2008) Comparing two samples by penalized logistic regression. Electron J Stat 2:564–580 12. Ghosh D, Chinnaiyan AM (2005) Classification and selection of biomarkers in genomic data using LASSO. J Biomed Biotechnol 2005(2):147 13. Ghosh A, Basu A (2016) Robust estimation in generalized linear models: the density power divergence approach. TEST 25(2):269–290 14. Ghosh A, Majumdar S (2020) Ultrahigh-dimensional robust and efficient sparse regression using nonconcave penalized density power divergence. IEEE Trans Inf Theory 66(12):7812–7827 15. Ghosh A, Jaenada M, Pardo L (2020) Robust adaptive variable selection in ultra-high dimensional linear regression models arXiv:2004.05470 16. Hastie T, Tibshirani R, Friedman J (2009) The elements of statistical learning: data mining, inference and prediction. Springer, Berlin 17. Huang J, Ma S, Zhang CH (2008) The iterated lasso for high-dimensional logistic regression. The University of Iowa, Department of Statistics and Actuarial Sciences, pp 1–20 18. Jacob L, Obozinski G, Vert JP (2009) Group lasso with overlap and graph lasso. In: Proceedings of the 26th annual international conference on machine learning, pp 433–440 19. Konishi S, Kitagawa G (1996) Generalized information criteria in model selection. Biometrika 83:875–890 20. Mick E, Kamm J, Pisco AO, Ratnasiri K, Babik JM, Calfee CS et al (2020) Upper airway gene expression differentiates COVID-19 from other acute respiratory illnesses and reveals suppression of innate immune responses by SARS-CoV-2. medRxiv 21. Park MY, Hastie T (2008) Penalized logistic regression for detecting gene interactions. Biostatistics 9:30–50 22. Ramesh P, Veerappapillai S, Karuppasamy R (2021) Gene expression profiling of corona virus microarray datasets to identify crucial targets in COVID-19 patients. Gene Rep 22:100980 23. Plan Y, Vershynin R (2013) Robust 1-bit compressed sensing and sparse logistic regression: a convex programming approach. IEEE Trans Inf Theory 59(1):482–494 24. Salahudeen AA, Choi SS, Rustagi A, Zhu J, Sean M, Flynn RA, Kuo CJ (2020) Progenitor identification and SARS-CoV-2 infection in long-term human distal lung organoid cultures. BioRxiv. https://doi.org/10.1101/2020.07.27.212076 25. Shevade SK, Keerthi SS (2003) A simple and efficient algorithm for gene selection using sparse logistic regression. Bioinformatics 19(17):2246–2253 26. Sun H, Wang S (2012) Penalized logistic regression for high-dimensional DNA methylation data with case–control studies. Bioinformatics 28:1368–1375 27. Tibshirani R (1996) Regression shrinkage and selection via the lasso. J R Stat Soc Ser B (Methodol) 58(1):267–288 28. Wu TT, Chen YF, Hastie T, Sobel E, Lange K (2009) Genome-wide association analysis by lasso penalized logistic regression. Bioinformatics 25(6):714–721 29. Zhang YH, Li H, Zeng T, Chen L, Li Z, Huang T, Cai YD (2021) Identifying transcriptomic signatures and rules for SARS-CoV-2 infection. Front Cell Dev Biol 8:1763 30. Zhu J, Hastie T (2004) Classification of expressions arrays by penalized logistic regression. Biostatistics 5(3):427–443 31. Zou H (2006) The adaptive lasso and its oracle properties. J Am Stat Assoc 101(476):1418–1429
dspace.entity.typePublication
relation.isAuthorOfPublication931cc892-86a0-4d44-9343-7b54535c00a2
relation.isAuthorOfPublicationa6409cba-03ce-4c3b-af08-e673b7b2bf58
relation.isAuthorOfPublication.latestForDiscovery931cc892-86a0-4d44-9343-7b54535c00a2

Download

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
pardo_classification.pdf
Size:
852.32 KB
Format:
Adobe Portable Document Format

Collections