Person:
Vélez Serrano, Daniel

Loading...
Profile Picture
First Name
Daniel
Last Name
Vélez Serrano
Affiliation
Universidad Complutense de Madrid
Faculty / Institute
Ciencias Matemáticas
Department
Estadística e Investigación Operativa
Area
Estadística e Investigación Operativa
Identifiers
UCM identifierScopus Author IDWeb of Science ResearcherIDDialnet ID

Search Results

Now showing 1 - 8 of 8
  • Item
    Prediction of in-hospital mortality after pancreatic resection in pancreatic cancer patients: A boosting approach via a population-based study using health administrative data
    (PLoS ONE, 2017) Velez-Serrano, Jose F.; Vélez Serrano, Daniel; Hernandez Barrera, Valentin; Jimenez Garcia, Rodrigo; Lopez de Andres, Ana; Carrasco Garrido, Pilar; Álvaro Meca, Alejandro;
    Background One reason for the aggressiveness of the pancreatic cancer is that it is diagnosed late, which often limits both the therapeutic options that are available and patient survival. The long-term survival of pancreatic cancer patients is not possible if the tumor is not resected, even among patients who receive chemotherapy in the earliest stages. The main objective of this study was to create a prediction model for in-hospital mortality after a pancreatectomy in pancreatic cancer patients. Methods We performed a retrospective study of all pancreatic resections in pancreatic cancer patients in Spanish public hospitals (2013). Data were obtained from records in the Minimum Basic Data Set. To develop the prediction model, we used a boosting method. Results The in-hospital mortality of pancreatic resections in pancreatic cancer patients was 8.48% in Spain. Our model showed high predictive accuracy, with an AUC of 0.91 and a Brier score of 0.09, which indicated that the probabilities were well calibrated. In addition, a sensitivity analysis of the information available prior to the surgery revealed that our model has high predictive accuracy, with an AUC of 0.802. Conclusions In this study, we developed a nation-wide system that is capable of generating accurate and reliable predictions of in-hospital mortality after pancreatic resection in patients with pancreatic cancer. Our model could help surgeons understand the importance of the patients’ characteristics prior to surgery and the health effects that may follow resection.
  • Item
    A method for K-Means seeds generation applied to text mining
    (Statistical Methods & Applications, 2015) Vélez Serrano, Daniel; Sueiras, Jorge; Ortega, Alejandro; Velez, Jose F.
    In this paper, a methodology is proposed in order to produce a set of seeds later used as a starting point to K-Means-type unsupervised classification algorithms for text mining. Our proposal involves using the eigenvectors obtained from principal component analysis to extract initial seeds, upon appropriate treatment for search of lightly overlapping clusters which are also clearly identified by keywords. This work is motivated by the interest of the authors in the problem of identification of topics and themes previously unknown in short texts. Therefore, in order to validate the goodness of this method, it was applied on a sample of labeled e-mails (NG20) representing a gold standard within the field of text mining. Specifically, some corpora referenced in the literature have been used, configured in accordance to a mix of topics contained in the sample. The proposed method improves on the results of other state-of-the-art methods to which it is compared.
  • Item
    Predicting Sex in White Rhinoceroses: A Statistical Model for Conservation Management
    (Animals, 2023) Martínez, Leticia; Andrés Gamazo, Paloma Jimena De; Caperos, José Manuel; Silván Granado, Gema; Fernández-Morán, Jesús; Casares, Miguel; Crespo, Belén; Vélez Serrano, Daniel; Sanz San Miguel, Luis; Cáceres Ramos, Sara Cristina; Illera Del Portal, Juan Carlos
    Ensuring the effective management of every rhinoceros population is crucial for securing a future for the species, especially considering the escalating global threat of poaching and the challenges faced in captive breeding programs for this endangered species. Steroid hormones play pivotal roles in regulating diverse biological processes, making fecal hormonal determinations a valuable non-invasive tool for monitoring adrenal and gonadal endocrinologies and assessing reproductive status, particularly in endangered species. The purpose of this study was to develop a statistical model for predicting the sex of white rhinoceroses using hormonal determinations obtained from a single fecal sample. To achieve this, 562 fecal samples from 15 individuals of the Ceratotherium simum species were collected, and enzyme immunoassays were conducted to determine the concentrations of fecal cortisol, progesterone, estrone, and testosterone metabolites. The biological validation of the method provided an impressive accuracy rate of nearly 80% in predicting the sex of hypothetically unknown white rhinoceroses. Implementing this statistical model for sex identification in white rhinoceroses would yield significant benefits, including a better understanding of the structure and dynamics of wild populations. Additionally, it would enhance conservation management efforts aimed at protecting this endangered species. By utilizing this innovative approach, we can contribute to the preservation and long-term survival of white rhinoceros populations.
  • Item
    Churn and Net Promoter Score forecasting for business decision-making through a new stepwise regression methodology
    (Knowledge-Based Systems, 2020) Vélez Serrano, Daniel; Ayuso, A.; Perales-González, C.; Rodríguez González, Juan Tinguaro
    Companies typically have to make relevant decisions regarding their clients’ fidelity and retention on the basis of analytical models developed to predict both their churn probability and Net Promoter Score (NPS). Although the predictive capability of these models is important, interpretability is a crucial factor to look for as well, because the decisions to be made from their results have to be properly justified. In this paper, a novel methodology to develop analytical models balancing predictive performance and interpretability is proposed, with the aim of enabling a better decision-making. It proceeds by fitting logistic regression models through a modified stepwise variable selection procedure, which automatically selects input variables while keeping their business logic, previously validated by an expert. In synergy with this procedure, a new method for transforming independent variables in order to better deal with ordinal targets and avoiding some logistic regression issues with outliers and missing data is also proposed. The combination of these two proposals with some competitive machine-learning methods earned the leading position in the NPS forecasting task of an international university talent challenge posed by a well-known global bank. The application of the proposed methodology and the results it obtained at this challenge are described as a case-study.
  • Item
    Analyzing the influence of contrast in large-scale recognition of natural images
    (Integrated Computer-Aided Engineering, 2016) Sánchez, Ángel; Moreno, A. Belén; Vélez Serrano, Daniel; Vélez, José F.
    This paper analyzes both the isolated influence of illumination quality in 2D facial recognition and also the influence of contrast measures in large-scale recognition of low-resolution natural images. First, using the Yale Face Database B, we have shown that by separately estimating the illumination quality of facial images (through a fuzzy inference system that combines average brightness and global contrast of the patterns) and by recognizing the same images using a multilayer perceptron, there exists a nearly-linear correlation between both illumination and recognition results. Second, we introduced a new contrast measure, called Harris Points Measured Contrast (HPMC), which assigns values of contrast in a more consistent form to images, according to their recognition rate than other global and local compared contrast analysis methods. For our experiments on image contrast analysis, we have used the CIFAR-10 dataset with 60,000 images and convolutional neural networks as classification models. Our results can be considered to decide if it is worth using a given test image, according to its calculated contrast applying the proposed HPCM metric, for further recognition tasks.
  • Item
    The PANDEMYC Score. An Easily Applicable and Interpretable Model for Predicting Mortality Associated With COVID-19
    (Journal of Clinical Medicine, 2020) Torres Macho, Juan; Ryan Múrua, Pablo; Valencia, Jorge; Pérez-Butragueño, Mario; Jiménez González De Buitrago, Eva; Fontán-Vela, Mario; Izquierdo García, Elsa; Fernandez-Jimenez, Inés; Álvaro-Alonso, Elena; Lazaro, Andrea; Alvarado, Marta; Notario, Helena; Resino, Salvador; Vélez Serrano, Daniel; Meca, Alejandro
    This study aimed to build an easily applicable prognostic model based on routine clinical, radiological, and laboratory data available at admission, to predict mortality in coronavirus 19 disease (COVID-19) hospitalized patients. Methods: We retrospectively collected clinical information from 1968 patients admitted to a hospital. We built a predictive score based on a logistic regression model in which explicative variables were discretized using classification trees that facilitated the identification of the optimal sections in order to predict inpatient mortality in patients admitted with COVID-19. These sections were translated into a score indicating the probability of a patient’s death, thus making the results easy to interpret. Results. Median age was 67 years, 1104 patients (56.4%) were male, and 325 (16.5%) died during hospitalization. Our final model identified nine key features: age, oxygen saturation, smoking, serum creatinine, lymphocytes, hemoglobin, platelets, C-reactive protein, and sodium at admission. The discrimination of the model was excellent in the training, validation, and test samples (AUC: 0.865, 0.808, and 0.883, respectively). We constructed a prognostic scale to determine the probability of death associated with each score. Conclusions: We designed an easily applicable predictive model for early identification of patients at high risk of death due to COVID-19 during hospitalization.
  • Item
    Affective homogeneity in the Spanish general election debate. A comparative analysis of social networks political agents
    (Information, Communication and Society, 2018) Robles Morales, José Manuel; Vélez Serrano, Daniel; De Marco, Stefano; Rodríguez González, Juan Tinguaro; Gómez González, Daniel; taylor and francis
    Many experts in the social sciences are studying the extent to which the agents of democratic political systems tend to strengthen their points of view to such an extent that it reduces their capacity to engage and debate with those who hold different points of view. This phenomenon, called polarization, is also present in public debate on social networks and has generated a significant number of studies and empirical research. In this context, a few noteworthy factors in the study of polarization are the concepts of ‘homophily’ and ‘homogeneity’. These terms refer to the fragmenting effect of social networks and are the consequence of the common characteristics and attributes of the members that comprise them. In this work, we analyze this phenomenon in relation with the General Elections for the Presidency of Spain and, particularly, in the case of the candidature of the political party UnidosPodemos. We used data from the Twitter social network to analyze the subjects of debate, and the affective positions in relation with each of these. We found that the most active political agents had postures that were clearly homogenized in affective terms. Finally, we discuss the polarizing effects of this homogenization.
  • Item
    Teoría de cópulas aplicada a la predicción
    (2007) Vélez Serrano, Daniel; Quesada Paloma, Vicente
    En esta tesis se proponen metodologías que utilizan la Teoría de Cópulas con fines predictivos abordando, como aplicación práctica, la predicción a corto y medio plazo de la demanda de gas natural en Madrid. En ambos casos, el proceso parte de una predicción que no tiene en cuenta la influencia de la climatología sobre el consumo doméstico, y utiliza funciones cópula para estimar la desviación esperada de dicha previsión ante distintos escenarios configurados por los valores de variables de temperatura. A medio plazo, donde el objetivo es predecir el valor diario máximo que se puede esperar para el consumo durante los dos próximos años (valor pico, según la terminología energética), la predicción inicial es realizada con un modelo lineal que utiliza como regresor el comportamiento cíclico anual de la serie identificado mediante técnicas de suavizado de curvas (wavelets, splines de regresión,). Posteriormente, las funciones cópulas son empleadas para simular la distribución del incremento máximo esperado para la demanda ante una situación meteorológica especialmente adversa como es por ejemplo una ola de frío. A corto plazo (diario), se plantea un algoritmo iterativo que parte del proceso residual resultante de un ARIMA ajustado únicamente en función del histórico de demanda, y suple el empleo de modelos de función de transferencia por el de cópulas, para explicar la influencia del factor térmico. La selección de la función cópula que mejor define la relación de dependencia demanda/temperatura se establece de acuerdo a un test de bondad de ajuste de distribuciones basado en el estadístico de Pearson. Dentro de un contexto teórico, ante la posibilidad de que el contraste no permita decantarse por ninguna de las familias de cópulas candidatas, se sugiere un método de construcción de cópulas empíricas y no paramétricas que, respecto de la expresión de Pearson, presentan un valor óptimo.