García Sánchez, Carlos

Profile Picture
First Name
Last Name
García Sánchez
Universidad Complutense de Madrid
Faculty / Institute
Arquitectura de Computadores y Automática
Arquitectura y Tecnología de Computadores
UCM identifierORCIDScopus Author IDWeb of Science ResearcherIDDialnet IDGoogle Scholar ID

Search Results

Now showing 1 - 10 of 11
  • Publication
    NMF-mGPU: non-negative matrix factorization on multi-GPU systems
    (Biomed Central LTD, 2015-02-13) Mejía Roa, Edgardo; Tabas Madrid, Daniel; Setoain Rodrigo, Javier; García Sánchez, Carlos; Tirado Fernández, Francisco; Pascual Montano, Alberto
    Background: In the last few years, the Non-negative Matrix Factorization (NMF) technique has gained a great interest among the Bioinformatics community, since it is able to extract interpretable parts from high-dimensional datasets. However, the computing time required to process large data matrices may become impractical, even for a parallel application running on a multiprocessors cluster. In this paper, we present NMF-mGPU, an efficient and easy-to-use implementation of the NMF algorithm that takes advantage of the high computing performance delivered by Graphics-Processing Units (GPUs). Driven by the ever-growing demands from the video-games industry, graphics cards usually provided in PCs and laptops have evolved from simple graphics-drawing platforms into high-performance programmable systems that can be used as coprocessors for linear-algebra operations. However, these devices may have a limited amount of on-board memory, which is not considered by other NMF implementations on GPU. Results: NMF-mGPU is based on CUDA (Compute Unified Device Architecture), the NVIDIA's framework for GPU computing. On devices with low memory available, large input matrices are blockwise transferred from the system's main memory to the GPU's memory, and processed accordingly. In addition, NMF-mGPU has been explicitly optimized for the different CUDA architectures. Finally, platforms with multiple GPUs can be synchronized through MPI (Message Passing Interface). In a four-GPU system, this implementation is about 120 times faster than a single conventional processor, and more than four times faster than a single GPU device (i.e., a super-linear speedup). Conclusions: Applications of GPUs in Bioinformatics are getting more and more attention due to their outstanding performance when compared to traditional processors. In addition, their relatively low price represents a highly cost-effective alternative to conventional clusters. In life sciences, this results in an excellent opportunity to facilitate the daily work of bioinformaticians that are trying to extract biological meaning out of hundreds of gigabytes of experimental information. NMF-mGPU can be used "out of the box" by researchers with little or no expertise in GPU programming in a variety of platforms, such as PCs, laptops, or high-end GPU clusters. NMF-mGPU is freely available at
  • Publication
    A low cost matching motion estimation sensor based on the NIOS II microprocessor.
    (MDPI AG, 2012-10) González, Diego; Botella Juan, Guillermo; Meyer Baese, Uwe; García Sánchez, Carlos; Sanz, Concepción; Prieto Matías, Manuel; Tirado Fernández, Francisco
    Medical imaging has become an absolutely essential diagnostic tool for clinical practices; at present, pathologies can be detected with an earliness never before known. Its use has not only been relegated to the field of radiology but also, increasingly, to computer-based imaging processes prior to surgery. Motion analysis, in particular, plays an important role in analyzing activities or behaviors of live objects in medicine. This short paper presents several low-cost hardware implementation approaches for the new generation of tablets and/or smartphones for estimating motion compensation and segmentation in medical images. These systems have been optimized for breast cancer diagnosis using magnetic resonance imaging technology with several advantages over traditional X-ray mammography, for example, obtaining patient information during a short period. This paper also addresses the challenge of offering a medical tool that runs on widespread portable devices, both on tablets and/or smartphones to aid in patient diagnostics.
  • Publication
    Prácticas adaptadas inclusivas para el desarrollo de tecnologías emergentes en docencia Tic (Reconnet)
    (2019-08-01) Botella Juan, Guillermo; Barrio García, Alberto Antonio del; García Sánchez, Carlos; Clemente Barreira, Juan Antonio; Bernabé García, Sergio; Roa Romero, Carlos; Ahmed Fahmy Amin, Hesham; Ezquerro Rodríguez, José Miguel; Cao García, Francisco Javier; Sierra López, Ángel
    Los objetivos alcanzados en el proyecto han sido: I) Continuar la preparación del framework de desarrollo rápido de código VHDL sobre FPGAs de Altera a partir de código Matlab y del entorno gráfico Simulink realizado en el proyecto ALLIANCE (PID 217 CONV. 16/17) útil y ACCESIBLE para diseño rápido de aplicaciones en asignaturas de ciencia y tecnología. II) Aplicar el propio entorno semiautomático dentro de asignaturas específicas de diseño hardware como Sist. Empotrados Distribuidos, analizando el compromiso de la curva de aprendizaje frente a la optimalidad de la solución y comparándola para estas asignaturas frente a técnicas tradicionales. III) Adaptación de material inclusivo para usar como caso de estudio inmediato asignaturas relacionadas en el Máster de Secundaria (Especialidad en Informática y Tecnología) por cercanía de contenidos y afinidad técnica.
  • Publication
    Framework para el desArroLLo ágIl de código RTL orientAdo a programadores software, ingeNieros y CiEntíficos (ALLIANCE)
    (2018-09-28) Botella Juan, Guillermo; Barrio García, Alberto Antonio del; Recas Piorno, Joaquín; García Sánchez, Carlos; Ezquerro Rodríguez, José Miguel; Roa Romero, Carlos; Fariña Fernández, Daniel; López Alonso, José Manuel; Cao García, Francisco Javier; Sierra López, Ángel
    Las FPGAs en el contexto científico tienen una gran importancia debido a su capacidad de procesamiento paralelo, sus características de bajo coste y bajo consumo como plataformas de aceleración. Compañias bien conocidas y establecidas en el desarrollo de herramientas científicas como Mathworks han desarrollado herramientas de síntesis de alto nivel que acelera el proceso de diseño. El siguiente proyecto de Innovación docente hace uso de estas herramientas para desarrollar una plataforma de aceleración útil en contextos educativos (específicamente en asignaturas científico técnicas impartidas en la UCM). Se genera un entorno basado en el paradigma FPGA-in-the-Loop mediante Simulink que permite ejecutar los modelos directamente en la FPGA. Se presentarán resultados que darán cuenta de precisión, performance, asimismo se establecerán comparativas respecto a modelos usando un paradigma de desarrollo clásico.
  • Publication
    Fast-Coding Robust Motion Estimation Model in a GPU
    (2015-02-10) García Sánchez, Carlos; Botella Juan, Guillermo; Sande, Francisco de; Prieto Matías, Manuel
    Nowadays vision systems are used with countless purposes. Moreover, the motion estimation is a discipline that allow to extract relevant information as pattern segmentation, 3D structure or tracking objects. However, the real-time requirements in most applications has limited its consolidation, considering the adoption of high performance systems to meet response times. With the emergence of so-called highly parallel devices known as accelerators this gap has narrowed. Two extreme endpoints in the spectrum of most common accelerators are Field Programmable Gate Array (FPGA) and Graphics Processing Systems (GPU), which usually offer higher performance rates than general propose processors. Moreover, the use of GPUs as accelerators involves the efficient exploitation of any parallelism in the target application. This task is not easy because performance rates are affected by many aspects that programmers should overcome. In this paper, we evaluate OpenACC standard, a programming model with directives which favors porting any code to a GPU in the context of motion estimation application. The results confirm that this programming paradigm is suitable for this image processing applications achieving a very satisfactory acceleration in convolution based problems as in the well-known Lucas & Kanade method.
  • Publication
    Customized Nios II multi-cycle instructions to accelerate block-matching techniques
    (SPIE, 2015-02-27) González, Diego; Botella Juan, Guillermo; García Sánchez, Carlos; Meyer Bäse, Anke; Meyer Bäse, Uwe; Prieto Matías, Manuel
    This study focuses on accelerating the optimization of motion estimation algorithms, which are widely used in video coding standards, by using both the paradigm based on Altera Custom Instructions as well as the efficient combination of SDRAM and On-Chip memory of Nios II processor. Firstly, a complete code profiling is carried out before the optimization in order to detect time leaking affecting the motion compensation algorithms. Then, a multi-cycle Custom Instruction which will be added to the specific embedded design is implemented. The approach deployed is based on optimizing SOC performance by using an efficient combination of On-Chip memory and SDRAM with regards to the reset vector, exception vector, stack, heap, read/write data (.rwdata), read only data (.rodata), and program text (.text) in the design. Furthermore, this approach aims to enhance the said algorithms by incorporating Custom Instructions in the Nios II ISA. Finally, the efficient combination of both methods is then developed to build the final embedded system. The present contribution thus facilitates motion coding for low-cost Soft-Core microprocessors, particularly the RISC architecture of Nios II implemented in FPGA. It enables us to construct an SOC which processes 50×50 @ 180 fps.
  • Publication
    Enseñanza de coMputación cuántica Práctica pAra esTudiantes de Informática: Arquitectura y programación (EMPATIA)
    (2021-10) Botella Juan, Guillermo; Del Barrio García, Alberto Antonio; Carrascal de la Heras, Ginés; García Sánchez, Carlos; Murillo Montero, Raúl; García Moreno, Daniel; Fahmy Amin, Hesham Ahmed; Mas Aguilar, Juan; Roa Romero, Carlos; Sierra López, Angel
    Plataforma de simulación y computación cuántica basada en hardware de bajo coste y tecnología de contenedores con posibilidad de ejecuciones en la nube. También metodología docente para la primera asignatura en UCM de Computación Cuántica práctica. "Arquitectura y Programación de Computadores Cuánticos" perteneciente a la Facultad de Informática.
  • Publication
    Herramientas para el diseño y gestión de Guías Docentes digitales
    (2021-03-09) García Payo, María del Carmen; Aranda Iriarte, José Ignacio; Franco Peláez, Francisco Javier; Tenllado Van Der Reijden, Christian Tomás; García Sánchez, Carlos; Gómez Pérez, José Ignacio; Riveira Martín, Mercedes del Carmen; Sanmartino Rodríguez, Julio Antonio; Payo Rubio, Marina; Pino Hernández, Javier; Díaz Núñez, Guillermo Jesús; Villar Serrano, Daniel
    El objetivo de este proyecto es elaborar una herramienta web que permita a los profesores actualizar las fichas docentes de su asignatura de forma online mediante un formulario web, almacenando la información de las guías en una base de datos, de modo que el sistema señale los cambios realizados, gestione el acceso y permisos de los usuarios, y permita exportar y generar las fichas de las asignaturas en diversos formatos respetando los apartados y condiciones de la Memoria de Verificación (VERIFICA).
  • Publication
    Evaluation of Intel's DPC++ Compatibility Tool in heterogeneous computing
    (Elsevier, 2022-04-08) Castaño Roldán, Germán; Faqir-Rhazoui, Youssef; García Sánchez, Carlos; Prieto Matías, Manuel
    The Intel DPC++ Compatibility Tool is a component of the Intel oneAPI Base Toolkit. This tool automatically transforms CUDA code into Data Parallel C++ (DPC++), thus assisting in the migration process. DPC++ is an implementation of the programming standard for heterogeneous computing known as SYCL, which unifies the development of parallel applications on CPUs, GPUs or even FPGAs. This paper analyzes the DPC++ Compatibility Tool by considering the manual intervention required and the problems encountered while migrating the Rodinia benchmarks. For this suite, this tool achieves an impressive rate of almost 87% for code successfully migrated. Moreover, a comparative study of the performance obtained by the migrated code was carried out, showing a moderate overhead in most of the migrated examples. Finally, a performance comparison on different devices was also performed.
  • Publication
    Funcionamiento de la herramienta OpenIRS-UCM y sus sinergias con Moodle
    (Universidad Complutense de Madrid, 2012) García Sánchez, Carlos; Castro Rodríguez, Fernando; Chaver Martínez, Daniel Ángel; Tenllado Van Der Reijden, Christian; Gómez Pérez, José Ignacio; López Orozco, José Antonio; Piñuel Moreno, Luis
    Los sistemas de respuesta interactiva han ido ganando aceptación dentro de la comunidad educativa en los últimos años y una prueba clara de ello es el número creciente de los sistemas comerciales disponibles hoy en el mercado. Sin embargo, la mayoría de las soluciones se basan en sistemas que están cerrados, son rígidos y dependientes del software instalado en el computador del profesor. Presentamos en este trabajo una nueva herramienta gratuita que hemos denominado OpenIRS-UCM que incorpora la mayoría de las funcionalidades de las aplicaciones comerciales con la ventaja de integrar varios tipos de mandos comerciales con otros dispositivos como smartphones, PDAs, portátiles, etc. Además, permite interactuar con la plataforma del campus virtual de Moodle incrementando exponencialmente sus posibilidades de uso.