<?xml version="1.0" encoding="UTF-8"?><?xml-stylesheet type="text/xsl" href="static/style.xsl"?><OAI-PMH xmlns="http://www.openarchives.org/OAI/2.0/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/ http://www.openarchives.org/OAI/2.0/OAI-PMH.xsd"><responseDate>2026-06-29T15:09:12Z</responseDate><request verb="GetRecord" identifier="oai:docta.ucm.es:20.500.14352/133698" metadataPrefix="oai_dc">https://docta.ucm.es/rest/oai/request</request><GetRecord><record><header><identifier>oai:docta.ucm.es:20.500.14352/133698</identifier><datestamp>2026-03-03T00:57:58Z</datestamp><setSpec>com_20.500.14352_14</setSpec><setSpec>col_20.500.14352_15</setSpec></header><metadata><oai_dc:dc xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:doc="http://www.lyncode.com/xoai" xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/ http://www.openarchives.org/OAI/2.0/oai_dc.xsd">
   <dc:title>Hybrid reward-driven reinforcement learning for efficient quantum circuit synthesis</dc:title>
   <dc:creator>Giordano, Sara</dc:creator>
   <dc:creator>Sen, Kornikar</dc:creator>
   <dc:creator>Martín-Delgado Alcántara, Miguel Ángel</dc:creator>
   <dc:subject>004.27</dc:subject>
   <dc:subject>004.85</dc:subject>
   <dc:subject>530.145</dc:subject>
   <dc:subject>Circuit depth</dc:subject>
   <dc:subject>Circuit optimization</dc:subject>
   <dc:subject>Quantum circuits</dc:subject>
   <dc:subject>Reinforcement learning</dc:subject>
   <dc:subject>Informática (Informática)</dc:subject>
   <dc:subject>Teoría de los quanta</dc:subject>
   <dc:subject>Inteligencia artificial (Informática)</dc:subject>
   <dc:subject>1203.04 Inteligencia Artificial</dc:subject>
   <dc:subject>2212.12 Teoría Cuántica de Campos</dc:subject>
   <dc:subject>1203.02 Lenguajes Algorítmicos</dc:subject>
   <dc:description>© The Author(s) 2026.
Next Generation EU PRTR-C17.
W911NF-14-1-0103.</dc:description>
   <dc:description>A reinforcement learning (RL) framework is introduced for the efficient synthesis of quantum circuits that generate specified target quantum states from a fixed initial state, addressing a central challenge in both the Noisy Intermediate-Scale Quantum (NISQ) era and future fault-tolerant quantum computing. The approach utilizes tabular Q-learning, based on action sequences, within a discretized quantum state space, to effectively manage the exponential growth of the space dimension. The framework introduces a hybrid reward mechanism, combining a static, domain-informed reward that guides the agent toward the target state with customizable dynamic penalties that discourage inefficient circuit structures such as gate congestion and redundant state revisits. This is a circuit-aware reward, in contrast to the current trend of works on this topic, which are primarily fidelity-based. By leveraging sparse matrix representations and state-space discretization, the method enables practical navigation of high-dimensional environments while minimizing computational overhead. Benchmarking on graph-state preparation tasks for up to seven qubits, we demonstrate that the algorithm consistently discovers minimal-depth circuits with optimized gate counts. Moreover, extending the framework to a universal gate set still yields low depth circuits, highlighting the algorithm’s robustness and adaptability. The results confirm that this RL-driven approach, with our completely circuit-aware method, efficiently explores the complex quantum state space and synthesizes near-optimal quantum circuits, providing a resource-efficient foundation for quantum circuit optimization.</dc:description>
   <dc:description>Ministerio de Ciencia e Innovación (España)</dc:description>
   <dc:description>Agencia Estatal de Investigación</dc:description>
   <dc:description>European Comission</dc:description>
   <dc:description>Comunidad de Madrid</dc:description>
   <dc:description>Ministerio de Transformación Digital y de la Función Pública (España)</dc:description>
   <dc:description>U.S. Army Research Office</dc:description>
   <dc:description>Depto. de Física Teórica</dc:description>
   <dc:description>Fac. de Ciencias Físicas</dc:description>
   <dc:description>TRUE</dc:description>
   <dc:description>pub</dc:description>
   <dc:date>2026-03-02T18:05:38Z</dc:date>
   <dc:date>2026-03-02T18:05:38Z</dc:date>
   <dc:date>2026-02-03</dc:date>
   <dc:type>journal article</dc:type>
   <dc:type>VoR</dc:type>
   <dc:identifier>https://hdl.handle.net/20.500.14352/133698</dc:identifier>
   <dc:identifier>2524-4906</dc:identifier>
   <dc:identifier>10.1007/s42484-026-00359-8</dc:identifier>
   <dc:identifier>2524-4914</dc:identifier>
   <dc:language>eng</dc:language>
   <dc:relation>info:eu-repo/grantAgreement/AEI/Plan Estatal de Investigación Científica y Técnica y de Innovación 2021-2023/PID2021-122547NB-I00/ES/TECNOLOGIAS CLAVE PARA COMPUTACION CUANTICA/</dc:relation>
   <dc:relation>TEC-2024/COM-84 QUITEMAD-CM</dc:relation>
   <dc:relation>Giordano, Sara, et al. «Hybrid Reward-Driven Reinforcement Learning for Efficient Quantum Circuit Synthesis». Quantum Machine Intelligence, vol. 8, n.o 1, junio de 2026, p. 9. DOI.org (Crossref), https://doi.org/10.1007/s42484-026-00359-8.</dc:relation>
   <dc:rights>Attribution-NonCommercial-NoDerivatives 4.0 International</dc:rights>
   <dc:rights>http://creativecommons.org/licenses/by-nc-nd/4.0/</dc:rights>
   <dc:rights>open access</dc:rights>
   <dc:format>application/pdf</dc:format>
   <dc:publisher>Springer</dc:publisher>
</oai_dc:dc></metadata></record></GetRecord></OAI-PMH>