Hybrid reward-driven reinforcement learning for efficient quantum circuit synthesis
| dc.contributor.author | Giordano, Sara | |
| dc.contributor.author | Sen, Kornikar | |
| dc.contributor.author | Martín-Delgado Alcántara, Miguel Ángel | |
| dc.date.accessioned | 2026-03-02T18:05:38Z | |
| dc.date.available | 2026-03-02T18:05:38Z | |
| dc.date.issued | 2026-02-03 | |
| dc.description | © The Author(s) 2026. Next Generation EU PRTR-C17. W911NF-14-1-0103. | |
| dc.description.abstract | A reinforcement learning (RL) framework is introduced for the efficient synthesis of quantum circuits that generate specified target quantum states from a fixed initial state, addressing a central challenge in both the Noisy Intermediate-Scale Quantum (NISQ) era and future fault-tolerant quantum computing. The approach utilizes tabular Q-learning, based on action sequences, within a discretized quantum state space, to effectively manage the exponential growth of the space dimension. The framework introduces a hybrid reward mechanism, combining a static, domain-informed reward that guides the agent toward the target state with customizable dynamic penalties that discourage inefficient circuit structures such as gate congestion and redundant state revisits. This is a circuit-aware reward, in contrast to the current trend of works on this topic, which are primarily fidelity-based. By leveraging sparse matrix representations and state-space discretization, the method enables practical navigation of high-dimensional environments while minimizing computational overhead. Benchmarking on graph-state preparation tasks for up to seven qubits, we demonstrate that the algorithm consistently discovers minimal-depth circuits with optimized gate counts. Moreover, extending the framework to a universal gate set still yields low depth circuits, highlighting the algorithm’s robustness and adaptability. The results confirm that this RL-driven approach, with our completely circuit-aware method, efficiently explores the complex quantum state space and synthesizes near-optimal quantum circuits, providing a resource-efficient foundation for quantum circuit optimization. | |
| dc.description.department | Depto. de Física Teórica | |
| dc.description.faculty | Fac. de Ciencias Físicas | |
| dc.description.refereed | TRUE | |
| dc.description.sponsorship | Ministerio de Ciencia e Innovación (España) | |
| dc.description.sponsorship | Agencia Estatal de Investigación | |
| dc.description.sponsorship | European Comission | |
| dc.description.sponsorship | Comunidad de Madrid | |
| dc.description.sponsorship | Ministerio de Transformación Digital y de la Función Pública (España) | |
| dc.description.sponsorship | U.S. Army Research Office | |
| dc.description.status | pub | |
| dc.identifier.citation | Giordano, Sara, et al. «Hybrid Reward-Driven Reinforcement Learning for Efficient Quantum Circuit Synthesis». Quantum Machine Intelligence, vol. 8, n.o 1, junio de 2026, p. 9. DOI.org (Crossref), https://doi.org/10.1007/s42484-026-00359-8. | |
| dc.identifier.doi | 10.1007/s42484-026-00359-8 | |
| dc.identifier.essn | 2524-4914 | |
| dc.identifier.issn | 2524-4906 | |
| dc.identifier.officialurl | https://dx.doi.org/10.1007/s42484-026-00359-8 | |
| dc.identifier.relatedurl | https://link-springer-com.bucm.idm.oclc.org/article/10.1007/s42484-026-00359-8 | |
| dc.identifier.uri | https://hdl.handle.net/20.500.14352/133698 | |
| dc.issue.number | 1 | |
| dc.journal.title | Quantum Machine Intelligence | |
| dc.language.iso | eng | |
| dc.page.final | 9-19 | |
| dc.page.initial | 9-1 | |
| dc.publisher | Springer | |
| dc.relation.projectID | info:eu-repo/grantAgreement/AEI/Plan Estatal de Investigación Científica y Técnica y de Innovación 2021-2023/PID2021-122547NB-I00/ES/TECNOLOGIAS CLAVE PARA COMPUTACION CUANTICA/ | |
| dc.relation.projectID | TEC-2024/COM-84 QUITEMAD-CM | |
| dc.rights | Attribution-NonCommercial-NoDerivatives 4.0 International | en |
| dc.rights.accessRights | open access | |
| dc.rights.uri | http://creativecommons.org/licenses/by-nc-nd/4.0/ | |
| dc.subject.cdu | 004.27 | |
| dc.subject.cdu | 004.85 | |
| dc.subject.cdu | 530.145 | |
| dc.subject.keyword | Circuit depth | |
| dc.subject.keyword | Circuit optimization | |
| dc.subject.keyword | Quantum circuits | |
| dc.subject.keyword | Reinforcement learning | |
| dc.subject.ucm | Informática (Informática) | |
| dc.subject.ucm | Teoría de los quanta | |
| dc.subject.ucm | Inteligencia artificial (Informática) | |
| dc.subject.unesco | 1203.04 Inteligencia Artificial | |
| dc.subject.unesco | 2212.12 Teoría Cuántica de Campos | |
| dc.subject.unesco | 1203.02 Lenguajes Algorítmicos | |
| dc.title | Hybrid reward-driven reinforcement learning for efficient quantum circuit synthesis | |
| dc.type | journal article | |
| dc.type.hasVersion | VoR | |
| dc.volume.number | 8 | |
| dspace.entity.type | Publication | |
| relation.isAuthorOfPublication | 1cfed495-7729-410a-b898-8196add14ef6 | |
| relation.isAuthorOfPublication.latestForDiscovery | 1cfed495-7729-410a-b898-8196add14ef6 |
Download
Original bundle
1 - 1 of 1
Loading...
- Name:
- Quantum Mach. Intell. 8, 9 (2026).pdf
- Size:
- 4.75 MB
- Format:
- Adobe Portable Document Format


