Aplicación de técnicas de machine learning para predecir el consumo de gas en transacciones Ethereum
Loading...
Official URL
Full text at PDC
Publication date
2024
Advisors (or tutors)
Editors
Journal Title
Journal ISSN
Volume Title
Publisher
Citation
Abstract
Para la ejecución de cualquier transacción en Ethereum, se debe establecer la cantidad de gas que puede ser consumido por la transacción durante su ejecución, como si de combustible se tratase. Diariamente numerosas transacciones fallan por establecer una cantidad de gas inferior a la necesaria para la correcta ejecución. El compilador de Solidity es capaz de inferir una cota superior sobre el consumo de gas de los métodos públicos de un contrato. Sin embargo, en ocasiones el compilador devuelve “infinito” como valor, debido a la complejidad de los métodos analizados, resultando una información poco útil para el usuario. La falta de una cota superior informada, da lugar a estos posibles errores por el desconocimiento de la cantidad de gas que consumirá dicha ejecución. Teniendo en cuenta que la cantidad de gas que consume de una transacción está especificado por un modelo de coste bien definido, es por tanto predecible y no depende de la fluctuación del mercado, ni de la saturación de la red. El objetivo de este TFG es aplicar técnicas de machine learning para inferir el consumo de gas necesario para ejecutar una determinada transacción. Para ello, se han desarrollado y evaluado distintos modelos de aprendizaje automático para cada función pública de los 100 contratos más ejecutados desde 2023, tomando como parámetro el input de la transacción. A su vez se ha realizado un análisis del bytecode de un contrato simple para las funciones con los diferentes costes distinguidos en las evaluaciones: constante, en función del parámetro de entrada, en función del tamaño de la entrada y accediendo al storage. Obteniendo una función de coste a partir del grafo de control de flujo. La escritura en storage supone una incertidumbre que propicia que el consumo de gas no sea predecible únicamente considerando los valores de los argumentos con los que se ha llamado a la transacción, pues depende del estado del storage en el momento de ejecutar de la transacción. Por otro lado, para los casos en los que no interviene el storage es posible aproximar una función de coste a partir de los parámetros de entrada, no dependiendo de ningún agente externo.
For the execution of any transaction in Ethereum, the amount of gas that can be consumed by the transaction during its execution must be specified, as if it were fuel. Numerous transactions fail every day because of setting an amount of gas lower than the one required for their correct execution. The Solidity compiler is able to infer an upper bound on the gas consumption of a contract is public methods. However, sometimes the compiler returns “infinity” as a value, due to the complexity of the analyzed methods, resulting in unhelpful information for the user. The lack of an informed upper bound, gives rise to these possible errors due to the ignorance of the the amount of gas that will be consumed by the execution. Thus, taking into account that the amount of gas consumed by a transaction is specified by a well-defined cost model, it is therefore predictable and does not depend on market fluctuations or network congestion. The objective of this TFG is to apply machine learning techniques to infer the gas consumption required to execute a given transaction. For this purpose, different machine learning models have been developed and evaluated for each public function of the 100 most used contracts deployed on the blockchain since 2023, taking as a parameter the transaction input. At the same time, an analysis of the bytecode of a simple contract has been conducted for the following functions with distinguished costs in the evaluations: constant, based on the input parameter, based on the size of the input, and accessing storage. A cost function has been derived from the control flow graph. Writing to storage involves an uncertainty that makes gas consumption unpredictable when considering only the values of the arguments used to call the transaction, since it depends on the state of the storage at the time the transaction is executed. On the other hand, for cases where the storage is not involved, it is possible to approximate a cost function from the input parameters since they do not depend on any external.
For the execution of any transaction in Ethereum, the amount of gas that can be consumed by the transaction during its execution must be specified, as if it were fuel. Numerous transactions fail every day because of setting an amount of gas lower than the one required for their correct execution. The Solidity compiler is able to infer an upper bound on the gas consumption of a contract is public methods. However, sometimes the compiler returns “infinity” as a value, due to the complexity of the analyzed methods, resulting in unhelpful information for the user. The lack of an informed upper bound, gives rise to these possible errors due to the ignorance of the the amount of gas that will be consumed by the execution. Thus, taking into account that the amount of gas consumed by a transaction is specified by a well-defined cost model, it is therefore predictable and does not depend on market fluctuations or network congestion. The objective of this TFG is to apply machine learning techniques to infer the gas consumption required to execute a given transaction. For this purpose, different machine learning models have been developed and evaluated for each public function of the 100 most used contracts deployed on the blockchain since 2023, taking as a parameter the transaction input. At the same time, an analysis of the bytecode of a simple contract has been conducted for the following functions with distinguished costs in the evaluations: constant, based on the input parameter, based on the size of the input, and accessing storage. A cost function has been derived from the control flow graph. Writing to storage involves an uncertainty that makes gas consumption unpredictable when considering only the values of the arguments used to call the transaction, since it depends on the state of the storage at the time the transaction is executed. On the other hand, for cases where the storage is not involved, it is possible to approximate a cost function from the input parameters since they do not depend on any external.
Description
Trabajo de Fin de Grado en Ingeniería Informática, Facultad de Informática UCM, Departamento de Sistemas Informáticos y Computación, Curso 2023/2024.
Todos los materiales desarrollados por los integrantes del trabajo se encuentran disponibles para su uso y visualización en la plataforma GitHub bajo el siguiente enlace:
https://github.com/Charlisteeron/TFG.