This thesis investigates the advancement and utilization of a Deep Q-Network (DQN) incorporating a Long Short-Term Memory (LSTM) feature extractor for algorithmic trading. The suggested model seeks to identify temporal connections in financial time series data and improve decision-making in stock trading. We utilize LSTM to extract useful features from the time series and implement DQN to acquire effective trading strategies via reinforcement learning. The design combines the DQN’s capacity to learn optimal policies with LSTM’s proficiency in managing sequential data, allowing the model to make more educated trading decisions. The methodology incorporates experience replay and employs two neural networks, one for online learning and another for target Q-values, to ensure training stability. Hyperparameter tuning is conducted with Optuna, and the model is optimized utilizing the Adam optimizer, incorporating Kaiming Normal weight initialization and layer normalization in the LSTM. We examine two reward functions, focusing not only on performance but also on the agent’s risk aversion. The methodology is assessed across different asset classes, including the S&P 500, gold, and specific stocks such as Disney and Intel, utilizing performance indicators such as the Sharpe Ratio, Sortino Ratio, and Maximum Drawdown for evaluation. The model showed promising results, being able to generate profits; however, not consistently. This thesis continues the past research on a hybrid architecture that integrates advanced reinforcement learning with time series feature extraction, offering novel insights into the capabilities of deep learning models for financial trading.
Deep Q-Networks with a LSTM feature extractor for Algorithmic Trading
Telatin, Giuseppe
2024/2025
Abstract
This thesis investigates the advancement and utilization of a Deep Q-Network (DQN) incorporating a Long Short-Term Memory (LSTM) feature extractor for algorithmic trading. The suggested model seeks to identify temporal connections in financial time series data and improve decision-making in stock trading. We utilize LSTM to extract useful features from the time series and implement DQN to acquire effective trading strategies via reinforcement learning. The design combines the DQN’s capacity to learn optimal policies with LSTM’s proficiency in managing sequential data, allowing the model to make more educated trading decisions. The methodology incorporates experience replay and employs two neural networks, one for online learning and another for target Q-values, to ensure training stability. Hyperparameter tuning is conducted with Optuna, and the model is optimized utilizing the Adam optimizer, incorporating Kaiming Normal weight initialization and layer normalization in the LSTM. We examine two reward functions, focusing not only on performance but also on the agent’s risk aversion. The methodology is assessed across different asset classes, including the S&P 500, gold, and specific stocks such as Disney and Intel, utilizing performance indicators such as the Sharpe Ratio, Sortino Ratio, and Maximum Drawdown for evaluation. The model showed promising results, being able to generate profits; however, not consistently. This thesis continues the past research on a hybrid architecture that integrates advanced reinforcement learning with time series feature extraction, offering novel insights into the capabilities of deep learning models for financial trading.File | Dimensione | Formato | |
---|---|---|---|
888060-1274133.pdf
accesso aperto
Tipologia:
Altro materiale allegato
Dimensione
1.48 MB
Formato
Adobe PDF
|
1.48 MB | Adobe PDF | Visualizza/Apri |
I documenti in UNITESI sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.
https://hdl.handle.net/20.500.14247/23392