Portfolio Optimization with Deep Reinforcement Learning: Dynamic Strategy Allocation and Execution under Realistic Transaction Costs

Classical portfolio optimization models, such as Markowitz’s mean-variance and the extensions that followed, like Capital Asset Pricing Model (CAPM) and multi-factor models, commonly rely on returns normality assumption and frictionless market condition. Although they theoretically provide reliable results, they fail to capture the financial market complexities. In practice returns are non-stationary, highly volatile, markets regime shift abruptly and and portfolio profitability is highly impacted by transaction costs. These limitations call for the development of a data-driven approach. This thesis addresses the Problem of the dynamic portfolio optimization under realistic market conditions, incorporating transaction costs, drawdown constraints and asset dependencies. The primary objective is the development and training of a deep reinforcement learning model that selects a strategy among a set of classical investment technique and continuously adjust portfolio allocations, The proposed design implements PPO advanced with a transformer based feature extractor to capture temporal nonlinear dependencies within assets and macroeconomic factors in a hybrid action space. The model simulates a near-real-world experience through incorporating a transaction cost model, risk constraints, liquidity constraints, and drawdown penalties in the reward function based on the differential Sharpe Ratio.Moreover, we use a Walk-Forward-Analysis with rolling train, validation, and test windows to ensure reliable performance in out-of-sample evaluation. Finally, we benchmark our model’s performance against SPY for evaluation. Empirical results on US assets from 1999 to 2023 demonstrated that the proposed framework faces limitations in achieving high risk-adjusted returns with consistent performance across multiple market regimes.The results demonstrate the limitations of deep reinforcement learning for portfolio management. Future research may extend the framework to multi-policy agents and explore alternative reward designs for more stable training.

Portfolio Optimization with Deep Reinforcement Learning: Dynamic Strategy Allocation and Execution under Realistic Transaction Costs

ZOUAOUI, LINA FATMA

2024/2025

Abstract

Classical portfolio optimization models, such as Markowitz’s mean-variance and the extensions that followed, like Capital Asset Pricing Model (CAPM) and multi-factor models, commonly rely on returns normality assumption and frictionless market condition. Although they theoretically provide reliable results, they fail to capture the financial market complexities. In practice returns are non-stationary, highly volatile, markets regime shift abruptly and and portfolio profitability is highly impacted by transaction costs. These limitations call for the development of a data-driven approach. This thesis addresses the Problem of the dynamic portfolio optimization under realistic market conditions, incorporating transaction costs, drawdown constraints and asset dependencies. The primary objective is the development and training of a deep reinforcement learning model that selects a strategy among a set of classical investment technique and continuously adjust portfolio allocations, The proposed design implements PPO advanced with a transformer based feature extractor to capture temporal nonlinear dependencies within assets and macroeconomic factors in a hybrid action space. The model simulates a near-real-world experience through incorporating a transaction cost model, risk constraints, liquidity constraints, and drawdown penalties in the reward function based on the differential Sharpe Ratio.Moreover, we use a Walk-Forward-Analysis with rolling train, validation, and test windows to ensure reliable performance in out-of-sample evaluation. Finally, we benchmark our model’s performance against SPY for evaluation. Empirical results on US assets from 1999 to 2023 demonstrated that the proposed framework faces limitations in achieving high risk-adjusted returns with consistent performance across multiple market regimes.The results demonstrate the limitations of deep reinforcement learning for portfolio management. Future research may extend the framework to multi-policy agents and explore alternative reward designs for more stable training.

Scheda breve

Scheda completa

Scheda completa (DC)

	Corso di studio
	
				DATA ANALYTICS FOR BUSINESS AND SOCIETY
			
	Anno Accademico
	
				2024
			
	Relatore
	
				CORAZZA, MARCO
			
	Appare nelle tipologie:
	
				Laurea magistrale

File in questo prodotto:

File	Dimensione	Formato
Portfolio_Optimization_with_Deep_Reinforcement_Learning__Dynamic_Strategy_Allocation_and_Execution_under_Realistic_Transaction_Costs_MASTER_THESIS finalFORMAT.pdf non disponibili Dimensione 1.39 MB Formato Adobe PDF	1.39 MB	Adobe PDF

I documenti in UNITESI sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14247/28294