Skip to content

0xdarkman/Deep-Reinforcement-Learning-for-Automated-Stock-Trading-Ensemble-Strategy-ICAIF-2020

 
 

Repository files navigation

Deep Reinforcement Learning for Automated Stock Trading: An Ensemble Strategy

This repository refers to the codes for ICAIF 2020 paper.

Abstract

Stock trading strategies play a critical role in investment. However, it is challenging to design a profitable strategy in a complex and dynamic stock market. In this paper, we propose a deep ensemble reinforcement learning scheme that automatically learns a stock trading strategy by maximizing investment return. We train a deep reinforcement learning agent and obtain an ensemble trading strategy using the three actor-critic based algorithms: Proximal Policy Optimization (PPO), Advantage Actor Critic (A2C), and Deep Deterministic Policy Gradient (DDPG). The ensemble strategy inherits and integrates the best features of the three algorithms, thereby robustly adjusting to different market conditions. In order to avoid the large memory consumption in training networks with continuous action space, we employ a load-on-demand approach for processing very large data. We test our algorithms on the 30 Dow Jones stocks which have adequate liquidity. The performance of the trading agent with different reinforcement learning algorithms is evaluated and compared with both the Dow Jones Industrial Average index and the traditional min-variance portfolio allocation strategy. The proposed deep ensemble scheme is shown to outperform the three individual algorithms and the two baselines in terms of the risk-adjusted return measured by the Sharpe ratio.

Reference

Hongyang Yang, Xiao-Yang Liu, Shan Zhong, and Anwar Walid. 2020. Deep Reinforcement Learning for Automated Stock Trading: An Ensemble Strategy. In ICAIF ’20: ACM International Conference on AI in Finance, Oct. 15–16, 2020, Manhattan, NY. ACM, New York, NY, USA.

Installation:

git clone https://github.com/AI4Finance-LLC/Deep-Reinforcement-Learning-for-Automated-Stock-Trading-Ensemble-Strategy-ICAIF-2020.git

Prerequisites

For OpenAI Baselines, you'll need system packages CMake, OpenMPI and zlib. Those can be installed as follows

Ubuntu

sudo apt-get update && sudo apt-get install cmake libopenmpi-dev python3-dev zlib1g-dev

Mac OS X

Installation of system packages on Mac requires Homebrew. With Homebrew installed, run the following:

brew install cmake openmpi

Windows 10

To install stable-baselines on Windows, please look at the documentation.

Create and Activate Virtual Environment

cd into this repository

cd Deep-Reinforcement-Learning-for-Automated-Stock-Trading-Ensemble-Strategy-ICAIF-2020

Under folder /Deep-Reinforcement-Learning-for-Automated-Stock-Trading-Ensemble-Strategy-ICAIF-2020, create a virtual environment

pip install virtualenv

Virtualenvs are essentially folders that have copies of python executable and all python packages.

Virtualenvs can also avoid packages conflict.

Create a virtualenv venv under folder /Deep-Reinforcement-Learning-for-Automated-Stock-Trading-Ensemble-Strategy-ICAIF-2020

virtualenv -p python3 venv

To activate a virtualenv:

source venv/bin/activate

Dependencies

The script has been tested running under Python >= 3.6.0, with the folowing packages installed:

Under virtualenv (venv):

pip install -r requirements.txt

Run DRL Ensemble Strategy

python run_DRL.py

Backtesting

Use Quantopian's pyfolio package to do the backtesting.

Backtesting script

Status

Version History [click to expand]
  • 1.0.1 Changes: added ensemble strategy
  • 0.0.1 Simple version

Data

The stock data we use is pulled from Compustat database via Wharton Research Data Services.

Ensemble Strategy

Our purpose is to create a highly robust trading strategy. So we use an ensemble method to automatically select the best performing agent among PPO, A2C, and DDPG to trade based on the Sharpe ratio. The ensemble process is described as follows:

  • Step 1. We use a growing window of 𝑛 months to retrain our three agents concurrently. In this paper we retrain our three agents at every 3 months.
  • Step 2. We validate all 3 agents by using a 12-month validation- rolling window followed by the growing window we used for train- ing to pick the best performing agent which has the highest Sharpe ratio. We also adjust risk-aversion by using turbulence index in our validation stage.
  • Step 3. After validation, we only use the best model which has the highest Sharpe ratio to predict and trade for the next quarter.

Performance

About

Deep Reinforcement Learning for Automated Stock Trading: An Ensemble Strategy. ICAIF 2020.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 97.1%
  • Python 2.9%