Skip to content

PhilipWaddilove/Backtesting-Algos

Repository files navigation

Backtesting Algo Strategies

Identifying meaningful signals for different asset classes

Code developed by Philip Millsapugh, Rasha Mosaad, Rawlric Sumner, Mitchel Voloshin, and Phil Waddilove

Table of Contents

  1. Introduction
  2. Code Outline
  3. Datasets
  4. Backtesting
  5. Evaluation of signals
  6. Combining signals with Deep Learning
  7. Conclusions and potential improvements

 


1. Introduction

An evaluation of the efficacy of common algo trading signals. Analysis attempts to answer the following questions

For a given asset or asset class:

  1. Which strategy delivers the most profitable signals?
  2. How do the risks/ returns of each strategy compare?
  3. Does a combination of strategies using machine learning work better than each strategy in isolation?
  4. [Can additional signals generated by Natural Langauge Processing be additive to this process?]

2. Code Outline

  • Scrapes historic closing price data using python libraries for an ETF (Alpaca API, Pandas and associated data cleaning libraries) or Cryptocurrency
  • Calculate signals for a given lookback window on the following trading strategies:
    • Simple Moving Average ('SMA')
    • Exponentially Weighted Moving Average ('EMA')
    • Bollinger Bands (‘BBD’), and
    • Relative Strength Index ('RSI')
  • Simulate strategy performance over 10yrs historic data - trading on long, neutral and short signals
  • Evaluate using descriptive statistics: Return (%), StDev, Sharpe Ratio, Sortino Ratio, Max Drawdown (%)
  • Employ a deep learning libraries to combine and optimize signals from all strategies using Tensorflow.keras (Sequential), and Scikit_Learn
  • [Add signals derived from a Natural Language Processing ('NLP') analytics to refine strategy]

Outline


3. Datasets

The Alpaca API can deliver closing price information over 10yrs for the following ETFs (all subsets of Standard & Poor's 'SPY' index)

Assets


4. Backtesting

Conducts dummy trades on signals from historic data on each strategy. Considers lot size, initial capital, and a ticker chosen by the user. Trades whole lots (i.e. cannot trade fractional lots, an 'all-or-nothing' approach).

Function Description
signal_position a function in to which we can pass a dataframe containing a 'signal' column, user defined lot size ('share_size'), initial capital contribution, and an ETF ticker
position long/ short * lot size
entry/exit position measures positional change in signal i.e. where a trade takes place
portfolio holdings Cumulative sum of trades
portfolio cash Initial capital, less: amounts invested (long or short, whole lots) in ETF
portfolio total cash + holdings
portfolio daily returns daily delta (as %) in portfolio total
portfolio cumulative returns overall delta (as %) on initial capital
`` ``

Backtesting


5. Evaluation of signals

A broad index index investor (SPY) using an SMA (50/200 day window) performed well over last 10yrs, but the same strategy performs poorly for tech sector and crypto - RSI is a more consistent performer.

Evaluation


6. Combining signals with Deep Learning

  • Calibrated a Long/ Short Term ('LSTM') deep learning model give time series data going back to 1/1/2010
  • LSTM model takes trading signals from each strategy and applies all of them as features in order to predict prices of the selected ETF
  • Tried several different epochs (50, 100, 500, 1,000) and found that 500 epochs minimizes loss function vs. computing power
  • Standard 70/30 train/test split
  • Given the number of features (8) we determined to use a 2 layer neural network with .33% dropout at each layer
  • LSTM model is free to decide what information is relevant and what can be discarded, optimizing over each successive itteration/ epoch.

7. Conclusions and potential improvements

  • RSI is a relatively consistent performer, the 'all-or-nothing' charicteristic of the backtesting methodology needs work
  • Deep Learning model performs fairly well in upwardly trending markets but does not do well in downward market movements
    • Data used to train the model is a period of low volatility and upwards trend
    • Model does not predict rebounds following corrections - important to note as to why Quantitative Strategies and Hedge Funds underperformed in 2020: models not trained on “V-shaped” recoveries, which lead these strategies and funds to underpeform on the rally
  • Timed out on incorporating NLP in to Deep Learning module - an additional excercise

Appendix

  • Dependencies: Solidity, Remix, Ganache, Metamask

About

Employing machine learning to backtest common algo strategies on SPX and other ETFs, and evaluate the performance of combinations of these strategies

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors