9/23/2019

Introduction

Objective

The primary objective of this project is to predict the direction of movement of volume-weighted-average price (VWAP) of the two most important buy and sell side levels in a limit order book in next 1 minute using deep learning

Issues to tackle

  • Volumes are not available at the end of each second and should be extrapolated
  • Similar to other financial problems, iid assumption fails. An explicit model for the association between observations is required
  • Sample dependency of the model should be examined. Therefore, the model should perform significantly better than majority guess
  • Balance the threshold for sampling observations and the minimum number of observations required for deep learning models

Data source

Privately acquired data. Only results will be shown

Feature Engineering

Time bars (discretization)

Time bars are series summaries that are created at fixed time intervals. They generally show signifcant amount of overlap in limit order books. However we try to apply deep learning methods that account for the correlation between observations

Labeling

Return is calculated for the current minute based on the features computed in previous minute. The label is split into 3 categories: 0 if |return| < thr, 1 if return > thr and -1 if return < (-thr)

Problem

The entire close price feature column is normalized. Therefore, there can be time windows where the normalized close price has non-zero mean. This may cause issues while applying deep learning followed by batch normalization. However, this is required if price (and not just volatility) is related to the outcome

Deep learning algorithms

Fully connected network

In a fully connected network each neuron of a layer is connected to every neuron in the next (and previous) layer. A weighted sum of the output from previous layer is taken and a non-linear ‘activation’ is applied on the resulting weighted sum. This creates more complex non-linear features from the original set of features.

Convolutional neural network

A convolutional neural network applies window-based weighted filtering on the data. This extracts pooled features from the previous layer. Changing the filter size will lead to downstream features that retain different amount of information from the original image.

Recurrent neural network

A recurrent neural network incorporates information from previous time steps into the current (and future) time steps. When unrolled it has a template like structure in which the structure is repeated at different time steps.

Hierarchical recurrent neural network

A hierarchical recurrent neural network adds an additional operation of summarizing each time’s feature set to gather timestamp-level (column encoding) information in addition to the row encoding that is performed by a regular RNN.

Training setup

Sampling

The data set is arranged in chronological order (ascending). Data set is sampled for running deep learning in the following way:

  • Training: 0%-60%
  • Validation: 60%-80%
  • Test: 80%-100%

Hyperparameters

  • Number of hidden layers and number of neurons in hidden layers (Fully connected)
  • Filter size (CNN)
  • LSTM units (RNN)
  • Row and column encoding units (Hierarchical RNN)

Hyperparameters are tuned using randomized grid search.

Results and conclusion

  • Evaluation metrics: Accuracy, class-wise precision, class-wise recall, class-wise F1
  • Results: Hierarchical RNN significantly outperformed majority guess in 3 out of 4 tickers on the test set. All models performed worse than random guess in the remaining ticker
  • Scope for improvement: Better preprocessing of text and better vectorization, exponential memory model could be used to make the news feature matrix smoother

Performance of deep learning models

Performance of classical machine learning models