ICML 2019 Expo Talk
June 9, 2019
Machine Learning Applications in Financial Data Analysis
Cubist Systematic Strategies is a quantitative and systematic trading firm that has been active and successful in the global stock, futures, bonds, and options markets for the last 15 years. Cubist utilizes state-of-the-art machine learning techniques to perform complex quantitative data analyses in order to identify market anomalies and participate in price discoveries. The lifecycle of financial quantitative analysis usually consists of data collection, cleaning & organization, rigorous modeling, testing & peer reviewing, and automation. This challenging process is further complicated by the fact that financial data, in general, is non-stationary, unstructured, heterogeneous, multivariate, and asynchronous. Also, due to the feedback loop between our actions and the marketplace, our footprints are unavoidably entrenched in historical asset price data. Recent developments in machine learning (ML) and the abundance of data have opened a myriad of potential applications in various parts of the financial research pipeline. In this talk, we will present the nature of the prominent problems we face in financial data analysis, particularly emphasizing their differences vs. the ones where recent ML advances have found successful applications, e.g., computer vision, speech recognition. First, with the goal of being able to forecast future security returns in various timescales, we’ll introduce and visualize some of the relevant but non-stationary, unstructured, heterogeneous, multivariate, and asynchronous data that our ML algorithms process. Next, we’ll discuss how advanced ML algorithms 1) are likely to fail, with specific examples, as an out of the box solution when faced with such complex financial data, 2) but also can reveal powerful knowledge when combined with insights into market structure. Finally, we’ll demonstrate how Cubist turns forecasts in various horizons into successful trades in the marketplace when liquidity and risk constraints are also considered.