Data Science Q&A: Olga Kokareva, Synthesis​

iStock

“Alternative data” has been a perennial buzzword for institutional investors over the past few years. Is it time for this non-market data and non-fundamental data to go mainstream?

Olga Kokareva, a managing director at quantitative investment firm Synthesis, took a short break from her busy schedule to share her thoughts with IntelAlley regarding the state of alternative data.

Are we close to a Golden Age of alternative data? 

We are in the early stage of adoption of what will eventually develop into a new normal of the investment research. The mere fact we still call this data alternative means this is just the beginning. We only see early adopters actively using alternative data and early majority testing the waters. Large traditional asset managers, private equity, and institutional investors will start implementing big data analysis at scale driven by client interest and peer pressure. Collecting data in real time from underlying sources and applying machine learning techniques to process it will become an integral part of the quest for alpha.

Is the number of alternative data sources growing at the same rate as it had in the past year or two?

It is growing much faster. Quintillions of bytes of data are created daily; millions of connected devices collect and share data. On top of that, tech entrepreneurs now have a much better idea of how to monetize data by partnering with financial firms. Many tech startups include selling data exhaust to Wall Street as a projected revenue stream into their VC pitches and business plans. Increasing demand combined with growing interest from private equity and strategic investors will attract new players into the industry. And the competition will make existing data vendors innovate and improve data quality.

What are the issues that financial institutions are still getting wrong regarding the sourcing of alternative data?

First of all, it is wrong to underestimate the role of big data insights in building a competitive investment strategy. Every asset manager, at least, has to assess the opportunity strategically.

Another misconception is high expectations of the revolutionary results. It requires time and investments to assemble the right team, ensure the infrastructure, streamline the process. Plus, most of the actively marketed data sets don’t contain a strong enough signal; they only work as a part of an information puzzle. It is hardly possible to achieve the optimal ROI at the first attempt, and meaningful results may take a while to show.

How has the role of the data scientist evolved much? Are they still spending the majority of their time cleaning data?

A highly skilled data scientist with domain expertise is not supposed to be cleaning data. Ideally, there should be a separate engineering team, but obviously, not all firms can afford to have both in-house. Data pre-processing doesn’t typically involve sharing any proprietary information and can be outsourced entirely. There are service providers who can cover data preparation and, in some cases, help with feature extraction so that the asset manager’s data team can focus entirely on data analysis. The demand for outsourced data processing will grow as more asset managers start incorporating big data insights in the research. 

Has the dynamics between data supplies and consumers changed much? Do you find yourself beating the bushes for providers, or are the providers knocking down your door to sell their wares?

The role of data scouts is evolving – we now receive tons of incoming inquiries from providers, and the challenge is to prioritize new data sets evaluation. Even at a fairly large firm, the capacity for in-depth testing is not unlimited. To find the balance, buy-side firms need to establish a robust vetting process. However, it would be a mistake to assume that alternative data is an entirely buyers’ market. Some valuable data sources are not yet commercialized and proactively searching and negotiating to acquire new data sets is still a big part of a data scout’s role.