Assessing market data quality often splits into two levels. At a basic level there is examining the quality of the channel itself, by which I mean looking at information to determine, for instance:
- What amount of bandwidth is my market data consuming?
- Am I experiencing microbursts?
- Do I have sequence gaps?
- Am I missing packets?
Answering these questions allows us to determine whether the channel is healthy. Then you’ve got market data assessment that looks at the actual quality of the data itself. This goes much deeper and could be used to assess, for example, if all of the symbols a firm would expect to see, are ticking and if so, if they’re ticking at the rate you’d expect them to on both the bid and offer side, and the tick price movements are within a normal range.
This is where assessing market data quality takes things to a new level. You could have a market data channel that would seem completely and perfectly healthy. You know – no sequence gap problems, no microbursts. But in reality, you’ve got a symbol or two that are just not ticking. Or they’re ticking on one side and not ticking on the other or a market is missing on a consolidated feed. Worst of all, you can have sudden price movements so extreme they may be indicating the next market crash.
In the US, in the past few years the SEC has introduced several new rules that require exchanges to halt trading if things are not properly balanced. A lot of these requirements have come as a response to market crashes or sudden price movements that affected, not only the exchange, but also the companies being traded. So there have been lessons from big IPOs, from situations that have caused a disruptive market with price changes that are just too quick and other crises.
However, no one can accurately predict where the next threat to market stability will come from. I’m sure when it hits even more regulation will be introduced, but for the firms that make huge losses because of the quality of the market data they were trading on at the time, it will be too late.
Take for example a firm active in a market when prices start moving very quickly. If you are only monitoring the quality of the channel it would probably all look good. Everything could seem fine because there might not be any gaps and the data is actually getting to you. But if you are looking into the quality of the data itself you might see that the price fluctuations of certain symbols have started to go crazy.
Being able to spot very, very quickly that a price movement is not normal can save thousands, if not millions of dollars. Any issue within the system subscribed to this data may cause a problem to go undetected. Even without a price jump, a symbol may only be ticking on the bid side, but not on the offer side causing the generation of misinformed trades, which can be hugely expensive.
Furthermore, there’s a lot of value in assessing the data’s quality at both a channel and content level in real-time, because trying to do this as a post-trade processing activity involves much more heavy lifting.
Access to information detailing the quality of the actual data being used to formulate trading decisions can help traders trying to understand why they hit, or didn’t hit a certain price, or why they didn’t get the price spreads they expected. All these things are very much connected to whether the market data received was healthy and available. It’s the ability to correlate why a bad trading decision was taken, with what was happening in the market at that point in time.
Monitoring market data quality at this level isn’t the standard. Many firms are still at the stage of wanting to initially ensure they are getting all of the channels they expect. Are these channels alive and are they missing any data? They are laying down a foundation first of all, so they can essentially assess is my channel healthy?
Once they get to the point where they can correct all of the problems monitoring at this level detects and can determine if they have a problem on their network, if they don’t have a big enough connection to the exchange or if the exchange is having problems – then they’re ready to move to the next level.
This is a methodical approach. The first layer is a building block. You can’t really assess the quality of the data without first knowing that the channel itself is sound. If you’re not sure if you’re getting all of the data, how do you know if it’s the quality of the actual data that’s causing the problem, or because you’re just not getting the full channel?
Once a firm is ready to look at the quality of the data itself, it’s recommended that they monitor the data before it enters their systems and after. This is so they can detect whether or not the way in which the data is being processed is healthy. In doing so, if you have corresponding problems either side you know you have a problem with the data received. If you only see a problem after the data has been processed, you know the problem is happening inside your own system.
In summary, I feel monitoring at both a channel and content level is important. But it’s when you are monitoring the quality of the data that being alerted to the emergence of issues can be really valuable, as the issue’s impact is much easier to determine. Take for instance a trade desk being alerted to the fact that certain symbols are not ticking as expected. This could indicate that the data being input into trading decisions could be behind the market and causing loss making trades. With this insight, they can then determine if they should halt trading, trade on a different market or take an alternative approach etc. Whereas, if you missed a few messages due to a sequence gap you can’t accurately determine what kind of impact this will have, and as such what would be an appropriate response. Yes being aware of missed messages would indicate to the network or infrastructure teams that there is a problem – but how, or if, this might impact trading is unknown.
So whilst firms need to understand the quality of their market data channel, to truly benefit from accurately being able to interpret the impact of data quality issues, choosing to stop at the first level can prove extremely costly.