There’s no doubt in my mind, being able to look into what is happening inside a server, blade or stack of servers and correlate those activities with what is happening outside on the network is extremely valuable. It’s what allows firms to really gain that end-to-end understanding of what is going on in their environment.
Components are starting to consolidate. Market data engines and matching engines, pricing engines and gateways are being pulled together, often into a single application or server, and are now communicating via memory.
This can present problems because with classic network based monitoring you would only be able to see a message going into this big blob of activity and a message coming out again. If this takes a second, that’s huge, usually, it’s simply too much. But realistically now what?
There could be several applications working away inside, there could be network cards and their stack, and the operating system, plus additional applications. Without a view inside this big blob of “stuff” how can you possibly know where the excess time was spent? And if you don’t know that, how do you know what it is you need to do to fix it?
With a view inside you can monitor a message as it goes in, see it as it crosses the network interface card, then the operating system and then as it enters the application. This drill down provides a detailed understanding of what is happening internally. So you can find out where the problem is and zoom in. In testing environments, where you are testing the integration of multiple applications into one server, it’s particularly important to be able to do this.
To enable our clients to gain this detailed visibility, Velocimetrics has Application Tap. It’s used in different ways. For instance, one of our clients is currently building a new network protocol and they don’t want to build decoders just yet as they will change with each version of the protocol’s development. This client uses Application Tap in their testing environment to measure performance, once they get the performance they want, they’ll then settle on a spec.
Using Application Tap means they can do all this testing, see what’s happening inside the application and correlate the timings with what is happening on the wire. Then they’ll write the decoder once, not potentially 100 times. It really is a way of dividing and conquering what you can do from the outside, with minimal impact on the application in terms of performance.
We’ve also seen clients use Application Tap to measure the performance of a very particular piece of hardware and to determine which change to their system caused a problem.
With Application Tap, when changes are introduced clients can see what is happening inside their server. If a client is replacing all of their network interface cards and messages are now taking extra time within the server, the client can measure the timings of all hops inside the server and determine which change caused the difference.
It’s not just about understanding the overall change, but whether different components were stable when various changes were introduced. It enables clients to see if they replace just one thing how everything is behaving, for instance, if they upgrade their operating system, with Application Tap, they can see what has changed.
It’s about being able to see what happened not just from the outside but from the inside too. And this level of visibility, especially in testing environments, where you are trying to work out what is going on, can be incredibly useful.