Build. Connect. Analyse.

Building a Trading System using Linux Containers

Good idea or bad?

In this article, we wanted to talk a bit about how we help our clients with challenging performance issues facing financial markets. Beeks works with many of our clients to implement Kubernetes on the bare metal servers that we provide in colocation facilities and as part of our ‘deploy anywhere’ Proximity Cloud solution. One of the topics that often comes up early in these discussions is whether the containers (which are the fundamental building blocks of a Kubernetes deployment) impose a performance overhead that would impact latency-sensitive applications.

 

Background on virtualization technologies

Any virtualization technology adds a layer of abstraction which introduces a performance penalty. This was certainly the case for the early virtualization techniques which were emulating in software a default set of underlying hardware. Initially, Qemu, Xen and Vmware had to emulate a physical server; later they started to introduce para-virtualized drivers. At around this point, Intel and AMD introduced a set of CPU instructions to help the hypervisors to translate memory region assigned to different virtual machines. These instruction sets provide faster access to memory for a virtual machine than if this mapping had to be done in software. The overhead was greatly reduced but you could still see a light performance drop from virtualisation. In our experience, if everything is optimized, you can reduce the overhead to around 10%.

In parallel to the traditional hypervisor-based virtualization technologies, there have always been alternative approaches available to isolating workloads on an individual host. Sun Solaris had something called Zones (in 2005!), BSD had Jails and Linux had Cgroups back in 2006 (developed by OpenVZ). In contrast to the hypervisor approach (virtualizing hardware), these technologies focused instead on carefully isolating workloads. Using these technologies, you lose the freedom to run binaries intended for different operating systems, but in theory, you can isolate multiple workloads without facing any tangible performance penalty.

 

Linux Containers

Ok, after this long introduction, what is a linux container? It addition to the aforementioned Cgroup it uses namespaces, which arrived slightly later in the Linux kernel. Both technologies provide separation. CGroup is dedicated to CPU separation, namespaces support separation of other resources such as PIDs, mountpoints, date/time configs and network stacks.

We can see therefore that a container is not virtualization technology but an isolation technology. This means that, with no hypervisor intermediating the containers access to hardware resources, in theory containers should in theory be able to maintain the same performance levels as physical hardware. However, in practice there have often been doubts that this is the case.

 

So, in short, is there any extra layer in the kernel to provide such abstraction?

The short answer is no.

Remember the main task of a kernel is to allow multiple processes to share common hardware, in particular to allow them to effortlessly use memory, network, and disk.

Every process (called task in kernel terminology) is represented by a structure called task_struct, which is a massive structure of hundreds of lines of code.
Inside this struct there are multiple linked data types like the Cgroups and the namespaces.

This means every process regardless whether it’s running on the host operating system or in a container is managed exactly in the same way. So there is no overhead in running a container since its processes are allocated exactly as they would be for any process on the base OS. The only difference is that it may have different namespaces (basically, a different list).

To clarify with an example, think about running all your applications (a browser, Office, Slack and terminals) in single desktop or organize them in two desktops ‘web’ and ‘productivity’. Your computer will execute them as usual but they are organized in a more convenient way. Container technology organize workloads in a more convenient way but it still leverage on the classic kernel; no extra layer is needed.

 

Testing theory with a quick real-world test

So that is what we can understand from reviewing the Linux source code.

Now let’s look at a real-world example to demonstrate that there’s no overhead from running within a container. We will look at a comparison between a network test on physical and the same on a container setup. We’ll use two servers with 25Gb network cards connected to the same switch, the software I choose is iperf which can simulate almost any traffic, in my test I will use multicast traffic which is widely used in financial systems.

On one physical server iperf will be listening for incoming connections:
iperf -s -u -B 226.94.1.1 -i 10

On the other physical server I will start iperf and it will start streaming traffic with a 1Gbits bandwidth:
iperf -c 226.94.1.1 -u -T 1 -t 30 -i 1 -b 1G

the result are presented below:
[ ID] Interval Transfer Bandwidth Jitter Lost/Total Datagrams
[ 2] 0.00–10.00 sec 1.11 GBytes 957 Mbits/sec 0.013 ms 0/813800 (0%)
[ 2] 10.00–20.00 sec 1.11 GBytes 957 Mbits/sec 0.023 ms 0/813805 (0%)
[ 2] 20.00–30.00 sec 1.11 GBytes 957 Mbits/sec 0.013 ms 0/813805 (0%)
[ 2] 0.00–30.00 sec 3.34 GBytes 957 Mbits/sec 0.027 ms 0/2441478 (0%)

The same test was run using a container image with the same version of Iperf:

docker run -it — net=host beeksgroup/iperf -s -u -B 226.94.1.1 -i 10
docker run -it — net=host beeksgroup/iperf -c 226.94.1.1 -u -T 1 -t 30 -i 1 -b 1G

We can also use podman in this example if you want an open source alterantive to Docker.

Results on the containers are as follow:

[ ID] Interval Transfer Bandwidth Jitter Lost/Total Datagrams
[ 2] 0.00–10.00 sec 1.11 GBytes 957 Mbits/sec 0.013 ms 0/813800 (0%)
[ 2] 10.00–20.00 sec 1.11 GBytes 957 Mbits/sec 0.023 ms 0/813805 (0%)
[ 2] 20.00–30.00 sec 1.11 GBytes 957 Mbits/sec 0.013 ms 0/813805 (0%)
[ 2] 0.00–30.00 sec 3.34 GBytes 957 Mbits/sec 0.027 ms 0/2441478 (0%)

As you can see the bandwidth was exactly the same and the jitter was very similar across all the tests. So the observed performance matched the theory in this case — which is always nice to find!

So we’ve proven that in our test containers operating on bare metal hosts can be as performant as deployments of applications directly onto the bare metal.

 

From simple deployments to Kubernetes for capital markets

It’s important to note that this simple example focuses on traffic between containers. Our tests were also relatively simple in terms of the traffic profile — running different profiles or a more elaborate container setup orchestrated using Kubernetes might reveal more edge cases where perhaps the overhead of the network layer abstraction imposed by the Kubernetes container network infrastructure (CNI) layer. As container deployments get more complex, you also start to find that latency can be introduced by the load balancing technologies that are used to direct incoming traffic to the appropriate container.

Beeks aims to give customers who want to run co-located workloads a full choice of container, pure bare metal, or virtual machine — bringing the flexibility of the cloud to front-office capital markets workloads. In future articles we’ll look at more of the kinds of performance challenges and myths that Beeks helps our internal and external customers with.


	

Ready to talk? Discuss your low-latency compute requirements with our sales team