Posts

  • Optimizing low-latency trading systems

    Modern hardware coupled with modern compilers allows software developers to write optimized programs with little effort. For this reason, developers are discouraged from micro-optimizing the source code by default. However, recent trends indicate that the growth in the general-purpose single-core performance has slowed down significantly, as can be observed in the figure below. As a consequence, there is now more pressure to optimize at the software layer. While compilers do help with most software optimizations, some areas require developers to open the hood to go above and beyond what compilers offer out of the box.

  • x32 ABI and optimizing memory-intensive applications

    Recall that a hash map, also called an unordered map, is implemented using a hash table data structure. The hash table data structure is simply an array that is indexed by keys which typically go through a hash function. These indices potentially collide, and chaining is a common way to handle these collisions. Each element of the hash table array is a linked list, and the collisions are resolved by simply appending the key-value pair to the linked list forming a bucket. The idea here is that there are a constant number of items in the bucket on average, ideally not more than one, and that property results in constant-time lookup, insert and delete operations on the map.

  • Minimizing redundant network reads

    In the context of trading, the market data represents the state of the exchange. The data provides information related to the currently available orders on the exchange for each security. This data is common to all trading clients connected to the exchange. For this reason, maintaining a TCP connection with each client is expensive since the same data would then be redundantly sent over the network to each client, thus hogging the bandwidth. Hence, stock exchanges typically send market data over multicast UDP. The UDP protocol is not reliable, however. The exchanges provide a recovery service through which clients can request the dropped packets over a more reliable connection backed by TCP. These connections are rate-limited, however, since the exchange does not expect too many packet drops.

  • Marius Script

    The task here is to develop a tool to simplify development on a state-of-the-art graph neural network designed at Madison called MariusGNN. To begin, graph neural networks are a more recent family of deep neural network architectures that are an amazing fit for graph style data. The book on graph representation learning provides an amazing introduction to the topic. In short, graph neural networks (GNNs for short) use a combination of message passing, neighborhood sampling, feed forward networks, and wisdom from deep neural networks to train embeddings of graph nodes. GNNs can be used for various purposes but can be mainly catergorized into node classification, edge prediction, and graph global classification. The arXiV paper on MariusGNN gives an amazing explanation on the same.