OpenMAMDA and the Open Data Model

On April 30, we announced the release of OpenMAMA 2.1 at the Linux Foundation End User Summit at the New York Stock Exchange. This release adds C++ and Java (JNI) bindings to OpenMAMA, Windows support and numerous bug fixes. In addition to the new language bindings and bug fixes, the release includes OpenMAMDA (C++ and Java) which is an object oriented framework for writing market data applications built on top of OpenMAMA. While OpenMAMA provides a content agnostic abstraction layer for messaging constructs like publishers and subscribers, OpenMAMDA processes market data messages and presents them through domain specific objects and interfaces like quotes, trades, and order books. The 2.1 launch and the End User Summit were both great successes. Members of the OpenMAMA community, including myself, Feargal O'Sullivan from NYSE Technologies, Shawn McAllister from Solace Systems, and Scott Parsons from Exegy, delivered several presentations. Many members of the OpenMAMA steering and technical committees from the US and Europe also attended the conference allowing us to convene several face to face meetings to discuss governance and technical issues. By far the most interesting and lively discussions addressed OpenMAMDA and the Open Data Model.

Around the same time NYSE Technologies started to consider open sourcing MAMA, we commenced work on formalizing the data model with which our feed handlers normalize market data for hundreds of market data feeds. A formal well documented data model both improves the quality and consistency of feed handler implementations, and allows users to more clearly understand how feed handlers process information by defining how raw market data maps to normalized message flow from the feed handlers. We quickly recognized that to realize the full potential of open sourcing MAMA, the data model with MAMDA as its reference implementation must be open as well. OpenMAMA integrates interprocess communication under a single API and allows interoperation between previously incompatible messaging systems; however, the absence of a standard representation for content diminishes much of the value of open communication. We quickly concluded that our vision of a community platform which enables open, unified market data deployments with components from multiple vendors requires not only OpenMAMA but OpenMAMDA and a data model based on a published open standard. Although the NYSE Technologies Open Data Model is a separate initiative from OpenMAMA and OpenMAMDA, the two efforts are intimately related, and the OpenMAMA governance and technical communities are actively involved in the Open Data Model project. 

Based on our conversation at the End User Summit, there are several points on which everyone agrees: OpenMAMDA must be the reference implementation of the Open Data Model. OpenMAMDA needs to support multiple, "pluggable" data models and not restrict users to any single data model. The Open Data Model must be extensible, and we must produce a data model and application framework complete enough and powerful enough to support all the market data feeds, asset classes etc. Meeting these criteria is a very (perhaps overly) ambitious goal. Some of the more contentious issues include how to deal with semantic information that varies wildly between venues. For example, it is very difficult to normalize security status's across multiple venues and regions. Moreover, different exchanges and data providers convey market events with different messages, message types or sequences of messages. The temporal and stateful nature of market data further complicates developing a comprehensive market data model and API. Caching, conflation, computed fields, and reference data muddy the waters further. Les Spiro from Tick42 (formerly DOT) correctly suggests that messages from the exchange do not necessarily correlate one to one with market data events from an OpenMAMDA API perspective: furthermore, he suggests a very flexible approach where API events map to messages (or sequences of messages) based on the message content. Clearly, we will not find a "one size fits all" solution, and OpenMAMDA implementation, as it stands now, hard coding many aspects of the current data model, must evolve into a much more flexible solution. While the Data Model Working Group hashes through these complex issues, we also intend to explore concrete data model representation options and API implementation strategies. These proofs of concept will allow us to narrow the scope and focus the larger effort into something more manageable.

One experiment currently under consideration attempts to determine the effort required to adapt the OpenMAMDA to a different data model. In this experiment, we plan to pass market data from an Exegy ticker plant through OpenMAMA into OpenMAMDA to test the feasibility of modfying the current version of OpenMAMDA to normalize and process the data properly. How much does OpenMAMDA need to change to produce consistent order books, quotes, trades, etc. given data with an new normalization scheme? This experiment will  identify some of the challenges we might face in reinventing OpenMAMDA to work with multiple data models.

In the next week or two, we will be adding more information around the Open Data Model/OpenMAMDA working group to the web site.