DB Design Tips

Random DB Design Tips

    On this page I’ll be posting various tips on DB Design and performance. Check frequently as the tips will be periodically archived. To obtain a full archive of tips or query on a specific issue you would like help on kindly email me at the above email under ‘Contact’

Designing Effective Primary Processing Queues

    Today I would like to discuss an overview of design for polling the database for new entries from the main processing thread of an application. When I talk about the main processing thread I am referring to trades in a trading system, licenses in a licensing system and so forth. This is as opposed to resource modification queues that are different in nature – those will be discussed later.

    Let’s take the case of a trading system as an example. Suppose that our system dealt with both equity and fixed income trades or even a myriad of derivative products. Each of these trade types has its own characteristics that are independent of each other (for example equity trades do not have accumulated interest used in settlement, etc). Now suppose that we have a backend application server which needs to process new trades and wants to use the database to poll for new trades. Let’s look at a few alternatives to approach this:

    • Put all trade types into one table and poll new records from that table.
        Pros:

      • Simplicity of design
      • Lack of required table joins to retrieve info

        Cons:

      • Large row size leads to poor retrieval and storage performance
      • Polling for new records intereferes with all trades of all types and all their attributes
      • Increased chance for deadlocking
      • Row size may easily exceed page size hampering additional column expansion
      • Slower polling times as more data to parse
    • Separate all trade types and poll at each individual table.
        Pros:

      • More normalized tables resulting in better performance for queries of all sorts
      • Smaller table sizes means better polling retrieval times

        Cons:

      • Lack of centralized queue forces design of several polling threads without complicated multi-queue resolution logic
      • Trade attribute data is tied with polling mechanism which may not need to read those attributes resulting in degraded performance
  • Separate trade types but have a central small queuing table that joins to each trade table by numeric UID.
      Pros:

    • Small queuing table results in quick polling retrieval
    • Polling queue separation from primary tables results in less interference
    • Optionally queuing table can be distributed into several tables across multiple polling servers

      Cons:

    • Table design may result in an additional join but this can be avoided with proper per-field denormalization

    So as you can see I prefer the final option, although one word of caution here – it is favorable to keep around the polled items by having the polling table be the keeper of a trade’s numeric UID. The benefit here is that the alternative of deleting queue entries as we go along would result in an almost always empty table except during peak volume times. During peak volume times an almost empty table would have poor performance due to outdated data distribution statistics and so become a bottleneck. Rather I suggest rows from the queue table(s) are purged periodically along with the primary entities (trades) themselves.

Addendum:
This article will have a second version with sample table schemas for each of the above cases in the near future. For further explanation of the above or advice as it pertains to your specific case, please fee free to email me as per above ‘Contact’ top bar item.