SQL Server, Code Name: Hekaton

•November 21, 2012 • Leave a Comment

Microsoft have now announced Hekaton, which, is a an in-memory DB claiming to be capable of improving transaction performance in the order of 50 times that of SQL for some scenarios.

Aside from being able to load tables or the entire DB into memory, Hekaton also achieve higher performance by adopting a different concurrency model to that of SQL Server. SQL apps which encounter concurrency challenges due to latch contention or blocking should realise significant benefits moving to Hekaton since it does not take latches when accessing data and blocking issues caused by readers/writers should be removed by its new form of row versioning which is used to implement all transaction isolation levels.

Hekaton natively compiles TSQL which effectively means less CPU cycles are required to execute the same stored procs.

Some of the target scenarios for Hekaton include: scale up, low latency (could this be appropriate for tick storage?), ETL scenarios.

So from what I’ve read, Hekaton looks pretty interesting, of course it’s not going to be appropriate for all scenario’s, but definitely worth a much closer look.


//build/–The tent, and the goodies

•November 1, 2012 • Leave a Comment

This year //build/ is on campus which means I get to catch up with many of my old colleagues, but since there’s no buildings large enough to house a couple of thousand attendees, Microsoft has erected a huge tent on the football field in the middle of campus…if it wasn’t for the AstroTurf it might have turned out a little like Glastonbury given the fact its been raining so much!


And yes its true, this year Microsoft has been pretty generous – all attendee’s have been given a Microsoft Slate and a Nokia 920 – Windows Phone 8 Smile, thanks Microsoft. For myself the timing is pretty good, after trying an iPhone and Android I’ve been considering giving Windows Phone a go, so no excuse now.

//build/–Service Bus for Windows Server

•November 1, 2012 • Leave a Comment

Some of my old colleagues recently shipped Service Bus 1.0 for Windows Server. Microsoft is striving for ‘symmetry’ across of it’s services whereby the same services are available both in the cloud on Azure and on-premise. Another reason that this release is interesting is because it targets the scenarios where until now only MSMQ or BizTalk were positioned. Service Bus supports both queues (durable FIFO queues) and topics (durable publish / subscribe) currently. As well as durable queuing Service Bus also enables store and forward capabilities using the topic support. Service Bus’s durable queues are unsurprisingly built on top of SQL Server for its persistence layer which should make it easier to architect highly available deployments.

Service Bus also has support now for the AMQP protocol now, which is rapidly growing in adoption.

//build/–HDInsight (Hadoop)

•October 31, 2012 • Leave a Comment

Microsoft have now released HDInsight which is their Hadoop distribution, it can be run on Azure, Windows Server but also as a single box installation.

Its all about data – No Sql

•October 28, 2012 • Leave a Comment

Its been on my TODO list for a while now to take a proper look at the the NoSql offerings in the market. Building highly scalable architectures mean you need to think carefully about your data tier, caching, transaction management and i/o in general. Scaling systems like BizTalk meant you needed to have a very performant disk subsystem due to the high level of DB i/o. As I’ve mentioned before, this level of durability isn’t always required when building distributed systems, resiliently storing transactions in memory for example (i.e. on at least 2 nodes) might be sufficient for many applications, journaling approaches are also a good alternative as in LMAX.

I’ve recently started to look at mongoDb, under the hood it uses memory mapped files, which if you throw enough RAM at mongo it will hold all of the data in memory making it very performant. There are various options around sharding and replication of data across nodes, the trick then is to ensure writes are either replicated across nodes or the journal is flushed on write to ensure data is not lost. I’ve worked on a  few architectures over the last few years where we’re used a distributed cache in front of a SQL database to significantly improve read performance as well as improving the performance of servicing up repeated query results. Mongo is interesting since it means putting a distributed cache in front of your database is not required. Of course, mongo is lacking a lot of sophistication of SQL databases, many of those gaps have to be filled with client code.

A colleague of mine pointed me to this comparison which is a good overview of many of the main No Sql offerings. NoSql is not going to be appropriate for all scenarios, but the great thing is its causing more and more people to question and challenge traditional approaches to building distributed systems.

Algo Wars–Massive Quote Stuffing or Just a Production Test …

•October 26, 2012 • Leave a Comment

CNBC reported another interesting series of events in the weird world of algos. An apparently new algo made up some 4% of the quote traffic during the first week of October, apparently using up 10% of the capacity of the US stock market on October 8th. But the really interesting bit… it didn’t make a single trade, all of the orders were cancelled.

This appears to either be quote stuffing on a huge scale or perhaps an organisation trying to avoid another Knight Capital event and decided they needed production testing!

Nanex has some of the research around the event.

Aerospike – Real time no SQL DB

•August 29, 2012 • Leave a Comment

Aerospike’s real-time no SQL DB looks interesting – “Aerospike, Inc. offers the only real-time NoSQL database and key-value store that delivers predictable high performance for mission-critical, Web-scale applications. Aerospike’s flash-optimized, shared-nothing architecture scales linearly, consistently processing over 200k transactions per second per node with sub-millisecond latency. With automatic fail-over, replication, and cross data center synchronization, the Aerospike database reliably stores billions of objects and terabytes of data—while providing 100% uptime and a 10x improvement in TCO over other NoSQL databases”