Thursday, September 11, 2008

London Stock Exchange suffers .NET Crash

It should have been a great day on the London Stock Exchange. The U.S. government had announced on the Sunday before that it was coming to the rescue of Freddie Mac and Fannie Mae. Trading would have been extremely brisk, but then, at 9:15 AM GMT, the Exchange's software failed due to "connectivity issues."Six-hours and 45-minutes later, the London Exchange, along with the Johannesburg Stock Exchange, which uses the LSE's trading platform TradElec, were finally back up.

That was no consolation to traders. As Reuters reported, "We have the biggest takeover in the history of the known world ... and then we can't trade. It's terrible," one trader said.

So what happened? Officially, the LSE first said that, "We will be investigating this and will do everything we can to make sure this doesn't reoccur." Laterthe LSE gave the vague explanation, that "It was software-related, a coincidence, due to two processes we couldn't have foreseen," and not caused by high-volume. The spokesperson added, "We've introduced a fix and we're confident it will not happen again."

Somehow "we couldn't have foreseen" and "we're confident it will not happen again" don't fit very well together.

So what really happened? I doubt we'll ever get a detailed, nitty-gritty explanation, but I have friends in London and... Well, let me just make the following points about TradElec. First, TradElec runs on more than a 100 HP ProLiant servers in several locations in London. These servers are running Windows Server 2003.

On top of this runs the TradElec software itself. This is a custom set of C# and .NET programs, which was created by Microsoft and Accenture, the global consulting firm. Its back-end databases, believe it or not, run on Microsoft SQL Server 2000. The goal was to maintain sub-ten millisecond response times. In short, it's meant to be a real-time system.

The programmers and serious database administrators in the audience can already see where this is going. Sorry, Microsoft, .NET Framework is simply incapable of performing this kind of work, and SQL Server 2000, or any version of SQL Server really, can't possibly handle the world's number three stock exchange's transaction load on a consistent basis.

I'd been hearing from friends who trade on the LSE for ages about how slow the system could get. Now, I know why.

What I find really amazing is that the LSE's software stack hadn't blown its top earlier. Even setting aside my feelings for Linux, there's simply no way I'd recommend Server 2003, .NET and SQL Server for a job even a tenth this size. If a customer of mine insisted that they didn't want open source - more fool them - I'd recommended Sun Solaris, JEE (Java Enterprise Edition) and Oracle or IBM AIX or z/OS, WebSphere and DB2.

What I'd really prefer to see is RHEL (Red Hat Enterprise Linux), JBoss, and MySQL or Oracle or Novell's SLES (SUSE Linux Enterprise Server), JEE, and, again MySQL or Oracle for the DBMS engine. In any case, though, the real moral of this story is that if you really want HA (high availability) or HPC (high performance computing), Microsoft's products should be at the bottom of your list. Unix, mainframes, and, yes Linux, are far, far better for companies that need fast and reliable computing.

You don't have to believe me though. The New York Stock Exchange has already started to use Linux on its servers.

No comments: