Monday, November 29, 2010

TritonSort benchmark for Indy GraySort

25 years ago, Jim Gray started benchmarking sort performance, and the efforts continue, as sort performance is a wonderful tool for incrementally advancing the state of the art of systems performance. You can read more about the overall sort benchmarking world at

One of the 2010 winners, the TritonSort team at UCSD, have posted an overview article about their work on this year's benchmark. Although they don't have the complete technical details, the article is still quite informative and worth reading.

One of the particular aspects they studied was "per-server efficiency", arguing that in the modern world of staggeringly scalable systems, it's interesting to ensure that you aren't wasting resources, but rather are carefully using the resources in an efficient manner:

Recently, setting the sort record has largely been a test of how much computing resources an organization could throw at the problem, often sacrificing on per-server efficiency. For example, Yahoo’s record for Gray sort used an impressive 3452 servers to sort 100 TB of data in less than 3 hours. However, per server throughput worked out to less than 3 MB/s, a factor of 30 less bandwidth than available even from a single disk. Large-scale data sorting involves carefully balancing all per-server resources (CPU, memory capacity, disk capacity, disk I/O, and network I/O), all while maintaining overall system scale. We wanted to determine the limits of a scalable and efficient data processing system. Given current commodity server capacity, is it feasible to run at 30 MB/s or 300 MB/s per server? That is, could we reduce the required number of machines for sorting 100 TB of data by a factor of 10 or even 100?

Their article goes on to describe the complexities of balancing the configuration of the four basic system resources: CPU, memory, disk, and network, and how there continues to be no simple technique that makes this complex problem tractable:

We had to revise, redesign, and fine tune both our architecture and implementation multiple times. There is no one right architecture because the right technique varies with evolving hardware capabilities and balance.

I hope that the TritonSort team will take the time to write up more of their findings and their lessons learned, as I think many people, myself included, can learn a lot from their experiences.

No comments:

Post a Comment