Daily Archives: 10/03/2011

Amazon Will Crown the TOP500

The computer system architecture used for cloud computing provides a powerful computing environment for web services, but not for the tightly coupled parallel processes required for High Performance Computing (HPC), rendering the vast data centers built for the cloud useless for supercomputing purposes. This fact is very well presented in the paper “Can Cloud Computing reach the TOP500”: it is shown that the cost for solving a linear system increases exponentially with the problem size in Amazon EC2, the opposite of what happens in a genuine supercomputer.

A real market demand for the ability to run HPC workloads in the cloud lead Amazon to offer a new Cluster Computing Instance, increasing performance of typical HPC applications by an order of magnitude more than the one offered by the previous EC2 instance types. This supercomputer processes information at the rate of 41.8 TeraFLOPS and ranks at #231 of the TOP500 list as of November 2010. What it’s really noteworthy is that Amazon released a new Cluster GPU Instance on Amazon EC2 some months later, but the LINPACK benchmark was calculated without using the available GPUs, so the Amazon Cluster must really rank much higher in the TOP500 list: since almost every supercomputer at the top of the list now uses GPUs, it has become an almost mandatory prerequisite (except for Cray). The available benchmarks show excellent performance, but the real GPU speed-up on a standard benchmark remains unknown.

To get at the top of the list, Amazon should provide more RAM per node and much better interconnects than 10GB Ethernet (vg. Infiniband FDR or Gigabit Ethernets). One thing is for sure: meanwhile, I’m having fun renting the GPGPU clusters from $2.10/server/hour, each node delivering a TeraFLOP using the dual NVIDIA Tesla M2050 GPUs and immediately accessing the performance of an HPC Cluster GPU, with no upfront investment or long-term commitment.