CoolThreads - Great! Now, How About Some CoolSpin?

If you’ve followed computing in the Life Sciences, you’ll know that during the last ten years there’s been constant talk about the so-called “data explosion”. In the late 1990s, much to our amusement, barely a week would go buy without receiving some marketing letter from a vendor telling us how our data was “literally exploding”… whatever “literally exploding data” looks like! However, while our data was growing, it wasn’t really a big issue. All we had to do was buy a bit more disk. It was incredibly easy to deal with. The truth is - our general storage and CPU needs in biotechnology, were nothing compared to the needs of of my friends working in the financial sector.

Fast forward to 2006. The times they are achangin’. Life sciences data may, at last, really be about to errr… explode. We’re currently working on a life sciences project where the volume of data and the computational need really is a bit of a challenge. In fact, if the project scales up to the high-end of projected levels, we’re talking about a genuinely huge volume of data. How huge? Well, a bit of context may help here.

Earlier this week, I was meeting with some of the key people in CERN and other particle physics labs to talk about utility / grid computing. CERN is building a compute infrastructure to gear up for the switch-on of what will be the world’s largest scientific instrument - the Large Hadron Collider (LHC) - in 2007. When the LHC goes live, the particle physics community will be generating about 10 Petabytes of data per year. That counts as a huge volume of data, I think you’ll agree. At full-scale, however, our project will be generating about 80% more data than that. Around 18 Petabytes of data per year.

As we discussed our respective approaches to the compute and storage challenges we faced, it rapidly became clear we shared the same philosphy when it comes to high-performance computing CPU and storage. The particle physics community working on the LHC project is much further ahead in their build-out than we are - we’re just at the start of our project. I was, then, keen to learn whatever lessons I could from their experiences. So, what was one of their biggest surprises in building the grid utility compute facility? The answer: running costs for hard drives.

Now, anyone that’s been involved in building computer grids or server farms in the past will be well aware of the issues surrounding power. That’s why technologies like Sun’s CoolThreads are so potentially useful. However, this addresses the power consumption only of servers i.e. CPU. That’s great. The problem comes when you start adding large amounts of storage to the system in the form of magnetic disk. The power costs just for hard drive storage for the particle physics guys on the LHC project is of the order of millions of dollars per year. In other words, that’s the cost of electricity simply to keep the disks spinning.

CoolThreads is great - I’m a big fan. However, surely no-one is betting that data needs are going to decrease over time? They surely won’t be. It’s now time, then, for the computing hardware companies to think about power consumption of hard drives. I heereby claim the trademark CoolSpin™ for use in the field of computing! Of course, I’ll sell it for a reasonable price to any hardware company that’s serious about reducing the power costs for storage ;-)

Post a Comment

Your email is never published nor shared. Required fields are marked *

*

*