Martin Fink explores the reasons why computers spend most of their time shuffling data between different storage types and why HPE's Memristor memory is the ideal universal memory vehicle.
Four hours and 31 minutes. That’s how much longer my HP Folio 1040 laptop estimates I can work given the energy stored in its battery. Where does all that energy go? My CPU meter shows only slight activity. I’m only running a couple of apps and I’m not typing that fast. It turns out that today’s computers spend most of their time and energy shuffling data between tiers of storage and memory. In modern systems, this hierarchy can be more than 10 layers deep. In geek land, we call this the “volatility chain.” We’re all so used to this that few of us ever think to question it. But on the face of it, this is an odd way to go about computing. Why not hold all your data in main memory all of the time? In this post, I want to take a look at how we came to work the way we do, and what comes next.
Why do we have a hierarchy?It’s a question of scarcity. To keep up with the processor you need the fastest memory possible. Since the 1970s, the fastest memories have required continuous power. Computers have always been built with as much fast memory as a user can afford, and the required capacity comes from cheaper, but slower, technologies. The memory hierarchy evolved because fast memory is expensive to both buy and run. A primary task of an operating system is to manage this hierarchy, delivering the right data to applications on demand and filing away the results. Historically, this hierarchy has been effective. It was a brilliant way of achieving the necessary price/performance combination users needed—and it has worked for decades. However, we believe we’re reaching a point where the memory hierarchy is holding us back. Today, scientists, mathematicians and economists are spending their careers working out how to perform their calculations instead of doing their actual work. They are forced to translate simple equations into complex parallel processing tasks because we can’t afford to buy, run or efficiently program computers with enough horsepower to accomplish what we need.
The top of the hierarchy—fast but expensive and volatile
The hierarchy of memory comprises three major layers. SRAM is used for on-chip cache memory, DRAM is used for main memory and mass storage is provided by Flash and hard disk drives. Let’s start by looking at the first two layers:
- The memory that shares silicon with the microprocessor is called SRAM. Each bit is stored in a network of (usually) six transistors. Speed is paramount because the SRAM has to keep up with the gigahertz pace of the microprocessor. The problem is that SRAM cells can take up most of the space on the chip (the most expensive real estate on the planet!). They’re also the most difficult transistors to run reliably at low voltages and high frequencies, making them difficult and expensive to fabricate.
- DRAM stores information as electric charge in a capacitor. The problem is that a DRAM capacitor is a leaky bucket of electrons. You have to keep refilling the capacitor every few milliseconds or the data will be lost. This wastes time—you can’t access data while a refresh is in progress—and power. As DRAM cells scale down, these twin problems get progressively worse.
The bottom of the hierarchy—slow but cheap and permanentThe final layer of our memory hierarchy retains information in the absence of power:
- Flash is growing in popularity for at least part of mass storage today. It’s much quicker than a hard drive but still very slow compared to DRAM. Flash is slow because data must be written and read—flashed—in large blocks. It’s like picking up a dictionary when you only want a single word. This speed limitation isn’t a problem today because of the memory hierarchy—we have SRAM and DRAM to do the rapid work.
- Hard disk drives are used for the bulk of mass storage today. Although Flash is catching up, hard drives still offer the lowest cost-per-bit this side of magnetic tape. But they’re glacially slow and energy-inefficient. They’re also inconsistent. If two blocks of data happen to be next to each other then it’s not too bad. But if the blocks are far apart, millions of clock cycles can be wasted while the read/write head drags itself across the platter. Newton still matters, and F=ma still means that moving drive heads and rotating platters burns energy.