Comments

(newest first)

John F | Mon, 25 Apr 2016 08:30:40 UTC

M Swain: everything is a cache tuning problem (your physical workstation, the economy). I think, to stretch an analogy, adding SCMs to a system is like using an electric motor to drive a car instead of an ICE: suddenly you don't need a gearbox, or more broadly analogous to datacenters, adding a n electric motor to an ICE means you can do a setup like the Chevrolet Volt/Vauxhall Ampera, which seems to have a crazy drivetrain setup but according to user reviews is incredibly reliable. Anyway. I could have commented more directly had I read this closer in time to my university studies, but as it is, please forgive my strained analogies.

M Swain | Fri, 08 Jan 2016 04:48:18 UTC

How is this not a cache tuning problem ?

Amit Golander | Thu, 07 Jan 2016 20:54:02 UTC

Interesting article.
 
For those who want to further investigate the impact of such radically different storage on applications - Plexistor has made its software-defined memory (SDM) available for download. The community edition (CE) is free.
http://www.plexistor.com/download/  

It is multi tiered, using persistent memory (or DRAM) as the 1st tier and FLASH as the 2nd tier.

brent | Thu, 07 Jan 2016 20:52:09 UTC

Managing data hazards at a rate of >1Mil IOP/sec/DIMM : Well done.

Figure 1 : Can you please add latency for Infiniband and Fiberchannel?
Figure 2: Can you please check if the orange 'network' data series is relevant?

Hans | Thu, 07 Jan 2016 20:09:05 UTC

If SCM's are so expensive that they need to be utilized to the max to earn their value and we need to change everything around the SCM's to reach that max utilization, then why would an enterprise buy these SCM's ? That would not make sense imho.

If you are building a HPC cluster and you need to wrestle the last cycle and IOP out  of your system then you might want use SCM's in special configurations to reach that point. But as you mention typical enterprise applications won't benefit from these type of configurations as they are not optimized to use of special hardware.

For the rest I agree with FredInIT. SSD's were also "way too expensive" to use in the DC a few years ago. So in a few years time I expect SCM's also to reach a pricepoint where compression and dedup is not that important anymore and therefore you will use it in your application as NVRAM, not as disk.  The NIC will then use RDMA to transfer data directly between the memory banks of the nodes in a DC. The OS won't see disk anymore (and very little of the NIC :-)), only memory.

My 2cts.

Brian Too | Thu, 07 Jan 2016 19:41:25 UTC

Never forget that in modern computing, system utilization ratios are far from the most important design consideration.  Very far.

Thus, I would support modest redesign and rearchitecting of systems to optimize them.  Just be sure to keep the big picture in mind.  This is low level technical work and best performed in conjunction with other needed systems maintenance and development.  And when it comes to 3rd party applications or technology stacks, the relevant systems modifications are often out of your responsibility or control.

Spend too much analytical resources on matters your employer does not value, and you will work your way right out of a job.  With the best of intentions of course.

MikeTheOld | Thu, 07 Jan 2016 19:37:55 UTC

Interesting article. One other thought - early PCs had CPUs that weren't all that much faster than storage; so some had storage access modes that treated it much like RAM (random disk access in BASIC, for instance, and the TRS-80 Model 100's use of battery-backed static RAM for both storage and processing). NVRAM that can operate and normal RAM speeds could in principle make that possible again especially for smaller devices. For instance, if the price were to drop enough, why would a phone or smaller device need both flash (used as a disk) and RAM? Just work on the data directly on the NVRAM with backups to the cloud or other external storage, either scheduled or on demand. Among other things, that should allow for near-instant start since there would be no copying of data from one form of storage to another.

Bruce | Thu, 07 Jan 2016 18:20:28 UTC

A lot of the issues raised in this article seem to be the same issues that HP (now HPE) is trying to address with their "machine" project, where the system architecture is supposedly structured around the memory rather than around the CPU. Technical details of the implementation are still rather thin, however.

FredInIT | Thu, 07 Jan 2016 18:02:32 UTC

Interesting article. I see how you are trying to use the SCM as a means to predict Bell's Law on the industry. Unfortunately your entire premise is based upon the SCM remaining more expensive than the rest of the system's components for the foreseeable future. With the deep architecture redesign requirements you are proposing (and running against technical issues described by Amdahl's Law), the early-adopters would be in a 1-3 year range, with wide-ranging adoption in the 3-5 year range. During that time frame SCMs have had a chance to experience Moore's, Grosch's Laws. In other words, by the time you re architect the system to support high-priced SCMs that updated architecture will no longer be needed as the price of the SCM will no longer be a driving force.
Does this mean that developing architectures to more fully the realize performance gains of SCM are futile? Absolutely not! But the gains are driven by overall system efficiency - not todays relatively high cost of a single component.
Secondly, you view the cost of storage as dominating data center design? Um... nope... it's the cost of cooling. Only a very tiny sliver of the electricity consumed is actually used in the computational process. The vast majority is lost as heat. Hence the HUGE efforts on DC Bus, hot-cold aisle, hot-ceiling-cold-floor, liquid cooling (which was heavily used in bi-polar mainframes), etc., and making compute units much, much, more energy efficient. Along with that you need to cite your sources for many of the claims in the 'Balanced Systems' section. This is a research paper, not a white-paper. All claims need to be verifiable and reproducible.
Regardless of my pedantic nit-picking of your paper, you guys have the right ideas and are thinking in the right direction long-term regarding improving overall compute capabilities.

FredInIT | Thu, 07 Jan 2016 18:00:50 UTC

many word such insight wow | Thu, 07 Jan 2016 17:44:54 UTC

*cough* peecee-centrism *cough* which itself is telling since that crap never was suitable for serving anything or filling up datacentres like they do, not even with all the bolt-ons that are supposed to make the whole contraption "usable".

Chris Adkin | Wed, 06 Jan 2016 23:33:51 UTC

This is one of the most interesting articles on storage I've read, I'd be interested to get the authors thoughts on how well NVMe addresses the issue of block management within the kernel, Intel data direct IO as a solution for addressing memory being a potential bottleneck and the trade offs of tearing versus having to cater for different code paths in the software stack which deals with storage.

Sign up for QueueNews

Upcoming Conferences

acmqueue app

Join ACM

Comments