In the past few years, many storage technologies and markets have been brewing. Recently, there are more and more messages that enterprise storage systems begin to use SCM storage level memory technology. Different from the past, SCM is not only used as read-write cache, but also used in persistent storage layer. So how much impact will this technology have on the storage industry?

Ideally, it is a new storage technology with speed comparable to DRAM, but the cost is close to that of traditional hard disk. Of course, at present, only the reading speed can match that of DRAM, and there is still a gap in the writing speed. Moreover, when the unit cost of SSD is close to that of traditional hard disk, SCM does not have enough cost performance as the underlying storage device. In the existing AFA storage system, in order to pursue the ultimate performance of nvme SSD, the delay brought by the software stack itself can not be ignored.

Compared with SSD, the access latency of SCM media varies by several orders of magnitude (from hundreds of microseconds to hundreds of nanoseconds), and the problem of software stack latency will be more prominent. For example, the traditional software stack from application to kernel has a clear decomposition level of functions, which is suitable for slow storage media, but for ultra-high-speed media such as SCM, it has become a speed bottleneck. For the same reason, the proportion of network delay in SCM system has also become the main contradiction affecting system delay. How to build a high-speed and stable network has become the key factor to make full use of SCM media performance in the system.

Storage level memory SCM can retain its content like NAND flash memory and have the same speed as DRAM, which makes it the preferred high-speed storage medium instead of flash memory. Due to the inherent design of flash memory, SCM is much better here. One of the biggest causes of performance problems and flash latency is the use of garbage collection to meet new writes. Old information cannot be overwritten when data is written to the flash drive. It must write a new block elsewhere and delete the old file when disk I / O is suspended.

Nvme / PCIe based byte addressed nonvolatile storage opens a new chapter in storage architecture innovation. SCM is often used as an extended cache or persistent storage for the highest performance tier. Therefore, in most cases, SCM is positioned to supplement NAND rather than replace NAND. HPE announced that it would use Intel’s optane as an extension of DRAM cache. It can be seen from the test data of HPE 3Par 3D cache that the delay was reduced by 50% and the IO was increased by 80%.

The goal and potential of SCM technology are to bridge the gap between DRAM and SSD reading and writing speed. Theoretically, due to the performance gap of internal devices in modern information systems, a lot of power consumption is increased, and the time spent on data round-trip has become a short board of overall performance. Therefore, there are registers and caches between the processor and memory, and SCM is introduced as memory buffer or SSD cache to solve this problem.

In general, SCM has two uses: one is for caching and the other is for persistent storage. ① HPE uses SCM as a cache. From the implementation point of view, the implementation of read-write cache in this way is relatively simple. It is said that the delay of 3Par and nimble can be maintained at 300 μ S microseconds or less, most of the IO delay can be maintained at 200 μ S or less. HPE uses optane as cache in 3Par. As a result, the latency is twice lower than before. It is also said that it is 50% faster than Dell EMC’s powemax with nvme SSD.

② Most SCM is now used as a cache. Unlike HPE, Dell EMC’s powermax uses SCM as a storage layer. Powemax uses a low latency nvme – of to connect to the server. Because of SCM, data access will be faster. In the implementation of powermax, each port will be fully utilized. Each port has its own separate queue and can handle more io. Powermax can provide independent queues for read, write, small block requests, large block requests and various loads.

Moreover, different ports can be configured with different controllers, which will make raid more efficient. For example, when the hard disk fails and needs to be reconstructed, the two controllers can participate at the same time. In the past, it took about 7-8 hours for a single controller to reconstruct a 7200 RPM hard disk, while it took only 2.5 hours for dual controller operation, which reduced the time by three times.

Leave a Reply

Your email address will not be published. Required fields are marked *