Back in 2020, Microsoft announced that a crucial part of the Xbox Series X's Velocity Architecture - DirectStorage - would be coming to PC. Since then, Microsoft has been fairly quiet about about this feature, other than their announcement regarding a limited developer preview last summer. During that announcement, Microsoft told us that the DirectStorage SDK is compatible with Windows 11 and Windows 10 Build 1909 and newer. They stated that GPU asset decompression would be available in a later preview. Finally, they made mention of a new and improved storage stack in Windows 11 that would allow DirectStorage to perform more efficiently than on Windows 10 (since then, we have covered this new Windows 11 storage stack in our articles on I/O Rings and BypassIO.) At this time, DirectStorage is still under NDA, however, we will take a deeper dive into this technology, as well as the workload we can expect in next-gen games.

It wasn't too long ago that SSDs were relegated only to a small subset of high end PCs. The majority of gamers, be it on console or PC, were still gaming on mechanical hard disk drives, and as such, game developers were primarily targeting these systems as a baseline. This meant that games were designed for slow drives and utilizing low queue depths with block sizes between 4kB to 16kB, resulting in long loading screens and the occasional streaming related stutter. Over the years, developers have started to use block sizes of 32kB and beyond more frequently for more efficient loading of assets, but by and large, they were still heavily limited by antiquated storage stacks that would create significant CPU bottlenecks and would not allow asynchronous reads.

Microsoft DirectStorage aims to provide more efficient loading of assets from an NVMe SSD to the GPU with a multi-faceted approach:

  1. It will support a batch I/O submission system to allow for asynchronous reads. Advanced storage devices like NVMe SSDs require a high queue depth, as opposed to serialized requests that we have had until now, in order to achieve their potential bandwidth.
  2. GPU based asset decompression. GPUs are far faster and more efficient when dealing with the decompression of game assets than the CPU.
  3. A new and improved storage stack in Windows 11. The new I/O Rings API will play an important role in the batch I/O request system mentioned in point #1, and BypassIO will reduce the CPU overhead when performing reads from disk. These are both Windows 11 exclusive features, so it remains to be seen now everything will be handled in Windows 10.

Microsoft has been working closely with GPU vendors when it comes to the asset decompression side of things. According to a presentation on the Microsoft Game Stack YouTube channel from last year, the GPU based asset decompression will be handled through the DirectCompute API. NVIDIA has already outlined their solution - called RTX IO - which will allow reads through DirectStorage to remain compressed while being transferred to the GPU for decompression.

geforce-rtx-30-series-rtx-io-announcing-rtx-io

Intel has also been working closely with Microsoft on this technology, and in their presentation at GDC 2021, Intel outlined how DirectStorage not only makes asset streaming more efficient, but also makes things much easier on developers by replacing hundreds of lines of code.

Screenshot%20from%202022-02-07%2015-13-40

Another partner on this project has been Phison. The large manufacturer of NAND flash memory - who is set to release their new PCIe 5.0 controller this year - has been working closely with Microsoft to develop this new technology. At CES 2022 last month, Phison had this to say regarding the future of gaming:

y83h0wyfFyFK61il-1480x832

The last part in particular is key. High queue depth random reads of large block sizes from 32kB to 64kB, and possibly even 128kB, will play a crucial role in maximizing the performance of NVMe SSDs and keeping GPUs fed with data in order to allow for larger and more detailed game worlds. We hope gamers and SSD reviewers alike keep this in mind going forward. It's easy to look at the maximum sequential read speeds of an SSD and be amazed by the 7 GB/s that a PCIe 4.0 drive, or the 14 GB/s that a PCIe 5.0, could offer. But those workloads are seldom seen in practice. Gaming workloads are primarily random in nature, and the block sizes are far larger than 4kB. We hope to see publications who review SSDs take into consideration random reads of block sizes from 32kB to 128kB. Too many are currently content to only write about a drive's sequential or random 4K performance, and while random 4K performance is important for many day-to-day tasks, it serves little purpose for gamers, who are among the top consumers of SSDs. Microsoft says that DirectStorage will be optimized for both sequential and random workloads.

Another interesting thing to note about Phison's presentation is their statement that developers will start relying on NVMe SSDs to deliver a fast and consistent stream of data to GPUs in 2022. Could this imply that we may start seeing games that support DirectStorage this year? That is difficult to say, because while developers have had access to the DirectStorage SDK since last summer, it is not yet fully finalized. Back when Microsoft released the SDK as a limited developer preview last July, they mentioned that the GPU decompression aspect would be available at a later date. No announcement of this addition to the SDK has been made yet, however, Microsoft has recently released a job posting looking for a Senior Software Engineer to join their DirectStorage team. The position seems to have now been filled.

The DirectStorage team delivers world-class graphics and IO technologies for Windows and Xbox. Help shape the future of gaming as new, high speed, storage devices become available. We are looking for engineers with experience and a strong desire to build cutting edge graphics and storage systems.

Responsibilities

  • Software development work in a small team in your area of specialty on functionality, performance and quality
  • Engage directly with game developer partners to identify and fix specific issues, gather feedback, and gain insight into future product development
  • Contribute to SDK components that enable game developers to take full advantage of the available hardware
  • Be a key stakeholder for quality, documentation and tooling

Preferred Qualifications

  • Experience in game development
  • Experience with compression technologies

It looks like Microsoft is still working on finalizing the asset decompression part of the SDK, but given that it is so early in the year, odds are good that they can at least get this into the hands of developers sometime in 2022, as Phison stated in their presentation.

Until then, make sure to check out our article on the Sampler Feedback Streaming demo that we ran on our test system. SFS is a very exciting technology that aims to allow hundreds of GBs of assets to be displayed on current consumer GPUs by greatly reducing memory requirements, and alongside DirectStorage, will allow for the design of massive game worlds that are rich with detail. There are some interesting discoveries we made when running the demo. You can also check out the video below:

Stay in touch with Compusemble

To stay up to date on tech news, as well as to see our scores for PC components such as GPUs, CPUs, and SSDs, visit our site and follow us on Twitter.

Visit our YouTube channel for all your tech and gaming content needs.

Previous Post Next Post