Alibaba Cloud Storage Wins the Best Paper Award at the Top Global Conference FAST

robot
Abstract generation in progress

Good news to share:

At the recent global storage summit FAST 2026, Alibaba Cloud, in collaboration with Shanghai Jiao Tong University and Solidigm, presented a new paper that once again won the Best Paper award. This is the third time Alibaba Cloud Storage Research has received this highest honor in the international academic community over the past four years.

The paper systematically reviews the “three generations of evolution” of storage architecture and introduces Latte, a brand-new cloud-integrated storage architecture. Combining Alibaba Cloud’s extensive engineering experience, it offers a new paradigm to break the “impossible triangle” of storage performance, cost, and reliability.

Below is an in-depth analysis of the Best Paper titled “Here, There and Everywhere: The Past, the Present and the Future of Local Storage in Cloud.”

In the era of cloud-native, local storage has long faced a trade-off between “ultimate performance” and “engineering usability”: it must meet microsecond-level low latency while also supporting multi-tenant isolation, elastic operations, and high availability. Therefore, based on Alibaba Cloud’s large-scale production practice, the research team systematically reviews the evolution of local disk technology from pure software optimization to hardware-software co-design across three generations in the paper. They propose a new cloud-integrated storage architecture—Latte—and provide a clear technological roadmap for its evolution.

Figure | New Cloud-Integrated Storage Architecture

// Evolving Towards Hardware-Software Co-Design, Like Upgrading Coffee-Making Technology

The paper vividly compares the three generations of storage technology to the process of refining coffee: how local disk technology evolved from pure software optimization to hardware-software co-design.

The first generation of storage technology is like espresso—Alibaba Cloud pioneered a user-space polling architecture that is very fast, unlocking NVMe SSD potential but sacrificing CPU efficiency, like emptying the entire kitchen just to brew a cup of coffee.

The second generation is like a double shot of espresso—introducing hardware assistance to improve isolation, but hardware rigidity makes it hard to keep up with the rapid iteration of SSDs, like buying a machine that can only brew specific beans; when the hardware upgrades, it falls behind.

The final evolution to the third generation is like Ristretto—an ultra-precise coffee. This architecture, with its hardware-software co-design, retains high-speed hardware channels while using a programmable “smart brain” to flexibly adapt to new disks, bringing storage performance close to the physical disk limit in large-scale applications, truly balancing speed, security, and future upgrades.

// New Hybrid Architecture Latte, Combining Ultra-Fast Local Storage Response with Infinite Cloud Elasticity

Based on this, the paper proposes the next-generation hybrid architecture Latte, which innovatively merges the ultra-fast response of local storage with the infinite elasticity of the cloud: through intelligent scheduling, it achieves 95.6% tail latency prediction accuracy with less than 10% CPU overhead; efficient caching strategies eliminate the need to “discard old for new,” nearly avoiding write amplification while achieving 80% read hit rate; elastic disaster recovery mechanisms allow local absorption of sudden traffic spikes, with the cloud silently backing up, so even if a server crashes unexpectedly, the system can quickly recover and continue providing services.

Figure | IO Latency Comparison Experiment Results

// Building a New Storage Foundation for the AI Era

Today, with AI large models becoming mainstream workloads, the value of the Latte architecture is even more significant. It can build high-performance, large-capacity, cost-effective elastic cache layers, effectively solving the problem of GPUs often idling due to data supply shortages. Meanwhile, through the “local absorption + intelligent shunting” architecture, Latte significantly improves response speed and resource utilization, providing a “data-on-demand” storage foundation for large model inference, enabling faster responses, lower costs, and more flexible scaling.

From Espresso to Latte, this is not only an evolution of storage forms but also a microcosm of the underlying cloud computing architecture shifting from “resource islands” to “pooled integration.” With this achievement, Alibaba Cloud demonstrates how to leverage hardware-software co-design and cloud-native integration to lay a more solid foundation for cloud-native databases, AI inference, and big data analytics, leading the global storage technology into a new era of intelligence.

FAST, short for Conference on File and Storage Technologies, was founded in 2002. It is a top international conference focused on storage, jointly organized by the USENIX Association and ACM SIGOPS, the operating systems professional organization of the Association for Computing Machinery. It represents the highest level in the field of computer storage worldwide. Over the past twenty-plus years, FAST has promoted the development of many storage-related technologies, including hardware-software integration, RAID, flash file systems, non-volatile memory technologies, and distributed storage.

View Original
This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.
  • Reward
  • Comment
  • Repost
  • Share
Comment
Add a comment
Add a comment
No comments