Blog

For AI, the Sky May Be the Limit—But Data Storage Could Be, Too

This blog post was provided by Melody Zacharias, Technical Evangelist Director AI at Pure Storage.

June 17, 2025

This blog post was provided by Melody Zacharias, Technical Evangelist Director AI at Pure Storage.

Data storage is the backbone of AI, but as model complexity and data intensity increases, efficient, high-performance storage platforms will be critical to support AI’s unique and evolving demands.

Behind every AI project is a pipeline of interlocked technologies in the data center: not just high-powered GPUs, but networking, memory, and storage arrays. Each plays a critical role, but one, in particular, is emerging as an unsung hero for AI: the data storage platform.

At Pure Storage, so many of our conversations with customers about AI come down to challenges with data—from its exponential growth and complexity to idle GPUs and data center footprints. They’re quickly discovering that for AI, traditional data storage can’t keep pace.

AI Has Always Been about the Data

Looking back, AI’s journey has always been intertwined with storage innovation. While early AI efforts were limited by algorithmic complexity and data scarcity, as algorithms advanced, bottlenecks with memory and storage emerged. New, high-performance storage enabled breakthroughs like ImageNet and GPT-3, which required petabytes of storage.

With each AI push, storage responds with better capacity, speed, and scalability. To manage the exabyte-scale workloads of artificial generative intelligence (AGI) with sub-millisecond latency, we’ll need an entirely new generation of storage—like the new FlashBlade//EXATM, the world’s highest performance data storage platform.

And we’re just getting started.

AI’s Demands on Data

Why is the highest end of storage so important to AI—and why won’t hard disk systems cut it? It’s a problem of volume, velocity, and performance that legacy systems can’t solve, exacerbating financial pressures for AI ROI.

Traditional storage systems struggle to keep pace with petabyte-sized multimodal data sets and the aggregation-heavy demands of data science and AI/ML workloads like autonomous driving and genomics research, creating bottlenecks with insufficient IOPS. High latency from spinning disks or outdated caching mechanisms delays data access, increasing time to insight and reducing efficiency. But if that data resides on flash systems from the start—systems that offer four to five times the performance at a similar price to disk—AI deployments can get a serious leg up.

Storage Must Be Optimized for the AI Pipeline

Clearly, it’s not possible to push the performance envelope of networking and compute in the data center without also addressing data storage. And, storage has to be optimized for the demands of each stage of the AI pipeline, from data curation and processing to checkpointing, training and inference, and metadata management.

For example, in training and inference, large language models (LLMs) can run into issues that lead to hallucinations and irrelevant insights. Here’s where architectures like retrieval-augmented generation (RAG) play a critical role—but also challenge traditional storage’s limits.

In an interview with Six Five – On the Road, Pure Storage CEO Charlie Giancarlo said, “In RAG, you want to access a large fraction (or all) of the data in an enterprise. That means you have to be able to access it.” It relies on efficient retrieval of external knowledge bases during inference, and on disk systems, the performance level is missing for RAG.”

The sky may be the limit for AI, but data storage can be, too.

What about the Energy Demands of AI?

When OpenAI CEO Sam Altman warned AI’s future depends on a breakthrough in energy efficiency, he shared a potential solution: climate-friendly storage.

Demand for power from Canada’s data centers is expected to grow at a yearly rate of 9%, reaching 1.16 gigawatts (GW) by 2030, with AGI workloads in hyperscale data centers requiring 5-10 times more power than traditional facilities.[1]

“As you start adding GPUs, you need a lot of power and cooling. Data centers aren’t sold by the square foot anymore; they’re sold by the megawatt,” shared Giancarlo. “If you’re pressing up against the limits of your power envelope in your data center…you’re stuck, [or you’re] forced to expand to another data center or bring in more power.”

It’s something Pure Storage systems were designed to offset, with ten times the reliability of disk systems while requiring one-fifth the space, power, and cooling.

No Limits on Data, Limitless Possibility with AI

In a recent study, nearly half of Canadian respondents cited data quality and integration as the top barrier to moving from AI pilots to full-scale implementation.[2] The need for unified data platforms has never been greater. Prioritizing storage as a core component of AI strategy can unlock ROI, drive continuous innovation, and help businesses maintain a competitive edge, and Pure Storage is here to help.

Learn more about how Pure Storage can accelerate your AI journey. Let’s power the future of AI—together.

Learn more about how Pure Storage can accelerate your AI journey. Let’s power the future of AI—together.


[1] https://www.nationalobserver.com/2025/01/23/analysis/canada-data-centre-ai-power-grid

[2] https://itbrief.ca/story/canadian-firms-to-boost-ai-investments-significantly-by-2025

Share this