How to Unleash AI Creativity with VALDI and Storj

Kevin Leffew
April 4, 2023

In the rapidly evolving world of generative AI, researchers and developers face significant challenges related to computational resources, storage, and cost management. Traditional cloud providers often come with hefty price tags, and centralized storage solutions struggle to keep up with the ever-growing demand for efficient data access.

VALDI and Storj have joined forces to overcome these historical hurdles, offering a powerful distributed cloud and decentralized storage solution that revolutionizes the way AI models are trained and deployed. In this post, I'll dive into a real-world example of how this groundbreaking combination has enabled the training of a feedforward Artificial Neural Network (ANN) on the MNIST dataset with remarkable cost savings and efficiency improvements compared to traditional approaches.


Training a machine learning model on VALDI’s distributed cloud

VALDI demonstrates training of a feedforward Artificial Neural Network (ANN) on the MNIST dataset of 60,000 images of handwritten digits. The neural network model is defined and trained using the popular GPU-enabled TensorFlow Python library. Upon completion of the training, the model is able to identify handwritten digits with near 100% accuracy. The code is available here.

A visualization of the structure of the artificial neural network trained on VALDI with Storj.


Using Storj for training data retrieval plus model storage

On the backend, as shown in the following diagram, the decentralized Storj network is used in conjunction with VALDI to serve the training dataset to the VALDI network device(s) doing the training, plus store the intermediate and final models.

An illustration of the architecture used to interface between the VALDI and Storj networks.

The training of this relatively simplistic ML model is able to effectively demonstrate the benefits of distributed and decentralized cloud services. By leveraging the distributed VALDI network for compute cycles—in conjunction with the decentralized Storj network for data storage and retrieval—the cost of popular ML training workflows can be substantially reduced. This approach also entirely eliminates dependence on services from larger, more expensive and traditional cloud providers. Note, finally, that a similar architecture could be used for ML model inference.

Upon completion of model training, a series of intermediate model checkpoints (intermediate-model-N.tar.gz), as well as the final model (mnist-model.tar.gz) is available in the user’s Storj bucket.


AI innovation moves faster with the decentralized cloud

In conclusion, the collaboration between VALDI and Storj has demonstrated the immense potential for distributed cloud computing and decentralized storage solutions in the generative AI community. By combining these cutting-edge technologies, researchers and developers can now effectively tackle complex projects while minimizing costs and maximizing efficiency.

As we continue to push the boundaries of AI capabilities, it is crucial to explore and adopt innovative approaches like the one provided by Storj. We encourage you to take advantage of Storj's decentralized storage platform to empower your own AI projects and contribute to the future of generative AI.  Embrace the power of decentralization and start leveraging Storj as the object storage layer for your machine learning workflows today!

Get started with Storj: To learn more, get started here: storj.io/signup?partner=kevin, and check out our documentation at docs.storj.io

Get started with Valdi:  To learn more VALDI, get started here: https://www.valdi.ai , and check out our documentation at https://docs.valdi.ai

Share this blog post

Put Storj to the test.

It’s simple to set up and start using Storj. Sign up now to get 25GB free for 30 days.
Start your trial
product guide