Member-only story
Cloud Storage as a File System in AI Training
Cloud Storage is a common choice for Vertex AI and AI Platform users to store their training data, models, checkpoints, and logs.
Now, with Cloud Storage FUSE, training jobs on both platforms can access their data on Cloud Storage as files in the local file system.
This post introduces the Cloud Storage FUSE for Vertex AI Custom Training. On AI Platform Training, the feature is very similar.
Cloud Storage FUSE provides 3 benefits over the traditional ways of accessing Cloud Storage:
- Training jobs can start quickly without downloading any training data.
- Training jobs can perform I/O easily at scale, without the friction of calling the Cloud Storage APIs, handling the responses, or integrating with client-side libraries.
- Training jobs can leverage the optimized performance of Cloud Storage FUSE.
The problems
Traditionally, training jobs have two ways to use data from Cloud Storage.
- They can use
gsutil
to download the entire dataset prior to training. This may take hours depending on the dataset size, which significantly slows down the start-up of the jobs.