Java – Listing blobs in GCS with java with paging keeps getting the same blob pages

Listing blobs in GCS with java with paging keeps getting the same blob pages… here is a solution to the problem.

Listing blobs in GCS with java with paging keeps getting the same blob pages

I’m trying to list a lot of gcs blobs using the Java API. Due to the large number of blobs, I tried to use paging, but I get the same page repeatedly. The code looks like this:

Storage storage = StorageOptions.newBuilder().setCredentials(credentials).build().getService();

Page<Blob> allBlobs = storage.list(myBucketName,Storage.BlobListOption.pageSize(5000), Storage.BlobListOption.prefix("some prefix");

while (allBlobs.hasNextPage()) {
   Page<Blob> page = allBlobs.getNextPage();
   for (Blob blob : page.getValues()) {
     .... do something....
   }

}

It looks like I visit the same page over and over again. I looked at the token with allBlobs.getNextPageToken() and it looked the same all the time. Did I miss moving the page forward to the next? Isn't getNextPage doing it? The Page interface defines only a few methods. Am I missing something?

Solution

Use the iterateAll method instead. See example here (copy here to finish):

Page<Blob> blobs =
    storage.list(
        bucketName, BlobListOption.currentDirectory(),
        BlobListOption.prefix(directory));

for (Blob blob : blobs.iterateAll()) {
   do something with the blob
}

Related Problems and Solutions