Azure Storage Archive - Incremental Backups

Anon4343

When using Azure Storage Archive, Synthetic Full Backups are disabled and each backup job re-uploads the entire backup set in a single file. Am I missing something or is Azure Storage Archive not feasible for daily backups? I don't recall this behaviour when using AWS Glacier (Legacy).

Anon4343

And then for restoring a single file, does the application download a compressed file containing many files?

David Gugick

↪Anon4343

What type of backup are you running? File or Image? Are you using the new backup format.

I'll make a few comments and can adjust based on your reply:

Generally speaking, archive tiers do not support synthetic full backups because of limitations in their storage access times and / or APIs. Azure Storage Archive is not supported for synthetic full backups. You can see a list of supported cloud storage options for the new backup format in this help page: https://help.msp360.com/cloudberry-backup/backup/about-backups/new-backup-format/synthetic-full-backup.

You could use a lifecycle policy to move the data from a hot Azure storage tier to a less expensive one on a schedule. That way, you can get the advantage of Synthetic Fulls while moving your older backup data to less expensive Azure Archive. But doing that may depend on your retention needs. Feel free to elaborate on how you keep backup data.

The new backup format file backup does group files together as you described, but not necessarily in a single file object. But that could be the case if the total size of files is under the archive size limit and the archive is created fast enough (the archives themselves are dynamically managed based on size and speed of creation). The benefits are both in backup and restore speeds and the other new backup format advantages (https://mspbackups.com/AP/Help/backup-and-restore/about-backup/backup-format/about-format).

In contrast, the legacy file backup format manages backups at the file object level. Meaning, each file backed up has one or more objects created in backup storage each time (the file and optionally NTFS permissions). Since each object needs to be created / uploaded individually via the cloud APIs, this creates a lot of IO and associated latencies if file counts are high and / or files are small. This is largely eliminated with the new backup format.

When you restore, the software will restore every archive that is needed based on the restore criteria.

David Gugick

↪Anon4343

let me clarify one point. When we're storing, it's only necessary to restore the needed blocks within each archive where the files that are selected for restore are contained. We do not have to restore the entire archive file. For example, if you have a file that one megabyte in size and it's contained within an archive that's one gigabyte, we only have to restore the blocks that contain that one megabyte file. There's no reason to restore the entire 1 gigabyte archive file.

Anon4343

↪David Gugick

Thanks David. It looks like the Synthetic Backup saves time and bandwidth by copying within the cloud storage account. When I noticed that there is a recommendation for periodic full backups, I got the impression that without a Synthetic Backup, the entire data set would need to be re-uploaded periodically. In many backup software, it doesn't do a scheduled full backup or require the user to manually refresh the full backup, it uploads a percentage of already backed up data along with the new data. That way over the course of a month, a full backup does exist. It's just made up of incremental backups throughout the month.

I'm backing up over a terabyte of photo files that are typically 35 MB each. The backups are being made in case of a drive failure or a once a year folder retrieval. It sounds like the backup software uploads archive files of X size containing partial file or multiple files depending on their size. With archive storage, block level changes can't be made and new blocks are uploaded to the storage rather than replacing blocks. Old blocks are automatically deleted depending on the retention policy.

It sounds like I should migrate my data from AWS Glacier to either the new S3 Glacier Deep Archive or Azure Archive.

David Gugick

↪Anon4343

I think if you're just backing up files, I'd stick with the legacy file backup format. That format is incremental forever, and you'll never need to run another full backup. You can manage your retention at the file version level, which will make it very easy to manage what you need to keep long-term.

Anon4343

↪David Gugick

Great thank you. I'll continue to use the older AWS Glacier because the application still supports it after changing the config file to show all destination options.

David Gugick

↪Anon4343

We still support Glacier, but through the newer S3 Glacier API. You would register your AWS S3 account and then on the Compression and Encryption Options page (for stand-alone) and the Advanced Options (for Managed) of the Backup Wizard (for both legacy and new backup formats), you have a drop-down list called Storage Class that lets you select the S3 storage class. From there, you can select:

Standard
Intelligent-Tiering
Standard-Infrequent Access
One Zone-Infrequent Access
Glacier
Glacier Instant Retrieval
Glacier Deep Archive

Azure Storage Archive - Incremental Backups

Welcome to MSP360 Forum!

Categories

More Discussions