Following up; support confirmed that S3 buckets with object-locking enabled is not supported.
I just want to publicly request that object locking support be a priority.
Console passwords and 2FA deletes etc are great, but object locking is BETTER because even if a threat gets your password via keylogger, breaching your mailbox/mail server, whatever. Even if they gain the ability to reset your passwords, disable your 2FA, or gain access to the storage account itself, they still can’t delete objects that are locked, can’t remove the object lock, can’t shorten the object lock, and and can’t delete buckets with locked objects, short of closing the aws account as root. (And even then AWS might hang onto the data for a few months allowing you to reopen the account – I’m not 100% clear on that.)
So this even gives you some security against insiders and rogue IT admins.
*In compliance mode, a protected object version can’t be overwritten or deleted by any user, including the root user in your AWS account. When an object is locked in compliance mode, its retention mode can’t be changed, and its retention period can’t be shortened. Compliance mode ensures that an object version can’t be overwritten or deleted for the duration of the retention period. *
https://docs.aws.amazon.com/AmazonS3/latest/dev/object-lock-overview.html
It looks like the only requirement for cloudberry to support object locking is just to include the otherwise optional Content-MD5 header?
The base64-encoded 128-bit MD5 digest of the message (without the headers) according to RFC 1864. This header can be used as a message integrity check to verify that the data is the same data that was originally sent. Although it is optional, we recommend using the Content-MD5 mechanism as an end-to-end integrity check.
This seems like a no-brainer to add!
As it stands now, I am now looking to set up replication from the storage cloudberry points at to a bucket with object locking. So that even if the cloudberry target is destroyed, I have some sort of secure immutable backup… of the backup.
The only limitation to object locking is that once objects are older than the lock they can be deleted; which means if you are doing a file based backup with a 3 month retention, if the object doesn’t change for 3 months it can be deleted. For block based backups I’m hoping scheduled full backups within the retention window will resolve it, I’m not sure if there is a solution for file based backups. It might need cloudberry support to reupload/duplicate the object periodically or extend the retention period of the current version or something to ensure the versions that should be retained never have an expired lock no matter what.
Hopefully I can target cloudberry at a version locked bucket to do a restore … otherwise to restore I’ll need to replicate back to a regular bucket…or something.
I’m working on testing these various scenarios.
It’s frustrating that cloudberry doesn’t seem to be taking more serious ownership of this issue; and providing transparency on issues, and best-practices to mitigate known and theoretical attack vectors.
While, for example, a breach of the storage provider is beyond what you are responsible for; you should identify that as a risk. Some of your supported storage providers do not provide any sort of bucket replication, object locking, or usable mitigations. As much as I like backblaze the company, that is why i am looking at alternatives now. These limitations should be clearly called out. And for those that do support it you should be providing deployment guides and recommendations on how to use it effectively. This is part of deploying your product, we shouldn’t be each figuring it out for ourselves; or getting caught out because we didn’t think of it. This affects all of us.
I will also say Datto claims their SIRIS line already has redundant cloud replication (cloud to 2nd tier cloud); so that a compromised onsite backup device or customer account that manages to somehow destroy the local and customer accessible cloud backups will not be able to reach the 2nd tier cloud backup. This is not “spy-thriller” stuff, this is the reality of what we are contending with in the wild right now.