Calculating optimum Thread Count and Chunk Size for lots of small files
I have seen a few discussions on this but not sure if there is a basic formula I can use to find the best settings for threads and chunk size. Like other I experience a slow backup when my files are small. I in some cases have millions of files that are in the range of 2k-100k. I have tried setting threads to 20 and this does help some but I am still in the range of 1MB or so for the rate. Other servers with a bit bigger files (say 400k-600k) seem to run better around 10Mbs. SQL backups are even better, running around 45Mbs.
Chunk Size probably will not affect small files performance as the chunk size is likely larger than the file size. That just means all files will be uploaded in a single chunk. That's the best case for small files. As far as threads, you can try increasing further to see if that helps. It may, but at some point you'll likely be thread limited. As you're seeing, the file-level / object-level latency is the cause.
A few other things to consider:
1 - Consider turning off logging (Settings - Logging) or making sure the log is on a fast drive (SSD) or one a different drive than the files being backed up. But I would try disabling logging as a test
2 - Make sure the Repository (Settings - Repository) is on a fast drive
3 - If the files do not compress well, consider disabling compression. Compression is generally low CPU as we use an efficient algorithm, but it's worth a try.
4 - Make sure you are using the closest region for your cloud provider (assuming you have the choice). You may be able to use tools available at the cloud provider or elsewhere to help determine your latency to each region.(can you share which cloud you are using)?
We are working on product improvements for your exact use case, but we do not expect those features to be available until next year some time.
Sign in or register to add a comment.
Add a Comment
Welcome to MSP360 Forum!
MSP360 Managed Products
Managed Backup - General
Managed Backup Windows
Managed Backup Mac
Managed Backup Linux
Managed Backup SQL Server
Managed Backup Exchange
Managed Backup Microsoft 365
Managed Backup G Workspace
Backup for Linux
Backup SQL Server
Connect Free/Pro (Remote Desktop)
Cloudberry backup on One Drive - Searching modified files taking long time (3 day) and not completed
Backup for Linux fails to backup files with diacritics in file names (Spanish, French, German etc.)
Local OneDrive and DropBox install with Files On Demand / Smart Sync enabled
"Product total size limit exceeded" for small backup to empty bucket
Terms of Service
Useful Hints and Tips
Created with PlushForums
© 2022 MSP360 Forum