Backups are a lifeline of any BCP/DR plan and its imperative to have backups configured correctly. Its also important to periodically check backups and restore them to test servers to ensure that everything is working as expected.
It's important to define a backup policy for individual services/servers and apply them. A simple one size fits all policy is not going to meet all your requirements and can be cost-ineffective
Backups are a treasure trove but unfortunately, if you misconfigure the settings, it can inadvertently cost a lot in terms of losing data, customer confidence and compliance issues.
The backups policies are affected by the following issues:
Frequency of backups
Retention of backups
What's being backed up?
What's the recovery point objective (RPO) for each of the services?
So it's obvious that the more frequent the backup runs, the more is the backup size which significantly impacts your backup costs. It's also important to consider how long you want to retain backups. It's here that the need to define a proper backup policy is important.
The questions that can help you further tune the backup policy is as follows
What's my RPO? This helps you understand in case of a data loss, what's the last backup time you are comfortable with ? for example, web servers might have their code already on git or your favourite code repository so it can be backed up once per day but databases might need to be backed up very frequently (something like every 4 hrs). If you have much more critical data like a requirement to have no data loss for DB, then consider setting up a worker node for the DB. The worker node can sync the DB in realtime and using a combination of backups and relay logs, you should be able to recover all data by restoring data and replaying logs to the exact minute
What to exclude from backups? it's important to understand the severity of excluding from your backups things that are not critical. For example, if you already have a centralized logging system, you should be able to exclude all logs from being backed up.
Towards achieving an optimal backup strategy, E2E Networks has launched new backup templates to make it easier to achieve various RPOs. We use two major factors to create multiple pre-configured strategies namely frequency of backups and term of backups.
So accordingly we have the following configurations of backups available
Low Frequency - Short Term Backups
Low Frequency - Long Term Backups
High Frequency - Short Term Backups
High Frequency - Long Term Backups
As the name suggests, low frequency means a backup every 6 hours and high-frequency backups means a backup run every 4 hours. Short term backups are typically about 12 backups of the 4 or 6 hourly runs, 5 days of daily backups and 12 weekly backups. Long term backups are 12 backups of 4 or 6 hourly runs with 5 daily backups, weekly backups of upto 8 weeks, 12 monthly backups and 2 yearly backups. As you can see, long term backups and short term backups can be used depending on the requirements
Apart from this, other features can be used like MySQL backup add-on which can take a consistent snapshot of a server running MySQL if configured and the ability to archive backups to E2E Object Store (EOS) makes it a compelling reason to configure backups for all your nodes.