Have you been told that oh you don't need any backups or replication because snapshots will fulfil all your data protection needs?
So in this article, we will share with you what's the difference between snapshots, replications and backups. And just maybe you may want to reconsider your DR strategy. Snapshot is all the backup you ever need is often something only storage vendors will tell you. More often than not these are specific vendors that don't have a complete set of data protection strategies or solutions.
Don't get us wrong. We are not saying that snapshots are bad. Snapshots have their place in the chain of data protection but it is certainly not backup neither does it replace the use case of replication. Having said that your data protection needs to differ from the next person so there may be a possibility that snapshots are all that you need but oftentimes most enterprises will have a combination of the three data protection capabilities and technologies.
So let's go through the three data protection methods, a little of how it works and what use case fits best. Snapshots are also known as the point in time copies. Point in Time copies by definition is the viewpoint of the data at the point where the snapshot is triggered. Snapshots are by far the fastest and most efficient data protection method to protect data. Sometimes in certain systems, it is almost instantaneous.
So let's look at how it works. You would have the master copy and as you write more and more data, and when you initiate a snapshot what tends to happen is they just put like a little bookmark or marker. Every time you write subsequent new data on it, there's a journal happening that tracks all these changes. When more new data is added and the next time you trigger a snapshot another journal happens. The longer you take the snapshots, the journal becomes larger and larger and will impact performance.
So why do we not like snapshots or rather why snapshots are not backup is because there is interdependency between all these snapshots and what you want to recover. Say for example you want to recover a point in time data, it is actually a combination of other snapshots and the master copy. So assuming any of the components is corrupted or destroyed, you literally don't have anything else to recover. Say for example if the master copy is dead or corrupt or whatever it may be, you are not able to restore any of these two other copies as well. Also a lot of times snapshots in most storage subsystems live on the same storage. You would have your master here and all your snapshots on top of it.
This is actually not the best practice in general because failure in the volume or the storage simply means all your backups with it will fail. It's a bit like all eggs in the same basket. Having said that snapshots because it's so fast and so powerful and it's just doing reference and pointers it is great for recovery purposes. Snapshots are great if you only need to recover and retain backups for just a couple of days and also it's highly dependent on how often do you take it. This is because the longer you keep snapshots the more resources it uses to keep the journals and all the blocks that are changed.
Many vendors have unique implementations to help alleviate this issue but still really is just delaying the inevitable. The limitations still exist. Now let's look at replication. As the name suggests replication simply means copying or replicating data to another storage. It can be on another system in the same data center but often it is remote to protect against DC failures as well. There are generally two types of replications - asynchronous and synchronous replications. Let's start with async. Async replications often mean that data is replicated at a given interval, perhaps every five minutes, changes are then replicated to the remote site so in the event of a disaster the worst that can happen, you will lose up to five minutes of data and it's often articulated as what we call recovery point objective or RPO equals five minutes.
Sync replications on the other hand replicate all IO as it is written to the storage system. It will commit both local and remote writes before and acknowledging to the host that the write is good. In many cases, mission-critical apps that cannot tolerate any loss of data would often opt for sync replications. Similarly in terms of RPO sync replications is what we term RPO equals zero, which simply means no data loss.
So why would anybody pick Async replications then? As you can tell sync replications demand on bandwidth will be extremely high and latency-sensitive comparatively, async often time have generous allowances of bandwidth and latency making it significantly cheaper. The advantage of replication is in its ability to recover very quickly with minimal data loss in the event of a complete data center failure or primary storage is completely lost. You often time have already a copy of the data and you're ready to resume business. Having said that it is not without its caveats because every data block written is replicated, that simply means if you have a corrupted block that is written or somebody accidentally or maliciously deleted a whole bunch of data. All this will also be replicated like the saying Dirty block in, Dirty block out! This makes it great for business continuity protections and insulation against primary storage failures but surely not so great if you want the ability to roll back to any point in time which brings me to my very last item.
Backups have been around pretty much since the beginning of time. Over time it's evolved to resemble a little like a combination of snapshots and replication. You make a full copy of the primary data every time you run the back up. Which for most organizations it's once a day and assuming you run at 8:00 p.m. you get a point in time replicas of the data exactly as how it looks like at the end. Similar to how a snapshot will be, assuming you do it seven days a week you will now have seven independent copies of data for 8:00 p.m. for the last seven days. Assuming the third copy is corrupted you often still have the second or fourth copy to recover, unlike snapshots or replications. Backups are also perfect for long-term retention because as long as there's capacity and resources you can store it for as long as you want.
You may be thinking the consumption of storage for backup then be massive and surely that's an issue. Yes of course, but there are many capabilities out there like dedupe and compression that will help with that problem and I mean today I will not go into depth with regards to that but the biggest issue with backups is generally time. It takes the longest to protect and also takes the longest to recover without going into details on the advanced backup and recovery capabilities, incremental forever and dedupe appliances which have improved recovery performance over the years. Regardless it is still the slowest of the three technologies we spoke about today. So depending on your needs and requirement you may only need one of the three data protection methods or a combination of all three.
If I will summarize my recommendations for the most cost-effective and fundamental form of data protection for every enterprise. Backup is a must! I cannot stress enough about backups. You need to have backups. For short term data protection between three to five days snapshots is the way to go but optionally I will still recommend backups. And for fast recovery, snapshots and replications is the way to go. Mission-critical applications definitely will be requiring replication with backups. Hopefully that has been useful. I know it seems like a lot for people that are new in the data protection domain and they may all sound the same in some sense. There are subtle differences between all of them.