“A web scale, master less architecture which ensures data integrity while being application aware, immutable and provides performance”…it sounds like witchcraft or magic and to be completely transparent I thought it was too.
In this post I am going to walk through the Rubrik architecture and in particular focus on the file system named “Atlas” underneath which is integral to providing web scale, performance and immutability.
To really understand this you have to go back to where Rubrik’s exceptionally smart engineers come from, the founding engineer for the file system worked at Google and worked on Google Colossus – background reading on Google HERE
Note for background reading on Rubrik please see a great overview HERE
So lets take care of what Rubrik looks like under the hood and what this master less web scale architecture is. A Rubrik brik (2U) consists of 4 nodes where each node looks the same (Compute, Flash, HDD and Memory) and are all fully independent, what this means is, it is a fully independent or master less system which is acting like a cluster. To scale you simply add another Brik(s) and Rubrik will see more nodes it can write too, it will self balance as a background task and balance workloads across the additional nodes. The real key here though is what is occurring under the hood, to visualize this see the diagram below:
The reason that Rubrik can achieve web-scale (think Google like scale) is down to Atlas which is the distributed file system built from the ground up by the engineers at Rubrik, achieving huge scale and still maintaining performance.
Atlas is running on each local node but is interacting with all the nodes, you can think of each node as a peer like in a distributed system such as block chain, where if one node dies the others can pick up the workload and no data is lost. In essence no master or slave nodes! The cluster management in the diagram above is feeding information to Atlas on cluster health etc. So Atlas can decide make decisions on where to place copies of data to achieve maximum redundancy and locality. Atlas is doing many tasks underneath to constantly address / improve performance, locality, redundancy – it needs to interface with cluster management to make these decisions based on the resources available and health.
Atlas has a global namespace with data stored on either flash or spinning disk with distributed meta-data (Callisto). The meta data is redundant so it it replicated across 3 x nodes (SSD) and backed up to HDD this gives us all the information we need about the data which Rubrik ingests and protects against any failures. Meta data provides Rubrik with the fast Google like search it has as this is stored on SSD and it is distributed throughout the system to ensure it has maximum availability.
So with Rubrik there are no fork lift upgrades to gain more performance, no scaling or silos / pockets of kit, just simply add more Brik’s and the Rubrik cluster keeps expanding.
What about Integrity?
Hardware can be unreliable that’s just a fact, things do fail and the way to mitigate failure is to have a system which can tolerate failure and not incur outages, lengthy re-builds and impact to performance. If you build for failure then when failures happen life is easier, all web scale companies like AWS, Azure, Google, Facebook they all build their architecture with the ability to tolerate hardware failures and they expect them.
Rubrik can handle either dual disk failure or one node failure, but it is more they way Atlas stores your data which is of interest. You can think of Atlas as a versioned file system where you cannot modify what has been written (append only by design). So as a snapshot is ingested that version of data can never be modified, this is of a benefit to you. Think what backup is doing, it has to prove compliance and as a more recent or hot topic, it means that ransomware cannot modify your copies of data (For a post on ransomware see this post).
Ah but what about when I perform a live mount? Surely the data is modified? Very simply NO, if we for instance bring a SQL DB back to a point in time the snapshot chain is read from but any changes are stored in a journal / log file. This makes it perfect for testing, think of ransomware and an infection, the key is to test your data. Rubrik provides a way to spin up a copy of your Virtual Machine or Database from an immutable point in time, with any changes written to a separate area, meaning you can test if your data is safe and pick a restore time!
So if I lose copies of my data how does Rubrik still keep running and ensure no data loss? The answer lies with Erasure coding (Reed Solomon 4.2) prior to Rubrik version 3.0 mirroring was used. so if a disk fails Erasure coding is automatically enabled to return the cluster to full protection within hours. (Remember Rubrik is writing across multiple nodes).
“No RAID here!”
Note: Erasure coding was added to vastly improve usable capacity giving 66% raw to usable and still providing a rebuild time of only hours for large drives. With Erasure coding it means we chunk data in to 4 data blocks & 2 code blocks, you can then lose any of the two data blocks and you can still rebuild.
Atlas provides continuous validation, what this means is that data is checked in 3 areas 1) when being written, 2) when data is read and 3) Background / maintenance tasks. To do this Rubrik leverages 2 types of checksum or also known as CRC’s. This is good news for customers, it means Rubrik is consistently checking your data and if needed it can rebuild and corruption. I will leave fingerprints until another time which “Cerebo” takes care of, essentially a set of algorithms which employ a more rigorous end to end check.
With fault tolerance and Erasure coding, bear in mind when you grow the cluster Rubrik under the hood is spreading the data across the cluster to give you the maximum availability so the cluster can still run and perform its job which is ingesting your backup’s and operational tasks (restores, live mounts etc.) It is doing all of this while maintaining performance and self-healing from any failures.
So yes its being a cat riding a unicorn holding a golden gun and shooting rainbow flames its that good!
I am speaking at an event in this LINK covering Ransomware along with many other security topic’s and talks, the top of my list will be discussing how with the Atlas file system your data will not be corrupted and show a demo of how to actually recover in seconds. Hope to see you there.
Please like share, comment, ask questions!
The original talk on this subject was ran at Tech Field day 12 by the man who wrote Atlas himself, see this LINK