Monday, August 5, 2019

Understand how vSAN Data Protects

vSAN Protects data in many different forms. We will discuss these in brief.

Storage Policy-Based Management

Storage Policy-Based Management (SPBM) from VMware enables precise control of storage services. Like other storage solutions, vSAN provides services such as availability levels, capacity consumption, and stripe widths for performance. A storage policy contains one or more rules that define service levels.
 
Storage policies are created and managed using the vSphere Web Client. Policies can be assigned to virtual machines and individual objects such as a virtual disk. Storage policies are easily changed or reassigned if application requirements change. These modifications are performed with no downtime and without the need to migrate virtual machines from one datastore to another. SPBM makes it possible to assign and modify service levels with precision on a per-virtual machine basis.
 
Failures To Tolerance (FTT)

Defines how many failures an object can tolerate before it becomes unavailable.
Fault Domains: “Fault domain” is a term that comes up often in availability discussions. In IT, a fault domain usually refers to a group of servers, storage, and/or networking components that would be impacted collectively by an outage. A common example of this is a server rack. If a top-of-rack switch or the power distribution unit for a server rack would fail, it would take all the servers in that rack offline even though the server hardware is functioning properly. That server rack is considered a fault domain.
 
Each host in a vSAN cluster is an implicit fault domain. vSAN automatically distributes components of a vSAN object across fault domains in a cluster based on the Number of Failures to Tolerate rule in the assigned storage policy. The following diagram shows a simple example of component distribution across hosts (fault domains). The two larger components are mirrored copies of the object and the smaller component represents the witness component.
 


To mitigate this risk, place the servers in a vSAN cluster across server racks and configure a fault domain for each rack in the vSAN UI. This instructs vSAN to distribute components across server racks to eliminate the risk of a rack failure taking multiple objects offline. This feature is commonly referred to as “Rack Awareness”. The diagram below shows component placement when three servers in each rack are configured as separate vSAN fault domains.

 

Disk Group

A disk group is a unit of physical storage capacity on a host and a group of physical devices that provideperformance and capacity to the vSAN cluster. On each ESXi host that contributes its local devices to avSAN cluster, devices are organized into disk groups.Each disk group must have one flash cache device and one or multiple capacity devices. The devicesused for caching cannot be shared across disk groups, and cannot be used for other purposes. A singlecaching device must be dedicated to a single disk group. In hybrid clusters, flash devices are used for thecache layer and magnetic disks are used for the storage capacity layer.

Consumed Capacity

Consumed capacity is the amount of physical capacity consumed by one or more virtual machines at anypoint. Many factors determine consumed capacity, including the consumed size of your VMDKs,protection replicas, and so on. When calculating for cache sizing, do not consider the capacity used forprotection replicas.


Object-Based Storage

vSAN stores and manages data in the form of flexible data containers called objects. An object is a logicalvolume that has its data and metadata distributed across the cluster. For example, every VMDK is anobject, as is every snapshot. When you provision a virtual machine on a vSAN datastore, vSAN creates aset of objects comprised of multiple components for each virtual disk. It also creates the VM homenamespace, which is a container object that stores all metadata files of your virtual machine. Based onthe assigned virtual machine storage policy, vSAN provisions and manages each object individually,which might also involve creating a RAID configuration for every object.When vSAN creates an object for a virtual disk and determines how to distribute the object in the cluster,it considers the following factors:nvSAN verifies that the virtual disk requirements are applied according to the specified virtual machinestorage policy settings.nvSAN verifies that the correct cluster resources are used at the time of provisioning. For example,based on the protection policy, vSAN determines how many replicas to create. The performancepolicy determines the amount of flash read cache allocated for each replica and how many stripes tocreate for each replica and where to place them in the cluster.nvSAN continually monitors and reports the policy compliance status of the virtual disk. If you find anynoncompliant policy status, you must troubleshoot and resolve the underlying problem.

vSAN Datastore

After you enable vSAN on a cluster, a single vSAN datastore is created. It appears as another type ofdatastore in the list of datastores that might be available, including Virtual Volume, VMFS, and NFS. Asingle vSAN datastore can provide different service levels for each virtual machine or each virtual disk. InvCenter Server, storage characteristics of the vSAN datastore appear as a set of capabilities. You canreference these capabilities when defining a storage policy for virtual machines. When you later deployvirtual machines, vSAN uses this policy to place virtual machines in the optimal manner based on therequirements of each virtual machine.

Objects and Components

Each object is composed of a set of components, determined by capabilities that are in use in the VMStorage Policy. For example, with Primary level of failures to tolerate set to 1, vSAN ensures that theprotection components, such as replicas and witnesses, are placed on separate hosts in the vSANcluster, where each replica is an object component. In addition, in the same policy, if the Number of diskstripes per object configured to two or more, vSAN also stripes the object across multiple capacitydevices and each stripe is considered a component of the specified object. When needed, vSAN mightalso break large objects into multiple components.

Virtual Machine Compliance Status

Compliant and NoncompliantA virtual machine is considered noncompliant when one or more of its objects fail to meet therequirements of its assigned storage policy. For example, the status might become noncompliant whenone of the mirror copies is inaccessible. If your virtual machines are in compliance with the requirementsdefined in the storage policy, the status of your virtual machines is compliant. From the Physical DiskPlacement tab on the Virtual Disks page, you can verify the virtual machine object compliance status.


Component State: Degraded and Absent States

vSAN acknowledges the following failure states for components:nDegraded. A component is Degraded when vSAN detects a permanent component failure anddetermines that the failed component cannot recover to its original working state. As a result, vSANstarts to rebuild the degraded components immediately. This state might occur when a component ison a failed device.nAbsent. A component is Absent when vSAN detects a temporary component failure wherecomponents, including all its data, might recover and return vSAN to its original state. This state mightoccur when you are restarting hosts or if you unplug a device from a vSAN host. vSAN starts torebuild the components in absent status after waiting for 60 minutes.

Object State

Healthy and UnhealthyDepending on the type and number of failures in the cluster, an object might be in one of the followingstates:nHealthy. When at least one full RAID 1 mirror is available, or the minimum required number of datasegments are available, the object is considered healthy.nUnhealthy. An object is considered unhealthy when no full mirror is available or the minimum requirednumber of data segments are unavailable for RAID 5 or RAID 6 objects. If fewer than 50 percent ofan object's votes are available, the object is unhealthy. Multiple failures in the cluster can causeobjects to become unhealthy. When the operational status of an object is considered unhealthy, itimpacts the availability of the associated VM.

Witness

A witness is a component that contains only metadata and does not contain any actual application data. Itserves as a tiebreaker when a decision must be made regarding the availability of the surviving datastorecomponents, after a potential failure. A witness consumes approximately 2 MB of space for metadata onthe vSAN datastore when using on-disk format 1.0, and 4 MB for on-disk format for version 2.0 and later.vSAN 6.0 and later maintains a quorum by using an asymmetrical voting system where each componentmight have more than one vote to decide the availability of objects. Greater than 50 percent of the votesthat make up a VM’s storage object must be accessible at all times for the object to be consideredavailable. When 50 percent or fewer votes are accessible to all hosts, the object is no longer accessibleto the vSAN datastore. Inaccessible objects can impact the availability of the associated VM.

I hope this has been informative and thank you for reading!

No comments:

VMware Private AI

VMware Private AI In the fast-paced world of AI, privacy and control of corporate data are paramount concerns for organizations. That's ...