File System Utilization Misleading Indicator of Array Efficiency
-
Since storage arrays do not keep track of usage, array utilization has been a mystery to many storage teams. Usage should be defined as storage with data written to it directly from file systems or applications, or as a result of some form of data protection/replication. To attempt to understand the utilization of their storage systems, some storage teams have deployed agents from their Storage Resource Management (SRM) tools to all their application servers to collect file system usage information. Typically the user must then export this information into a spreadsheet and manually correlate the file system information with each storage array LUN, a painstaking and error-prone process. This approach incorrectly assumes that the utilization of the arrays is equal to the utilization of the file systems. MonoSphere® has evaluated tens of petabytes of storage information and has consistently found that the utilization of the file systems logically connected to the array through the SAN is not representative of the utilization of array. There are many examples of why this is true, but common examples are:
- File system utilization does not account for dark storage – storage that is either not claimed by application servers or not assigned to a file system in which application have written data. Accounting for dark storage will result in a much lower utilization.
- File system utilization does not fully account for the entire LUN. Imagine an example where a file system has RAID 4 data protection created on a volume group containing four LUNs, where the file system is striped across three of the LUNs and the fourth LUN is for parity. If the usage of the file system is 75 GBs, and the size of the file system and LUNs are 100 GBs each (meaning that the file system is striped across a 33.3 GB partition from each LUN), the utilization of the file system would be 75 percent, but each LUN in the volume group would be 25 percent utilized.
As a result, MonoSphere has seen the utilization of storage arrays to be around 50 percent the utilization of the file systems on application servers. For example, if the overall utilization of your file systems is 60%, typically array utilization will be somewhere between 25 to 35 percent.
MonoSphere's Storage Horizon® automatically determines current and predicts future storage array usage, enabling storage teams to more efficiently manage their arrays, resulting in lower capital spending and less administration overhead.
Strategies for Setting Snapshot Reserves
-
One of the tough challenges for storage teams is to determine how large to set snapshot reserves. Setting the snapshot reserve too large causes a great deal of storage to be wasted since nothing else can utilize this reserved space. Setting the snapshot reserve too low could cause snapshots (or other data writes) to fail if snapshot usage consumes all the space available in the volume. MonoSphere has observed customers successfully use the two philosophies below in NetApp® environments.
- Set the snapshot reserve to zero and closely manage the volume available (and the aggregate available in non-space guaranteed environments). This philosophy ensures that excessive snapshot reserves do not waste storage; however daily management of the volume available is required to ensure storage on the volume does not run out.
- Set the snapshot reserves to 115% of the snapshot peak usage. This philosophy ensures that snapshots have enough space to complete and that snapshot usage won't interfere with other storage usage; however daily management of the snapshot reserve size is required to ensure it's the right size in relation to snapshot usage.
MonoSphere's Storage Horizon provides the automated daily monitoring to assist storage teams with any of these two philosophies.