Tuesday, March 24, 2009

NetApp usable space calculation formula

NetApp RAW to usable space calculation formula

All of sudden today I got a question "Is there any quick formula to calculate the usable capacity for NetApp filer?" in all my career I never had to come across for any such kind of formula so curiously I started hunting on web and found a number of sites offering such tool but most of them either dead or pointing to a tool developed by Nick Bernstein. This is a small tool available for free download, where you have to select the disk, raid and some other options and you get a good numbers of usable space from your raw space. Although it gives a very close figures and sometime actual number, but I found it's not all, as there are some more constraints which one has to keep in mind while quoting usable space.

So I started hunting more on net and came across one blog from Jim of HP, where he is criticising the NetApp's not so called clear approach of giving any formula for same and after that a number of following comments by some NetApp big-shots.

After going through post and accusations made by each other on their product I found an easy to understand formula with example posted by NetApp guy. Now below is the formula given by them where I have added my points which I think is required while calculating usable size.

So first thing first, as we all know that space advertised by manufacturers are in base10 formula which is just a marketing funda but we use in base2 formula which is the actual space your system sees once you connect the disk. So I will use GB forms in base2 which is the actual space we can use.

Let's take example of 20 FC disk of 144GB which after converting in base2 number comes to (136000MB/ 1024) 132.8GB x 20 = 2656 GB
(You can check the base2 size of each disk in sysconfig -r command)

Now we will use the disks in Raid-DP so each raid group will reserve 2 disks as parity, so now we have space left-over with 20 - 2 x 132.8 = 2390 GB
(Please note that here I have taken the example of FC disk where maximum number of disks limit per RG is 28, check the NetApp Storage management guide under topic Maximum and Default Raid Group Size or see the online version on now site here)

As NetApp system stores an additional checksum of 8bytes for every 512 bytes of data, which is ~1.5% overhead or 35GB in this case. So we are leftover with 2355GB.

Now reduce 10% for WAFL overhead, so it comes to 2120GB

Now change the default aggregate snapshot reserve from 2% to 0%. Why? Because aggregate level snapshots are primarily used for metrocluster.

So to summarize that let's see the an easy step by step calculation formula
1. Check available disk capacity in base2 number (a)
2. Number of disks - hot spares - Parity disk required by raid group (depends on raid type and disk type+number) = number of disks will be used in aggregate (b)
3. a * b = Raw space available (c)
4. c - 1.5% = Space after additional checksum overhead (d)
5. d - 10% = Space after WAFL Overhead (e)

So in nutshell 88.65% is usable space after multiplying with the value of raw space in base 2 value

Link from netapp from raw to usable conversion a good read if you want to know anything further

Update:
On SATA disks space used for checksum in BCS type is more than 11%, but if you use ZCS the net loss from checksums is a little under 2% which is consumed from the wafl reserve
Update:
Post updated following changes in release of ONTAP 8.1.

5 comments:

Anonymous said...

I should add to this for "usable" space gets reduced even more with block-level protocols (iscsi/fcp) when using LUNs, since the default of having 100% fractional reserve allocates the same space used in your LUN for FR. (ie: 200GB LUN = 200GB used in FR as soon as you take a snapshot).

This reduces your "usable" space even less.

Reducing the FR is an option, but comes with risks, obviously.

Unknown said...

Yes, you are right but that gives you peace of mind that once you have snapshots enabled then no matter how much changes you make on a lun it will always have space to write and will not run out of space. Say you have 100Gb lun and you are using snapshots to revert at original state of lun even after you fill the lun delete everything and again fill the lun or run indexing on exchange or other type of data which are stored on that lun. So in nutshell if you want to go extreme and want to have a insurance policy then you have to put your fat wallet on diet.

Дмитрий said...

You forget about spare disks (BP - 2 disks per disk type per controller) and about root vol (DATA Ontap needs for itself some room at root vol + mailbox disks needs some space at root vol).

Unknown said...

No, I didn't forget about spare and root volume disk, it's just that I didn't added them because they are fixed and deriving a percentile for them is highly variable. For example if your system has 1000 disks on your system then 3(root volume) + 2(spare) = 5 disks will account for just 0.5% however if you have only 10 disks then they will account for 50%.

Hope you got the point of not adding them to my calculation formula.

Cheers,

Anonymous said...

very nice blogThank you!