Monday, December 7, 2009

Restoring data from snapshot through snaprestore in NetApp

Ok so now you have allocated correct snap reserve space, configured snap schedules, snap autodelete, users have access to their snapshots and they recover their data without any interference of backup team. Everyone is happy so you happy but all of sudden on a Friday evening get a call from VP marketing crying on phone that he lost all his data from his network drive and windows shows recovery time of 2 hrs but he wants his 1Gb pst to be accessible now as he is on VPN with a client and needs to pull some old mails from his pst. Well that’s nothing abnormal as he was having lots of data and to recover the data windows has to read all the data from snapshot and then write back on network drive which but obvious will take time. Now what would you say, will you tell him to navigate to his pst and recover it (which shouldn’t take much time on fast connection) then try to recover all the data or ok I have recovered all your data while talking on the phone and become hero.

Well I must say I would like to use the opportunity to become hero with a minute or less of work, but before we do a few things to note.

For volume snaprestore:

  • The volume must be online and must not be a mirror.
  • When reverting the root volume, filer will be rebooted.
  • Non-root volumes do not require a reboot however when reverting a non-root volume, all ongoing access to the volume must be terminated, just as is done when a volume is brought offline.

For single-file snaprestore:

  • The volume used for restoring the file must be online and must not be a mirror.
  • If restore_as_path is specified, the path must be a full path to a filename, and must be in the same volume as the volume used for the restore.
  • Files other than normal files and LUNs are not restored. This includes directories (and their contents), and files with NT streams.
  • If there is not enough space in the volume, the single file snap restore will not start.
  • If the file already exists (in the active file system), it will be overwritten with the version in the snapshot.

To restore data there are two ways, first system admins using “snap restore” command invoked by SMO, SMVI, Filer view or system console and second by end users where they can restore by copying file from .snapshot or ~snapshot directory or by using revert function in XP or newer system. However restoring data through snap restore command is very quick (seconds) even for TBs of data. Syntax for snap restore is as below.

“snap restore -t vol -s <snapshot_name> -r <restore-as-path> <volume_name>”

If you don’t want to restore the data at different place then remove the “-r <restore-as-path>” argument and filer will replace current file with the version in snapshot and if you don’t provide a snapshot name in syntax then system will show you all available snapshots and will prompt to select snapshot from which you want to restore the data.

Here’s the simplest form of this command as example to recover a file.

testfiler> snap restore -t file /vol/testvol/RootQtree/test.pst

WARNING! This will restore a file from a snapshot into the active filesystem. If the file already exists in the active filesystem, it will be overwritten with the contents from the snapshot.

Are you sure you want to do this? yes

The following snapshots are available for volume testvol:

date            name
------------    ---------
Nov 17 13:00    hourly.0
Nov 17 11:00    hourly.1
Nov 17 09:00    hourly.2
Nov 17 00:00    weekly.0
Nov 16 21:00    hourly.3
Nov 16 19:00    hourly.4
Nov 16 17:00    hourly.5
Nov 16 15:00    hourly.6
Nov 16 00:00    nightly.0
Nov 15 00:00    nightly.1
Nov 14 00:00    nightly.2
Nov 13 00:00    nightly.3
Nov 12 00:00    nightly.4
Nov 11 00:00    nightly.5
Nov 10 00:00    weekly.1
Nov 09 00:00    nightly.6
Nov 03 00:00    weekly.2
Oct 27 00:00    weekly.3

Which snapshot in volume testvol would you like to revert the file from? nightly.5

You have selected file /vol/testvol/RootQtree/test.pst, snapshot nightly.5

Proceed with restore? yes
testfiler>

Sunday, December 6, 2009

Snapshot configuration in NetApp

Ok first of all let me admit that my last post sounded more as a sales pitch rather than something technical though I am not a NetApp employee or paid by anyone to do blogging. However I must agree that whatever I have tried to show there was petty much similar available from other vendors so it was more about general awareness of  technology rather than a particular vendor but in this post I will talk about Snapshot configuration and other functions in NetApp, so let’s start.

What is snapshot copy?

A Snapshot copy is a frozen, read-only image of a volume or an aggregate that captures the state of the file system at a point in time and each volume can hold maximum 255 Snapshot copies at one time.

Snapshots can be taken either by system at pre-defined schedule, Protection Manager Policies, SMO, SMVI, Filer view or manually running command at system console or through custom scripts.

How to disable client access to snapshot copy?

To disable client access of .snapshot volume you can give “vol options <volume_name> nosnapdir on” command.

Notes:

  • Please DO NOT use any snap family of command without volume name as it may drive CPU processor to its peak for systems having lots of volume with a number of snapshots and it can hung the system which may result in system panic situation.
  • Use “-A” if you want to run these command against any aggregate and replace volume name with aggregate name.

How to Configure Snapshots through system console?

It’s always recommended that when you provision a volume you should look at snapshot reserve and schedule as by default when a volume is created 20% of space is reserved for snapshots which most of the time you need to change for efficient usage of space and snapshots. Always ask requester what is the rate of change, how much snapshots he wants to have access to and when he wants to snapshots to be taken because if you take snapshot of some oracle data and database is not in hot-backup mode then it’s just utter waster and same goes for VM.

So once you have those details do a little calculation and then use these command to configure.

  1. ‘snap reserve <volume name> <snapshot reserve size in % volume size, gb, mb or kb>’
    Example:
    ‘snap reserve testvol 10’
    This command will allocate 10% of space for snapshots on volume “testvol”
  2. ‘snap sched <volume name> <week days hour@list>
    Example:
    ‘snap sched testvol 4 7 7@9,11,13,15,17,19,21’

    This command will define the automatic snapshot schedule, and here you specify how much weekly, daily or hourly snapshot you want to retain as well at what time hourly snapshot will be taken. In given example volume testvol is having 4 weekly, 7 daily and 7 hourly available where hourly snapshots are taken at 9,11,13,15,17,19 and 21 hours of system local time. Please make sure that ‘nosnap’ is set to off in volume options.

How to take snapshots manually?

To take the snapshot manually you can run below command.

“snap create <volume name> <snapshot name>”

Here volume name is the name of volume you want to take snapshot of and snapshot name is the name you want to identify snapshot with.

How to list snapshots?

You can check the status of snapshots associated with any volume with command

“snap list <volume name>”

After issuing the above command you will get similar output

testfiler> snap list testvol

Volume testvol
working...

%/used %/total date name
---------- ---------- ------------ --------
36% (36%) 0% ( 0%) Dec 02 16:00 hourly.0
50% (30%) 0% ( 0%) Dec 02 12:00 hourly.1
61% (36%) 0% ( 0%) Dec 02 08:00 hourly.2
62% ( 5%) 0% ( 0%) Dec 02 00:01 nightly.0
69% (36%) 0% ( 0%) Dec 01 20:00 hourly.3
73% (36%) 0% ( 0%) Dec 01 16:00 hourly.4
77% (36%) 0% ( 0%) Dec 01 00:01 nightly.1

What if you are running low on snap reserve?

Sometimes due to excessive rate of change in data, very soon snapshot reserve gets full and they over spill on data area of volume, to remediate this you have to either extend volume or delete old snapshots.

To resize the volume use “vol size” command and to delete the old snapshots you can use “snap delete” command which I will cover in next section, however before deleting if you want to check how much free space you can gain from this snapshot use below command

“snap reclaimable <volume name> <snapshot name> | <snapshot name>…”

Running above command will give you output as below and you can add multiple snapshot names after one other if you are not getting required free space by deleting one snapshot. Please note that you should select snapshots for deletion only from oldest to latest order otherwise blocks freed by deleting any middle snapshot will still be locked in its following snapshot

testfiler> snap reclaimable testvol nightly.1 hourly.4
Processing (Press Ctrl-C to exit) ............
snap reclaimable: Approximately 9572 Kbytes would be freed.

How to delete snapshot?

To delete the snapshot use command snap delete with volume name and snap name in below fashion

“snap delete <volume name> <snapshot name>”

Running this command will print similar information on screen

testvol> snap delete testvol hourly.5

Wed Dec 2 16:58:29 GMT [testfiler: wafl.snap.delete:info]: Snapshot copy hourly.5 on volume testvol NetApp was deleted by the Data ONTAP function snapcmd_delete. The unique ID for this Snapshot copy is (67, 3876).

How to know what is the actual rate of change?

Sometime on a particular volume very often you will be running out of snap reserve space as snapshots fill them up much before old snaps gets expire and deleted by auto delete function (if you have configured) and you must be interested to resize the snap reserve accurately to avoid any issues. So in order to check how much is the actual rate of change KB per/hour calculated from all the snapshots or between two snap on given volume you can use snap delta command.

“snap delta <volume name> [<1st snapshot name> <2nd snapshot name>]”

testfiler> snap delta testvol

Volume testvol
working...

From Snapshot   To                   KB changed  Time         Rate (KB/hour)
--------------- -------------------- ----------- ------------ ---------------
hourly.0        Active File System   30044          0d 00:28  63176.635
hourly.1        hourly.0             552            0d 02:00  276.000
hourly.2        hourly.1             552            0d 01:59  276.115
weekly.0        hourly.2             628            0d 09:00  69.680
hourly.3        weekly.0             468            0d 03:00  155.956
hourly.4        hourly.3             552            0d 01:59  276.115
hourly.5        hourly.4             500            0d 02:00  249.895
hourly.6        hourly.5             548            0d 01:59  274.038
nightly.0       hourly.6             560            0d 14:59  37.334
nightly.1       nightly.0            700            0d 23:59  29.171
nightly.2       nightly.1            5392           1d 00:00  224.666
nightly.3       nightly.2            820            0d 23:59  34.172
nightly.4       nightly.3            2920           0d 23:59  121.687
nightly.5       nightly.4            880            1d 00:00  36.666
weekly.1        nightly.5            1111956        1d 00:00  46307.381
nightly.6       weekly.1             632            1d 00:00  26.333
weekly.2        nightly.6            42420          6d 00:00  294.583
weekly.3        weekly.2             8892           7d 00:00  52.928

Summary...

From Snapshot   To                   KB changed  Time         Rate (KB/hour)
--------------- -------------------- ----------- ------------ ---------------
weekly.3        Active File System   1209016       21d 13:29  2336.320

 

That was all about configuring creating and deleting snapshots but what it’s good if you don’t know how to restore the data from snapshots for which you have done so much things. So, in next post I will address how to restore data from snapshot through snap restore command

Friday, December 4, 2009

Snapshots in NetApp

Volumes and data:

Volume used for test was a flexible volume named ‘buffer_aggr12’ and “My Documents” folder from my laptop for data and sync tools from Microsoft to sync ‘My Documents’ folder with cifs share created from volume buffer_aggr12.

Snapshot configuration:

Scheduled snapshot were configured at 9,11,13,15,17,19,21 hours and retention period was 4 weekly, 7 daily and 7 hourly with 20% space reserve for snapshot.

The coolest part of the snapshot is flexibility, because as an administrator once you have configured it no more you have to look into this as it takes snapshot at defined schedule and if you have configured ‘snap autodelete’ then it will purge expired snapshots also as per your retention period. So effectively you don’t have to ever worry about managing hundreds of old snapshots lying in volume and eating up space (except when change rate of data overshoots and snapshots starts spilling on data area). As a end user you experience backups at your click away because snapshots integrates well with shadow copy services of windows 2000, XP or Vista and you can recover them whenever you need.

Here’s the configuration of snapshot for my test volume ‘buffer_aggr12’

AMSNAS02> snap sched buffer_aggr12
Volume buffer_aggr12: 4 7 7@9,11,13,15,17,19,21

AMSNAS02> snap reserve buffer_aggr12
Volume buffer_aggr12: current snapshot reserve is 20% or 157286400 k-bytes.

As I was running this test for months so there were enough snaps for me to play with and you can see below that these snapshots are going way back to 20th July,  which is 4 week old snapshot and anytime I can recover that from just a right click.

How to recover files or folders from snapshot:

There are two ways to recover the data from snapshots.

As an end user you can recover your data from windows explorer by just right clicking in an empty space while you are in the share in which you lost your data. Here’s an example of this.

a) This is the snapshot of my share folder, in this as you can see my pst file is corrupted and showing 0 kb.
image

b) To recover this, right click on any empty area and go to properties>previous version it shows me all the snapshots taken for this folder, as shown in below screenshot.

image image

c) Now at this point either I can revert the whole folder to previous state or just copy it to another location to recover a deleted file but at this place my point is to revert a corrupted file rather than recovering a deleted file. So I will just do a right click on that file and navigate to previous versions tab in properties dialogue box. Here in this it shows me the changes captured by snapshot at different times, so I can just select the date I want to revert with and click on restore.
image

d) Now it starts replacing the corrupted file with the one taken by snapshot. Its taking a long time because the file in question is >1Gb size and I am on WAN link so it’s slow but there is another way to do it and that’s recovering directly from filer console which recovers in seconds but unfortunately not available to end user.
clip_image009

e) Now here’ the screenshot of my before and after.

image image

As an Administrator you can recover a file, folder or whole volume within second as while doing it from filer console, system doesn’t have to copy the old file from snapshot to temp location, delete old file and then change the recovered file’s metadata , instead it just changes the block pointers internally so it’s blazing fast . Here’s an example of this.

a) In this test again I will use same pst file which is corrupted but this time we will recover it from console. So first login to filer and do a snap list to see what all snapshots are available.
AMSNAS02> snap restore buffer_aggr12
Volume buffer_aggr12
working…
  %/used       %/total  date          name
----------  ----------  ------------  --------
  0% ( 0%)    0% ( 0%)  Aug 14 17:00  hourly.0
  0% ( 0%)    0% ( 0%)  Aug 14 15:00  hourly.1
40% (40%)    0% ( 0%)  Aug 14 13:00  hourly.2
40% ( 0%)    0% ( 0%)  Aug 14 11:00  hourly.3
40% ( 0%)    0% ( 0%)  Aug 14 09:00  hourly.4
40% ( 0%)    0% ( 0%)  Aug 14 00:00  nightly.0
40% ( 0%)    0% ( 0%)  Aug 13 21:00  hourly.5
40% ( 0%)    0% ( 0%)  Aug 13 19:00  hourly.6
40% ( 0%)    0% ( 0%)  Aug 13 00:00  nightly.1
41% ( 0%)    0% ( 0%)  Aug 12 00:00  nightly.2
57% (39%)    0% ( 0%)  Aug 11 00:00  nightly.3
57% ( 0%)    0% ( 0%)  Aug 10 00:00  weekly.0
57% ( 0%)    0% ( 0%)  Aug 09 00:00  nightly.4
57% ( 0%)    0% ( 0%)  Aug 08 00:00  nightly.5
57% ( 0%)    0% ( 0%)  Aug 07 00:00  nightly.6
57% ( 0%)    0% ( 0%)  Aug 03 00:00  weekly.1
65% (35%)    0% ( 0%)  Jul 27 00:00  weekly.2
65% ( 0%)    0% ( 0%)  Jul 20 00:00  weekly.3

b) Now to recover the file you give below command and it recovers that in just a second.
AMSNAS02> snap restore -t file -s nightly.5 /vol/buffer_aggr12/RootQtree/test.pst

WARNING! This will restore a file from a snapshot into the active filesystem.  If the file already exists in the active filesystem, it will be overwritten with the contents from the snapshot.

Are you sure you want to do this? yes

You have selected file /vol/buffer_aggr12/RootQtree/test.pst, snapshot nightly.5

Proceed with restore? yes

AMSNAS02>

c) Here’s the screenshot of my folder which confirm file back in previous state.
image

Now as you see it was quite easy to use and very useful also, but to have a snapshot you need some extra space reserved in volume specially if your data is changing very frequently as more changes means more space you need to store changed block and the condition goes more complicated if you are trying to take snapshot of a VM, Exchange or Database volume, because before the snapshot is taken application has to put itself in hot-backup mode so a consistent copy can be made. Most of the applications have this functionality available but you have to use some script or snapmanager so when application is prepared it can inform filer to take snapshot and once snapshot is taken filer can inform back the application to resume its normal activity.