Showing posts with label Bug. Show all posts
Showing posts with label Bug. Show all posts

Saturday, June 26, 2010

Operations Manager Efficiency Plugin

After the release of DFM 3.8.1 NetApp has released a nice little plugin for DFM called as ‘Operations Manager Storage Efficiency Dashboard Plugin’. Though quite a long name but it’s good, it cleverly uses DFM database to pull storage utilization and presents the information in nice flash based webpage.

It’s useful when you have to show higher management current storage utilization and saving came from NetApp thin provisioning, dedupe, flexclone and other stuffs and goes very well with NetApp’s storage efficiency mantra. The best part is, after you install the plugin you don’t have to anything and you can access it from anywhere in network without installing any software, however there isn’t a simple way to reach the page even after you are right inside OM webpage as there is no link pointing to dashboard, so you have to remember the location to access it later or for people like me bookmark in your browser.

The most common problem arising from this is due to lack of foresight while creating the plugin. Here’s what I mean to say. Usually we install DFM server on c:\ and move all perfdata, DB, script folder and other bits and pieces to a different drive for easy backup or in the case of cluster, for clustering setup and here script falls apart. Script expects that it is sitting in its default location and web folder is sitting right next to it, so it acts accordingly whereas in real situation web folder is on c:\ and script is in some other volume.

Now there isn’t any way to rectify the behaviour of script or web server, as apache running on DFM can’t be configured to use any folder other then the one sitting inside the installation directory (AFAIK) and no switches are provided in script to tell him the location of original web folder where he needs to copy its content.

So in nutshell even though script executes and copies all the files required for showing the dashboard it’s useless unless you figure out by yourself what’s going wrong and why not the page is showing in your browser.

Overcoming this limitation is easy enough for folks those who are on Unix environment as creating an alias to original web folder makes everything working fine but for windows folks like me creating a shortcut doesn’t works.

So here’s the way to correct the problem.

Download the plugin from now toolchest. Extract the zip and edit file ‘package.xml’, change the string “dfmeff.exe” to “dfmeff.bat”, next you have to create a new batch file in called “dfmeff.bat” with below contents.

@echo off
D:\DFM\script-plugins\dfmeff\dfmeff.exe
xcopy D:\DFM\web\*.* "C:\Program Files\NetApp\DataFabric\DFM\web" /Q/I/Y/R

Obviously you have to change the path as per your installation however once you have created the batch file and added its reference in xml file you are good to go, just zip it again using any zip software and use the new zip file as plugin source for installation in DFM.

Update:
Just noticed a video showing features of plugin on netapp community site http://communities.netapp.com/videos/1209

Saturday, November 7, 2009

Restrict snapmirror access by host and volume on NetApp

Recently one of my fellow NetApp admin friend asked me a very general question,

“How do you restrict your data to be copied through snapmirror?”

As like any other normal NetApp guy my answer was also same old vanilla type.

“Go to snapmirror.allow file and put the host name if your have set snapmirror.access to legacy or you can directly put hostname in host=host1,host2 format in snapmirror.access option.”

But he wanted more granular level of permission, so my another answer was,

“You can also use snapmirror.checkip.enable so any system reporting same hostname will not be able to access data.”

But even on that he wasn’t happy and was asking if there is any other way so he can restrict snapmirror access on volume basis. At this point I said “No, NetApp doesn’t provide this level of granular access.”

So the topic stopped there, but this question was there in my mind and always hunted me why there isn’t any such way.

Fast forward Past week when I had some extra time in my hand I started searching on net for this and fortunate enough I got a way on NOW site to get this work.

It was recorded under Bugs section with Bug ID # 80611 Which reads as.

“There is an unsupported undocumented feature of the /etc/snapmirror.allow file, such that if it is filled as follows:
    hostA:vol1
    hostA:vol29
    hostB:/vol/vol0/q42
    hostC
and "options snapmirror.access legacy" is issued, then the desired access policy will be implemented. Again note that this is unsupported and undocumented so use at your own risk.”

Yes, though NetApp says that there is a way to do that but they also say well sometimes it may break other functionality or may not work as expected.

Finding this I sent the details to my friend but unfortunately he don’t want to give it a try on his production systems and test systems are not available with him.

So if anyone of you want to try it or have tried it before please put your experience in comments field.

Saturday, September 12, 2009

SSH broken if you disable Telnet in ontap 7.3.1

And here’s another bug which we hit a last month.

Last month when I was doing setup of our new filers I disabled telnet on the systems with along-with lots of other tweaking but later on when I tried to connect the system with SSH it refused. Thinking about that I might have turned off some other deep registry feature I went through entire registry but couldn’t find anything suspicious.

So I turned on SSH verbose login, tried to re-run SSH setup with different passkey sizes and what not, but no joy. Finally I tried with enabling telnet and voila it worked. By the time it worked it was around 7 pm so I called a day and left office scratching my head.

Next morning again I started looking around if there was something obvious I am missing but no, I couldn’t find anything even on NOW site, so I opened a case with NetApp and even NetApp guy was not able to understand why system is behaving like this, but finally in late evening that NetApp chap came to me with a BURT # 344484 which was fixed in 7.3.1.1P2.

Now there was a big problem as I wasn’t quite ready to upgrade my systems with a patched version so decided to let have telnet enable and wait for 7.3.2 to arrive. But since that time I was getting bugged with IT-security team because I was trying to get these systems connected in network so I can start allocating some space and get rid of space low warning but these guys were not allowing me because telnet was enabled on them. Finally past week when I noticed 7.3.2RC1 and 8.0RC1 availability on now site I got some sigh of relief as I believe now 7.3.2 GA should be available within a month and finally I can have my systems meeting my organization security policy more importantly I can get rid of pending space allocation request.

Friday, April 3, 2009

False API warning message in syslog of NetApp filer

Message:

api.output.invalidSchema:warning]: Error in API output schema validation. API name: snapshot-autodelete-list-info. Detailed error: Missing output: options.


OnTap Version: 7.3, 7.3.1

I found this message in our all filer’s syslog but no clue where from they are coming because I have never called any api so started searching on net. After a lot of search found that it’s a bug in OnTap which complains about one API however as today there is no fix, just a workaround and moreover as per NetApp this bus is not scheduled to be fixed.

Surprising huh.. well here is the workaround.


priv set advanced

registry set state.api.schema_output_validate.enable off


To get more details about this bug # 339742 go to the below link

http://now.netapp.com/NOW/cgi-bin/bol?Type=Detail&Display=339742

Thursday, April 2, 2009

False Disk Shelf mismatch error on Active/Active NetApp Filer

Message:

cf.fsm.shelfCount.fewerShelves:CRITICAL]: Disk shelf count mismatch: partner sees more of our A shelves on its B loop (14) than we do (13).

OnTap Version: 7.3.1

I got this bug while doing giveback of cluster, checked all the hardware and disks but they are working fine and there is no problem. Further checking in now site shown some light and I found a workaround for this.

Type the command 'options cf.takeover.on_disk_shelf_miscompare off'. This should be executed explicitly even if the option 'cf.takeover.on_disk_shelf_miscompare' value is already off

As of now there is no fix for this and NetApp has identified it as bug id 349449.

http://now.netapp.com/NOW/cgi-bin/bol?Type=Detail&Display=349449