搜索
您的当前位置:首页正文

10netapp存储配置练习_检查健康状态与性能

来源:六九路网
NetApp存储基础学习汇总(第十三部分)

目 录

一、检查状态及性能 ....................................................................................................................... 1

1.1、概述 .................................................................................................................................. 1 1.2、几个有用的管理命令 .................................................................................................... 13 1.3、使用statit命令 .............................................................................................................. 15 1.4、执行特殊的BOOT命令 ............................................................................................... 17 二、检查状态及性能管理 ............................................................................................................. 25

2.1、检查系统 ........................................................................................................................ 25

2.1.1、sysconfig ............................................................................................................. 25 2.1.2、sysstat .................................................................................................................. 28 2.1.3、与优化CPU性能的options命令 ..................................................................... 28

一、检查状态及性能

1.1、概述

      

识别管理权限命令(priv admin) 解释管理权限命令的功能

定义一些对于管理员非常有用的普通命令 普通权限的管理命令用于日常管理

高级权限的管理命令用于特殊任务,比如系统调优、测试、统计等。这种命令如果使用不当可能毁坏数据,所以推荐别在高级权限模式长时间停留。 Options类的命令

Flash启动命令在设备启动阶段可以获得

命令行提供4种类型的命令用于基本的系统管理或者排错。

普通权限的命令在命令行打问号可以获得提示,主要是为了磁盘管理、网络和系统管理、物理或者虚拟接口管理等。下面是一系列普通权限的命令: Configuration类—黄色的是这一类命令

一些命令解释(software、source):

tan> softwar命令用于从HTTP或HTTPS服务器下载DataONTAP软件镜像到filer,管理软件并安装或升级它们,一般是SETUP.EXE(NETAPP发布的)文件,软件下载后保存在根卷的/etc/software。 tan> arp –a ====将IP地址解析成MAC地址 (192.168.0.1) at (incomplete) tan> arp -n tan tan (192.168.0.105) -- no entry tan> source -v /etc/rc =读取和执行包含filer命令的文件、一行行执行,但是其中一行出错,并不报错,继续执行,但执行结果会有问题,文件写全路径名,因为DataONTAP没有当前路径的概念。 #Auto-generated by setup Wed Mar 24 05:06:20 GMT 2010 hostname tan ifconfig ns0 `hostname`-ns0 mediatype auto route add default 192.168.0.1 1 add net default: gateway 192.168.0.1: entry already exists routed on options dns.enable off options nis.enable off

Disk Management类

一些命令解释,storage命令详解: dns –显示DNS信息,控制DNS子系统 tan> dns info ===显示DNS解析器的状态 DNS is disabled tan> dns flush ===删除DNS cache中的所有条目 DNS cache flushed. storage –这个命令用来管理存储子系统中的磁盘、SCSI和光纤卡。可以enable或者disable卡,列出disk 的信息。 tan> storage show adapter Slot: v0 == 适配器在哪个slot Description: Fibre Channel Host Adapter v0 (Network Appliance VHA rev. 15) Firmware Rev: 42 FC Node Name: d:c7b:f40500:000000 FC Packet Size: 2112 Link Data Rate: 0 Gbit SRAM Parity: Yes External GBIC: No State: Enabled In Use: Yes Redundant: Yes = 是否冗余 Slot: v1 Description: Fibre Channel Host Adapter v1 (Network Appliance VHA rev. 15) Firmware Rev: 42 FC Node Name: d:d7b:f40500:000000 FC Packet Size: 2112 Link Data Rate: 0 Gbit SRAM Parity: Yes External GBIC: No State: Enabled In Use: Yes Redundant: Yes 。。。。。。。。。。。。。。。。。。。。。。。。。。。。。。。。。。。。。。。 tan> storage show adapter v0 === 指定适配器的名字,显示具体某个适配器 Slot: v0 Description: Fibre Channel Host Adapter v0 (Network Appliance VHA rev. 15) Firmware Rev: 42 FC Node Name: d:c7b:f40500:000000 FC Packet Size: 2112 Link Data Rate: 0 Gbit SRAM Parity: Yes External GBIC: No State: Enabled In Use: Yes Redundant: Yes tan> storage show == 显示所有元件 Slot: v0 Description: Fibre Channel Host Adapter v0 (Network Appliance VHA rev. 15) Firmware Rev: 42 FC Node Name: d:c7b:f40500:000000 FC Packet Size: 2112 Link Data Rate: 0 Gbit SRAM Parity: Yes External GBIC: No State: Enabled In Use: Yes Redundant: Yes storage show hub [ -a ] [ ] not implemented for simulator storage show expander [ -a ] [ ] not implemented for simulator DISK SHELF BAY SERIAL VENDOR MODEL REV --------------------- --------- ---------------- -------- ---------- ---- v4.16 1 0 13740500 NETAPP VD-100MB 0042 v4.17 1 1 13740501 NETAPP VD-100MB 0042 v4.18 1 2 13740502 NETAPP VD-100MB 0042 v5.16 1 0 10604900 NETAPP VD-500MB 0042 v5.17 1 1 10604901 NETAPP VD-500MB 0042 tan> storage show shelf storage show shelf [ -a ] [ ] not implemented for simulator tan> storage show tape storage show { tape | mc } not implemented for simulator tan> storage disable adapter v0 ==将slot v0上的适配器disbale了,比如更换这个适配器上连接的外接SCSI设备,但此设备不能热插拔,则需要一定的操作,然后才可以更换。 Thu Apr 1 04:18:41 GMT [rc:notice]: Taking loop attached to Fibre Channel adapter v0 offline. Host adapter v0 disable succeeded tan> storage show adapter v0 Slot: v0 Description: Fibre Channel Host Adapter v0 (Network Appliance VHA rev. 15) Firmware Rev: 42 FC Node Name: d:c7b:f40500:000000 FC Packet Size: 2112 Link Data Rate: 0 Gbit SRAM Parity: Yes External GBIC: No State: Disabled In Use: No Redundant: Yes tan> storage show disk –p -p选项显示了连接到磁盘设备的首选和备份路径。比如磁盘设备可以通过A端口和B端口连接,如果两个都通,则一个是首选一个是备选。 PRIMARY PORT SECONDARY PORT SHELF BAY ------- ---- --------- ---- --------- v4.16 B 1 0 =我们上面把v0 disable了,则v4变成了首选路径,备份路径没了 v4.17 B 1 1 v4.18 B 1 2 v5.16 B v1.16 A 1 0 v5.17 B v1.17 A 1 1 v5.18 B v1.18 A 1 2 v5.19 B v1.19 A 1 3 v5.20 B v1.20 A 1 4 v5.21 B v1.21 A 1 5 v5.22 B v1.22 A 1 6 tan> storage enable adapter v0 ===将v0激活后,再看,发现v0变成了首选路径,v4是备选路径 Thu Apr 1 04:33:06 GMT [rc:notice]: Bringing loop attached to Fibre Channel adapter v0 online Host adapter v0 enable succeeded tan> storage show disk -p PRIMARY PORT SECONDARY PORT SHELF BAY ------- ---- --------- ---- --------- v0.16 A v4.16 B 1 0 v0.17 A v4.17 B 1 1 v0.18 A v4.18 B 1 2 v5.16 B v1.16 A 1 0 v5.17 B v1.17 A 1 1 v5.18 B v1.18 A 1 2 =从这可以看出来v0,v4互相备份,v5和v1互相备份 df命令显示磁盘空闲空间 tan> df –h ===以适合阅读的格式输出,尺寸单位自适应 Filesystem total used avail capacity Mounted on /vol/vol0/ 241MB 89MB 152MB 37% /vol/vol0/ tan> df -m Filesystem total used avail capacity Mounted on /vol/vol0/ 241MB 89MB 152MB 37% /vol/vol0/ tan> df -g Filesystem total used avail capacity Mounted on /vol/vol0/ 0GB 0GB 0GB 37% /vol/vol0/ tan> df -k Filesystem total used avail capacity Mounted on /vol/vol0/ 247644KB 91512KB 156132KB 37% /vol/vol0/ tan> df -t Filesystem total used avail capacity Mounted on /vol/vol0/ 0TB 0TB 0TB 37% /vol/vol0/ 上面几个命令按不同单位显示空闲空间 tan> df -r –h ==显示卷上的预留空间 Filesystem total used avail reserved Mounted on /vol/vol0/ 241MB 89MB 152MB 0MB /vol/vol0/ /vol/vol0/.snapshot 0MB 13MB 0MB 0MB /vol/vol0/.snapshot tan> df -A –m =显示aggr的空间使用情况 Aggregate total used avail capacity aggr0 256MB 243MB 13MB 95% aggr0/.snapshot 13MB 12MB 0MB 95% aggr2 2137MB 0MB 2137MB 0% aggr2/.snapshot 112MB 0MB 112MB 0%

System and networking Management类

一些命令解释:

maxfiles命令 –增加卷可以拥有的文件数量 tan> maxfiles vol0 Volume vol0: maximum number of files is currently 19990 (6054 used). tan> maxfiles vol0 30000 =增加vol0的文件数 The new maximum number of files specified is more than twice as big as it needs to be, based on current usage patterns. Increasing the maximum number of files consumes disk space, and the number can never be decreased. Configuring a large number of inodes can also result in less available memory after an upgrade, which means you might not be able to run WAFL_check. The new maximum number of files will be rounded to 29985. Are you sure you want to increase the maximum number of files? yes tan> maxfiles vol0 Volume vol0: maximum number of files is currently 29985 (6053 used). tan> uptime =系统已经启动多长时间 5:32am up 2:11 0 NFS ops, 0 CIFS ops, 0 HTTP ops, 0 FCP ops, 0 iSCSI ops vscan命令 –控制存储上面文件的病毒扫描 tan> vscan on Warning: CIFS clients will not be allowed to open files because there are no virus scanners registered with the filer. Are you sure? yes Thu Apr 1 05:39:35 GMT [vscan.server.connectedNone:warning]: CIFS: Virus scanning is enabled but no vscan (anti-virus) servers are connected to the filer. Thu Apr 1 05:39:35 GMT [vscan.enable:info]: CIFS: Virus scanning has been enabled. Virus scanning is enabled tan> vscan Virus scanning is enabled. No vscan servers are connected. List of extensions to scan: 001,002,386,3GR,??_,ACE,ACM,ADE,ADP,ADT,AP?,ARC,ARJ,ASA,ASD,ASP,AX?,B64,BA?,BIN,BMP,BO?,BZ?,CAB,CC?, CDR,CDX,CEO,CGI,CHM,CL?,CMD,CNV,CO?,CPL,CPT,CPY,CRT,CSC,CSS,CSV,D?B,DAT,DEV,DIF,DL?,DO?,DOC,DOT, DQY,DRV,EE?,EFV,EML,EX?,EXE,FDF,FMT,FO?,FPH,FPW,GF?,GIM,GIX,GMS,GNA,GW?,GWI,GZ?,HDI,HHT,HLP,HT?, HWD,ICE,ICS,IM?,IN?,IQY,ISP,ITS,JAR,JP?,JS?,LGP,LIB,LNK,LWP,LZH,M3U,MB0,MB1,MB2,MBR,MD?,MHT,MOD,MPD, MPP,MPT,MRC,MS?,MSG,MSO,NAP,NEW,NWS,OB?,OC?,OFT,OL?,OLE,OTM,OV?,PCD,PCI,PD?,PDF,PF?,PHP,PI?,PLG, POT,PP?,PPZ,PRC,PWZ,QLB,QPW,QQY,QTC,RAR,REG,RMF,RQY,RTF,SCR,SCT,SH?,SIS,SKV,SLK,SMM,SPL,SRF,SWF,SYS, TAR,TAZ,TBZ,TD0,TFT,TGZ,TLB,TSP,UNP,URL,UUU,VB?,VBS,VS?,VVV,VWP,VXD,WBK,WIZ,WMV,WP?,WRI,WRL,WRZ, WS?,X32,XL?,XML,XRF,XSL,XTP,XX?,Z0M,Z??,ZI?,ZIP,ZL?,ZZZ List of extensions not to scan: Extensions-not-to-scan list is empty. Number of files scanned: 0 Number of scan failures: 0 Number of throttled requests: 0 useradmin命令 –管理存储的访问控制 useradmin user command argument... useradmin domainuser command argument... useradmin group command argument... useradmin role command argument... useradmin whoami user可以放到一个或多个group里 domainuser:必须CIFS起来,通过windows域来验证。 group是user和domainuser的容器,可以有一个或多个角色 role一组能力(可以执行某些动作的能力) 有六组内置的能力: login-*, cli-*,api-*, security-*, compliance-* and filerview-readonly tan> useradmin role add tanyx -a login-*,cli-help*,cli-ifconfig* ===新建一个role Thu Apr 1 06:57:58 GMT [useradmin.added.deleted:info]: The role 'tanyx' has been added. Role added. tan> useradmin group add test -r tanyx ===新建一个group Thu Apr 1 07:00:31 GMT [useradmin.added.deleted:info]: The group 'test' has been added. Group added. tan> useradmin user add wangjun -g test =新建一个user New password: Retype new password: User added. tan> Thu Apr 1 07:02:32 GMT [useradmin.added.deleted:info]: The user 'wangjun' has been added. --------------------------------------试验使用这个新用户可以做什么--------------------------------------------------- Data ONTAP (tan.) login: wangjun Password: tan> Thu Apr 1 07:03:43 GMT [console_login_mgr:info]: wangjun logged in from console tan> useradmin whoami Thu Apr 1 07:03:53 GMT [useradmin.unauthorized.user:warning]: User 'wangjun' denied access - missing required capability: 'cli-useradmin' Permission denied, user wangjun does not have access to useradmin tan> ifconfig -a ns0: flags=848043 mtu 1500 inet 192.168.0.105 netmask 0xffffff00 broadcast 192.168.0.255 ether 00:50:56:11:e6:ba (auto-100tx-fd-up) ns1: flags=8042 mtu 1500 ether 00:50:56:12:e6:ba (auto-unknown-cfg_down) lo: flags=1948049 mtu 9188 inet 127.0.0.1 netmask 0xff000000 broadcast 127.0.0.1 用此命令创建的group、user与/etc/passwd、/etc/group没关系 routed命令 –routed是一个boot的时候启动的管理路由表的后台进程 tan> routed status RIP snooping is on Gateway Metric State Time Last Heard 192.168.0.1 1 ALIVE Thu Apr 1 03:22:03 GMT 2010 0 free gateway entries, 1 used tan> routed off tan> routed on traceroute-显示到网络主机的路由清单 tan> traceroute tan traceroute to tan (192.168.0.105), 30 hops max, 40 byte packets 1 tan (192.168.0.105) 0.013 ms 0.070 ms 0.075 ms secureadmin-配置SSH和SSL的命令. tan> secureadmin setup ssh SSH Setup --------- Determining if SSH Setup has already been done before...no SSH server supports both ssh1.x and ssh2.0 protocols. SSH server needs two RSA keys to support ssh1.x protocol. The host key is generated and saved to file /etc/sshd/ssh_host_key during setup. The server key is re-generated every hour when SSH server is running. SSH server needs a RSA host key and a DSA host key to support ssh2.0 protocol. The host keys are generated and saved to /etc/sshd/ssh_host_rsa_key and /etc/sshd/ssh_host_dsa_key files respectively during setup. SSH Setup will now ask you for the sizes of the host and server keys. For ssh1.0 protocol, key sizes must be between 384 and 2048 bits. For ssh2.0 protocol, key sizes must be between 768 and 2048 bits. The size of the host and server keys must differ by at least 128 bits. Please enter the size of host key for ssh1.x protocol [768] : Please enter the size of server key for ssh1.x protocol [512] : Please enter the size of host keys for ssh2.0 protocol [768] : You have specified these parameters: host key size = 768 bits server key size = 512 bits host key size for ssh2.0 protocol = 768 bits Is this correct? [yes] Setup will now generate the host keys. It will take a minute. After Setup is finished the SSH server will start automatically. tan> secureadmin enable ssh2 tan> secureadmin status ssh2 - active ssh1 - inactive ssl - inactive rdate –从远程主机设置系统时间 iscsi管理SCSI服务 tan> iscsi start Thu Apr 1 07:36:32 GMT [iscsi.service.startup:info]: iSCSI service startup iSCSI service started tan> iscsi status iSCSI service is running tan> iscsi session show No active sessions tan> iscsi ? The following commands are available; for more information type \"iscsi help \" alias interface security stats connection isns session status help nodename show stop initiator portal start tpgroup fcp- Commands for managing Fibre Channel target adapters and the FCP target protocol. Logger--将信息写入系统日志文件 tan> logger (Enter '.' and carriage return to end message) kkjkj . ipspace命令只有在存储有vfile license的时候才可以用 environment – 显示存储物理环境的信息 tan> environment status Environment for channel v0 Number of shelves monitored: 1 enabled: yes Environmental failure on shelves on this channel? no Channel: v0 Shelf: 1 SES device path: local access: v4.17 Module type: LRC; monitoring is active Shelf status: normal condition SES Configuration, via loop id 17 in shelf 1: logical identifier=0x0b00000000000000 vendor identification=XYRATEX product identification=DiskShelf14 product revision level=1111 Vendor-specific information: Product Serial Number: Optional Settings: 0x00 Status reads attempted: 1638; failed: 0 Control writes attempted: 18; failed: 0 Shelf bays with disk devices installed: 2, 1, 0 with error: none Power Supply installed element list: 1, 2; with error: none Power Supply information by element: [1] Serial number: sim-PS12345-1 Type: Firmware version: [2] Serial number: sim-PS12345-2 Type: Firmware version: tan> environment status shelf tan> environment shelf_power_status tan> environment chassis fans ==显示所有非shelf环境的情况 tan> environment chassis power 下面是关于partner命令的解释 假设两个存储toaster1和toaster2组成一个集群。在toaster2宕掉后,toaster1接管,因为你不能再从console口toaster2键入命令, 所以必须从toaster1键入命令访问toaster2.比如:决定toaster2上最大的文件数量: toaster1(takeover)> partner maxfiles ==注意提示符 Volume vol0: maximum number of files is currently 241954 (3194 used). Volume vol1: maximum number of files is currently 241954 (3195 used). 下面一步步执行 toaster1(takeover)> partner ==进入partner模式 toaster2/toaster1> maxfiles Volume vol0: maximum number of files is currently 241954 (3194 used). Volume vol1: maximum number of files is currently 241954 (3195 used). toaster2/toaster1> partner ===退出partner模式 toaster1(takeover)>

Service and protoocls类

Files and diretories类

Device control类

1.2、几个有用的管理命令

tan> ifstat ns0 ----- print i/f, driver statistics -- interface ns0 (0 hours, 10 minutes, 32 seconds) -- RECEIVE Frames/second: 0 | Bytes/second: 0 | Errors/minute: 0 Discards/minute: 0 | Total frames: 221 | Total bytes: 14548 Total errors: 0 | Total discards: 0 | Multi/broadcast: 221 No buffers: 0 | Non-primary u/c: 0 | Tag drop: 0 Vlan tag drop: 0 | Vlan untag drop: 0 | Read errors: 0 TRANSMIT Frames/second: 0 | Bytes/second: 0 | Errors/minute: 0 Discards/minute: 0 | Total frames: 32 | Total bytes: 3376 Total errors: 0 | Total discards: 0 | Multi/broadcast: 32 Queue overflows: 0 | No buffers: 0 | Write errors: 0 LINK_INFO Current state: up | Up to downs: 0 | Speed: 100m Duplex: full | Flowcontrol: none tan> nfsstat ----displays statistical information about NFS (Net-work File System) and RPC (Remote Procedure Call) for the filer. Server rpc: TCP: calls badcalls nullrecv badlen xdrcall 0 0 0 0 0 UDP: calls badcalls nullrecv badlen xdrcall 0 0 0 0 0 Server nfs: calls badcalls 0 0 Server nfs V2: (0 calls) null getattr setattr root lookup readlink read 0 0% 0 0% 0 0% 0 0% 0 0% 0 0% 0 0% wrcache write create remove rename link symlink 0 0% 0 0% 0 0% 0 0% 0 0% 0 0% 0 0% mkdir rmdir readdir statfs 0 0% 0 0% 0 0% 0 0% Read request stats (version 2) 0-511 512-1023 1K-2047 2K-4095 4K-8191 8K-16383 16K-32767 32K-65535 64K-131071 > 131071 0 0 0 0 0 0 0 0 0 0 Write request stats (version 2) 0-511 512-1023 1K-2047 2K-4095 4K-8191 8K-16383 16K-32767 32K-65535 64K-131071 > 131071 0 0 0 0 0 0 0 0 0 0 Server nfs V3: (0 calls) null getattr setattr lookup access readlink read 0 0% 0 0% 0 0% 0 0% 0 0% 0 0% 0 0% write create mkdir symlink mknod remove rmdir 0 0% 0 0% 0 0% 0 0% 0 0% 0 0% 0 0% rename link readdir readdir+ fsstat fsinfo pathconf 0 0% 0 0% 0 0% 0 0% 0 0% 0 0% 0 0% commit 0 0% Read request stats (version 3) 0-511 512-1023 1K-2047 2K-4095 4K-8191 8K-16383 16K-32767 32K-65535 64K-131071 > 131071 0 0 0 0 0 0 0 0 0 0 Write request stats (version 3) 0-511 512-1023 1K-2047 2K-4095 4K-8191 8K-16383 16K-32767 32K-65535 64K-131071 > 131071 0 0 0 0 0 0 0 0 0 0 tan> sysstat CPU NFS CIFS HTTP Net kB/s Disk kB/s Tape kB/s Cache in out read write read write age 0% 0 0 0 0 0 3 14 0 0 >60 0% 0 0 0 0 0 5 22 0 0 >60 tan*> wrfile /etc/test ==如何编辑一个文件 test1 ==按ctrl+C退出 read: error reading standard input: Interrupted system call tan*> rdfile /etc/test ==读文件内容 test1 tan*> wrfile -a /etc/test test2 ==追加一行 tan*> rdfile /etc/test ===注意提示符 test1 test2 tan*> mv /etc/test /etc/test1 1.3、使用statit命令

Statit命令生成一个报告,内容是详细的系统利用率,由于输出内容很多,所以最好捕捉输出到文件或者可滚动的屏幕。

只有在高级管理状态才可以使用此命令。 tan> priv set advanced tan*> statit -b tan*> statit -e Hostname: tan ID: 0099908572 Memory: 512 MB NetApp Release 7.3: Thu Jul 24 12:55:28 PDT 2008 Start time: Wed Mar 24 09:05:47 GMT 2010 CPU Statistics 9.147117 time (seconds) 100 % 0.017650 system time 0 % 0.000915 rupt time 0 % (915 rupts x 1 usec/rupt) 0.016735 non-rupt system time 0 % 9.129467 idle time 100 % 0.170104 time in CP 2 % 100 % 0.000017 rupt time in CP 0 % (17 rupts x 1 usec/rupt) Miscellaneous Statistics (per second) 901.38 hard context switches 0.00 NFS operations 0.00 CIFS operations 0.00 HTTP operations 0.00 NetCache URLs 0.00 streaming packets 0.00 network KB received 0.00 network KB transmitted 4.37 disk KB read 19.68 disk KB written 0.98 NVRAM KB written 0.00 nolog KB written 0.00 WAFL bufs given to clients 0.00 checksum cache hits ( 0%) 0.00 no checksum - partial buffer 0.00 FCP operations 0.00 iSCSI operations WAFL Statistics (per second) 6.23 name cache hits ( 77%) 1.86 name cache misses ( 23%) 39.36 buf hash hits ( 90%) 4.15 buf hash misses ( 10%) 19.46 inode cache hits ( 100%) 0.00 inode cache misses ( 0%) 9.07 buf cache hits ( 100%) 0.00 buf cache misses ( 0%) 0.00 blocks read 0.00 blocks read-ahead 0.00 chains read-ahead 0.00 dummy reads 0.00 blocks speculative read-ahead 3.83 blocks written 0.44 stripes written 0.00 blocks over-written 0.11 wafl_timer generated CP 0.00 snapshot generated CP 0.00 wafl_avail_bufs generated CP 0.00 dirty_blk_cnt generated CP 0.00 full NV-log generated CP 0.00 back-to-back CP 0.00 flush generated CP 0.00 sync generated CP 0.00 wafl_avail_vbufs generated CP 0.00 deferred back-to-back CP 0.00 container-indirect-pin CP 0.00 low mbufs generated CP 0.00 low datavecs generated CP 46.24 non-restart messages 0.00 IOWAIT suspends 0.00 next nvlog nearly full msecs 0.00 dirty buffer susp msecs 0.00 nvlog full susp msecs 98848 buffers RAID Statistics (per second) 2.30 xors 0.00 long dispatches [0] 0.00 long consumed [0] 0.00 long consumed hipri [0] 0.00 long low priority [0] 0.00 long high priority [0] 0.00 long monitor tics [0] 0.00 long monitor clears [0] 0.00 long dispatches [1] 0.00 long consumed [1] 0.00 long consumed hipri [1] 0.00 long low priority [1] 0.00 long high priority [1] 0.00 long monitor tics [1] 0.00 long monitor clears [1] 6 max batch 0.55 blocked mode xor 0.22 timed mode xor 0.00 fast adjustments 0.00 slow adjustments 0 avg batch start 0 avg stripe/msec 0.66 tetrises written 0.00 master tetrises 0.00 slave tetrises 2.30 stripes written 1.75 partial stripes 0.55 full stripes 4.04 blocks written 0.00 blocks read 1.09 1 blocks per stripe size 3 0.66 2 blocks per stripe size 3 0.55 3 blocks per stripe size 3 Network Interface Statistics (per second) iface side bytes packets multicasts errors collisions pkt drops ns0 recv 0.00 0.00 0.00 0.00 0.00 xmit 0.00 0.00 0.00 0.00 0.00 ns1 recv 0.00 0.00 0.00 0.00 0.00 xmit 0.00 0.00 0.00 0.00 0.00 vh recv 0.00 0.00 0.00 0.00 0.00 xmit 0.00 0.00 0.00 0.00 0.00 Disk Statistics (per second) ut% is the percent of time the disk was busy. xfers is the number of data-transfer commands issued per second. xfers = ureads + writes + cpreads + greads + gwrites chain is the average number of 4K blocks per command. usecs is the average disk round-trip time per 4K block. disk ut% xfers ureads--chain-usecs writes--chain-usecs cpreads-chain-usecs greads--chain-usecs gwrites-chain-usecs /aggr0/plex0/rg0: v4.16 2 1.20 0.22 1.00 60000 0.87 1.88 10000 0.11 3.00 6667 0.00 .... . 0.00 .... . v4.17 1 0.55 0.00 .... . 0.44 3.00 7500 0.11 3.00 6667 0.00 .... . 0.00 .... . v4.18 1 0.66 0.00 .... . 0.55 2.80 7857 0.11 2.00 10000 0.00 .... . 0.00 .... . /aggr1/plex0/rg0: v5.19 0 0.00 0.00 .... . 0.00 .... . 0.00 .... . 0.00 .... . 0.00 .... . v5.35 0 0.00 0.00 .... . 0.00 .... . 0.00 .... . 0.00 .... . 0.00 .... . v5.20 0 0.00 0.00 .... . 0.00 .... . 0.00

statit命令可以查看以下内容:

            

CPU statistics

Multiprocessor statistics CSMP domain switches Miscellaneous statistics WAFL® statistics RAID statistics

Network interface statistics Disk statistics Aggregate statistics Spares and other disks FCP statistics iSCSI statistics Tape statistics

1.4、执行特殊的BOOT命令

如何访问特殊boot命令,当执行了halt或者reboot命令后,按CTRL+C后就可以进入特殊boot命令。 进入这个状态主要是为了新安装系统和troubleshooting.

执行(2)或者(5)的时候,一般用于排错,执行(4)或者(4a)的时候,一般在系统安装的时候非常有用。     

选择(1)其实就是正常启动。

选择(2)也是执行正常启动,但是不按照/etc/rc里面的配置,这时候可以手工执行rc文件里面的内容,比如ifconfig,cifs setup,NFS等,以排除因为/etc/rc里面配置的问题,造成系统不正常。 选择(3)忘记密码的时候用。

选择(4),这个命令一般就在安装的时候执行一次,它会格式化所有的盘,而且一旦确认就不可挽回,这个操作持续时间很长,取决于有多少盘,一般持续几个小时,以前硬盘上所有数据,全部丢失。 选择(6),这个状态只有部分命令可以执行,通常用来执行硬盘相关的问题,/etc/rc文件不会被解释,WAFL volume可以识别但是不可用,很少的系统服务会启动,NFS和CIFS不可用等。 除了重启命令执行后,可以进入特殊boot状态,执行此命令也可以进入此状态: 先halt系统

然后执行setenv floppy-boot?true

This session is logged in /sim/node1/sessionlogs/log floppy boot? yes NetApp Release 7.3: Thu Jul 24 12:55:28 PDT 2008 Copyright (c) 1992-2008 Network Appliance, Inc. Starting boot on Sat Apr 3 10:43:43 GMT 2010 (1) Normal boot. (2) Boot without /etc/rc. (3) Change password. (4) Initialize all disks. (4a) Same as option 4, but create a flexible root volume. (5) Maintenance mode boot. Selection (1-5)? *> ? ==注意提示符 fcadmin sasadmin storage aggr fcstat sasstat sysconfig disk halt sata version disk_list help scsi vol disk_mung raid_config sesdiag xortest environment Selection (1-5)? 4a ===选择4a的结果 Zero disks and install a new file system? yes This will erase all the data on the disks, are you sure? yes Zeroing disks takes about 56 minutes. .................................................................. Sat Apr 3 11:35:36 GMT [raid.disk.zero.done:notice]: Disk v0.16 Shelf ? Bay ? [NETAPP VD-100MB 0042] S/N [13740500] : disk zeroing complete .Sat Apr 3 11:35:37 GMT [raid.disk.zero.done:notice]: Disk v0.18 Shelf ? Bay ? [NETAPP VD-100MB 0042] S/N [13740502] : disk zeroing complete Sat Apr 3 11:35:37 GMT [raid.disk.zero.done:notice]: Disk v0.17 Shelf ? Bay ? [NETAPP VD-100MB 0042] S/N [13740501] : disk zeroing complete ................... 格式化完了,会让执行setup.重新格式化所有的license全部丢失了。 node1> aggr offline aggr2 ==在正常模式下,如果aggr上有灵活卷,不允许offline aggr offline: Cannot offline aggregate 'aggr2' because it contains one or more flexible volumes. *> aggr offline aggr2 ===在维护模式,就可以实现,说明维护模式可以做一些正常模式做不了的 Aggregate 'aggr2' is now offline. *> aggr read_fsid aggr2 ===这个命令只有在服务模式才可以获得,每个aggr的fsid必须不同 Aggregate aggr2 has an FSID of 0x4caba711. *> aggr read_fsid aggr0 Aggregate aggr0 has an FSID of 0x4caba506. (1) Normal boot. (2) Boot without /etc/rc. (3) Change password. (4) Initialize all disks. (4a) Same as option 4, but create a flexible root volume. (5) Maintenance mode boot. Selection (1-5)? 2 ====选择2模式 booting without /etc/rc and without various system daemons.. > ifconfig –a ===看见网卡并没有IP地址,就是因为没有读/etc/rc ns0: flags=8042 mtu 1500 ether 00:50:56:1a:68:49 (auto-unknown-cfg_down) ns1: flags=8042 mtu 1500 ether 00:50:56:1b:68:49 (auto-unknown-cfg_down) lo: flags=1948049 mtu 4064 inet 127.0.0.1 netmask 0xff000000 broadcast 127.0.0.1 ether 00:00:00:00:00:00 (Shared memory) (1) Normal boot. (2) Boot without /etc/rc. (3) Change password. (4) Initialize all disks. (4a) Same as option 4, but create a flexible root volume. (5) Maintenance mode boot. Selection (1-5)? 3 ===选择3模式,如果密码遗忘,可以修改 New password: Retype new password: They don't match; try again. New password: Retype new password: Password changed. 通过printenv命令看产品环境设置:

秘密的启动命令,键入22/7见下图:

Selection (1-5)? Readonly ===选择readonly启动 Selection (1-5)? vol_clear_inconsistent vol1 aggr2 vol_clear_inconsistent: successfully enqueued Selection (1-5)? WAFL_check In a cluster, you MUST ensure that the partner is (and remains) down, or that takeover is manually disabled on the partner node, because clustering software is not started or fully enabled in WAFL_check mode. FAILURE TO DO SO CAN RESULT IN YOUR FILESYSTEMS BEING DESTROYED Continue with boot? yes Check aggr2? yes Check aggr0? yes Checking aggr2... WAFL_check NetApp Release 7.3 Starting at Sat Apr 3 12:15:52 GMT 2010 Phase 1: Verify fsinfo blocks. Phase 2: Verify metadata indirect blocks. Phase 3: Scan inode file. Phase 3a: Scan inode file special files. Phase 3a time in seconds: 0 Phase 3b: Scan inode file normal files. Phase 3b time in seconds: 0 Phase 3 time in seconds: 0 Phase 4: Scan directories. Phase 4 time in seconds: 1 Phase 5: Check volumes. Phase 5a: Check volume inodes Phase 5a time in seconds: 0 Phase 5b: Check volume contents Checking volume vol1... Phase [5.1]: Verify fsinfo blocks. Phase [5.2]: Verify metadata indirect blocks. Phase [5.3]: Scan inode file. Phase [5.3a]: Scan inode file special files. Phase [5.3a] time in seconds: 0 Phase [5.3b]: Scan inode file normal files. Phase [5.3b] time in seconds: 0 Phase [5.3] time in seconds: 0 Phase [5.4]: Scan directories. Phase [5.4] time in seconds: 0 Phase [5.6]: Clean up. Phase [5.6a]: Find lost nt streams. Phase [5.6a] time in seconds: 0 Phase [5.6b]: Find lost files. Phase [5.6b] time in seconds: 0 Phase [5.6c]: Find lost blocks. Phase [5.6c] time in seconds: 0 Phase [5.6d]: Check blocks used. Phase [5.6d] time in seconds: 0 Phase [5.6] time in seconds: 0 Volume vol1 WAFL_check time in seconds: 0 (No filesystem state changed.) Phase 5b time in seconds: 0 Phase 6: Clean up. Phase 6a: Find lost nt streams. Phase 6a time in seconds: 0 Phase 6b: Find lost files. Phase 6b time in seconds: 3 Phase 6c: Find lost blocks. Phase 6c time in seconds: 0 Phase 6d: Check blocks used. Phase 6d time in seconds: 0 Phase 6 time in seconds: 3 WAFL_check total time in seconds: 5 (No filesystem state changed.) Checking aggr0... WAFL_check NetApp Release 7.3 Starting at Sat Apr 3 12:15:57 GMT 2010 Phase 1: Verify fsinfo blocks. Phase 2: Verify metadata indirect blocks. Phase 3: Scan inode file. Phase 3a: Scan inode file special files. Phase 3a time in seconds: 0 Phase 3b: Scan inode file normal files. Phase 3b time in seconds: 0 Phase 3 time in seconds: 0 Phase 4: Scan directories. Phase 4 time in seconds: 0 Phase 5: Check volumes. Phase 5a: Check volume inodes Phase 5a time in seconds: 0 Phase 5b: Check volume contents Checking volume vol0... Phase [5.1]: Verify fsinfo blocks. Phase [5.2]: Verify metadata indirect blocks. Phase [5.3]: Scan inode file. Phase [5.3a]: Scan inode file special files. Phase [5.3a] time in seconds: 0 Phase [5.3b]: Scan inode file normal files. Phase [5.3b] time in seconds: 0 Phase [5.3] time in seconds: 0 Phase [5.4]: Scan directories. Phase [5.4] time in seconds: 1 Phase [5.6]: Clean up. Phase [5.6a]: Find lost nt streams. Phase [5.6a] time in seconds: 0 Phase [5.6b]: Find lost files. Phase [5.6b] time in seconds: 0 Phase [5.6c]: Find lost blocks. Phase [5.6c] time in seconds: 0 Phase [5.6d]: Check blocks used. Phase [5.6d] time in seconds: 0 Phase [5.6] time in seconds: 0 Volume vol0 WAFL_check time in seconds: 1 (No filesystem state changed.) Phase 5b time in seconds: 1 Phase 6: Clean up. Phase 6a: Find lost nt streams. Phase 6a time in seconds: 0 Phase 6b: Find lost files. Phase 6b time in seconds: 0 Phase 6c: Find lost blocks. Phase 6c time in seconds: 0 Phase 6d: Check blocks used. Phase 6d time in seconds: 0 Phase 6 time in seconds: 0 WAFL_check total time in seconds: 2 (No filesystem state changed.) Press Enter to reboot system. Selection (1-5)? wafliron add net 127.0.0.0: gateway 127.0.0.1 Sat Apr 3 12:19:39 GMT [fmmb.current.lock.disk:info]: Disk v0.18 is a local HA mailbox disk. Sat Apr 3 12:19:39 GMT [fmmb.current.lock.disk:info]: Disk v0.17 is a local HA mailbox disk. Sat Apr 3 12:19:39 GMT [fmmb.instStat.change:info]: normal mailbox instance on local side. Sat Apr 3 12:19:40 GMT [fmmb.current.lock.disk:info]: Disk v4.16 is a partner HA mailbox disk. Sat Apr 3 12:19:40 GMT [fmmb.instStat.change:info]: normal mailbox instance on partner side. Sat Apr 3 12:19:42 GMT [raid.vol.replay.nvram:info]: Performing raid replay on volume(s) Restoring parity from NVRAM Sat Apr 3 12:19:42 GMT [raid.cksum.replay.summary:info]: Replayed 0 checksum blocks. Sat Apr 3 12:19:42 GMT [raid.stripe.replay.summary:info]: Replayed 0 stripes. Sat Apr 3 12:19:43 GMT [wafl.iron.start:notice]: Starting wafliron on aggregate aggr0. Sat Apr 3 12:19:43 GMT [wafl.iron.start:notice]: Starting wafliron on volume vol0. Replaying WAFL log . Sat Apr 3 12:19:47 GMT [rc:notice]: The system was down for 874 seconds Sat Apr 3 12:19:47 GMT [javavm.javaDisabled:warning]: Java disabled: Missing /etc/java/rt131.jar. Sat Apr 3 12:19:47 GMT [dfu.firmwareUpToDate:info]: Firmware is up-to-date on all disk drives Sat Apr 3 12:19:48 GMT [netif.linkUp:info]: Ethernet ns0: Link up. Sat Apr 3 12:19:48 GMT [netif.linkUp:info]: Ethernet ns1: Link up. add net default: gateway 192.168.0.1 Sat Apr 3 12:19:48 GMT [perf.archive.start:info]: Performance archiver started. Sampling 20 objects and 187 counters. Sat Apr 3 12:19:49 GMT [httpd.servlet.jvm.down:warning]: Java Virtual Machine is inaccessible. FilerView cannot start until you resolve this problem. Sat Apr 3 12:19:49 GMT [snmp.agent.msg.access.denied:warning]: Permission denied for SNMPv3 requests from root. Reason: Password is too short (SNMPv3 requires at least 8 characters). Sat Apr 3 12:19:49 GMT [sysconfig.sysconfigtab.openFailed:notice]: sysconfig: table of valid configurations (/etc/sysconfigtab) is missing. Sat Apr 3 12:19:50 GMT [mgr.boot.disk_done:info]: NetApp Release 7.3 boot complete. Last disk update written at Sat Apr 3 12:04:59 GMT 2010 Sat Apr 3 12:19:50 GMT [cf.fm.unexpectedAdapter:warning]: Warning: clustering is not licensed yet an interconnect adapter was found. NVRAM will be divided into two parts until adapter is removed Sat Apr 3 12:19:50 GMT [cf.fm.unexpectedPartner:warning]: Warning: clustering is not licensed yet the node once had a cluster partner Sat Apr 3 12:19:50 GMT [mgr.boot.reason_ok:notice]: System rebooted after running WAFL_check. Sat Apr 3 12:19:50 GMT [wafl.scan.start:info]: Starting wafliron demand on aggregate aggr0. Sat Apr 3 12:19:50 GMT [wafl.scan.start:info]: Starting wafliron demand on volume vol0. Password: Sat Apr 3 12:19:51 GMT [wafl.iron.completion.times:info]: Mounting phase of volume vol0 took 3s 770ms. Sat Apr 3 12:19:51 GMT [wafl.iron.completion.times:info]: Inode scanning phase of volume vol0 took 1s 450ms. Sat Apr 3 12:19:51 GMT [wafl.iron.completion.times:info]: Lost blocks search phase of volume vol0 took 190ms. Sat Apr 3 12:19:51 GMT [wafl.iron.completion.times:info]: Lost inodes search phase of volume vol0 took 30ms. Sat Apr 3 12:19:52 GMT [wafl.scan.iron.done:info]: Volume vol0, wafliron completed. Sat Apr 3 12:19:53 GMT [wafl.iron.completion.times:info]: Mounting phase of aggregate aggr0 took 4s 449ms. Sat Apr 3 12:19:53 GMT [wafl.iron.completion.times:info]: Inode scanning phase of aggregate aggr0 took 3s 139ms. Sat Apr 3 12:19:53 GMT [wafl.iron.completion.times:info]: Lost blocks search phase of aggregate aggr0 took 11ms. Sat Apr 3 12:19:53 GMT [wafl.iron.completion.times:info]: Lost inodes search phase of aggregate aggr0 took 30ms. Sat Apr 3 12:19:53 GMT [wafl.iron.mount.times:info]: Rootdir mount phase of aggregate aggr0 took 40ms. Sat Apr 3 12:19:53 GMT [wafl.iron.mount.times:info]: Activemap mount phase of aggregate aggr0 took 20ms. Sat Apr 3 12:19:53 GMT [wafl.iron.mount.times:info]: Snap inofiles mount phase of aggregate aggr0 took 0ms. Sat Apr 3 12:19:53 GMT [wafl.iron.mount.times:info]: Snap selfcover mount phase of aggregate aggr0 took 0ms. Sat Apr 3 12:19:53 GMT [wafl.iron.mount.times:info]: Snapdir mount phase of aggregate aggr0 took 20ms. Sat Apr 3 12:19:53 GMT [wafl.iron.mount.times:info]: Snapmaps mount phase of aggregate aggr0 took 0ms. Sat Apr 3 12:19:53 GMT [wafl.iron.mount.times:info]: Summary map mount phase of aggregate aggr0 took 0ms. Sat Apr 3 12:19:53 GMT [wafl.iron.mount.times:info]: Refcnt mount phase of aggregate aggr0 took 10ms. Sat Apr 3 12:19:53 GMT [wafl.iron.mount.times:info]: Metadir mount phase of aggregate aggr0 took 160ms. Sat Apr 3 12:19:53 GMT [wafl.iron.mount.times:info]: Flex vols mount phase of aggregate aggr0 took 140ms. Sat Apr 3 12:19:53 GMT [wafl.scan.iron.done:info]: Aggregate aggr0, wafliron completed.

二、检查状态及性能管理

2.1、检查系统

好的性能是硬件、软件和通讯协议之间一起以最佳状态运行。持续的监控系统和用一组NETAPP命令可以调整你的系统减少等待时间、改进数据吞吐量、达到最优性能。下面介绍一组工具关于检查系统中不同元件的状态和健康情况。

2.1.1、sysconfig

第一个命令就是sysconfig,用这个命令可以看见以下输出: 内存是否正确 显示盘柜 显示磁带驱动器 显示NIC

tan> sysconfig NetApp Release 7.3: Thu Jul 24 12:55:28 PDT 2008 System ID: 0099908572 (tan) System Serial Number: 987654-32-0 (tan) Model Name: Simulator Processors: 1 slot 0: NetApp Virtual SCSI Host Adapter v0 3 Disks: 0.3GB 1 shelf with LRC slot 1: NetApp Virtual SCSI Host Adapter v1 25 Disks: 13.0GB 2 shelves with LRC slot 2: NetApp Virtual SCSI Host Adapter v2 slot 3: NetApp Virtual SCSI Host Adapter v3 slot 4: NetApp Virtual SCSI Host Adapter v4 3 Disks: 0.3GB 1 shelf with LRC slot 5: NetApp Virtual SCSI Host Adapter v5 25 Disks: 13.0GB 2 shelves with LRC slot 6: NetApp Virtual SCSI Host Adapter v6 slot 7: NetApp Virtual SCSI Host Adapter v7 slot 8: NetApp Virtual SCSI Host Adapter v8 4 Tapes: VT-100MB VT-100MB VT-100MB VT-100MB tan> sysconfig –a ==显示每个I/O的详细信息 tan> sysconfig –c ==检查系统硬件配置 sysconfig: There are no configuration errors. tan> sysconfig –d ==显示系统中的盘 Device HA SHELF BAY CHAN Disk Vital Product Information ---------- --------------- ----- ------------------------------ v4.16 v4 1 0 FC:B 13740500 v4.17 v4 1 1 FC:B 13740501 v4.18 v4 1 2 FC:B 13740502 v5.16 v5 1 0 FC:B 10604900 v5.17 v5 1 1 FC:B 10604901 v5.18 v5 1 2 FC:B 10604902 v5.19 v5 1 3 FC:B 10604903 v5.20 v5 1 4 FC:B 10604904 v5.21 v5 1 5 FC:B 10604905 v5.22 v5 1 6 FC:B 10604906 v5.24 v5 1 8 FC:B 10604907 v5.25 v5 1 9 FC:B 10605008 v5.26 v5 1 10 FC:B 10605009 v5.27 v5 1 11 FC:B 10605010 v5.28 v5 1 12 FC:B 10605011 v5.29 v5 1 13 FC:B 10605012 v5.32 v5 2 0 FC:B 10605013 v5.33 v5 2 1 FC:B 10605014 v5.34 v5 2 2 FC:B 10605015 v5.35 v5 2 3 FC:B 10605016 v5.36 v5 2 4 FC:B 10605017 v5.37 v5 2 5 FC:B 10605018 v5.38 v5 2 6 FC:B 10605019 v5.39 v5 2 7 FC:B 10605020 v5.40 v5 2 8 FC:B 10605021 v5.41 v5 2 9 FC:B 10605022 v5.42 v5 2 10 FC:B 10605023 v5.43 v5 2 11 FC:B 10605024 tan> sysconfig –t ==显示系统中的磁带驱动器信息 Tape drive (v8.0) NETAPP VT-100MB rst0l - rewind device, format is: VT-100MB (100 MB) nrst0l - no rewind device, format is: VT-100MB (100 MB) urst0l - unload/reload device, format is: VT-100MB (100 MB) rst0m - rewind device, format is: VT-100MB (100 MB) nrst0m - no rewind device, format is: VT-100MB (100 MB) urst0m - unload/reload device, format is: VT-100MB (100 MB) rst0h - rewind device, format is: VT-100MB (100 MB) nrst0h - no rewind device, format is: VT-100MB (100 MB) urst0h - unload/reload device, format is: VT-100MB (100 MB) rst0a - rewind device, format is: VT-100MB (w/compression) nrst0a - no rewind device, format is: VT-100MB (w/compression) urst0a - unload/reload device, format is: VT-100MB (w/compression) tan> sysconfig –v ===显示DATAONTAP版本,PCI插槽上的所有设备和内存,显示每个东西。

上图的存储相关硬件示意图。

2.1.2、sysstat

最好的命令看CPU利用率是sysstat [interval]. Sysstat 1指定每一秒更新一次(缺省是15秒) 我们可以判断并回答以下命令:  

使用是稳定的还是波动的

CPU利用率是否过高以致无法响应输入输出行为。

tan> sysstat 1 CPU NFS CIFS HTTP Net kB/s Disk kB/s Tape kB/s Cache in out read write read write age 0% 0 0 0 0 0 0 0 0 0 >60 0% 0 0 0 0 0 0 0 0 0 >60 第一列表示CPU的忙闲程度,如果是70%-80%则说明比较忙 第二列表示网络流量每秒/千字节。 第三列表示每秒/千字节磁盘的I/O.filer用NVRAM来限制到磁盘的写流量,用RAM去缓存读数据,磁盘读发生在数据部在cache中,理想状态下,磁盘写每10秒发生一次,所以9行的输入是0,然后是一个大的写入,持续的写会增加CPU的负载,最后影响写的性能。

2.1.3、与优化CPU性能的options命令

tan> options raid.reconstruct raid.reconstruct.perf_impact medium ==控制raidgroup重构的时候,CPU的利用情况,如果想提高重构速度,则降低此值 tan> options raid.scrub raid.scrub.duration 360 raid.scrub.enable on raid.scrub.perf_impact low raid.scrub.schedule tan> options vol.copy vol.copy.throttle 10 tan> options wafl.max ==定义最大的目录大小 wafl.maxdirsize 5242 tan> vol options vol0 maxdirsize 5000 ==定义vol最大的目录大小

因篇幅问题不能全部显示,请点此查看更多更全内容

Top