NFS服务启动异常

welman00chijian 2011-06-02 08:35:11

我在服务器上装了Ubuntu 10.04, 配置了NFS用来共享几台Clients的HOME目录, Clients设置的是开始自动mount的。
一开始活动挺正常的。昨天在几台Client启动的情况下，重启了服务器。然后发现，现在Server启动后，NFS的服务不正常了，Client不能自动挂载NFS了，手动挂载亦不成功，提示是：
mount to NFS server 'node1:/export/homes' failed: RPC Error: Success
如果我在服务器端手动重启NFS，然后一切正常了。

推断是服务器的问题，我用
cat | grep -iE '(rpc|nfs)' 的方式查看了/var/log/messages 和 /var/log/syslogs
没有发现异常的输出：
Jun 2 14:12:07 node1 rpc.statd[1134]: Version 1.1.6 Starting
Jun 2 14:12:07 node1 rpc.statd[1134]: Flags:
Jun 2 14:12:07 node1 kernel: [ 13.438549] RPC: Registered udp transport module.
Jun 2 14:12:07 node1 kernel: [ 13.438549] RPC: Registered tcp transport module.
Jun 2 14:12:07 node1 kernel: [ 13.438549] RPC: Registered tcp NFSv4.1 backchannel transport module.
Jun 2 14:12:07 node1 kernel: [ 13.467135] Installing knfsd (copyright (C) 1996 okir@monad.swb.de).
Jun 2 14:12:08 node1 kernel: [ 13.795788] NFSD: Using /var/lib/nfs/v4recovery as the NFSv4 state recovery directory
Jun 2 14:12:08 node1 kernel: [ 13.803972] NFSD: starting 90-second grace period
看起来像是顺利启动了。利用rpcinfo -p localhost | grep nfs　可以看到
100003 2 udp 2049 nfs
100003 3 udp 2049 nfs
100003 4 udp 2049 nfs
100003 2 tcp 2049 nfs
100003 3 tcp 2049 nfs
100003 4 tcp 2049 nfs
同时，netstat 查看2049端口：
tcp 0 0 0.0.0.0:2049 0.0.0.0:* LISTEN -
udp 0 0 0.0.0.0:2049 0.0.0.0:* -
可以看到2049已被NFS采用，并处于listen的状态。

如果我重启一下nfs服务，则手动挂载就没有问题了。
syslog显示：
Jun 2 14:32:11 node1 kernel: [ 1216.968817] nfsd: last server has exited, flushing export cache
Jun 2 14:32:12 node1 kernel: [ 1218.043658] NFSD: Using /var/lib/nfs/v4recovery as the NFSv4 state recovery directory
Jun 2 14:32:12 node1 kernel: [ 1218.043686] NFSD: starting 90-second grace period
除了重新flush了一下export cache之后，没有什么别的差别。
如果我手动地来重置一下export,
sudo exportfs -arv
则client不能挂载上。

不过我现在还有一个情况不清楚，在我重启NFS之前，利用showmount看了一下当前的挂载IP，显示
showmount -a localhost
All mount points on localhost:
192.168.0.2:/export/homes
冒似client已经挂载上了, 这里的192.168.0.2是一台client的IP.
但应该并不是client的问题，因为如果当NFS恢复正常后，我再重启client，是可以自动挂载上去的。

我大概的分析就只到这里了，还请各位多多指教，谢谢！

...全文

2091 1 打赏收藏转发到动态举报

写回复

用AI写文章

1 条回复

切换为时间正序

请发表友善的回复…

发表回复

welman00chijian 2011-06-02

打赏
举报

问题解决了，但原理还不太清楚。
在mount时，加了一个参数，-o nfsvers=2
用来指定是version2 的NFS版本，这就没问题了。同时在client的fstab里加载参数中，把这个nfsvers=2加进去，就可以自动加载了。
看了一下NFS的mannual，其中说mount.nfs时，默认的nfsvers是3. 同时在server通过
rpcinfo -t localhost nfs可以看到，2,3,4的版本也同时在ready。
所以还不太明白为什么必须指定２的版本才可以。
暂不关帖，望高手指教。

同时，说个题外话，我之前在查看/var/log/messages时，看到有这样一个错误：
kernel: [ 11.176079] svc: failed to register lockdv1 RPC service (errno 97).
当时以为这个是原因，后来找了一下，这主要是由于Ubuntu默认支持IPV6而造成的。
通过在/etc/default/grub中,修改GRUB_CMDLINE_LINUX项为
GRUB_CMDLINE_LINUX="ipv6.disable=1"
就可以禁掉ipv6,消掉这个error了。
但这个问题和NFS的问题没有直接的关系。