NFS client kernel issue?

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

NFS client kernel issue?

Keith Jackson-2
Hi,
I've run into a strange problem, and I'm not sure where to ask about  
it. It appears to be kernel related, so hopefully someone here can  
shed some light on it for me. I have a over a dozen ubuntu 9 server  
machines that share an NFS filesystem served up on a FreeBSD machine.  
We recently upgraded the FreeBSD box to FreeBSD 7.2. The ubuntu  
machines are running the standard 2.6.28-11-server kernel.

At this point, we started having problems with NFS on all of the  
ubuntu machines. They would come up, mount the NFS shares and then  
5-10 minutes later NFS would be hung. I ran wireshark on one of them,  
while doing "ls -l ~username". When NFS was working everything looked  
fine. Once it hung, I would see the normal YP MATCH RPC call looking  
for the entry for "username". The NIS server would return the password  
entry for "username" as I'd expect. When things were working, I'd then  
see the standard NFS traffic. When NFS was hosed, the YP MATCH reply  
would be the last net traffic I would see to the NFS server. Nothing  
NFS related would ever happen. An strace would show that the ls was  
hung in a stat64 call.

A reboot of the system would cause things to start working for a  
while, but 5-10 minutes later NFS would hang. I eventually updated to  
a 2.6.30 kernel from: http://kernel.ubuntu.com/~kernel-ppa/mainline/v2.6.30/ 
  This has fixed the problem.

Any thoughts on what the problem might be, or how I might go about  
debugging it better?
thx,
--keith
--------------------------------------------------------------------------------------------------------
Keith R. Jackson                                     email: [hidden email]
MS: 50B-2239                                         phone: 510-486-4401
Lawrence Berkeley National Lab        url: http://www-itg.lbl.gov/~kjackson/
----------------------------------------------------------------------------------------------------------



--
kernel-team mailing list
[hidden email]
https://lists.ubuntu.com/mailman/listinfo/kernel-team
Reply | Threaded
Open this post in threaded view
|

Re: NFS client kernel issue?

Andy Whitcroft-3
On Mon, Jun 29, 2009 at 04:29:25PM -0700, Keith Jackson wrote:

> Hi,
> I've run into a strange problem, and I'm not sure where to ask about  
> it. It appears to be kernel related, so hopefully someone here can  
> shed some light on it for me. I have a over a dozen ubuntu 9 server  
> machines that share an NFS filesystem served up on a FreeBSD machine.  
> We recently upgraded the FreeBSD box to FreeBSD 7.2. The ubuntu  
> machines are running the standard 2.6.28-11-server kernel.
>
> At this point, we started having problems with NFS on all of the  
> ubuntu machines. They would come up, mount the NFS shares and then  
> 5-10 minutes later NFS would be hung. I ran wireshark on one of them,  
> while doing "ls -l ~username". When NFS was working everything looked  
> fine. Once it hung, I would see the normal YP MATCH RPC call looking  
> for the entry for "username". The NIS server would return the password  
> entry for "username" as I'd expect. When things were working, I'd then  
> see the standard NFS traffic. When NFS was hosed, the YP MATCH reply  
> would be the last net traffic I would see to the NFS server. Nothing  
> NFS related would ever happen. An strace would show that the ls was  
> hung in a stat64 call.
>
> A reboot of the system would cause things to start working for a  
> while, but 5-10 minutes later NFS would hang. I eventually updated to  
> a 2.6.30 kernel from: http://kernel.ubuntu.com/~kernel-ppa/mainline/v2.6.30/ 
>   This has fixed the problem.
>
> Any thoughts on what the problem might be, or how I might go about  
> debugging it better?

I am unsure whether the YP stuff is done in the kernel, but there do
seem to be a number of reverted patches in the v2.6.28 to v2.6.30
period.  As an example:

    nfsd: Revert "svcrpc: take advantage of tcp autotuning"

It would be worth testing v2.6.29 as the midpoint to try and split the
difference and help reduce the gap.  Also get a bug filed in launchpad
and let us know its number.

-apw

--
kernel-team mailing list
[hidden email]
https://lists.ubuntu.com/mailman/listinfo/kernel-team
Reply | Threaded
Open this post in threaded view
|

Re: NFS client kernel issue?

Stefan Bader-2
In reply to this post by Keith Jackson-2
I cannot recall anything NFS specific that would matter. But there is quite a
bunch of stable patches that need to get into Jaunty. I have preview packages
if that on https://launchpad.net/~stefan-bader-canonical/+archive/jaunty
which you could try as well. Nothing directly says rpc or nfs but maybe
something in net or vfs.

Regards,
Stefan


Keith Jackson wrote:

> Hi,
> I've run into a strange problem, and I'm not sure where to ask about  
> it. It appears to be kernel related, so hopefully someone here can  
> shed some light on it for me. I have a over a dozen ubuntu 9 server  
> machines that share an NFS filesystem served up on a FreeBSD machine.  
> We recently upgraded the FreeBSD box to FreeBSD 7.2. The ubuntu  
> machines are running the standard 2.6.28-11-server kernel.
>
> At this point, we started having problems with NFS on all of the  
> ubuntu machines. They would come up, mount the NFS shares and then  
> 5-10 minutes later NFS would be hung. I ran wireshark on one of them,  
> while doing "ls -l ~username". When NFS was working everything looked  
> fine. Once it hung, I would see the normal YP MATCH RPC call looking  
> for the entry for "username". The NIS server would return the password  
> entry for "username" as I'd expect. When things were working, I'd then  
> see the standard NFS traffic. When NFS was hosed, the YP MATCH reply  
> would be the last net traffic I would see to the NFS server. Nothing  
> NFS related would ever happen. An strace would show that the ls was  
> hung in a stat64 call.
>
> A reboot of the system would cause things to start working for a  
> while, but 5-10 minutes later NFS would hang. I eventually updated to  
> a 2.6.30 kernel from: http://kernel.ubuntu.com/~kernel-ppa/mainline/v2.6.30/ 
>   This has fixed the problem.
>
> Any thoughts on what the problem might be, or how I might go about  
> debugging it better?
> thx,
> --keith
> --------------------------------------------------------------------------------------------------------
> Keith R. Jackson                                     email: [hidden email]
> MS: 50B-2239                                         phone: 510-486-4401
> Lawrence Berkeley National Lab        url: http://www-itg.lbl.gov/~kjackson/
> ----------------------------------------------------------------------------------------------------------
>
>
>


--

When all other means of communication fail, try words!



--
kernel-team mailing list
[hidden email]
https://lists.ubuntu.com/mailman/listinfo/kernel-team
Reply | Threaded
Open this post in threaded view
|

RE: NFS client kernel issue?

joakim-2
In reply to this post by Keith Jackson-2

Hi,

I am not sure what you describe is a kernel problem. I also had a similar very
strange behavior with my desktop, Ubuntu 7.04 and forward, on our office net
which uses NIS and NFS. I finally tracked down the problem to be caused by the
limitation of 16 GID:s in the RPC specification on which NFS is based. Ubuntu
has many predefined GID:s and if you have a number of NIS GID:s aswell the
excess is just silently discarded.

This can cause strange silent access violations on systems where GID is
sometimes taken from the RPC and on other cases looked up locally on the server
through NIS based on the UID, depending in the service. The following thread
talks more about it:

http://forums13.itrc.hp.com/service/forums/questionanswer.do?admit=109447627+1246378870139+28353475&threadId=1135775

It would be nice if Ubuntu limited the number of local GID:s or at least
prioritized the GID:s supplied over NIS. Maybe not a Kernel team issue?

BR

Joakim

--
kernel-team mailing list
[hidden email]
https://lists.ubuntu.com/mailman/listinfo/kernel-team
Reply | Threaded
Open this post in threaded view
|

Re: NFS client kernel issue?

Keith Jackson-2
In reply to this post by Andy Whitcroft-3
On Jun 30, 2009, at 5:46 AM, Andy Whitcroft wrote:

> It would be worth testing v2.6.29 as the midpoint to try and split the
> difference and help reduce the gap.  Also get a bug filed in launchpad
> and let us know its number.


Ok, I filed a bug in launchpad: #394413

I will add this information to launchpad, but I did some further  
testing. 2.6.29-02062905-generic from the mainline tree also has this  
problem. As does: vmlinuz-2.6.28-11-server and vmlinuz-2.6.27-11-
server. vmlinuz-2.6.24-21-server seems to be working fine. Since the  
problem is showing up in the mainline, I will focus on doing some  
more  testing there to see if I can find exactly what version is the  
first to exhibit this problem.
--keith

--------------------------------------------------------------------------------------------------------
Keith R. Jackson                                     email: [hidden email]
MS: 50B-2239                                         phone: 510-486-4401
Lawrence Berkeley National Lab        url: http://www-itg.lbl.gov/~kjackson/
----------------------------------------------------------------------------------------------------------



--
kernel-team mailing list
[hidden email]
https://lists.ubuntu.com/mailman/listinfo/kernel-team