Rob Garth
Mildly Useful Stuff

NFS errors and File Server crashes

January 17th, 2007 by robg

Ahh the joys of being a Linux admin. Last week the raid array connected to our main file server suffered a hardware fault. Not to worry we have a backup. I NFS mounted the system from the backup box, and expected everything to work … it should have worked.

Of course nothing is ever simple, and the NFS code in the 2.6.18 Kernel and on appears to be broken. We have already been bitten by this bug, as we had previously been a complete Tru64 shop, and we still had some servers in use it bit us hard. But we decided to press on and upgraded those machines to Linux. But now it appears that samba cannot safely export NFS mounted shares, it should be able to, and always has been in the past, but if a windows box mounted one of the samba shares and attempted to write a file it would fail, and worse, it would clobber the file, writing nonsensical information on top of the existing file.

We quickly heard about the issues from our users, I did a DNS fiddle, a yum to install samba, and was happily sharing files from the backup machine, with the exception of a few machines which had to rejoin the domain.

What to do with the file server. I am guessing the issues I saw were caused by, or related to, the same NFS bug, so no supported version of Fedora was an option for the file-server as the bug is in both FC5 and FC6. I installed CentOS 4.4 on the box and had it ready to go with samba again. I ran it up as the Domain Master and all the shares were working correctly, including samba shares of NFS mounts. I breathed a sigh of relief.

Of course there were more problems, none of our windows machines or Linux boxes could join our Domain. We had never had this problem before. I ran samba in all kinds of debug levels to try and diagnose the problem, I was getting some weird LDAP errors, but we don’t use LDAP as a back end for samba. I did a google and nothing useful returned, but I did notice others having similar problems on similar versions of Samba. CentOS ships with an older version of samba.

I downloaded the Fedora Core 6 SRPM of Samba and built it on the CentOS Box. Success! The Shares worked, and all of the machines could join the domain.

Not the best way to spend my Sunday. If I whinge about the QA in Fedora your all going to tell me that I shouldn’t be running it on a production server, so I won’t. The Samba errors I was seeing in CentOS were surprising though.

Posted in Uncategorized

Leave a Comment

Please note: Comment moderation is enabled and may delay your comment. There is no need to resubmit your comment.