Automatically Resolve NFS Stale File Handle Errors

Basically, an NFS filehandle is a pointer on the NFS server to the actual file (or directory). In most instances, this pointer is an amalgamation of an inode, disk device, and inode generation of the referenced file on the server. As such, the filehandle is essentially meaningless to the NFS client that has an NFS share mounted.


This is a companion discussion topic for the original entry at https://engineerworkshop.com/blog/automatically-resolve-nfs-stale-file-handle-errors-in-ubuntu-linux/

Cool, and thank you very much. However (You knew there was a butt in there somewhere (pun intended…))

Box 1 (sagetower1) that has the vital data is an UNRAID box that is being used to capture recordings from my PVR device. Because it is capturing streaming video, I want it to use the cache drive to capture and move to regular storage later via the mover process. This appears to be the guy that has the stale NFS mount problem.

Box 2 (tower) is another UNRAID box that is running PLEX in a VM, and has mounted an NFS share from the above sagetower1 box in order to catalog and share up my videos for streaming/downloading to various portable devices (ipads, iphones, laptops, etc).

So, tower sees the stale NFS handles when when it goes to query the share for new videos to catalog.

So, my question is, which one of the above boxes do I need to place this script on? And, I haven’t played with UNRAID enough to know, how do I install this script into whichever of the UNRAID boxes that needs to be “fixed”?

In your case, Box 2 (tower) is a client of the NFS server running out of Box 1 (sagetower1). The above script goes on the client, so in your case it would go on Box 2 (tower).

Since you’re using Unraid, I would recommend using the User Scripts plugin to add the cron task. Then it’s just a matter of Add New Script > Edit Script > Copy and paste the script above > Save Changes. After that, back on the main User Scripts plugin page, change the schedule to “Custom” and in the field that appears enter either */5 * * * *, if you want the script to check every 5 minutes or * * * * * if you want it to run every minute. (The Unraid User Scripts plugin follows the same format as regular cron jobs).

-TorqueWrench

Sigh I feel like I’m on a treadmill with no definite destination.

I created your script via the User Scripts plugin on my Tower box (box 2). When I try to run it from the command line via ‘bash -x /boot/config/plugins/user.scripts/scripts/staleHandleDestroyer/script’, it basically doesn’t do anything. It appears that the ‘df’ command doesn’t recognize stale NFS handles, or it doesn’t call them out. So, I did some more research, and found the script that you mentioned for the ProMox solution. That script does in fact locate and identify mount points that have stale handles. However, since your ProMox solution relies on autofs to remount the drive after it’s been unmounted. So, I thought to myself, I’ll just test this out and manually run a ‘mount -a’. Well, in unRaid, that seems to do nothing. When I try to manually mount my nfs drive from the command line, I get an error ‘mount: /mnt/remotes/sagetower1_sagetv_media: can’t find in /etc/fstab.’

I was able to manually mount the nfs share via the GUI, but that is definitely NOT a long term solution.

I was wondering if you had any other insights for this UNRAID situation, or so I need to go back to the unraid forums and ask there?

(Thank you so much for the insights you have provided and I apologize if I’m being a pest. Just trying to figure out this “exciting” environment that is called UNRAID).

Just wanted to let you know, I think I’ve found a solution on the help page. All necessary scripts seem to be included with the new UD scheme, so I’m going to try their script: ‘/usr/local/sbin/rc.unassigned mount auto’

Thanks

TopJob

Just to wrap this discussion up, and for any others that may stumble across this, here is my script that I am using:

#!/bin/bash
echo $0 date '+%Y-%m-%d %H:%M:%S' “: running”
for mountpoint in $(grep nfs /etc/mtab | awk ‘{print $2}’)
do

   read -t1 < <(stat -t "$mountpoint" 2>&-)
   if [ -z "$REPLY" ] ; then
           echo "NFS mount $mountpoint stale. Removing..."
           umount -f -l "$mountpoint"
           /usr/local/sbin/rc.unassigned mount auto
   fi

done
echo $0 date '+%Y-%m-%d %H:%M:%S' “: done”

PS: sorry for the weird editing, but if I put all of the script in blockquotes then I seem to lose all of the indenting. This seems to preserve indenting as I want.

1 Like