Scheduled SMART Checks
For years hard disks (both spinning rust and SSDs) have had a built in monitoring system that tracks various metrics about the health of your disk, called SMART. In the old days if you were lucky you might get some warning that your disk was about to fail because it would start to make a nasty noise. In the modern era of SSDs you likely won’t get any warning, and suddenly boom, your laptop won’t boot or mount the disk.
Obviously nothing is perfect, and any monitoring can miss a failure, but the potential of some warning is better than definitely not getting any. Also this is no subtitute for a proper backup and recovery strategy, but in most home situations people don’t have spare laptops or hard drives just sitting around.
It would be relatively easy for operating system vendors to automatically detect SMART capabable drives and automatically run a check every so often. If it fails, they could pop up a warning about a potential imminent failure. As far as I know though, no-one does this.
There is a simple command line tool that lets you interrogate SMART attributes yourself,
smartctl. It is available in all Linux distribution package repositories,
and in both Homebrew and MacPorts for Mac OS. I’m not a Windows user, so I’m not sure about using it there.
On both my work and my personal MacBooks, and my home Linux server I run something similar to the following. First I have a script which enables SMART monitoring, then triggers a short test, waits for it to complete and finally emails me the output.
#!/bin/bash /opt/local/sbin/smartctl --all /dev/disk1 -s on /opt/local/sbin/smartctl --all /dev/disk1 -t short 2>&1 > /dev/null sleep 600 /opt/local/sbin/smartctl --all /dev/disk1 | /usr/bin/mail -s 'MacBook SMART' myemailaddress
On Linux it’s easy to use
cron to trigger this script weekly. On Mac OS it’s a little more complicated,
but the following
plist file will do it. Simple place it in
(replacing your username and script file name as appropriate), and it’ll get run weekly, when you switch
your laptop on.
<?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd"> <plist version="1.0"> <dict> <key>Label</key> <string>smartctl</string> <key>Program</key> <string>/Users/andrew/bin/smartctl</string> <key>StartCalendarInterval</key> <dict> <key>Hour</key> <integer>0</integer> <key>Minute</key> <integer>0</integer> <key>Weekday</key> <integer>1</integer> </dict> <key>AbandonProcessGroup</key> <true/> </dict> </plist>
On my work laptop
smartctl is reporting
PASSED, but one attribute is failing. I guess we just need
to wait and see what happens. Fingers crossed it holds out until I can get Apple-silicon-powered MacBook Pro.
173 Wear_Leveling_Count 0x0032 096 096 100 Old_age Always FAILING_NOW 12309581794157