http://forum.havetheknowhow.com/viewtop ... 1991#p1991
I have, however thought of a better solution, which will make the output for logging and email content much more readable and informative.
To do this, I have used the contents of the dev/disk/by-id folder.
Firstly, in a terminal, I enter:
Code: Select all
~$ ls -l /dev/disk/by-id
total 0
llrwxrwxrwx 1 root root 9 Aug 1 09:13 ata-Hitachi_HDS724040ALE640_PK1310PAG0VMBJ -> ../../sdb
lrwxrwxrwx 1 root root 10 Aug 1 09:13 ata-Hitachi_HDS724040ALE640_PK1310PAG0VMBJ-part1 -> ../../sdb1
lrwxrwxrwx 1 root root 9 Aug 1 09:13 ata-MATSHITABD-MLT_UJ240AS_WJ42_003694 -> ../../sr0
lrwxrwxrwx 1 root root 9 Aug 1 09:13 ata-OCZ-NOCTI_OCZ-F412PBYMZ7MZ4E6W -> ../../sdc
lrwxrwxrwx 1 root root 10 Aug 1 09:13 ata-OCZ-NOCTI_OCZ-F412PBYMZ7MZ4E6W-part1 -> ../../sdc1
lrwxrwxrwx 1 root root 9 Aug 1 09:13 ata-SAMSUNG_SSD_830_Series_S0XZNEAC711934 -> ../../sda
lrwxrwxrwx 1 root root 10 Aug 1 09:13 ata-SAMSUNG_SSD_830_Series_S0XZNEAC711934-part1 -> ../../sda1
lrwxrwxrwx 1 root root 10 Aug 1 09:13 ata-SAMSUNG_SSD_830_Series_S0XZNEAC711934-part2 -> ../../sda2
lrwxrwxrwx 1 root root 10 Aug 1 09:13 ata-SAMSUNG_SSD_830_Series_S0XZNEAC711934-part3 -> ../../sda3
lrwxrwxrwx 1 root root 10 Aug 1 09:13 dm-name-Server-root -> ../../dm-0
lrwxrwxrwx 1 root root 10 Aug 1 09:13 dm-name-Server-swap_1 -> ../../dm-1
lrwxrwxrwx 1 root root 10 Aug 1 09:13 dm-name-Server-System -> ../../dm-2
lrwxrwxrwx 1 root root 10 Aug 1 09:13 dm-uuid-LVM-Z8LZg70hTKbj7AoTEU12IP81IeP5fgLdb9h3Rs23fJ2io8zxPjbpedP4eUrC3OVw -> ../../dm-2
lrwxrwxrwx 1 root root 10 Aug 1 09:13 dm-uuid-LVM-Z8LZg70hTKbj7AoTEU12IP81IeP5fgLdG5s3dBd4rFdt34hdJoyhJxB5oCw7l6RI -> ../../dm-1
lrwxrwxrwx 1 root root 10 Aug 1 09:13 dm-uuid-LVM-Z8LZg70hTKbj7AoTEU12IP81IeP5fgLdVJQezcxe7NQVplQeVfQFMlqY4dAwq73D -> ../../dm-0
lrwxrwxrwx 1 root root 9 Aug 1 09:13 scsi-SATA_Hitachi_HDS7240_PK1310PAG0VMBJ -> ../../sdb
lrwxrwxrwx 1 root root 10 Aug 1 09:13 scsi-SATA_Hitachi_HDS7240_PK1310PAG0VMBJ-part1 -> ../../sdb1
lrwxrwxrwx 1 root root 9 Aug 1 09:13 scsi-SATA_OCZ-NOCTI_OCZ-F412PBYMZ7MZ4E6W -> ../../sdc
lrwxrwxrwx 1 root root 10 Aug 1 09:13 scsi-SATA_OCZ-NOCTI_OCZ-F412PBYMZ7MZ4E6W-part1 -> ../../sdc1
lrwxrwxrwx 1 root root 9 Aug 1 09:13 scsi-SATA_SAMSUNG_SSD_830S0XZNEAC711934 -> ../../sda
lrwxrwxrwx 1 root root 10 Aug 1 09:13 scsi-SATA_SAMSUNG_SSD_830S0XZNEAC711934-part1 -> ../../sda1
lrwxrwxrwx 1 root root 10 Aug 1 09:13 scsi-SATA_SAMSUNG_SSD_830S0XZNEAC711934-part2 -> ../../sda2
lrwxrwxrwx 1 root root 10 Aug 1 09:13 scsi-SATA_SAMSUNG_SSD_830S0XZNEAC711934-part3 -> ../../sda3
lrwxrwxrwx 1 root root 9 Aug 1 09:13 wwn-0x5000cca22bc063f2 -> ../../sdb
lrwxrwxrwx 1 root root 10 Aug 1 09:13 wwn-0x5000cca22bc063f2-part1 -> ../../sdb1
lrwxrwxrwx 1 root root 9 Aug 1 09:13 wwn-0x5002538043584d30 -> ../../sda
lrwxrwxrwx 1 root root 10 Aug 1 09:13 wwn-0x5002538043584d30-part1 -> ../../sda1
lrwxrwxrwx 1 root root 10 Aug 1 09:13 wwn-0x5002538043584d30-part2 -> ../../sda2
lrwxrwxrwx 1 root root 10 Aug 1 09:13 wwn-0x5002538043584d30-part3 -> ../../sda3
lrwxrwxrwx 1 root root 9 Aug 1 09:13 wwn-0x5e83a97edd3455aa -> ../../sdc
lrwxrwxrwx 1 root root 10 Aug 1 09:13 wwn-0x5e83a97edd3455aa-part1 -> ../../sdc1
lrwxrwxrwx 1 root root 9 Aug 1 09:13 scsi-SATA_Hitachi_HDS7240_PK1310PAG0VMBJ -> ../../sdb
is what I'm looking for.
The path to this symbolic link is therefore:
/dev/disk/by-id/scsi-SATA_Hitachi_HDS7240_PK1310PAG0VMBJ
So the modified Thermal Shutdown script for this method will look like:
Code: Select all
#!/bin/bash
#PURPOSE: Script to check temperature of installed hard drives and report/shutdown if specified temperatures exceeded
#
# Modified for this server!!
#
# AUTHOR: feedback[AT]HaveTheKnowHow[DOT]com
# Expects three arguments:
# 1. Warning temperature
# 2. Critical shutdown temperature
# 3. If argument 3 is present then just check that drive letter
# eg. using ./DriveTemps.sh 35 45
# will warn when temperature of one or more drives reaches 35degrees and shutdown when any one of them hits 45
# eg. using ./DriveTemps.sh 35 45 c
# will warn when temperature of drive sdc reaches 35degrees and shutdown when it hits 45
# NOTES:
# Change the string ">>/home/htkh" as required
# Substitute string "myemail@myaddress.com" with your own email address in the string which starts "/usr/sbin/ssmtp myemail@myaddress.com"
# Change the command MyList='a b c d e' to the number of drives you have. In this case I'm using 6 drives
# Assumes /usr/sbin/smartctl -n standby -a /dev/sd$i returns the string 'Temperature_Celsius' somewhere
echo "JOB RUN AT $(date)"
echo '============================'
echo ''
echo 'Drive Warning Limit set to =>' $1
echo 'Drive Shutdown Limit set to =>' $2
echo ''
echo ''
if [ $# -eq 2 ]
then
MyList='scsi-SATA_Hitachi_HDS7240_PK1310PAG0VMBJ'
echo 'Testing all drives'
else
MyList=($3)
echo 'Testing only the system drive'
fi
echo ''
for i in $MyList
do
echo 'Drive /dev/disk/by-id/'$i
/usr/sbin/smartctl -n standby -a /dev/disk/by-id/$i | grep Temperature_Celsius
done
echo ''
echo ''
for i in $MyList
do
#Check state of drive 'active/idle' or 'standby'
stra=$(/sbin/hdparm -C /dev/disk/by-id/$i | grep 'drive' | awk '{print $4}')
echo 'Testing Drive with ID: '$i
if [ ${stra} = 'standby' ]
then
echo ' Drive with ID: '$i ' s in standby'
echo ''
else
str1='/usr/sbin/smartctl -n standby -a /dev/disk/by-id/'$i
str2=$($str1 | grep Temperature_Celsius | awk '{print $10}')
if [ ${str2} -ge $1 ]
then
echo '========================================' >>/home/server/Logs/DriveWarning.Log
echo $(date) >>/home/server/Logs/DriveWarning.Log
echo '' >>/home/server/Logs/DriveWarning.Log
echo 'WARNING: TEMPERATURE FOR DRIVE with ID: '$i 'EXCEEDED' $1 '=>' $str2 >>/home/server/Logs/DriveWarning.Log
echo '' >>/home/server/Logs/DriveWarning.Log
echo '========================================' >>/home/server/Logs/DriveWarning.Log
echo '========================================'
echo $(date)
echo ''
echo 'WARNING: TEMPERATURE FOR DRIVE with ID: '$i 'EXCEEDED' $1 '=>' $str2
echo ''
echo '========================================'
fi
if [ ${str2} -ge $2 ]
then
echo '========================================' >>/home/server/Logs/DriveWarning.Log
echo $(date) >>/home/server/Logs/DriveWarning.Log
echo '' >>/home/server/Logs/DriveWarning.Log
echo 'CRITICAL: TEMPERATURE FOR DRIVE with ID: '$i 'EXCEEDED' $2 '=>' $str2 >>/home/server/Logs/DriveWarning.Log
echo '' >>/home/server/Logs/DriveWarning.Log
echo '========================================' >>/home/server/Logs/DriveWarning.Log
echo '========================================'
echo $(date)
echo ''
echo 'CRITICAL: TEMPERATURE FOR DRIVE with ID: '$i 'EXCEEDED' $2 '=>' $str2
echo ''
echo '========================================'
/usr/sbin/pm-hibernate
/usr/sbin/ssmtp ******@****** </home/server/Logs/DriveWarning.Log
echo 'Email Sent.....'
exit
else
echo ''
echo ' Temperature of Drive with ID: '$i' is OK at =>' $str2
echo ''
fi
fi
done
echo 'All Drives are within limits'
echo ''
To enable the hibernation feature, just install pm-utils:
Code: Select all
sudo apt-get install pm-utils
Code: Select all
server@Server:~/Scripts$ sudo ./DriveTempShutdown.sh 40 55
[sudo] password for server:
JOB RUN AT Sat Aug 4 19:32:44 BST 2012
============================
Drive Warning Limit set to => 40
Drive Shutdown Limit set to => 55
Testing all drives
Drive /dev/disk/by-id/scsi-SATA_Hitachi_HDS7240_PK1310PAG0VMBJ
194 Temperature_Celsius 0x0002 139 139 000 Old_age Always - 43 (Min/Max 22/47)
Testing Drive with ID: scsi-SATA_Hitachi_HDS7240_PK1310PAG0VMBJ
========================================
Sat Aug 4 19:32:47 BST 2012
WARNING: TEMPERATURE FOR DRIVE with ID: scsi-SATA_Hitachi_HDS7240_PK1310PAG0VMBJ EXCEEDED 40 => 43
========================================
Temperature of Drive with ID: scsi-SATA_Hitachi_HDS7240_PK1310PAG0VMBJ is OK at => 43
All Drives are within limits
I hope that this as useful to others as I find it to be. Changing to the disk/by-id/ symlink prevents the script from breaking on a hardware change and also tells exactly which drive has overheated.
Further improvements could be made by perhaps grep-ing the drive label to give sd#; ie: scsi-SATA_Hitachi_HDS7240_PK1310PAG0VMBJ -> ../../sdb. Maybe adding something along the lines of: ls -l /dev/disk/by-id | grep $i might work? Giving volume label would also be useful; perhaps this could be grep-ed from the result of the last function (ie. sdb):
Code: Select all
server@Server:~/Scripts$ ls -l /dev/disk/by-label
total 0
lrwxrwxrwx 1 root root 10 Aug 4 18:46 4TB_Storage -> ../../sdb1
lrwxrwxrwx 1 root root 10 Aug 4 18:46 Recordings -> ../../sdc1
Food for thought, anyway.
