Problems with cron table
Posted: July 16th, 2013, 8:43 pm
I am following this wonderful guide but I have come across some issues that quite frankly I cannot solve on my own.
Problem 1: I was testing the effectiveness of the HDD temp script and to confirm it was working I set the values to a value that would trigger the shutdown. Specifically I used 15 and 25 as the numerical arguments. However now I cannot get back into my server to adjust the temps to the right values. Additionally I am noticing that I am not getting emails saying that my server has shutdown.
Problem 2: I am getting an error when trying to test the CPU Core temp script but I cannot post any specifics as I cant keep my server on for longer than a minute. Ill post more details when I can access the information.
####EDIT####
I got into my system and changed the values. Okay so here are my two scripts.
CPUTempShutdown
If I run that in webmin I get the following errors.
Next problem, I am not getting emails from either scripts. Here is the second script.
DriveTempShutdown
Here is the output from the sensors cmd.
Thanks for looking.
Problem 1: I was testing the effectiveness of the HDD temp script and to confirm it was working I set the values to a value that would trigger the shutdown. Specifically I used 15 and 25 as the numerical arguments. However now I cannot get back into my server to adjust the temps to the right values. Additionally I am noticing that I am not getting emails saying that my server has shutdown.
Problem 2: I am getting an error when trying to test the CPU Core temp script but I cannot post any specifics as I cant keep my server on for longer than a minute. Ill post more details when I can access the information.
####EDIT####
I got into my system and changed the values. Okay so here are my two scripts.
CPUTempShutdown
Code: Select all
echo "JOB RUN AT $(date)"
echo "======================================="
echo ''
echo 'CPU Warning Limit set to => '$1
echo 'CPU Shutdown Limit set to => '$2
echo ''
echo ''
sensors
echo ''
echo ''
for i in 0 1
do
str=$(sensors | grep "Core $i:")
newstr=${str:14:2}
if [ ${newstr} -ge $1 ]
then
echo '============================' >>/home/thebigbaddie/Documents/CoreTempLogs/CPUWarning.Log
echo $(date) >>/home/thebigbaddie/Documents/CoreTempLogs/CPUWarning.Log
echo '' >>/home/thebigbaddie/Documents/CoreTempLogs/CPUWarning.Log
echo ' WARNING: TEMPERATURE CORE' $i 'EXCEEDED' $1 '=>' $newstr >>/home/thebigbaddie/Documents/CoreTempLogs/CPUWarning.Log
echo '' >>/home/thebigbaddie/Documents/CoreTempLogs/CPUWarning.Log
echo '============================' >>/home/thebigbaddie/Documents/CoreTempLogs/CPUWarning.Log
fi
if [ ${newstr} -ge $2 ]
then
echo '============================'
echo ''
echo 'CRITICAL: TEMPERATURE CORE' $i 'EXCEEDED' $2 '=>' $newstr
echo ''
echo '============================'
/sbin/shutdown -h now
/usr/sbin/ssmtp dylan.server244@gmail.com </home/thebigbaddie/Documents/CoreTempLogMsgs/msg.txt
echo 'Email Sent.....'
exit
else
echo ' Temperature Core '$i' OK at =>' $newstr
echo ''
fi
done
echo 'Both CPU Cores are within limits'
echo ''
If I run that in webmin I get the following errors.
Code: Select all
/home/thebigbaddie/Documents/MyScripts/CPUTempShutdown.sh: line 47: [: -ge: unary operator expected
/home/thebigbaddie/Documents/MyScripts/CPUTempShutdown.sh: line 57: [: -ge: unary operator expected
/home/thebigbaddie/Documents/MyScripts/CPUTempShutdown.sh: line 47: [: -ge: unary operator expected
/home/thebigbaddie/Documents/MyScripts/CPUTempShutdown.sh: line 57: [: -ge: unary operator expected
Next problem, I am not getting emails from either scripts. Here is the second script.
DriveTempShutdown
Code: Select all
echo "JOB RUN AT $(date)"
echo '============================'
echo ''
echo 'Drive Warning Limit set to =>' $1
echo 'Drive Shutdown Limit set to =>' $2
echo ''
echo ''
if [ $# -eq 2 ]
then
MyList='a b c d e f'
echo 'Testing all drives'
else
MyList=($3)
echo 'Testing only the system drive'
fi
echo ''
for i in $MyList
do
echo 'Drive /dev/sd'$i
/usr/sbin/smartctl -n standby -a /dev/sd$i | grep Temperature_Celsius
done
echo ''
echo ''
for i in $MyList
do
#Check state of drive 'active/idle' or 'standby'
stra=$(/sbin/hdparm -C /dev/sd$i | grep 'drive' | awk '{print $4}')
echo 'Testing Drive sd'$i
if [ ${stra} = 'standby' ]
then
echo ' Drive sd'$i 'is in standby'
echo ''
else
str1='/usr/sbin/smartctl -n standby -a /dev/sd'$i
str2=$($str1 | grep Temperature_Celsius | awk '{print $10}')
if [ ${str2} -ge $1 ]
then
echo '============================' >>/home/thebigbaddie/Documents/HDDTempLogs/DriveWarning.Log
echo $(date) >>/home/thebigbaddie/Documents/HDDTempLogs/DriveWarning.Log
echo '' >>/home/thebigbaddie/Documents/HDDTempLogs/DriveWarning.Log
echo 'WARNING: TEMPERATURE FOR DRIVE sd'$i 'EXCEEDED' $1 '=>' $str2 >>/home/thebigbaddie/Documents/HDDTempLogs/DriveWarning.Log
echo '' >>/home/thebigbaddie/Documents/HDDTempLogs/DriveWarning.Log
echo '============================' >>/home/thebigbaddie/Documents/HDDTempLogs/DriveWarning.Log
fi
if [ ${str2} -ge $2 ]
then
echo '============================'
echo ''
echo 'CRITICAL: TEMPERATURE FOR DRIVE sd'$i 'EXCEEDED' $2 '=>' $str2
echo ''
echo '============================'
/sbin/shutdown -h now
/usr/sbin/ssmtp dylan.server244@gmail.com </home/thebigbaddie/Documents/HDDTempLogMsgs/msg.txt
echo 'Email Sent.....'
exit
else
echo ' Temperature of Drive '$i' is OK at =>' $str2
echo ''
fi
fi
done
echo 'All Drives are within limits'
echo ''
Here is the output from the sensors cmd.
Code: Select all
nct6776-isa-0290
Adapter: ISA adapter
Vcore: +1.46 V (min = +0.00 V, max = +1.74 V)
in1: +0.20 V (min = +0.00 V, max = +0.00 V) ALARM
AVCC: +3.33 V (min = +2.98 V, max = +3.63 V)
+3.3V: +3.33 V (min = +2.98 V, max = +3.63 V)
in4: +0.55 V (min = +0.00 V, max = +0.00 V) ALARM
in5: +1.69 V (min = +0.00 V, max = +0.00 V) ALARM
3VSB: +3.46 V (min = +2.98 V, max = +3.63 V)
Vbat: +3.38 V (min = +2.70 V, max = +3.30 V) ALARM
fan1: 0 RPM (min = 0 RPM) ALARM
fan2: 3488 RPM (min = 0 RPM) ALARM
fan3: 0 RPM (min = 0 RPM) ALARM
fan4: 0 RPM (min = 0 RPM) ALARM
fan5: 0 RPM (min = 0 RPM) ALARM
SYSTIN: +38.0°C (high = +0.0°C, hyst = +0.0°C) ALARM sensor = thermistor
CPUTIN: +35.5°C (high = +80.0°C, hyst = +75.0°C) sensor = thermistor
AUXTIN: -31.5°C (high = +80.0°C, hyst = +75.0°C) sensor = thermistor
cpu0_vid: +0.000 V
intrusion0: ALARM
intrusion1: ALARM
k10temp-pci-00c3
Adapter: PCI adapter
temp1: +1.2°C (high = +70.0°C)
(crit = +70.0°C, hyst = +69.0°C)
Thanks for looking.