Site monitoring with Python and cron
I recently switched to hosting all of my own websites. While it is liberating to have much more control over my web host, it begs for more maintenance time and better tools to help you monitor your server.
UPDATE: It looks like Mark has made this a full project on GitHub and added timing the requests and command-line options! This is a perfect example of how OSS projects are started. Check out his introductory post.
Checking site availability with Python
I didn’t feel that this script was big enough to go full OO with it, but if you want to add to it, fork the gist on GitHub and provide a link in the comments. You know what’d really be cool is if someone used timeit to get the response time and set thresholds for when the site is too slow.
#!/usr/bin/env python
# sample usage: checksites.py yoursite.com othersite.org
import pickle, os, sys, logging
from httplib import HTTPConnection, socket
from smtplib import SMTP
def email_alert(message, status):
fromaddr = 'you@gmail.com'
toaddrs = 'yourphone@txt.att.net'
server = SMTP('smtp.gmail.com:587')
server.starttls()
server.login('you', 'password')
server.sendmail(fromaddr, toaddrs, 'Subject: %s\r\n%s' % (status, message))
server.quit()
def get_site_status(url):
response = get_response(url)
try:
if getattr(response, 'status') == 200:
return 'up'
except AttributeError:
pass
return 'down'
def get_response(url):
'''Return response object from URL'''
try:
conn = HTTPConnection(url)
conn.request('HEAD', '/')
return conn.getresponse()
except socket.error:
return None
except:
logging.error('Bad URL:', url)
exit(1)
def get_headers(url):
'''Gets all headers from URL request and returns'''
response = get_response(url)
try:
return getattr(response, 'getheaders')()
except AttributeError:
return 'Headers unavailable'
def compare_site_status(prev_results):
'''Report changed status based on previous results'''
def is_status_changed(url):
status = get_site_status(url)
friendly_status = '%s is %s' % (url, status)
print friendly_status
if url in prev_results and prev_results[url] != status:
logging.warning(status)
# Email status messages
email_alert(str(get_headers(url)), friendly_status)
prev_results[url] = status
return is_status_changed
def is_internet_reachable():
'''Checks Google then Yahoo just in case one is down'''
if get_site_status('www.google.com') == 'down' and get_site_status('www.yahoo.com') == 'down':
return False
return True
def load_old_results(file_path):
'''Attempts to load most recent results'''
pickledata = {}
if os.path.isfile(file_path):
picklefile = open(file_path, 'rb')
pickledata = pickle.load(picklefile)
picklefile.close()
return pickledata
def store_results(file_path, data):
'''Pickles results to compare on next run'''
output = open(file_path, 'wb')
pickle.dump(data, output)
output.close()
def main(urls):
# Setup logging to store time
logging.basicConfig(level=logging.WARNING, filename='checksites.log',
format='%(asctime)s %(levelname)s: %(message)s',
datefmt='%Y-%m-%d %H:%M:%S')
# Load previous data
pickle_file = 'data.pkl'
pickledata = load_old_results(pickle_file)
# Check sites only if Internet is_available
if is_internet_reachable():
status_checker = compare_site_status(pickledata)
map(status_checker, urls)
else:
logging.error('Either the world ended or we are not connected to the net.')
# Store results in pickle file
store_results(pickle_file, pickledata)
if __name__ == '__main__':
# First arg is script name, skip it
main(sys.argv[1:])
Basically, this script just checks if the internet is available, then checks each site. If the previous result is available and is different, it sends an email with the headers received so you might get a good idea what’s going on. Even cooler, you can use the email specific to your cell phone carrier to get text messages when your sites’ availability changes.
NOTE: You must have some sort of mailer daemon installed. See How to setup Gmail with sSMTP. You can try it out by editing the appropriate parts of the script and then doing:
chmod +x checksites.py ./checksites.py eriwen.com yoursite.com
Scheduling it up with cron
I’ve already showed you the ins and outs of basic cron scheduling. We can have this run every 5 minutes by typing crontab -e and then adding:
*/5 * * * * ./path/to/checksites.py yourwebsite.com othersite.org
What do you think? Tell me how you’d make it more “pythonic” or otherwise improve it in the comments.
isn’t “0-59/5 * * * *” same as “*/5 * * * *”
otherwise cool script !
Serves me right for writing that after midnight… Thanks for the correction :)
[...] Site monitoring with Python and cron – Eric Wendelin’s Blog (tags: python sysadmin) Categories: Links Comments (0) Trackbacks (0) Leave a comment Trackback [...]
You are doing a wonderful thing here on the Internet. I wish you the very best. Kindest regards.
Hey if you don’t want to host you own site monitoring you can checkout http://howsthe.com
Enjoy
There are lots of technology bits doing this. Actually, I like the free version of aremysitesup.com but it looks like it checks every hour or so.
I guess I just enjoy the home-grown stuff since I can control it fully. Maybe I’m just a control-freak ;)
Hi Eric, thanks for the script.
[...] and really made the code more usable and expendable which he explains in his blog post, Site monitoring with Python and cron. He had made the script accept command line arguments and even added a way to send email through [...]
I did something similar with cron and sticking a few simple shell commands together, rather than using an external script:
https://secure.grepular.com/blog/index.php/2009/10/08/monitor-your-website-automatically-from-cron/
enjoyed the read, i will bookmark your page and share it with my friends
Good job! Eric
[...] Last but not least noch etwas für Hardcore Pythonistas: Einmal Python Speed Performance Tips und zum anderen etwas für SysAdmins, die alles unter Kontrolle haben müssen: Site monitoring with Python and cron. [...]
Looks like http://www.aremysitesup.com ;)
This is great! Thanks for this post. I am new at development and this is a big help.