Site monitoring with Python and cron

I recently switched to hosting all of my own websites. While it is liberating to have much more control over my web host, it begs for more maintenance time and better tools to help you monitor your server.

Baby PythonWhile browsing my GitHub account I came across Mark Sanborn’s site monitoring script and thought: “Hey this is a good idea, let’s see what I can make of it”. I have been meaning to post more Python here so I updated his code a bit and thought I’d share it with you. I hope you have ideas for improvements.

UPDATE: It looks like Mark has made this a full project on GitHub and added timing the requests and command-line options! This is a perfect example of how OSS projects are started. Check out his introductory post.

Checking site availability with Python

I didn’t feel that this script was big enough to go full OO with it, but if you want to add to it, fork the gist on GitHub and provide a link in the comments. You know what’d really be cool is if someone used timeit to get the response time and set thresholds for when the site is too slow.

#!/usr/bin/env python

# sample usage: checksites.py yoursite.com othersite.org

import pickle, os, sys, logging
from httplib import HTTPConnection, socket
from smtplib import SMTP

def email_alert(message, status):
    fromaddr = 'you@gmail.com'
    toaddrs = 'yourphone@txt.att.net'

    server = SMTP('smtp.gmail.com:587')
    server.starttls()
    server.login('you', 'password')
    server.sendmail(fromaddr, toaddrs, 'Subject: %s\r\n%s' % (status, message))
    server.quit()

def get_site_status(url):
    response = get_response(url)
    try:
        if getattr(response, 'status') == 200:
            return 'up'
    except AttributeError:
    	pass
    return 'down'

def get_response(url):
    '''Return response object from URL'''
    try:
        conn = HTTPConnection(url)
        conn.request('HEAD', '/')
        return conn.getresponse()
    except socket.error:
    	return None
    except:
        logging.error('Bad URL:', url)
        exit(1)

def get_headers(url):
    '''Gets all headers from URL request and returns'''
    response = get_response(url)
    try:
        return getattr(response, 'getheaders')()
    except AttributeError:
    	return 'Headers unavailable'

def compare_site_status(prev_results):
    '''Report changed status based on previous results'''

    def is_status_changed(url):
    	status = get_site_status(url)
    	friendly_status = '%s is %s' % (url, status)
    	print friendly_status
    	if url in prev_results and prev_results[url] != status:
            logging.warning(status)
            # Email status messages
            email_alert(str(get_headers(url)), friendly_status)
        prev_results[url] = status

    return is_status_changed

def is_internet_reachable():
    '''Checks Google then Yahoo just in case one is down'''
    if get_site_status('www.google.com') == 'down' and get_site_status('www.yahoo.com') == 'down':
        return False
    return True

def load_old_results(file_path):
    '''Attempts to load most recent results'''
    pickledata = {}
    if os.path.isfile(file_path):
        picklefile = open(file_path, 'rb')
        pickledata = pickle.load(picklefile)
        picklefile.close()
    return pickledata

def store_results(file_path, data):
    '''Pickles results to compare on next run'''
    output = open(file_path, 'wb')
    pickle.dump(data, output)
    output.close()

def main(urls):
    # Setup logging to store time
    logging.basicConfig(level=logging.WARNING, filename='checksites.log',
            format='%(asctime)s %(levelname)s: %(message)s',
            datefmt='%Y-%m-%d %H:%M:%S')

    # Load previous data
    pickle_file = 'data.pkl'
    pickledata = load_old_results(pickle_file)

    # Check sites only if Internet is_available
    if is_internet_reachable():
    	status_checker = compare_site_status(pickledata)
    	map(status_checker, urls)
    else:
        logging.error('Either the world ended or we are not connected to the net.')

    # Store results in pickle file
    store_results(pickle_file, pickledata)

if __name__ == '__main__':
    # First arg is script name, skip it
    main(sys.argv[1:])

Basically, this script just checks if the internet is available, then checks each site. If the previous result is available and is different, it sends an email with the headers received so you might get a good idea what’s going on. Even cooler, you can use the email specific to your cell phone carrier to get text messages when your sites’ availability changes.

NOTE: You must have some sort of mailer daemon installed. See How to setup Gmail with sSMTP. You can try it out by editing the appropriate parts of the script and then doing:

chmod +x checksites.py
./checksites.py eriwen.com yoursite.com

Scheduling it up with cron

I’ve already showed you the ins and outs of basic cron scheduling. We can have this run every 5 minutes by typing crontab -e and then adding:

*/5 * * * * ./path/to/checksites.py yourwebsite.com othersite.org

What do you think? Tell me how you’d make it more “pythonic” or otherwise improve it in the comments.

If you liked this post, please help me share it

Responses (13)

  1. xrado says:

    isn’t “0-59/5 * * * *” same as “*/5 * * * *”
    otherwise cool script !

  2. [...] Site monitoring with Python and cron – Eric Wendelin’s Blog (tags: python sysadmin) Categories: Links Comments (0) Trackbacks (0) Leave a comment Trackback [...]

  3. login says:

    You are doing a wonderful thing here on the Internet. I wish you the very best. Kindest regards.

  4. Vitaly Babiy says:

    Hey if you don’t want to host you own site monitoring you can checkout http://howsthe.com

    Enjoy

    • There are lots of technology bits doing this. Actually, I like the free version of aremysitesup.com but it looks like it checks every hour or so.

      I guess I just enjoy the home-grown stuff since I can control it fully. Maybe I’m just a control-freak ;)

  5. Federico says:

    Hi Eric, thanks for the script.

  6. [...] and really made the code more usable and expendable which he explains in his blog post, Site monitoring with Python and cron. He had made the script accept command line arguments and even added a way to send email through [...]

  7. I did something similar with cron and sticking a few simple shell commands together, rather than using an external script:

    https://secure.grepular.com/blog/index.php/2009/10/08/monitor-your-website-automatically-from-cron/

  8. votar chicas says:

    enjoyed the read, i will bookmark your page and share it with my friends

  9. Bien says:

    Good job! Eric

  10. [...] Last but not least noch etwas für Hardcore Pythonistas: Einmal Python Speed Performance Tips und zum anderen etwas für SysAdmins, die alles unter Kontrolle haben müssen: Site monitoring with Python and cron. [...]

Leave a Reply