Scheduled Reports with CRON for App Engine

Introduction

It is important for web developers to regularly keep track of how their application is being used by their customers. This allows them to have sound decisions on what improvements need to be added for future releases. This post will show how to set up scheduled reports in Google App Engine running in Python. The scheduled reports will contain your application’s usage data, plus other stuff that you may want to measure and monitor over time.

Setting up a Reports Handler

In your application root, create a file that will contain our Python script that measures your application’s usage data. In the case of this post, I’ll set the filename to reports.py. This file will contain your reports handler and should import classes from your application model. Say we simply want to measure the total number of domains and the number of boards in all the domains, then we need to import the Board and Domain classes from our model. Let’s also import the Mail Python API so we can send our reports in the form of emails.

from google.appengine.api import mail
from google.appengine.ext import db
from google.appengine.api import namespace_manager

import tornado.wsgi
import tornado.locale

from script import BaseHandler
from model import Board, Domain

Writing the Reports Handler

After importing the modules and classes we need, we can now start writing the reports handler. Let’s call our handler ReportsHandler, which should inherit from the base handler in Google App Engine. Afterwards, we can write the code that measures the total number of domains and boards in the application. Let’s assume that our application model has set every domain to its own namespace, i.e. we can only access the Board class within a domain’s namespace. This means that we need to set the namespace for every domain iteration in order to count the total number of boards among all the domains. The code for these measurements are given below:

class ReportsHandler(BaseHandler):

    def get(self):
        namespace_manager.set_namespace("-global-")
        
        total_num_of_domains = Domain.all().count()
        total_num_of_boards = 0
        
        domains = Domain.all().run(batch_size=200)
        for domain in domains:
            namespace_manager.set_namespace(domain.name)
            total_num_of_boards += Board.all().count()

Now let’s set up the template for the email containing the reports. We pass the variables total_num_of_domains and total_num_of_boards in our template values and name the file of our email template as report.txt.

class ReportsHandler(BaseHandler):

    def get(self):
        ...
        # email address of recipient
        user_address = "user@email.com"

        if mail.is_email_valid(user_address):
            # email address of sender in the <>
            sender_address = "Your Reporter <reporter_email@here.com>"
            subject = "Your Report Subject"
            template_values = { 
                'total_num_of_domains': total_num_of_domains,
                'total_num_of boards' : total_num_of boards
            }
            body = self.render_string("templates/report.txt", **template_values)
            mail.send_mail(sender_address, user_address, subject, body)

After writing the ReportsHandler contents, we specify the URL that will trigger the report. Let’s use “root_URL/reports” for our example. Put the following code at the bottom of reports.py.

app = tornado.wsgi.WSGIApplication([
    (r"/reports", ReportsHandler)
])

Writing the Report Template

We use Tornado to render the template values in our email. Our email contents are located in report.txt, so let’s modify this to contain the following simple template:

This is the scheduled report for your application.

Total number of domains: {{ "{{ total_num_of_domains" }} }}
Total number boards: {{ "{{ total_num_of_unarchived_boards" }} }}

Your loyal reporter

Specifying the Schedule using CRON

Google App Engine uses CRON to run scheduled tasks. You can find the different scheduling formats here. Now create the file cron.yaml in your application root. Let’s say we want to run our report everyday at 8:30 am. Also note that the ReportsHandler is run through the URL “root_URL/reports”. The CRON file must then contain the following code:

cron:
- description: 
  url: /reports
  schedule: every day 08:30

We’re done setting up our scheduled reports via email! One can go ahead and customize further the reports handler to include many more measurements in our reports.

  • curious to hear details on your namespacing strategy.. “-global-” >> seems very structured

    • Jeremi Joslin

      Gregory, We sperate the data from all our customers using namespaces named after their Google Apps domain. But the list of domain is in 1 global namespace called “-global-“. We could do an article about how we organized the data in CollabSpot Boards.