using the GAE as a web cron

introduction

i said that we could use GAE as web cron at http://d.hatena.ne.jp/takahirox/20110424/1303652584

i worried about cron when i made bot with ruby.


http://d.hatena.ne.jp/takahirox/20090523/1243086497


perhaps, GAE can resolve this problem. python GAE application with "urllib2" and your bot with ruby or other language that executed via CGI on the other sever.

i tried doing that. and i'll make the notes of it.

i hope it helps you who want to use web cron.

what i did is

i made a bot on a web server, not GAE. and i let a GAE application exdcute the bot periodicaly. that is i use a GAE application as web cron.

a bot that i made


this bot tweets the time. just only doing it. it's very boaring and simple :)

features

  • good
    • you can make a bot without the knowledge of python and java
      • and also you can use the libraries of ruby and perl and so on that python and java don't have such a similar libraries.
    • you're not limited by GAE.
  • bad
    • security
      • your bot need to be executed by CGI.
      • anyone can access it.
      • you have to make the password or something.

get a twitter account for a bot

first of all, get an twitter account for a bot. and then you need to get the tokens and secrets.

i'll skip to explain it here. check this entry.


in my case, i re-used @supertimebot. that is the account of a bot i made before. you can see it at http://d.hatena.ne.jp/takahirox/20090523/1243086497

make a bot on a web server where you like

after that make a bot on a web server where you like.

in my case, i made it on xrea with ruby.

if you want to make the same bot, you need to configure your own rubygems on your xrea space for "twitter" of rubygems.
how to do that is written here, have a look at it.


and also, a bot i made is built from a twitter client with ruby that i made before. check it.


the bot source code is here.

#!/usr/local/bin/ruby

# main.cgi

require 'rubygems'
require 'twitter'
require 'time'
require 'cgi'

OAUTH_CONSUMER_KEY    = 'your consumer token'
OAUTH_CONSUMER_SECRET = 'your consumer secret'
OAUTH_ACCESS_TOKEN    = 'your access token'
OAUTH_ACCESS_SECRET   = 'your access token secret'

PROXY_ADDR = nil

oauth = Twitter::OAuth.new( OAUTH_CONSUMER_KEY, OAUTH_CONSUMER_SECRET )
oauth.authorize_from_access( OAUTH_ACCESS_TOKEN, OAUTH_ACCESS_SECRET )
base = Twitter::Base.new( oauth )
base.update Time.now

cgi = CGI.new
puts cgi.header
puts 'homuhomu.'

note that a bot needs to be executed via CGI in this case. (the URL of my bot is secret :P)

configure the GAE

finally, let the GAE application execute the bot.

register a GAE application for the bot and just only make these files and deploy them.

the detail is here


app.yaml

application: supertimebot
version: 1
runtime: python
api_version: 1

handlers:
- url: /.*
  script: main.py

cron.yaml

cron:
- description: cron job name
  url: /.*
  schedule: every 20 minutes

main.py

import urllib2

urllib2.urlopen( 'your bot URL' )

after that, the bot tweets periodically if you succeed.

security

as i said, you need to care about security. that's because the bot can be executed by anyone.

the hints of security measure are

  • check password
  • check user agent
  • check IP
  • record access log


like this.

#!/usr/local/bin/ruby

# main.cgi

require 'rubygems'
require 'twitter'
require 'time'
require 'cgi'

OAUTH_CONSUMER_KEY    = 'your consumer token'
OAUTH_CONSUMER_SECRET = 'your consumer secret'
OAUTH_ACCESS_TOKEN    = 'your access token'
OAUTH_ACCESS_SECRET   = 'your access token secret'

PROXY_ADDR = nil

if /\(\+http:\/\/code\.google\.com\/appengine; appid: supertimebot\)/ =~ ENV[ 'HTTP_USER_AGENT' ] then
  oauth = Twitter::OAuth.new( OAUTH_CONSUMER_KEY, OAUTH_CONSUMER_SECRET )
  oauth.authorize_from_access( OAUTH_ACCESS_TOKEN, OAUTH_ACCESS_SECRET )
  base = Twitter::Base.new( oauth )

  base.update Time.now
end

cgi = CGI.new
puts cgi.header
puts 'homuhomu.'

f = File.open( './homu.txt', 'a' )
f.puts Time.now
ENV.each do |key, value|
  f.puts key, value
end
f.puts

this checks the user agent and records access log.

but, of course, it's not enough.

conclusion

as you saw, we can make a bot on a web server where we like, not GAE + python or java.

but it's not perfect, we need to think about security, as i said.


by the way, i wrote this entry in a shinkansen(bullet train). it's very exciting for me. but it might be ordinary situation for some of you.