Category: Geek

Delayed Queues for RQ

April 15, 2013 » Geek

I really like RQ. It’s got a sensible API and it’s built for Python, something I can’t say about Resque, pyres is fine, but it’s not the same.

My one beef with RQ is that I can’t delay a job for an arbitrary amount of time. You queue it up and it runs. To get around that, I built a simple delayed queue system for RQ, using the same redis backend.

My delayed queues leverage sorted sets to store and select jobs. We put the job in with it’s earliest run timestamp as the score, then we have a cron job or daemon that pulls them out when they are ready and pushes them over to RQ. Simple enough!

Here’s the really relevant code, everything else is trimming.

Delaying Jobs

    def delay(self, queue, job, seconds, *args, **kwargs):
        '''Delay a queue job by a number of seconds.'''
        self.redis.zadd('queue:delayed', pickle.dumps({'job': job, 'queue': queue, 'args': args, 'kwargs': kwargs, 'id': uuid.uuid1().hex}), self._now() + seconds)

Waking Jobs

    def enqueue_delayed_jobs(self, now=None):
        '''Enqueue and clear out ready delayed jobs.'''
        if not now:
            now = self._now()
        jobs = self.redis.zrangebyscore('queue:delayed', 0, now)
        for pickled_job in jobs:
            job = pickle.loads(pickled_job)
            Queue(job['queue'], connection=self.redis).enqueue(job['job'], *job['args'], **job['kwargs'])
            self.redis.zrem('queue:delayed', pickled_job)
        return len(jobs)

You can run this at a granularity as low as one second the way it is written, and I’m sure you could go tighter if you wanted. We run it at a minute granularity, since our jobs are not highly time sensitive.

And because redis is atomic, you can run enqueue daemons/cron jobs on multiple machines, and everything should work fine, which is great for availability.

Grab the code and examples here if you are interested,

Goodbye Omaha (April Fools 2013)

April 1, 2013 » Geek, Life

Rick Astley

This year I decided to pull a tiny April fools joke. I decided this at four in the afternoon, on the day of, so I didn’t have time to prepare.

I cashed in on John Henry Müller’s recent departure from the Omaha area to pretend I was leaving too.

I quickly threw up a page on my blog that would redirect to a Rick Roll, then I realized that various Twitter clients follow HTTP redirects and unwrap links to get the “real” URL and page title. I’ve been burned by that before, so I changed my tactic slightly to move the redirect into JavaScript. That let me put a sliver of believable content onto the page so that Facebook shares would work too.

The redirect page.

Finally, I tweeted it out, and started trapping suckers!

The Tweet.

The reaction.

I didn’t get Google Analytics on immediately, so I might have missed some folks, but I got 30 uniques on it that day from Facebook and Twitter.

The Tweet.

Not bad at all. SPN even wrote about it!

Impromptu logging from a connection

October 27, 2012 » Geek

I recently participated in a live streamed event that provided a “watching now” counter usin Basically it was a super simple node.js script which incremented or decremented a variable when users joined and left the channel, and broadcasted the count to it’s subscribers. What I didn’t realize until right before the event that we might want to have a record for users on page at a given point in the broadcast. With so little time before the broadcast, I didn’t want to tinker with the server and break it, so I did the next best thing, I logged from the subscriber side.

I put up a quick PHP script on my laptop that allowed cross-domain access from the origin server and logged the incoming counter.

  header('Access-Control-Allow-Methods: GET, POST, OPTIONS');
  header('Access-Control-Allow-Credentials: true');
  header('Access-Control-Allow-Headers: Content-Type, *');
  file_put_contents('log.txt', time() . ', ' . $_REQUEST['count'] . "\n", FILE_APPEND);

Then, in Chrome’s JavaScript console, I just hooked updates from into an XHR to provide the values to my PHP.

socket.on('update', function ( data ) { $.get('http://localhost/logger.php', { count: data.count } ); } );

It worked like a charm, I didn’t have to mess with the server at a crucial point, and we got the data we needed.

Let the Facebook Object Debugger Into Staging

October 27, 2012 » Geek

One often important, and often overlooked aspect of modern web development is Open Graph tags. You know, those meta tags with weird attributes that break your page validation? That’s a whole other topic though.

Today, I want to talk about the Facebook Object Debugger, and giving it access to an HTTP Auth protected environment, such as a staging or pre-launch production site. This is Apache specific, so nginx fans will have to look elsewhere.

Assume you have this setup in your Apache config or htaccess;

AuthUserFile /var/www/staging/.htpasswd
AuthType Basic
AuthName "Secure Area"
Require valid-user

The easiest way that I’ve found to make this work is to accept based on user agent. I originally tried allowing it based on IP address, but the debugger uses many IP addresses, and after I had added a half dozen I gave up and switched to user agent.

Be aware, that because of this, it’s quite easy for someone to fake their UA and gain access, so I recommend only using this code while you actively use the debugger, and turning it off afterwards. This also prevents leaks if someone pastes the URL into an actual Facebook comment.

AuthUserFile /var/www/staging/.htpasswd
AuthType Basic
AuthName "Secure Area"
Require valid-user
# Allow from Facebook
SetEnvIfNoCase User-Agent facebookexternalhit.* facebook
Order allow,deny
Allow from env=facebook
Satisfy Any

Pretty easy!

Check out this page at AskApache for a nice guide to SetEnvIfNoCase.

Hashes Are Not *$&%@! Magic

September 27, 2012 » Geek

I’m going to get on a programming soapbox real quick and cover a topic that seems to confuse some people.

Hashes Are Not *$&%@! Magic

Some people seem to think that swapping out a secret with a hashed version of that secret makes it all safe and cozy, but that’s simply not true.

Yes, cryptographic hashes are a very important part of digital security, for a number of good reasons, but they have to be applied in a manner which takes the whole system into account.

The impetus for this work was a login integration I recently updated, because some other developer foolishly applied hashes.

Essentially, we were cross-posting a login form on one website to another. Nothing fancy. Ignore the lack of CSRF control.

<form method="POST" action="http://theotherguys.saas/login">
  <label for="user">Username</label>
  <input type="text" name="user" />
  <label for="password">Password</label>
  <input type="text" name="password" />
  <button type="submit">Log In</button>

The New Form

But the new form would need a change. Instead of sending the username and password, we would send the username, and an MD5 hash of the concatenation of username and password.

Now, I’m sure when this idea was implemented, it was sold as a way to authenticate the user, without exposing their password in plaintext (note that they don’t use SSL). Brilliant!

Yes, it does obscure the plaintext password, but it is not any more secure.

You see, they didn’t think about the system as a whole, they were just focused on obscuring the password.

All that happened here is a substitution of shared secrets.

Previously the server compared the username and password it has on file to what was sent in. Now it compares the username and the hashed password to what it has on file. Do you see what we did? We’ve simply swapped the secret of the plaintext password for the secret of the hashed password. I can still intercept your form submission over the wire and steal your credentials.

I don’t have to prove I know the password, I have to prove I know the secret.

Zero gain, and you’ve added complexity.

MD5, lol

As a bonus, they picked MD5, probably because it’s been implemented many times, there is a JavaScript version readily available, and it tends to be one of the first hashes people learn about, due to it’s age.

But MD5 is weak. And we have the salt, if you can call it that, in the username. An old 2Ghz P4 can try about 20 Million hashes a second, and throwing a modern GPU at it you can test several billion hashes a second. If we want the plaintext password, we can get it unless it is reasonably large (7+ characters) and fairly complex (at least one non-alphanumeric character).

(╯°□°)╯︵ ┻━┻

For an extra thought, consider how they must be storing these passwords. Either there scheme has always been MD5(CONCAT(username,password)) or they are storing them in plaintext and are (hopefully) migrating to hashed.