Tag: Python

Dashwire Photo Export

February 6, 2012 » Geek

I used to have a Windows phone, and I used a cool service called Dashwire to sync all my photos, contacts, etc. to the web.

Dashwire is shutting down this month, which is sad, but I haven’t used it in a long time, since I switched to Android. However, I’ve still got photos on there, and I’d like to grab them before they are gone. Unfortunately, Dashwire doesn’t have an easy export option that I could find.

So, pass one involved the photos RSS feed available on the site. I wrote this script to grab the feed, parse and download the photos.

It works as advertised, but it only drags down public photos, and only the most recent 30.

For round two I opened up the Dashwire dashboard and poked around their AJAX calls.

Turns out they have an images.json which has entries for every one of your photos in it.

To get this file, log into the dashboard, then download http://my.dashwire.com/images.json.

Once you have that, you will also need to figure out your user guid. You can do that with JavaScript, it’s stored in Dashwire.User.guid.

Then just plug it into the script below to get your stuff out!

Now sit back and enjoy as it downloads your photos!

Hurry up though, it all shuts down on Febuary 15th

Naive Search with JavaScript

February 2, 2012 » Geek

On a recent project I had the need to implement a basic search functionality into a web interface. This isn’t unusual, but what was different is that the data in this project was fairly static, and was kept in JSON on disk (and cached into memory). Additionally, there would be a point where it was moved off of our servers, onto an entirely different stack (hence the JSON).

There was minimal server side processing so far, and I did not want to add more overhead that would need to be ported. So, I decided to implement my search in JavaScript (with some help from Python). My idea was to do very basic string matching on a pre-built index. In this article I am going to lay out my implementation, but on a dummy data set.

The Process

This search breaks down into three steps:

  1. Build Search Index
  2. Perform Search
  3. Connect To Interface

Before we start writing that, let’s cover the data.

The Data

Here is the data I will be sifting through. To keep it simple, I’m just searching through some sentences with associated ID’s. It’s short and simple, but with some tweaks you can apply this to bigger data sets, as I did.

My data set will be represented in JSON for portability and easy interpretation from JavaScript.

data.json (92278f)

The Index

To facilitate easy matching, I’m going to build a search index that will be a dictionary, with individual words (tokens) as the keys, and arrays of ID’s as the values.

I’ve decided on Python for my index builder. Handling JSON from Python is easy, though if you have a version older than 2.6 you will need to use a different import, such as simplejson.

The core of this functionality is the tokenizer. To build this, we need to determine our rules. Since this is a simple search, I’m going to tokenize on word boundaries, it will be case insensitive, and I will only accept the characters A-Z, single quote and dash inside of a word.

Here is my implementation of the tokenizer, that follows these rules:

build_index.py (92278f)

The remainder of the Python is simply opening and parsing the JSON.

build_index.py (92278f)

Run that, and we get the dictionary we wanted:

output.txt (92278f)

JavaScript

Now that we have an index to work with, let’s convert it to JSON. This just takes a little tweak to the Python.

build_index.py (89ba77 )

That gives us a JSON file to work with.

index.json (89ba77)

Now we’re going to load that JSON with an AJAX call, so we have the index to work with.

search.js (89ba77)

Once we have that data, finding matches is just a matter of looking them up in Search.index. Below I’ve used a loop that will search through an array of terms, and then accumulate the matches.

search.js (89ba77)

It works!
It works!

Hook It Up

Just a little bit remains to connect this thing. First we need to port over the tokenizer from Python to get a consistent result. Pretty easy port.

search.js (216a15)

Tokenize!
JavaScript tokenize in action.

Last we just need to apply our functions to some inputs, which we hook up with some jQuery.

index.html (216a15)

Hooked Up

Make It Better

This is not, by any means, a complete system. You would want to grab the results and match up the ID’s to a more useful output. Additionally, you could drop the AJAX call and use a script tag to bring in the index.

There are plenty of improvements you can make, and I’d love to know if you make them!

For reference, here is the complete source: https://gist.github.com/1673557.

Thursday Quote: Eric S. Raymond

June 30, 2011 » Geek

“..the way humans (especially engineer-humans) perceive beauty is intimately related to our ability to process and understand complexity. A language that makes it hard to write elegant code makes it hard to write good code.”

– Eric S. Raymond
Why Python?

Compiling CPython Modules with XCode 4

April 7, 2011 » Geek

I’ve had trouble compiling native extensions on the Mac before, but I finally found a fix.

You just need the correct ARCHFLAGS environment variable. You can set this in your .bashrc or use it right before python setup.py build

This works because XCode dropped the PPC compiler in v4, and with that variable we tell the setup script not to bother trying to compile for that arch, just i386 and x86_64.

Much thanks to Y.H. Wong in this Super User thread.

Tags: , , ,

MarkedUp: The Power of Python

August 5, 2010 » Consume, Geek

Yesterday I was stuck at the office while Darcy got a flat replaced. I decided to make a Markdown editor real quick, since I’m always using the web dingus to preview it, and that gets old.

I found the python-markdown2 library, and then decided to try out the WebKit integration in Qt4.

The best word for it is painless. Seriously, I got the first version glued together in about ten minutes, including finding the markdown library and reading all the docs for QWebView. It’s a brutal hack, but it worked just fine, and only took up 45 lines of code.

I’ve since dropped in three more markup languages, as well as cleaned it up to something readable. But I doubt I have more than a couple hours invested thus far.I think this is a quintessential demonstration of the power of Python, which breaks down to three things.

  1. Fast development
    No compile cycle, it reduces the barrier to creating a prototype.
  2. Huge Ecosystem
    There is a library for everything and it’s all easy to get to and use.
  3. Good Documentation & Intuitive Behavior
    Most packages have good documentation, and those that don’t still behave intuitively. Just pydoc around until you find what you need.

MarkedUp is now feature complete, stable, and available at your local github.

The MarkedUp Editor