adrian holovaty

Low-tech edition (Skip to navigation)

April 13, 2006, 4:43 PM ET

New at washingtonpost.com: Faces of the Fallen 2.0

Just launched at work: Faces of the Fallen, a browsable database of U.S. service members who've died in Iraq and Afghanistan.

Technically this is version 2 of the site. The first version, which washingtonpost.com has had for years, was a single-page Flash application. Now we've given it the full browsable-database treatment, with permalinks for everything.

Django-powered, the site lets you browse by age, death date, home state and city, military branch or multiple search criteria. Each soldier gets his or her own page, as does each date, American city, age, military branch, etc. There's an RSS feed for recent casualties, a feed for each state and a feed for each military branch. We've integrated Google Maps on several pages to highlight service members' hometowns.

Let me know if you have any ideas for how we can improve the site.

Comments (12) / Permalink

April 7, 2006, 9:27 AM ET

How I'm using Amazon S3 to serve media files

As traffic to chicagocrime.org has steadily increased, I've been looking for ways to tweak the site's performance. The site runs on a rented dedicated server with Apache/mod_python, PostgreSQL and Django. (I'd love to bite the bullet and buy proper servers but haven't done so yet. Donations are welcome!)

One thing that's always bugged me is that chicagocrime.org's Apache instance serves both the dynamic, Django-powered pages and static media files such as the CSS and images. It's inefficient for a single Apache instance to act as both an application server (mod_python) and a media server. A bunch of Apache configuration tweaks can improve performance of one aspect of serving but are somewhat detrimental to the other aspect. For example, using the KeepAlive directive improves Apache's media-serving capabilities, but KeepAlive is detrimental in a server arrangement that mainly churns out dynamic pages. So if a single Apache instance does both media serving and dynamic page creation, you can't optimize for both cases.

(When I worked at LJWorld.com, we had the luxury of separate application, media and database servers, and we have a similar setup where I work now, but I can't afford separate servers for my little side projects.)

The solution hit me the other day -- I can just use Amazon's new Amazon S3 data-storage service to host chicagocrime.org's media files, so my own Apache server can focus on serving dynamic pages. S3 is very cheap -- 15 cents a month for each gig of storage (and I have only 936 K of media files) and 20 cents per gig bandwidth. That's peanuts.

It was easy to get this working; took less than an hour total. Here's what I did:

  1. First, I signed up for an Amazon S3 account. Do that by clicking "Sign Up For Web Service" on the main S3 page. When you sign up, you get two codes: an access key ID and secret access key.

  2. Next, I created an S3 "bucket" for my chicagocrime.org media files. An account can have multiple buckets. As far as I can tell, it's just a way of keeping your S3 stuff in separate containers. I did this by using the free S3 Python bindings. Just download the file, unzip it and put the file S3.py somewhere on your Python path. To create a bucket named 'mybucketname', do this:

    import S3
    conn = S3.AWSAuthConnection('your access key', 'your secret key')
    conn.create_bucket('mybucketname')
    
  3. Next, I wrote a Python script that uploaded my media files to this bucket and made them publically readable. S3 has a bunch of complex authentication stuff, but all I wanted to do was use S3, essentially, as a Web hosting service. Here's the script I used, and here's how to use it:

    $ cd /directory/with/media/files/
    find | python /path/to/update_s3.py

    The script is kind of cool because it uses Python's mimetypes to determine the content type of each file in order to pass that to the S3 API. Otherwise it's pretty straightforward.

  4. Finally, it was just a matter of changing my chicagocrime.org templates to point to S3's URLs rather than my own URLs. That was a snap, thanks to Django's template inheritance and includes.

Now chicagocrime.org's media files are served directly off of S3, at a cost of 35 cents a month, and my Apache is happier.

Comments (50) / Permalink



Thanks for reading.

A Django site.