<?xml version="1.0"?><rss version="0.91">
	<channel>
		<title>Holovaty.com</title>
		<link>http://www.holovaty.com/</link>
		<description>Holovaty.com is a weblog discussing technical aspects of news Web sites.</description>
		<language>en-us</language>
		<item>
			<title>New EveryBlock cities launched</title>
			<description>&lt;p&gt;We've just added two cities to EveryBlock: &lt;a href=&quot;http://charlotte.everyblock.com/&quot;&gt;Charlotte&lt;/a&gt; and &lt;a href=&quot;http://philly.everyblock.com/&quot;&gt;Philadelphia&lt;/a&gt;! I've &lt;a href=&quot;http://blog.everyblock.com/2008/jun/30/twonewcities/&quot;&gt;written more over at the EveryBlock blog&lt;/a&gt;.&lt;/p&gt;</description>
			<link>http://www.holovaty.com/blog/archive/2008/06/30/1926</link>
		</item>
		<item>
			<title>Request: Headless HTML rendering engine?</title>
			<description>&lt;!--pythonfeed--&gt;&lt;p&gt;Warning: Seriously geeky request ahead!&lt;/p&gt;

&lt;p&gt;I'm looking for a way to render arbitrary Web pages -- including CSS and JavaScript -- and access the resulting DOM tree programatically, i.e., in an automated/headless fashion. I want to be able to ask the following questions of the resulting DOM tree:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;For a given element, what font family, size, and color is the text?&lt;/li&gt;
&lt;li&gt;How tall and wide (in pixels) is a given &lt;code&gt;&amp;lt;div&amp;gt;&lt;/code&gt;, &lt;code&gt;&amp;lt;table&amp;gt;&lt;/code&gt;, etc.?&lt;/li&gt;
&lt;li&gt;What are the x/y coordinates of a given element (from the upper-left corner of the page, or lower-left, or wherever)?&lt;/li&gt;
&lt;li&gt;For a given element, what is its text content?&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The rendering must be state-of-the-art, handling advanced CSS that Firefox, Safari and IE handle. It should work on Linux. Bonus points if there's a Python API for this magical DOM tree.&lt;/p&gt;

&lt;p&gt;This is all stuff that standard in-page JavaScript could accomplish, but the catch with me is that I need to be able to do it in a completely automated way, on arbitrary pages, on a headless server.&lt;/p&gt;

&lt;p&gt;I know &lt;a href=&quot;http://en.wikipedia.org/wiki/Gecko_(layout_engine)&quot;&gt;Gecko&lt;/a&gt; and &lt;a href=&quot;http://en.wikipedia.org/wiki/WebKit&quot;&gt;Webkit&lt;/a&gt; provide this, but I'm not sure where to start with them. The docs and articles I've read seem to be focused more on embedding the full browser window in a GUI application than embedding the rendering engine itself and manipulating the resulting pages.&lt;/p&gt;

&lt;p&gt;Help! If you have any clues, I'd be grateful if you left a comment or &lt;a href=&quot;http://holovaty.com/contact/&quot;&gt;got in touch with me&lt;/a&gt;.&lt;/p&gt;</description>
			<link>http://www.holovaty.com/blog/archive/2008/05/02/0136</link>
		</item>
		<item>
			<title>Check out my Radiohead remix</title>
			<description>&lt;p&gt;Radiohead is holding a &quot;contest&quot; called &lt;a href=&quot;http://www.radioheadremix.com/&quot;&gt;Radiohead Remix&lt;/a&gt;, in which they're inviting fans to remix the song called &quot;Nude&quot; from their latest album. They've released the raw tracks -- separate, isolated audio clips of vocals, guitar, percussion, etc. -- and are encouraging people to remix the tracks to create something different, then upload it to &lt;a href=&quot;http://www.radioheadremix.com/&quot;&gt;radioheadremix.com&lt;/a&gt;. I put &quot;contest&quot; in quotes because there's no prize other than a guarantee that the band members will listen to your remix. But that's still kind of a cool prize.&lt;/p&gt;

&lt;p&gt;I listened to a bunch of the submitted remixes on Wednesday and was kind of disappointed that none of the ones I listened to did anything interesting &lt;em&gt;musically&lt;/em&gt;. Most of them retained the same techno/electronica feel of the original song, kept the song's melody intact and added a couple of drum beats. So tonight, I gave a shot at making my own remix.&lt;/p&gt;

&lt;p&gt;For context, I'd suggest listening to the original song first. You can find it on the &lt;a href=&quot;http://en.wikipedia.org/wiki/In_rainbows&quot;&gt;&quot;In Rainbows&quot; album&lt;/a&gt; or listen to &lt;a href=&quot;http://www.radioheadremix.com/remix/?id=238&quot;&gt;this remix&lt;/a&gt; to get an idea of the song's melody/mood.&lt;/p&gt;

&lt;p&gt;My remix is called &quot;Nude (jazzy acoustic),&quot; and you can listen to it &lt;a href=&quot;http://www.radioheadremix.com/remix/?id=565&quot;&gt;on the site&lt;/a&gt; or using this embedded player:&lt;/p&gt;

&lt;object width=&quot;425&quot; height=&quot;355&quot;&gt;&lt;param name=&quot;movie&quot; value=&quot;http://radioheadremix.com/widget/remix_widget.swf?remix_id=565&quot;&gt;&lt;/param&gt;&lt;embed src=&quot;http://radioheadremix.com/widget/remix_widget.swf?remix_id=565&quot; type=&quot;application/x-shockwave-flash&quot; wmode=&quot;transparent&quot; width=&quot;425&quot; height=&quot;355&quot;&gt;&lt;/embed&gt;&lt;/object&gt;

&lt;p&gt;It uses only Thom Yorke's vocal track from the recording, which means I was able to change the song's chords from the classic Radiohead melancholy to something a bit happier/jazzier. It has four guitar tracks -- two rhythm, one bass and one fingerstyle melody part for the extended fadeout. I cut up the vocal track in six places to fit the rhythm better, but they're in the same order as on the original recording, including the extended wordless vocals at the end.&lt;/p&gt;

&lt;p&gt;It's kind of soulful, in a weird falsetto-Eddie-Vedder-ish way, especially compared to the original recording (which is also soulful, but in a much different way!).&lt;/p&gt;

&lt;p&gt;If you like it, please vote and tell all your friends to vote! I'd love for Radiohead to hear this. :-)&lt;/p&gt;
</description>
			<link>http://www.holovaty.com/blog/archive/2008/04/04/0324</link>
		</item>
		<item>
			<title>EveryBlock hiring a Python screen-scraping expert</title>
			<description>&lt;!-- djangofeed --&gt; &lt;!-- pythonfeed --&gt;
&lt;p&gt;Attention Python screen-scraping experts! We're looking to hire another full-time developer at &lt;a href=&quot;http://www.everyblock.com/&quot;&gt;EveryBlock&lt;/a&gt;. Our site, which just launched a few weeks ago, compiles a wealth of granular geographic data and publishes it on a block-by-block basis. We offer a distinct Web page (plus an RSS feed and e-mail alerts) for every city block in &lt;a href=&quot;http://chicago.everyblock.com/&quot;&gt;Chicago&lt;/a&gt;, &lt;a href=&quot;http://nyc.everyblock.com/&quot;&gt;New York&lt;/a&gt; and &lt;a href=&quot;http://sf.everyblock.com/&quot;&gt;San Francisco&lt;/a&gt;. We're expanding to more cities and more data sources. And we have a ton of fun features and projects up our sleeves.&lt;/p&gt;

&lt;p&gt;This position involves contributions to all of our site's technology and data, with a concentration on screen-scraping public data from government Web sites. Some specifics we're looking for are:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Mastery of screen-scraping&lt;/li&gt;
&lt;li&gt;Experience programming in Python&lt;/li&gt;
&lt;li&gt;Experience with geographic data&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Experience with &lt;a href=&quot;http://www.djangoproject.com/&quot;&gt;Django&lt;/a&gt; is a nice-to-have.&lt;/p&gt;

&lt;p&gt;For more on EveryBlock, check out &lt;a href=&quot;http://blog.everyblock.com/2008/jan/23/launch/&quot;&gt;our launch announcement&lt;/a&gt; and &lt;a href=&quot;http://www.fimoculous.com/archive/post-3860.cfm&quot;&gt;this recent interview&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;This is an opportunity to work on an exciting and important project with a talented and experienced Web development team. We're currently only four people, so you'll have a lot of freedom and opportunities to make a difference.&lt;/p&gt;

&lt;p&gt;This is a full-time, salaried position, on-location in our modest downtown Chicago office. We're a startup, funded by a grant, trying to make the world a better place. Please &lt;a href=&quot;http://www.holovaty.com/contact/&quot;&gt;contact me&lt;/a&gt; if you're interested or have any questions. Tell me about the gnarliest site you've ever scraped.&lt;/p&gt;
</description>
			<link>http://www.holovaty.com/blog/archive/2008/02/18/1928</link>
		</item>
		<item>
			<title>A couple of EveryBlock interviews</title>
			<description>&lt;!--djangofeed--&gt; &lt;!--pythonfeed--&gt;

&lt;p&gt;Back in 2006, I had a &lt;a href=&quot;http://www.ojr.org/ojr/stories/060605niles/&quot;&gt;very enjoyable interview with Robert Niles at Online Journalism Review&lt;/a&gt;. Now, Robert and I have gotten back together for another e-mail conversation about my latest project, EveryBlock: &lt;a href=&quot;http://www.ojr.org/ojr/stories/080206niles/&quot;&gt;check it out&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;And there's more! Earlier today, Rex Sorgatz published &lt;a href=&quot;http://www.fimoculous.com/archive/post-3860.cfm&quot;&gt;an interview with me about EveryBlock&lt;/a&gt;, with more of a technology focus.&lt;/p&gt;

&lt;p&gt;Thanks to both Robert and Rex for the great questions.&lt;/p&gt;</description>
			<link>http://www.holovaty.com/blog/archive/2008/02/15/0032</link>
		</item>
	</channel>
</rss>
