Wednesday, October 19, 2005

Email tacks

Rob Bray suggests a different tack to take for spam prevention/reduction.

Rob blogged about his resistance to the penny tax for emails that Tim Bray suggested in a recent blog post. In this installment, however, he changes his mind and instead comes up with a brilliant solution: only un-answered emails cost you anything. Most of us write emails to people and receive answers from those same people; spammers on the other hand send emails to thousands of people and only a few (idiots) reply. If we could make it really costly to send un-answered emails and cheap to send answered emails we might have the beginnings of a really nice, low-tech solution.

This suggestion of Rob's reminds me of a socio-economic system called Stone Society described by Peter Merel. The system involves creating an artificial resource that is then exchanged and manipulated by participants in order to allow decision-making to proceed.

In the world that Rob describes, you would exchange tokens freely with people with whom you have a back-and-forth. Spammers would simply send you tokens which you could accumulate. In other words, spam would be beneficial to you even if you didn't want to receive it.

If the tokens here were indeed pennies you would actually get paid to receive spam. You could still have email filters to make sure you don't have to read it. This plan is all about raising the barrier to entry for spammers.

Tuesday, October 18, 2005

Hacking the iPod nano

I hacked together a little Python script that scrapes Atom feeds and transforms them to iPod "contacts" allowing me to read blog posts from my nano. It uses html2text right now but a lot of entries with embedded HTML come out looking pretty weird. I'll post the script once I clean up the text generation.

I managed to get a couple of the vCalendar features working and found out that even though Apple says you need iSync or iCal to use the nano to add TODO items, you can just add standard vTodo objects to a vCalendar stream and it works just fine.

Also, the rdate property is what controls repeating tasks/events - not any of the other ones that purportedly serve the same purpose.

Now it's time to go to bed...

Sunday, October 16, 2005

String/XML representations in YAXL

Damian Cugley asks


Could you eliminate the need for the text member variable by using str(elt) instead? Or would that interfere with using str to return the XML representation for the whole element?


In YAXL v0.0.14 I've split the representation of an element between str and repr so that the former would be a shortcut to return the text property and the latter would return the XML representation of the element.

I'm pleased with the separation since now if you evaluate an element at the interactive prompt, it returns the XML for the element but if you print or interpolate the element its string value will be used. There is also a nice parallel with the way nodes are handled in XPath.

YAXL elements retain the text property in order to continue to support the oodles of screaming fans who have already shipped billion-dollar projects based on my little library.

Thanks for the suggestion Damian!

Fall cleaning

Justin and I cleaned out some of the more cobwebby parts of mystic yesterday. We wanted to install Python 2.4 with subversion, trac and apache 2. It took a fair bit of wrangling (i.e. building from source) to get everything working together after which Justin realised that we hadn't upgraded to Sarge (duh!). He rebuilt everything using the latest packages and it looks OK so far.

The transition from Apache 1.3 to 2.x was pretty smooth with the SSL stuff at the high end weighing in at about 20-30 minutes to figure out. Although the new sites-available/sites-enabled organization is nicely architected, it does make for some pretty painful contortions when performing wholesale changes on a group of virtual hosts. For example, we started off testing the new installation of Apache 2.x on port 81 (which required that we split out the vhost declarations into files of their own) and then proceeded to switch it to port 80 (which ended up requiring a quick cat/sed/re-direct combination). Without a good knowledge of shell scripts or some scripting language this would have been really painful.

The other rough item on our plate was permissions when using the Berkeley DB version of a subversion repository. My advice: don't go there. Just stick to FSFS and you should be OK. This being said, I had originally created the repository in FSFS but had subsequently dumped it and loaded it onto mystic. It seems that I did something wrong because it loaded in Berkeley DB mode. All that is fixed now and taken care of.

We also tossed in Apache subversion support for good measure so now we can setup anonymous browsing and reading of our SVN repositories.

Saturday, October 15, 2005

iPod nano in da house

Vem and I finally both got our iPod nanos! What a cool gadget. These things are too slick to be real!

My one small gripe is that we ordered a bunch of stuff from Apple and they shipped it all separately with the predictable result that the iPod armband for Vem's gym sessions arrived first (wow, look - an armband) followed by her iPod, followed several days and a trip to the local Fedex detention centre for un-claimed packages later by mine own.

Rumors about the nano being scratch-prone seem to have some truth to them as I can already (after a fay and a half) detect small scoring on the exterior plastic. Of course, we ordered protective "skins" but since Apple hasn't shipped them yet (and nobody in their right mind would leave an un-opened nano sitting on the shelf for three weeks) we may find that our nanos are not as pristine as they could be. Ah well...

Still, the photo feature is pretty sweet (useless but sweet) although it would be nice to be able to set a photo that displays automatically when you power on (sort of like Vem's digital camera), the rating feature is very cool (I was just ranting to Vem the other day about needing one of these). iTunes is not all it could be but I haven't started playing around with the iPod libraries out there to see what I can hack up.

Anyways, overall we love our nanos and would buy them again: 9.8/10

Monday, October 10, 2005

YAXL v0.0.12 released

Over the weekend I've managed to work in a number of bugfixes, add some more XPath support, add sequence-style access for children and implement a whole slew of namespace features. The latest source version includes 78 unit-tests that cover all this fancy functionality.

I'm still debating the question of whether or not to allow access to children via property names (like xmltramp). The problem is that you end up having to mangle either child names or "real" property names. Also, you need to have a mangled way for dealing with node-sets. YAXL curretly supports XPath even in XML fragments so theoretically you should use that, but it's hard not to like root.title as opposed to root('title'). More complex queries, however, reveal the power of XPath: compare [x for x in root.head.children if x.localname == 'style'] to root('head//style') to retrieve all style elements of an HTML document. Or even root('//p') to retrieve all the paragraphs.

Ideally, YAXL would support both methods but I still need to come to terms with all the mangling required.

Friday, October 07, 2005

Expat doesn't support namespace prefixes

This via "Crest" who sent in some nice stack traces generated by YAXL. I wrote in a fix yesterday to handle a missing QName but I didn't realise that expat would throw a SAXException. As a result, nobody using expat can use the parse function until I add a try/except for that exception.

Thursday, October 06, 2005

SAX Parser Funny Stuff

It turns out that the startElementNS method on a SAX ContentHandler is not necessarily passed the QName of the element. The name is passed as a URI/element name tuple, but the QName is not mandatory unless the feature_namespace_prefixes feature is enabled. The following code shows how to do this:


from xml.sax.handler import feature_namespaces, feature_namespace_prefixes
from xml.sax import make_parser

contentHandler = MyContentHandler()

parser = make_parser()

# Perform namespace processing
parser.setFeature(feature_namespaces, True)

# Report original prefixed names (i.e. qname is passed to startElementNS)
parser.setFeature(feature_namespace_prefixes, True)

parser.setContentHandler(contentHandler)

parser.parse(inputSource)


I was made aware of this issue when YAXL failed to parse XML source properly on somebody else's machine; the elements were being named "None". That tip led to this fix. Merci Marc!

YAXL Reviewed

Jeremy Jones blogs about YAXL (already!). I agree with Jeremy that ElementTree is pretty much the best XML tool there is out there for Python. I also agree that the children attribute should not contain descendants. It never really did but since YAXL Elements are represented as XML fragments, it may have seemed that way.

Here's what an interactive session looks like:


>>> import yaxl
>>> x = yaxl.Element('x')
>>> x.append('y').append('z').append('w')
<w />
>>> x.children
[<y><z><w /></z></y>]


It may seem from the output that the z and w elements are returned as part of children but they are just part of the XML fragment that y outputs.

In the latest release (0.0.6) you can also "call" the element with an XPath of child::* and get back a list of immediate children.

Wednesday, October 05, 2005

YAXL gets XPath

I added some basic XPath support to YAXL this morning. It now supports both abbreviated and unabbreviated XPath queries on Element objects. Currently only attribute-value and node/node-set selections are supported. Also, only the following axes are allowed in queries: self, parent, ancestor, ancestor-or-self, child, descendant, descendant-or-self

Especially cool (IMO) is the fact that YAXL supports two methods of performing XPath queries: you can call select on an instance of Element or you can call an instance of Element and supply the XPath as the only parameter.

Tuesday, October 04, 2005

Something new, something old

I just launched a new site at http://www.ilowe.net. I will be retiring the schmeez.org site (the permanent redirects are already in place for the 3 people that visit regularly) for many reasons (not least of which is having to spell it every time I tell somebody where my "homepage" is).

I also released the first version of YAXL, a Python module for reading, writing and manipulating XML. It will form the basis for an Atom library I am currently writing.