Naruto Fetcher Jutsu!

A few months ago, a friend told me about “Naruto”, a japanese manga from which two TV series were made. I now hate this friend, because I was completely hooked by the story and wasted many hours that could’ve been spent more productively :) I watched nearly all of the dubbed episodes of Naruto (I got tired of the later filler episodes) as well as the Shippuden episodes.

The current story arc in Shippuden is also a filler arc that seems to be going nowhere, so I decided to get the mangas to read the canon story. There are, as of this writing, 445 mangas. That’s a lot of mangas to download by hand! (I could read them online, but I found the picture quality to be severly lacking.) Being a lazy programmer and all, instead of clicking individually on every link, I decided to write a script to fetch them all. I could’ve simply used a for loop, but I wanted to download many archives in parallel to make the process go faster, so I wrote the script in Python and I used the new multiprocessing module to make the parallelism easy (trivial, even!)

This is an extremely simple script, nothing fancy, but I give it to you anyway:

from multiprocessing import Pool
import os

def get(n):
    os.system('wget -q "http://www.narutochuushin.com/downloads/script/downloads.php?title=manga_chapter%03d"' % n)

pool = Pool(10)
pool.map(get, range(1, 446))
pool.close()

Hope you enjoy!

Explaining the null debate

A few weeks ago there was a discussion on an IRC channel I hang in about null references. That was at the time of QConLondon when Sir Tony Hoare was giving a keynote presentation calling null references his billion dollar mistake. We talked about null for about an hour before we all gave up after realizing that it was never going to get anywhere.

I like IRC, but for long discussions, especially those involving many parties, it’s not the best medium out there. This blog post is an attempt to concisely explain the problem with null.

First, what’s null? Null is a way in many languages to convey the idea of “nothingness”. The concept of nothingness is an important one in computer science, and hardly anyone is suggesting that we do without it. This is not what people are arguing about.

The debate is over whether nothingness should be explicit or implicit.

In a language like Java, nothingness is implicit. A method that expects a String parameter can also be passed null. A method that expects an array of doubles can also be passed null. We can think of null as a member of every reference type in Java. However, its semantics are different from every other value of that type; calling a method or looking up an attribute on null results in a NullPointerException. This means that special care has to be taken when dealing with null. And because the compiler can’t know whether we are calling .length() on null or a valid string, it cannot prevent us from doing bad things. The responsibility of safe code rests entirely on the programmer’s shoulders.

On the other hand, in a language like Haskell, nothingness is explicit. Haskell has a type called Maybe a (this could be written Maybe<t> in Java) which has exactly two values: Nothing, which is the value you use to describe the absence of a value and Just x, which we can think of as a box containing the actual value. (The value needs to be “pulled out” of the box to be used normally.) The type Maybe String (a concrete instance of Maybe a) cannot be used where a String is expected, and vice-versa; you cannot call the function length on Nothing, the type checker will not allow it. This eliminates the NullPointerException problem. Furthermore, because Haskell knows that Maybe a has two values, it can warn you when it detects that you forgot to handle either case in a function. Haskell can actually help us avoid making errors.

Response to the prefix syntax “debate”

In a recent blog post, Brian Carper came out in defense of Lisp’s unusual syntax citing regularity and ease of manipulation (by humans and computer programs). The response in the comments and on Reddit were mixed with many people — myself included — echoing Brian’s sentiment and as many people voicing their distaste for Lisp’s syntax.

A common criticism put forth by opponent of prefix syntax is that it makes maths “unnatural”. They write formulaes in infix and prefix styles and expect the reader to see how much clearer and natural and intuitive infix is. Here’s an example taken from a comment:

1 + 2*f(x,y) + 3*g(x)

vs.

(+ 1 (* 2 (f x y)) (* 3 (g x)))

Of course, most Lisp programmers would probably prefer to break down the prefix version into multiple lines to show the structure of the expression more clearly:

(+ 1
   (* 2 (f x y))
   (* 3 (g x)))

The “problem” with this solution was that it was now on three lines and apparently, it was “66% less productive” than the equivalent infix representation. I have my doubts on this claim. :)

However, if you thrown in comparison, boolean and bit operators, now your can have some fun. Quick, parenthesize the following expression without looking up your C reference: 1 & 2 * 3 || 4 + 5 ^ 6 < 7 - 8 == 9.

Regardless of how one feels about whether infix operators are more natural, it doesn’t really matter, because most of us don’t write programs with a lot of long mathematical expressions. In fact, I’d bet that the vast majority of arithmetic operations are adding 1 or subtracting 1 from a value. Hardly worthy of a debate.

Functions are used a lot more than operators, and in all mainstream languages they have the prefix form and nobody seems too stumped by them. Why is that? And if you aren’t stumped by prefix syntax, wouldn’t you like it if it was the same for every operation so that your code was effectively a tree that you could manipulate at compile-time with macros to add your own syntax and extensions to the language? Surely you would!

Bitten by dynamic typing :-/

Quick post about a bug that I fixed this morning that was quite embarrassing. I had a Python function in which I did something like the following:

if getattr(obj, method):
    ...

Experienced Pythonistas will spot the problem immediately: if obj doesn’t have an attribute method, getattr will throw an exception. What I should have done (and this is how I fixed the bug) was to add a third parameter to specify a default value when the attribute doesn’t exist.

I usually like dynamic typing, but this time I hated it: if this had been Haskell, the type checker would’ve called me a retard and refused to compile my code.

Oh well, what doesn’t kill you makes you stronger I guess. I probably should have had a test for this particular case too.

Discovering new settings for your Emacs

If you’re like me, you enjoy configuring Emacs so that it is a reflection of what would be the perfect editor for you. That process usually involves finding about new features and functionalities and adding them to your init.el file. Thanks to GitHub, you can now view thousands of init.el files. Just do a search for path:init.el and you’ll find the configuration file of every Emacs user that deemed useful to keep his/her configuration in a VCS.

Direct link