Clojure tutorial: fetch web comics (part 4)

December 28, 2008

Just a quick note before we start, I opened a Github repository for the code of these articles. You can find it at this address: http://github.com/gnuvince/blog-comics/tree/master

Welcome to part four of our simple Clojure comics fetcher tutorial. Like last time, I will try to keep today’s article short and sweet. We’ll discuss error handling; specifically, we’ll see what to do when a site does not respond.

The first thing to do is to add a comic to *comics* that we know will fail. The great Perry Bible Fellowship web comic is no more, so we shall use it to test a web site that is down. Add the following to *comics* and try executing the script.

  {:name "Perry Bible Fellowship"
   :url "http://www.pbfcomics.com"
   :prefix "http://www.pbfcomics.com/archive_b/"
   :regex #"PBF.+?\.gif"
   }

Unless pbfcomics.com is back up when you read this article, your program should be hung indefinitely. By default, the JVM sockets do not time out. Add the following to your code to add a 5 second timeout for all connections:

  (System/setProperty "sun.net.client.defaultConnectTimeout" "5000")
  (System/setProperty "sun.net.client.defaultReadTimeout" "5000")

Running the script again should give you a long ugly error message:

  $ clj comics.clj
  [ Penny-Arcade, We The Robots and Xkcd ]

  Exception in thread "main" java.net.SocketTimeoutException:
  connect timed out (comics.clj:0)

  [ Big stack trace ]

The program threw an exception when it could not connect to pbfcomics.com, but we never handled it. Let’s do so with try/catch. Change the doseq at the end of the script to this:

(doseq [comic *comics*]
  (try
   (println (:name comic) ":" (fetch-comic comic))
   (catch Exception e
     (println "Couldn't fetch" (:name comic) e))))

try will attempt to execute the forms under it, and if one throws an exception, try will look the catch clauses to see if this particular exception is handled. A catch clause has the form (catch ExceptionName varname body*). A try block may have multiple catch clauses.

Here, we catch Exception, which means that all exceptions, whichever they may be, will be handled the same way. It’s not the most “proper” thing to do, but that’ll do for now.

Running the script again, you should see the following when you get to PBF.

Couldn't fetch Perry Bible Fellowship \
#<SocketTimeoutException java.net.SocketTimeoutException: connect timed out>

When we start outputting XML, we’ll simply skip comics that couldn’t be fetched.


Peep Code: Meet Emacs

December 26, 2008

Yesterday, I bought the Meet Emacs screencast. I wasn’t sure initially if it was worth the $9, but after watching it, I can say without any doubt that it is worth every single penny.

The pace is extremely fast, so this cannot really be used as a follow-along tutorial. The caster covers a lot of material, so you’ll probably need to watch it a few times to get everything. They start from installation, configure it with the Emacs starter kit and they get into the different features of Emacs.

New users will learn what makes Emacs such a powerful editor, experienced users will probably learn a new thing or two (for me, it was whitespace-mode and magit.)

If you don’t know Emacs, but are wondering what’s the fuss about it, you should definitely buy this screencast.


Clojure tutorial: fetching web comics (part 3)

December 20, 2008

Today’s installment of our comic fetcher tutorial is gonna be shorter than the previous two. We’re not going to add a functionality, we’re just going to namespace it. Take the program from last time and change the import statement at the top to this:

(ns net.gnuvince.comics
  (import (java.net URL)
          (java.lang StringBuilder)
          (java.io BufferedReader InputStreamReader)
          (org.htmlparser Parser)
          (org.htmlparser.visitors NodeVisitor)
          (org.htmlparser.tags ImageTag)))

ns switches to a new namespace, creating it if it doesn’t exist. You can also specify modules to import in that particular namespace. Let’s look at it with a sample REPL session:

user=> (load-file "comics4.clj")
[output from our program]

; Fully qualified name works, because it's in our CLASSPATH
user=> org.htmlparser.Parser
org.htmlparser.Parser
; However, we have not imported the Parser class in the user namespace
user=> Parser
java.lang.Exception: Unable to resolve symbol: Parser in this context (NO_SOURCE_FILE:0)

; in-ns is used to switch namespaces
user=> (in-ns 'net.gnuvince.comics)
#<Namespace net.gnuvince.comics>>
; FQN still works
net.gnuvince.comics=> org.htmlparser.Parser
org.htmlparser.Parser
; Just the name of the class works, because we imported it in this namespace
net.gnuvince.comics=> Parser
org.htmlparser.Parser 

; Go back to user namespace
net.gnuvince.comics=> (in-ns 'user)
#<Namespace user>
; It is still not in the user namespace
user=> Parser
java.lang.Exception: Unable to resolve symbol: Parser in this context (NO_SOURCE_FILE:0)

Blind programming

December 11, 2008

I was wondering about something silly today: what would be an ideal programming language syntax for a blind person? I am not blind, nor visually impaired and I don’t know anybody who is, but the question was interesting to me. I’ll go with 100% pure conjecture here:

  • Assembler: one instruction per line, I think following that would be definitely doable.
  • C: the syntax can be a bit dense at times, but if somebody is careful not to code like it’s IOCCC, I think this could be manageable.
  • Perl: too many symbols, I don’t think that it would be easy to “hear” Perl code.
  • Python: the syntax is simple, but I wonder how the significant indentation would play here; for a sighted person, it’s quite evident which code is grouped together.
  • Lisp: Lisp code sometimes has the tendency to be more nested than other languages, I don’t know if it’d be easy to follow. The regular syntax would probably help though.
  • Factor: interesting one, being able to read left to right would probably be very helpful. Factor’s style guide strongly encourages making extremely short words, this should be helpful as well.

I’d love to hear from blind or visually-impaired programmers to know about your experiences. I’d love to hear from other programmers about their blind (har har har!) conjecture.