Just a quick note before we start, I opened a Github repository for the code of these articles. You can find it at this address: http://github.com/gnuvince/blog-comics/tree/master
Welcome to part four of our simple Clojure comics fetcher tutorial. Like last time, I will try to keep today’s article short and sweet. We’ll discuss error handling; specifically, we’ll see what to do when a site does not respond.
The first thing to do is to add a comic to *comics* that we know will fail. The great Perry Bible Fellowship web comic is no more, so we shall use it to test a web site that is down. Add the following to *comics* and try executing the script.
{:name "Perry Bible Fellowship"
:url "http://www.pbfcomics.com"
:prefix "http://www.pbfcomics.com/archive_b/"
:regex #"PBF.+?\.gif"
}
Unless pbfcomics.com is back up when you read this article, your program should be hung indefinitely. By default, the JVM sockets do not time out. Add the following to your code to add a 5 second timeout for all connections:
(System/setProperty "sun.net.client.defaultConnectTimeout" "5000") (System/setProperty "sun.net.client.defaultReadTimeout" "5000")
Running the script again should give you a long ugly error message:
$ clj comics.clj [ Penny-Arcade, We The Robots and Xkcd ] Exception in thread "main" java.net.SocketTimeoutException: connect timed out (comics.clj:0) [ Big stack trace ]
The program threw an exception when it could not connect to pbfcomics.com, but we never handled it. Let’s do so with try/catch. Change the doseq at the end of the script to this:
(doseq [comic *comics*]
(try
(println (:name comic) ":" (fetch-comic comic))
(catch Exception e
(println "Couldn't fetch" (:name comic) e))))
try will attempt to execute the forms under it, and if one throws an exception, try will look the catch clauses to see if this particular exception is handled. A catch clause has the form (catch ExceptionName varname body*). A try block may have multiple catch clauses.
Here, we catch Exception, which means that all exceptions, whichever they may be, will be handled the same way. It’s not the most “proper” thing to do, but that’ll do for now.
Running the script again, you should see the following when you get to PBF.
Couldn't fetch Perry Bible Fellowship \ #<SocketTimeoutException java.net.SocketTimeoutException: connect timed out>
When we start outputting XML, we’ll simply skip comics that couldn’t be fetched.
Posted by gnuvince
Posted by gnuvince
Posted by gnuvince