A few months ago, I wrote a Python library to parse Starcraft replay files. For the past few weeks, I’ve been meaning to try writing a Clojure implementation and I finally got the time to start working on that this week. In this post, I’d like to show how I tackled the problem of reading binary data in what I find to be an elegant manner.
Because Clojure can use all the Java libraries, I figured that it would probably be a good idea to use the java.nio package. A problem that quickly rose was that calling the methods manually was both ugly and unnecessarily verbose. Here’s an example of what it looked like:
(let [game-engine (.get buf)
game-frames (.getInt buf)
_ (.get buf (make-array Byte/TYPE 3)
save-time (Date. (long * 1000 (.getInt buf))))
_ (.get buf (make-array Byte/TYPE 3))
; etc
])
The code wasn’t “symmetric” enough for my taste between different fields, dealing with fields that needed more than one byte/word/dword of data was just plain ugly (not shown in the code above). It was clear that I needed something more declarative. What I wanted was to list the different fields, give their name, length, data type and an optional function to execute on the read value (for example, to convert a long into a Date). The first step I took was to actually write the code I wanted and then I’d worry about making it work. Here’s how I can read the header data of a replay file:
(defn parse-headers
[buf]
(parse-buffer buf
[:game-engine 1 :byte]
[:game-frames 1 :dword]
[nil 3 :byte]
[:save-time 1 :dword #(Date. (long (* 1000 %)))]
[nil 12 :byte]
[:game-name 28 :string]
[:map-width 1 :word]
[:map-height 1 :word]
[nil 16 :byte]
[:creator-name 24 :string]
[nil 1 :byte]
[:map-name 26 :string]
[nil 38 :byte]
[:players-data 432 :byte parse-players-data]
[:player-spot-color 8 :dword]
[:player-spot-index 8 :byte]))
With this “specification” in hand, I wrote the code to make it work.
(defn read-field
[buf n type]
(defn null-string
"Read a nul-terminated string. Stop at or at
length n, whichever comes first."
[buf n]
(let [bytes (doall (for [_ (range n)] (char (.get buf))))]
(apply str (take-while #(not= % \u0000) bytes))))
(defn read-field-aux
"Read n data and return it as a vector if n is greater than 1,
as a vector otherwise"
[n type]
(let [f ({:byte (memfn get)
:word (memfn getShort)
:dword (memfn getInt)} type)
vec (into [] (for [_ (range n)] (f buf)))]
(if (= n 1)
(first vec)
vec)))
(cond (= type :string) (null-string buf n)
(some #{type} [:byte :word :dword]) (read-field-aux n type)))
(defn parse-buffer
"A v-form is a vector of the form: [:field-name length :type func?]
Each v-form is read from buf and the whole data is return as a map
If a field-name is nil, the data is not returned (but the field is
read nonetheless to move forward into the buffer."
[buf & v-forms]
(apply
hash-map
(mapcat (fn [[field-name size type func]]
(let [data (read-field buf size type)]
(if (nil? field-name)
nil
[field-name (if func
(func data)
data)])))
v-forms)))
And here’s the data from a Starcraft replay file after it’s been read:
{:game-name "MBC_Sea[Shield]",
:game-engine 1,
:map-width 128,
:players-data ({:name "MBC_Sea[Shield]", :race "Terran", :player-number 0, :type :human, :slot-number 0}
{:name "", :race nil, :player-number 1, :type nil, :slot-number -1}
{:name "CJ sAviOr", :race "Zerg", :player-number 2, :type :human, :slot-number 1}
{:name "", :race nil, :player-number 3, :type nil, :slot-number -1}
{:name "", :race "Zerg", :player-number 4, :type nil, :slot-number -1}
{:name "", :race "Protoss", :player-number 5, :type nil, :slot-number -1}
{:name "", :race "Terran", :player-number 6, :type nil, :slot-number -1}
{:name "", :race "Zerg", :player-number 7, :type nil, :slot-number -1}
{:name "", :race "Zerg", :player-number 8, :type nil, :slot-number -1}
{:name "", :race "Zerg", :player-number 9, :type nil, :slot-number -1}
{:name "", :race "Zerg", :player-number 10, :type nil, :slot-number -1}
{:name "", :race "Zerg", :player-number 11, :type nil, :slot-number -1}),
:save-time #<Date Sat Jun 28 15:05:16 EDT 2008>,
:player-spot-index [1 1 1 1 0 0 0 0],
:game-frames 45149,
:player-spot-color [4 1 7 0 2 6 3 5],
:creator-name "MBC_Sea[Shield]",
:map-height 128,
:map-name "Andromeda 1.0"}
Posted by gnuvince
Posted by gnuvince
Posted by gnuvince