A couple of weeks ago, I was asked at work if I was interested in transfering from the infrastructure team to the development team. I wasn’t sure at the time, so I asked if I could check out the nature of the work to make my choice. I was given a book on the language used and a Word document that explained the different style guidelines.
The programming environment was HyperScript for JSheet. Most of you probably have never heard of it, and you can be thankful for that: it’s certainly the suckiest development environment and language I have ever seen. However, I’m not here to talk about it today, I’ll do that in another post at a later time. For now, I want to focus on two elements of the guidelines that would’ve drove me nuts if I had accepted the position.
Do not use single character variables
The first guideline was that you could not use a single character variable. Sounds reasonable mostly, some people often go way nuts with variable names nobody can understand. However, this was taken to the extreme here. In most, if not all, programming languages, when you need to loop over an array using indexes, you use i when you have one dimension, i and j when you have two dimensions, i, j and k when you have three dimensions and you rarely need to go beyond that. I know it’s used in Ruby, Python, C, C++, Java, Forth, Smalltalk, Lisp, etc. The guidelines strictly prohibited this however. You should call your index variable i_index. When you have more than one dimension, you should use i_something_meaningful_in_the_array.
This seriously hurts readability, makes the code horizontally longer and brings no benefit. Because of the ubiquity of i, j and k in all other programming languages, this would actually make code more readable.
Use hungarian notation for all variables
The second, and most annoying thing in the guidelines was that all variables were to be named in an hungarian style. For those of you who don’t know what it means, it’s a naming convention where you put relevant information (most often, the type) regarding the variable in the variable name itself. Some example would be floatAverage, arrayEmployees or stringFirstName. The reasoning — as far as I can understand — behing this naming scheme is to give more information to the programmer regarding the variable, so it’s easier to follow a program, to jump right into it.
Some programmers may agree with that; I don’t. I find that this extra information adds clutter to the names, and that the code suddenly becomes harder to read, because you don’t read it like you’d read instructions, you are always reminded of implementation details.
The conventions I would’ve had to follow were the following:
g_ : global variable
p_ : function parameter
a_ : array
i_ : integer
f_ : float
c_ : char
s_ : string
h_ : constant
dt_: DateTime
Nice, eh? Well it gets even better: these are pluggable together. Yes, yes, so if you have a global array of employee names, the name of the variable would be gas_employee_names. Maybe it’s just a question of habit, but it seems that when I read code like that, my mind stops to analyse the meaning of ‘gas’. “Oh right, that’s a global array of strings.” I don’t believe that’s a good practice.
A while ago, Blaine Buxton, a Ruby and Smalltalk programmer, wrote a blog entry where he mentionned that he used a thesaurus to find the best word to describe an object. I think that carefully choosing names wins over hungarian notation. For instance, if you have a variable named as_first_names, why don’t you just lose the “as_” prefix? It can be mistaken for the word “as”, and secondly, it’s completely unneeded here: first_names would indicate more than one first name, and I don’t think anyone is named 17.3 or -47, so it’s safe to assume that you’re dealing with strings. The same goes for salaries: it’s pretty safe to assume a collection of floats representing salaries, don’t you think?
There’s also the question, when do you stop? In Smalltalk for instance, there is a large set of collection classes which can contain any sort of object. How do you call an instance variable ordered collection containing DateTime objects? ivOcDtEmployeePunchTimes? That name looks like somebody was fighting a spider before typing “EmployeePunchNumber”. And with the number of classes in Smalltalk, you may end up confusing the prefixes; does “Dt” stand for DateTime or for DispatchTeam?
Hungarian notation is also a problem when you decide to change the implementation of your program. For example, you decide that you want to use a different kind of number in an array, you want to replace integers with floats. You need to change all instances of gai_variable to gaf_variable. That can be made pretty easy with search and replace, but that’s still an extra operation. Had the name been chosen to evoke an array of numbers, but without specifying what kind of numbers, that second operation would be unnecessary.
In the book “Smalltalk Best Practice Patterns” by Kent Beck, Kent extensively talks about choosing good names to clearly communicate your intentions and making sure that reading your code flows. Leo Brodie has similar thoughts in “Thinking Forth”. My opinion is that every time you read a variable name in hungarian notation, you go from the thought mode where you think of the solution to the mode of thought where you think about the implementation of the solution.
By the way, I am not going to be programming in HyperScript for JSheet. This means that in one week and a half I will have no job, but I think I’m happier being unemployed than having a job I would hate.