Tag Archives: Java

Book: Java Generics and Collections

During preparations for a job interview I noticed that my understand of Java Generics had some holes in it; and during the interview itself it became clear that there were also a few holes in my knowledge about the corner cases of the Java Collections framework introduced with Java 5 and 6. So, after looking around a bit, which book could be more suiting than Java Generics and Collections?

Now, I wasn’t completely lost for either of the two, but they were both introduced after I had my Java programming course (in Java 1.4) and apparently since then there hasn’t been a situation requiring much more than your typical

List<String> strList = new ArrayList<String>();

and the like (perhaps there rarely are?). However, when I came across the follwing signature:

public static <T extends Comparable<? super T>> T max(
    Collection<? extends T> coll) {
    // iterate to find max ...
}

it was clear that there was something not so trivial going on; a nice little example of why this crooked specification is useful is given in Section 3.3, but funnily enough I find the use of keywords extends and super a bit counter-intuitive here — mostly likely because I’ve “grown up” with the formal <: notation instead.

Now, in general the chapters on generics are well-written and usefully sprinkled with illustrative examples. There could sometimes have been more technical explanations, and perhaps links to formal type theory (it was written by Philip Wadler after all), however, overall it’s a highly recommended read!

It even taught me a little peculiar thing about Java bytecode, namely that generics allows one to overload methods based on return type:

class Works {

    public static void main(String[] args) {
        Works o = new Works();
        int i = o.<Integer>foo();
        boolean b = o.<Boolean>foo();
    }

    public <T extends Integer> int foo() {
        return 0;
    }

    public <T extends Boolean> boolean foo() {
        return false;
    }

}

which wouldn’t compile without generics:

class DoesntWork {

    public static void main(String[] args) {
        DoesntWork o = new DoesntWork();
        int i = o.foo();
        boolean b = o.foo();
    }

    public int foo() {
        return 0;
    }

    public boolean foo() {
        return false;
    }

}

since the compiler doesn’t know which method to pick in main. However, the bytecode calls a method by looking up in an index table, and with generics the compiler has enough information to differentiate these.

For the second part of the book, on the Collection framework, the book disappoints a bit: it covers most classes and uses, but few details are given and there’s quite a bit of fluff. It gave me the overview I needed, but not much more than going over the online Java documentation.

Book: Programming Interviews Exposed

In my preparation for potentially taking on a general software developer role, Programming Interviews Exposed was recommended to me on several occasions and hence gave it a shot. It’s not too bad a read, and would most likely recommended it in the future, despite is having a few glitches; at the very least, it serves its purpose as a good source of concrete problems to solve.

Overall the book has the decent “Wrox feeling” I remember from way back in the days of secondary school reading Professional ASP.NET and Professional C#, and later Beginning Visual C++ 6 in the early university days (fond memories of a world opening up): on the surface the paper feels the same, even a few pages with faded print, and the contents is the same easy-to-read-yet-with-good-substance. It contains some good non-technical advise as well, so for its price I don’t see why anyone looking for a job in this area (and perhaps a bit rusty in the practical areas of computer science) shouldn’t pick up a copy.

The trees do not grow all the way into Heaven though, as the Danish saying goes, and there are a few place with room for improvement; for the interested here’s my two cents.

First of all, the writing follows the idea that the reader could have come up with every step of reasoning the authors take by just thinking the right way (and not letting much room for the use of prior knowledge). When coming up with a proposal, this approach may seem a bit strange yet it still teaches you the tricks. However, in analysing a solution they often skip arguments that could potentially teach you something. As a concrete example, the two-pointer solution for detecting cycles in a linked list (page 60 in the third edition) seems more like a useful trick to know than something I would have come up with in an interview situation; fair enough I’ve now learnt it. However, they give no argument to the claim that it has a linear running time; while I may not see the obvious, coming up with a proof was a useful exercise, taking me a car ride and the use of modular arithmetic to find. In short, you’ll need to look elsewhere for brushing up on proving and analysing algorithms (go Cormen et al.!).

Secondly, the given solutions sometimes seem to ignore standard mathematical thinking, and simply brute force their way through, in turn making it more confusing (at least to me) than it has to be. One concrete example is the problem of generating all combinations of characters in a string (page 116 in the third edition), which has a simple mathematical formulation in the form of the power set that also leads to an easy recursive implementation. Yet they do not follow nor mention this approach, instead doing a lot of hand-waving to end up with something that works but leaves me less certain about its correctness (their solution leaves out the empty string for instance, which may or may not be an error, but at least we know when the goal is a well-known concept such as the power set).

Thirdly, they do not always follow the official Java code conventions, which honestly seems a bit silly after several companies have highlighted the importance of this during interviews. For instance, the following code (page 93 in the third edition):

public void foo(String str) {
    int i, length;
    Character c;

    length = str.length();
    for (i = 0; i < length; i++) {
        c = str.charAt(i);
        // do first scan ...
    }
    for (i = 0; i < length; i++) {
        c = str.charAt(i);
        // do second scan ...
    }
}

goes against both the advise on initialisation at the point of declaration (length) and declaring local indices within their for-loop (i). Although a bit contradictory, it also goes against the advise given in Effective Java (Item 45 in the second edition) of not declaring a variable before use (c). To me, rewriting the code to this:

public void foo(String str) {
    int length = str.length();
    for (int i = 0; i < length; i++) {
        Character c = str.charAt(i);
        // do first scan ...
    }
    for (int i = 0; i < length; i++) {
        Character c = str.charAt(i);
        // do second scan ...
    }
}

seems both safer and easier to follow.

That was a lot of bashing actually; let me just make sure to express that overall I enjoyed reading the book, and almost all problems in it are interesting and with an entertaining side; of the latter I especially liked the pancake sorting algorithm. So, getting a copy is not a bad idea!

If curious, my programming solutions are available at the progin project on GitHub.

Book: Java Puzzlers

It’s always a bit embarrassing to acknowledge, but Java Puzzlers: Traps, Pitfalls, and Corner Cases is one of the books on my bookshelf that has remained largely unread for quite a while (since 2006 in fact; I have a habit of writing the acquisition date inside the cover). Yet this week I finally finished it, and for a book on programming languages it’s actually a pretty good read. Here’s a few off the top of my head reasons.

Most importantly of course is that it’s educational. While a few of the puzzles are indeed corner cases that I suspect most will only rarely if ever encounter (such as the weird behaviour that may arise in floating point arithmetic), most of the material seems likely to come in handy from time to time when a bug creeps in. And not only for the simple bug: on a few occasions were the bugs I found in the puzzles not really bugs at all, with the real problem being something a lot more profound.

Fortunately, while the depth of the material sometimes makes the puzzle format of the book seem a bit quaint (for instance when the bug lies deep in the Java framework), the book also contains an elaborate overview/index in the back so that it’s also useful for those of us without photographic memory.

Which brings me to the final point, namely that the presentation is also quite pleasant: the font is easy to read, the writing is concise, and — most importantly — the code snippets are nicely formatted, to the point, and without clutter (to see how much impact this can have just open the horrible Thinking in Java). Due to the puzzle/solution nature of the book some blank space had to occur, but even this is nicely filled, with visual illusions.

Kalami: Parsing

Next up in the presentation of the little toy language Kalami is parsing. I remember how we used SableCC during my undergraduate years, and I also remember how we simply couldn’t make it pretty no matter what measure we used! Perhaps it can be made better in the Java environment than we ever managed to, but after I saw how it is done in the Caml environment I never looked back: all of a sudden it became fun to write parsers and interpreters!

So, parsing is basically about how to turn a input text string into an in-memory representation in the form of an abstract syntax tree. I don’t want to go into a lot of background but a good source for slightly more details is Chapter 12 in the Ocaml Manual.

The files needed can be found in the ZIP file kalami-parsing.zip; out of these the most important are structure.ml, lexer.mll, and parser.mly.

Starting with structure.ml we first have a piece of Caml code defining what we would like the abstract syntax tree of a Kalami programme to look like:

type identifier = string
type number = int
type bound = int

type expression =
	ExprNumber of number
	| ExprVariable of identifier
	| ExprProduct of expression * expression
	| ExprDivision of expression * expression
	| ExprSum of expression * expression
	| ExprSubtraction of expression * expression
	| ExprPower of expression * expression

type condition =
	CndTrue
	| CndFalse
	| CndEqual of expression * expression
	| CndLessthan of expression * expression
	| CndNegation of condition
	| CndOr of condition * condition
	| CndAnd of condition * condition

type statement =
	StmtLet of identifier * expression * statement
	| StmtIf of condition * statement * statement
	| StmtGuess of identifier * statement
	| StmtGuessLowerBound of identifier * ...
	| StmtGuessLowerUpperBound of identifier * ...
	| StmtAccept
	| StmtReject

Essentially, the above code defines expressions, conditions, and statements as labelled tuples, such that when we for instance write “4*5” in Kalami then this can be represented in memory as expression ExprProduct(ExprNumber(4), ExprNumber(5)). We shall later see that in the end a Kalami programme is simply a statement.

By the way, note the elegance with which we can express an abstract syntax tree in Caml: were we to express the same in Java it would look a lot less intuitive, at least in my experience. This is one of the great strengths of Caml, of which there are plenty more when we start evaluating and analysing the abstract syntax tree in later posts.

Next up is the lexer responsible for the first layer of abstraction of the input, namely turning the raw string into a stream of tokens. Looking in lexer.mll we see that several regular expressions are used to abstract strings, from the very basic labelling of characters and keywords:

	| '('				{ LPAREN }
	| ')'				{ RPAREN }
	| '+'				{ PLUS }
	| '-'				{ MINUS }

	| "let"				{ LET }
	| "in"				{ IN }
	| "if"				{ IF }
	| "then"			{ THEN }

to the slightly more advanced regular expressions:

	| (['0'-'9']+ as i)		{ INT(int_of_string i) }

that also includes a native Caml instruction for turning a string into a number.

Besides this abstraction of strings as tokens not much happens in the lexer, perhaps apart from the little extra code needed for allowing nested comments, and the insistence that STRs, and in turn identifiers, start with a digit (an easy way to distinguish them from INTs).

However, in parser.mly we may use these tokens to put together a grammar. In particular we see from rule main that an input (now a string of tokens) is a Kalami programme if it can be parsed according to rule “stmt” and its sub-rules as given by:

id:
	STR				{ $1 }
;

expr:
	INT				{ ExprNumber($1) }
	| id				{ ExprVariable($1) }
	| expr MULT expr		{ ExprProduct($1, $3) }
	| expr DIV expr			{ ExprDivision($1, $3) }
	| expr PLUS expr		{ ExprSum($1, $3) }
	| expr MINUS expr		{ ExprSubtraction($1...) }
	| expr EXP expr			{ ExprPower($1, $3) }
	| LPAREN expr RPAREN		{ $2 }
;

cnd:
	expr EQ expr			{ CndEqual($1, $3) }
	| expr LT expr			{ CndLessthan($1, $3) }
	| NOT cnd			{ CndNegation($2) }
	| cnd OR cnd			{ CndOr($1, $3) }
	| cnd AND cnd			{ CndAnd($1, $3) }
;

stmt:
	LET id ASSIGN expr IN stmt	{ StmtLet($2, $4, $6) }
	| IF cnd THEN stmt OTHERWI...	{ StmtIf($2, $4, $6) }
	| GUESS id IN stmt		{ StmtGuess($2, $4) }
	| GUESS id FROM expr IN ...	{ StmtGuessLowerBo... }
	| GUESS id FROM expr TO ...	{ StmtGuessLowerUp... }
	| ACCEPT			{ StmtAccept }
	| REJECT			{ StmtReject }
;

Besides these rules the file also contains instructions defining the precedence of for example the mathematical operations as well as their associativity, as needed to help the parser solve ambiguity.

Having looked at the primary files for the parser we turn to the plumbing needed to put them together. File printing.ml simply takes a statement and recursively prints it on the screen, and file kalamiparsing.ml contains Caml code that reads the input, invokes the lexer and the parser, and prints the statement in case no error occurred:

open Printing

let _ =
	try
		let lexbuf = Lexing.from_channel stdin in
		let stmt = Parser.main Lexer.token lexbuf in

		print_string "\n*** Statement ***\n\n";
		print_stmt stmt;
		print_string "\n\n"

	with 
		Lexer.Eof ->
			print_string "End of file\n";
			exit 0
		| Parsing.Parse_error ->
			print_string "Parsing error\n";
			exit 1

The only thing missing now is how to compile everything. This is not completely straight-forward since the lexer and the parser are referencing each other, and as a result we must execute the compilation commands in the specific order as seen in file compile.sh. Doing so will produce the kalamiparsing binary which we can tell to either load a programme from a file through

./kalamiparsing < "inputfile"

or run “interactively” by simply executing

./kalamiparsing

and typing a programme directly into it (ending with an EOF signal by pressing control + d).

In the next post we’ll look at how we may evaluate an in-memory Kalami programme in order to find a satisfying guess; as mentioned previously this requires something slightly more involved than a trivial brute-force.