Industry

From Academia to Industry

So, now that the year is coming to an end and I’ve been in my first “real job” for about half a year or so, I figured it was about time to write a few lines about what happens when you enter the software industry as a full-time software developer, coming rom days spent reading papers, attending seminars, teaching, and trying to make proofs go through.

Well, there has obviously been a lot of catching up to do. Getting used to the Android ecosystem is one thing (which is understated given my rather limited prior knowledge), but also a refresher of the basic skills and an update on where the industry has been going over the last 5-6 years — since I already have my nose in Android a lot these days I’m only going to mention here that 1) I like the upgrade from iOS and am not missing the latter at all, and 2) I’m surprised how much guessing is involved in developing for this system due to inadequate, or at least uneasy to find, documentation.

The refreshing part has been quite enjoyable, and involved going through the CLRS algorithms book again, re-finding the beauty in software architecture and general design patterns, and covering concrete Java stuff such as best practices and its concurrency library.

At the same time it was also good to catching up on where the industry was gone. There were some things that I didn’t see coming at all (JavaScript on the server for instance), and some things that I was particular happy to see, such as functional languages (e.g. OCaml and Haskell) growing in popularity and influencing the mainstream languages. Finally having the time to go deeper in the NoSQL databases was good, and again happy to see that some of the more theoretical concepts are making their way into industry such as graph databases (agreed, relational database has relational algebra, but never found it overwhelmingly exciting).

All of this was before getting a job though, and after starting it’s been incredible nice to actually have some energy/time/conscience left in the evenings to look into other interesting things — and since this is pure pleasure I’ve been taking a breadth-first approach opening several books at the same time:

Besides software-related topics I also finally had the time to get a non-algorithmic point of view on game theory, including concrete games and mechanism design. And finally, getting involved in business, management, and interviewing processes at work has also been very interesting, admittedly still having to draw a lot from teaching and early university group work.

Book: Programming Pearls

Another book I came across when refreshing the more practical computer science skills was Jon Bentley’s Programming Pearls (second edition), composed of columns — blog posts, I suppose, by today’s terminology — written by him for the ACM back in the 1980s. And while it is somewhat outdated in terms of technology, the strength of this little book lies in something entirely more timeless.

The book covers several topic within software construction, including programme verification, testing, and code tuning, but most pages concern the construction of good algorithms to solve a given problem. Incidentally, one of my favourite problems from the book was posed to me at a job interview, supporting the claim that the book is well-known within the software community, at least by senior staff (as a side-note, I mentioned that I’d already encountered the problem from reading this book, and funnily enough that almost seemed as good an answer as an algorithm).

The problem was this (section 2.4 and 2.8):

Given a dictionary of words, find all sets of anagrams.

The good thing is that the book not only poses interesting questions, it also discusses several solutions, comparing them in terms of time and space efficiency. For the anagram problem above there’s the brute force approach as usual, yet as soon as you remember equivalence classes and their representatives you’re close to a much better algorithm.

Another favourite was presented in chapter 8:

Given a array of n integers (positive and negative), find a maximum sum contiguous subarray.

For instance, in [ 31, -41, 59, 26, -53, 58, 97, -93, -23, 84 ] one solution is [ 59, 26, -53, 58, 97 ] with sum 187, obtained by removing the first two and last three elements. Again there’s the obvious brute force approach — e.g. computing the sum for all C(n, 2) pairs of indices — but by a bit of dynamic programming it’s possible to reach an O(n) algorithm that only scans the array twice.

Now, as mentioned above, the book does not just talk about algorithms but through various topics aims to improve your overall programming skills. And for that I recommend it highly: it will probably not teach you a new technology but it might improve the quality of your code.

Which brings me to a pattern I’ve noticed over the years: my technology books usually end up somewhere in the basement, mostly kept for nostalgic reasons, whereas books like this one, focusing on the fundamentals, are much more likely to still sit high on my office bookshelf, and as such still attractive to buy in dead-wood format.

The Power of Pressure

The other day I was, relatively speaking, very close to landing a good job in the software industry. However, during the last 5% of the interview process pressure kicked in and my mental processes collapsed, leaving me starring at the white board unable it seemed, to even spell my name.

At this point it had already been a surprisingly long recruiting process, consisting of more than 8 hours of problem solving interviews. I’d had a break for the weekend, and was now called in for a final interview with the big boss. This was of course an indication that they were considering me seriously, however the recruiting agent had let me know that a few of the interviewers had expressed a wish that I’d been a bit faster at coming up with solutions.

The first half hour with the big boss went well, focusing more on me being a cultural fit than technical skill. But then, for the last half hour he suddenly asked me another problem solving question: implement method transposeMatrix(int[] matrix, int n) where the matrix is assumed to be of size n by n represented by a one-dimensional array, and with only a constant memory overhead (so no allocation of another array).

I quickly realised how the algorithm would work, using two pointers running respectively out of the rows and down the columns. However, when it came to writing the code I froze up — and not because it’s difficult code, but because I knew that me getting the job or not most likely came down to these particular 30 minutes. As a result, I got self-conscious and my focus started drifting away from the white board towards the big boss looking over my shoulder.

The outcome was that I turned completely blank, fighting to bring back focus where it was desperately needed. But it didn’t happen until after 20 minutes of nonsense, when I got a 5 minute break while he left to pick up a new whiteboard marker. When he came back my focus was there already and the solution came quickly:

void transposeMatrix(int[] matrix, int n) {
    for (int row = 0; row < n-1; row++) {
        for (int col = row+1; col < n; col++) {
            int i = row * n + col;
            int j = col * n + row;
            // swap matrix[i] and matrix[j]
        }
    }
}

yet it was too late; given the performance I illustrated that day I wouldn’t have hired myself, so in full understanding of their decision.

The frustrating part instead, is that this can still happen to me when I want something badly (it never happens when I don’t care). But what’s to learn from this? Well, I’m better at problem solving with a piece of paper in front of me and 5 minutes of “quite time”; I’ve been hesitating to take this at interviews for a fear of disrupting the flow, but it seems it can come to a point where it’s worth it. Also, I’ve started on ways of controlling nerves by basically caring less about the surroundings; since this is rooted in a current insecurity (from being unemployed?) it is easy to practise in everyday life.

Book: Programming Interviews Exposed

In my preparation for potentially taking on a general software developer role, Programming Interviews Exposed was recommended to me on several occasions and hence gave it a shot. It’s not too bad a read, and would most likely recommended it in the future, despite is having a few glitches; at the very least, it serves its purpose as a good source of concrete problems to solve.

Overall the book has the decent “Wrox feeling” I remember from way back in the days of secondary school reading Professional ASP.NET and Professional C#, and later Beginning Visual C++ 6 in the early university days (fond memories of a world opening up): on the surface the paper feels the same, even a few pages with faded print, and the contents is the same easy-to-read-yet-with-good-substance. It contains some good non-technical advise as well, so for its price I don’t see why anyone looking for a job in this area (and perhaps a bit rusty in the practical areas of computer science) shouldn’t pick up a copy.

The trees do not grow all the way into Heaven though, as the Danish saying goes, and there are a few place with room for improvement; for the interested here’s my two cents.

First of all, the writing follows the idea that the reader could have come up with every step of reasoning the authors take by just thinking the right way (and not letting much room for the use of prior knowledge). When coming up with a proposal, this approach may seem a bit strange yet it still teaches you the tricks. However, in analysing a solution they often skip arguments that could potentially teach you something. As a concrete example, the two-pointer solution for detecting cycles in a linked list (page 60 in the third edition) seems more like a useful trick to know than something I would have come up with in an interview situation; fair enough I’ve now learnt it. However, they give no argument to the claim that it has a linear running time; while I may not see the obvious, coming up with a proof was a useful exercise, taking me a car ride and the use of modular arithmetic to find. In short, you’ll need to look elsewhere for brushing up on proving and analysing algorithms (go Cormen et al.!).

Secondly, the given solutions sometimes seem to ignore standard mathematical thinking, and simply brute force their way through, in turn making it more confusing (at least to me) than it has to be. One concrete example is the problem of generating all combinations of characters in a string (page 116 in the third edition), which has a simple mathematical formulation in the form of the power set that also leads to an easy recursive implementation. Yet they do not follow nor mention this approach, instead doing a lot of hand-waving to end up with something that works but leaves me less certain about its correctness (their solution leaves out the empty string for instance, which may or may not be an error, but at least we know when the goal is a well-known concept such as the power set).

Thirdly, they do not always follow the official Java code conventions, which honestly seems a bit silly after several companies have highlighted the importance of this during interviews. For instance, the following code (page 93 in the third edition):

public void foo(String str) {
    int i, length;
    Character c;

    length = str.length();
    for (i = 0; i < length; i++) {
        c = str.charAt(i);
        // do first scan ...
    }
    for (i = 0; i < length; i++) {
        c = str.charAt(i);
        // do second scan ...
    }
}

goes against both the advise on initialisation at the point of declaration (length) and declaring local indices within their for-loop (i). Although a bit contradictory, it also goes against the advise given in Effective Java (Item 45 in the second edition) of not declaring a variable before use (c). To me, rewriting the code to this:

public void foo(String str) {
    int length = str.length();
    for (int i = 0; i < length; i++) {
        Character c = str.charAt(i);
        // do first scan ...
    }
    for (int i = 0; i < length; i++) {
        Character c = str.charAt(i);
        // do second scan ...
    }
}

seems both safer and easier to follow.

That was a lot of bashing actually; let me just make sure to express that overall I enjoyed reading the book, and almost all problems in it are interesting and with an entertaining side; of the latter I especially liked the pancake sorting algorithm. So, getting a copy is not a bad idea!

If curious, my programming solutions are available at the progin project on GitHub.

The Ten Minutes After Enlightenment

I had a job interview earlier today, which didn’t go completely awful, yet didn’t go terribly well either. During the interview I only found a simple solution to one of the problems and then the ten-minutes-after-enlightenment happened: what you should have said pops up in your head just after the interview is over. Sometimes it may take a book or an internet search for this to happen, but in these cases it seems to be because of nerves settling down and the brain starting to work again, fortunately. Why? Well perhaps a methodology can help, by extending the offline preparation phase a bit.

What happened? Well, I was asked to come up with an algorithm for computing a certain result, and the only viable thing that came to mind was a brute force method (not completely unexpected by the way, as important exams have earlier been seen to make my brain strike and confusion taking over — one might assume it to be different after 18+ years in the educational systems, yet here we are). This is obviously not the best solution, but when nothing else presents itself I’ve found earlier that coding this naive approach often leads to an insight that in turn yields an improvement, sometimes simply in the form of a better understanding of the problem and the objective.

So, I started coding the naive solution, hoping something better would show up along the way. Unfortunately it didn’t, and I even managed to mess up the coding despite the fact that I’ve used the same programming pattern numerous times, not least in my functional programming. At that point desperation kicked in, a dark well opened up in the ground, and the universe decided to test my ability to catch helping hands while tumbling towards a deep bottom. The usual stuff. Fortunately though, my interviewer was very kind and basically put out a platform for me to fall on. I made it up the well again, and by then time ran out and the interview ended (with me having goosebumps, a mild shaking, and the urge to play the guitar loudly).

And then it happens. Less than ten minutes later, after having first summoned a ringing in my right ear and next looked out the window from the back of my armchair, a better idea that I wish we could have pursued occurred to me: a data structure could have helped (a radix tree in this case), at the very least by providing a space/time trade-off. But the interviewer is now long gone and this is for your enjoyment only.

This insight may of course have been just around the corner independently of the interview ending or not, yet my point here is that in my state at the time it seems unlikely: the problem, I suspect, lies more in not being in your best working state than in the problem being too difficult for the time frame.

So, how to improve? Given that it’s difficult to just stop being nervous, one thing I’ll try for the next one is to improve my methodology. More concretely, I’ll keep with me a list of the most commonly used data structures:

  • hash table
  • (doubly) linked list
  • heap
  • binary search tree
  • radix
  • etc.

and if needed, go over them asking How could the problem be solved using one of these? Similarly, I’ll try with a list of important problems (and good algorithms for them):

  • sorting
  • breadth-first search
  • max flow
  • etc.

in the hope that a suitable reduction could be enough. And last but not least, it may also pay off to keep a list of the most common programming techniques:

  • tail recursion
  • dynamic programming
  • greedy
  • etc.

in case nothing else yields.