Monday, June 23, 2014

Designing an Access Control System

Designing the Proper Thing

One thing I've learned in my career is that in order to build something that will outlast its original intention, you need to boil the idea down to a fundamental use case and design the interface to your system around that.  A clean interface, whether that's a user interface or an API, makes all the difference.  It's really the Unix philosophy of "do one thing and do it well".  Once you have that clean, well-defined interface, the actual implementation can be done and redone or built upon as needed without causing consumers of your code to make any changes to compensate.  And not forcing users to redesign around your short-sightedness goes a long way to making them want to use your software. Shortly after I started at Liquid Web, I was tasked with building a system that would allow us to selectively restrict access to portions of our internal applications to employees whose jobs required them to have said access.  This idea is common, and there are systems out there designed to handle similar workloads, LDAP being the most notable.  However, what I found while researching options was that many of them hoisted the bulk of the actual restriction effort on to the user interface, which I found problematic from a maintenance perspective.

Imagine you're writing an interface, and you want to selectively hide parts of the UI so people who can't do those actions aren't distracted by them being there.  You can go about that in a few ways.  For a group-based approach, you can say "Is Joe a member of the flibbertygibbit group?  Ok, he sees the flibbertygibbit widget".  You can take a roles based approach, and then the question becomes "Is Nancy a SuperHeadAdminPerson?  Ok, she can see the 'delete user' button.".    In either of these cases, what happens when the organization restructures itself and suddenly the flibbertygibbit group has all new responsibilities that don't include the flibbertygibbit widget?  What if a new level is inserted above SuperHeadAdminPerson (SuperDuperHeadAdminPerson), and now SuperHeadAdminPerson is actually no longer allowed to delete users?  Why, you get to wait until the developers have time to retool the entire interface to fix the issues with people seeing the wrong things.  That's just bad voodoo, so I wasn't too keen on taking that approach.  I had to find a better way.

After much pondering, some caffeine, probably a nap or two, and much wasting time on Reddit, I found a light bulb and turned it on.  What access control boils down to, after you strip away all the groups, and roles, and everything is: "Can this person do this action?".  It's deceptively simple.  Can Jimbob give this customer a $1 billion credit?  Can Suzie delete this SuperUser account?  Can Dr Evil get sharks with friggin lazer beams on their heads?  It's really that simple.   So, given that simple idea, could we design a system around the idea of "Can $user do $action?".  I should note at this point that LDAP could do that at the time, but the way it did it wasn't very natural, and it wouldn't scale to the level of thousands of possible 'actions' we were envisioning.  It's possible that has since changed, but I haven't had to revisit it so I don't know.  So we went about designing a system that would answer this question quickly, frequently, and with maximum flexibility.

Designing the Thing Proper

They say when you have a nail, you want to hit it with a hammer.   Wait, no, what was it again?  I can't remember.  Anyway, when we went to design the system, we opted to use a) Perl and b) Postgres, because both are excellent tools for building performant, scalable systems, and they just happened to be what everything else at the company was using (except for some crappy legacy systems using mysql).  Oh, and I forgot one other critical piece: memcached.  Now we had a stew going.  Actually, not quite yet.  We didn't want a situation where we had to go in for each user and every possible action and flip a toggle.  That just wouldn't scale.  Thousands of users, potentially tens of thousands of actions, that's just begging to grind to a halt.  I was wishing that there was just some nice hierarchical data structure where we could define blanket permit/deny statements, then give more granular exceptions to that, like "Jimbob can do anything to an account... except delete it.".    While I was pondering this, my boss kindly walked over and yelled something in my ear.  "LTREE you idiot" he said.  His version of reality may vary on that.  If you don't know ltree, go check it out, it's amazing.  It's basically exactly what I wanted.  With that tidbit, I was off to the races.  Now I could set permissions:

jimbob CAN: Account
jimbob CAN'T: Account.Delete

And then when I ask the question:

Can jimbob create an Account? - yes!
Can jimbob update an Account? - yes!
Can jimbob delete an Account? - no!

But I didn't have to specifically tell the system about Account.Create and Account.Update, because jimbob already can do Account (there's an implicit .*, so Account = Account.*)

So going back to my original comment about clean interfaces, as long as the system would continue to answer the "Can $user do $action", it didn't matter how much complexity was leveraged to come about those answers.  All any consumer of my system required to know was the answer to that question, and its thousands of siblings.

So, despite my earlier comments about groups and roles, they are still useful constructs for defining what people can or can't do, as long as you're not exposing that level of detail through the interface.  So, to make a long story just a wee shorter, this was the basic structure we came up with for how the system would answer the ultimate question:

Action - the thing that can or can't be done
Role - a collection of rules about which actions can or can't be done
Group - a collection of users that are assigned the same roles
User - can have roles, be assigned to groups that have other roles, and can also have user-specific rules about additional actions outside the scope of its other roles

So, to answer the question, it was a possible multi-step process:

1. Can this user do this action?
2. Can any role this user has do this action?
3. Does any role on any group assigned to this user have permission to do this action?

This setup gave us a huge amount of flexibility in defining permissions for various users, and we then mapped all the company departments to groups, and put their employees (users) in those groups.   Now we had the best of both worlds: broad definition of permissible actions to whole departments, with the ability to make exceptions where needed.  And the consumer of the system still only cared about one, simple thing.

We made pretty extensive use of memcached to prevent repeated calculations of the same data in quick succession, and from that we had a system that still uses very few resources despite powering access control for both our public API and internal intra-department API, as well as many other internal systems.   Not bad for a few days work (ok ok, it really took about a month).

What's the big deal?

So, I keep going on about how a clean interface enables you to become one with the universe or something. Due to the flexible design of the system and the single clear point of interaction, we were able to adapt it for use by customer accounts with our public API and web UI with about 2 hours of work to allow for running multiple copies pointing at different databases.  When we needed to add rate-limiting to the API, we knew that it was really just a slight adjustment on "Can $user do $action" to be "Can $user do $action... again?".  All of the same idioms lined up, so we simply added a layer to track requests to the methods in a performant way so that asking that question became cheap enough to be useful for that purpose.

This was really a fluke project.  I've never had another project that lasted as long as it has with as few modifications required to keep up with the needs of the company.  Given hindsight, I should have made a simpler way to define the roles, as that was the biggest stumbling block for people who worked on the system later, and I was maybe a tad too over-aggressive with the memcached use, but besides that, it's held up extremely well.  Perhaps those who are now maintaining it will disagree with that assessment.

Tuesday, June 17, 2014

My thoughts on technical interviews

If you know me at all, you've probably heard me rant about technical interviews.  For those who don't, perhaps a little background is in order.  I didn't finish my CS degree (well, technically I was enrolled in Computer Engineering, but I didn't finish that either).  I started working in the dot-com boom of the late 90s to save up some money to go back to school, and then I started making too much money to qualify for aid, but not enough to actually pay for school.  So, I kept working.  I've consistently been a top performer on each team I've been a part of, and I've written some fairly complex systems over the years.   However, I've always worked with web-based technologies and always in dynamic, high-level languages (Perl, Javascript, Python).  Undoubtedly, if I had finished my degree, I'd have probably made different choices as to how to approach those problems, but for the most part, my lack of degree hasn't prohibited my ability to write good software (several people with degrees, some advanced, have complimented me on my code over the years).  So, when I rant about interviews, understand where I'm coming from.

The problem with technical interviews is they are almost solely focused on scholastic CS knowledge or trivia.  Despite all the evidence to the contrary, people continue to insist that this is an appropriate way to interview programmers.  Before I explain how I think interviews should go, I want to provide a few thoughts on what's wrong with the current process.

Google says there's no correlation between interview and job performance

Google, a company well-known for its difficult technical interviews, and also a company well-known for making data-based decisions, keeps track of interview performance statistics and tracks them against on-the-job performance.  They've concluded that there is absolutely no correlation between the two.  This means that as a measure of ability to perform on the job, the types of questions that Google asks (mostly advanced CS style problems) are useless.  If you don't believe me, believe it straight from the mouth of the person who tracks this data at Google.  In an interview with the NY Times, SVP of People Operations Laszlo Bock said:
Years ago, we did a study to determine whether anyone at Google is particularly good at hiring. We looked at tens of thousands of interviews, and everyone who had done the interviews and what they scored the candidate, and how that person ultimately performed in their job. We found zero relationship.
See http://www.nytimes.com/2013/06/20/business/in-head-hunting-big-data-may-not-be-such-a-big-deal.html for the full synopsis, there's some other gems to be found.

I fully agree with this assessment (it's hard to disagree with the data, anyway).  The ability to solve complex CS questions in an interview setting is so far removed from what someone will encounter in a day-to-day programming job that I don't even bother asking these sorts of questions when I do interviews.  It tells you so very little.  I've worked with people with advanced degrees that ace the interviews, then they peter out on the job because they're bored with the mundane tasks they get assigned.  The mentality of someone who enjoys building a software product vs those who enjoy understanding advanced CS theory is vastly different.  If you can find someone who does both well, then kudos to you.  I've only met maybe a handful of them in the last 15 years.

Stress is the mindkiller

Interviews are stressful.  Most introverts don't do well under stressful conditions.  You might say "well the job will be stressful", but it's not the same kind of stress.  Many introverts, myself included, view an interview as more of an amalgamation of an interrogation and a test.  Under that sort of scrutiny, it can be overwhelming to be asked something that we don't readily have an answer to.  Even after conducting a few dozen interviews myself, I have to subvocalize reminders to myself when I'm interviewing to remember that not knowing the answer is ok.  This took a long time for me to realize, and I still don't always remember.  At a minimum, you should help the interviewee feel comfortable before digging in to the difficult questions.  Even better, if you can see they're totally stumped, rather than beating a dead horse and turning a bad interview into a nightmare, move on to another question that might be more apropos to their experience.

I know he can get the job, but can he do the job?

The person coming in has no idea what sorts of questions they're going to be asked, because frankly, most interviewers ask about things completely irrelevant to the job they're hiring for.  I once got asked how to reverse a string in place for a PHP developer job.  With my background in Perl, the answer is simply: reverse $string.  It's built in to the language.  This wasn't acceptable, obviously, because it's a programming interview.  Having never encountered this problem before, I got stuck for a while, and then finally worked it out after one of the interviewers reminded me that in C, strings are arrays of characters (a fact I'd forgotten over the previous decade of working with Perl and Javascript, where strings are built-in types that can't be manipulated as arrays).  Looking back on it now that I have solved it once, it seems rather silly that I couldn't figure it out quickly, but that's how interviews are.  Chances are the interviewer took much longer than an hour to learn what they're quizzing you on, so how is it fair to expect someone to figure it out in less time than that, under the scrutiny of the interview?   I quickly answered much more challenging problems, so at the wrap up one interviewer even called that out and asked me why I struggled with what he considered to be a basic problem while acing more difficult ones (I found the others to be trivial, given my experience).

For another job, the interviewer asked all sorts of in-depth theoretical questions like "how would you design a system to determine addresses within a certain radius of a certain address, given a billion addresses".  I sure expected to be doing super-complex stuff when I managed to get the job (despite not answering those questions well, I did well enough otherwise that they took a chance on me).  Guess what I worked on for my first project?  Nope, not the next Google Maps, not even the next Gmail, we added a blog feature to our website builder.  A simple blog: posts, comments, tags.  That's it.  And since it was assumed that I couldn't do anything complicated, the boss pre-wrote all the model layer for me.  Nevermind that I'd done this sort of thing for years, and went on to fix several bugs in his implementation as well as add new features and optimized some of his queries.

So, try to keep the technical questions to things relevant to the job they've applied for.  A web developer might not know what a directed acyclic graph is, and they really shouldn't need to to be effective.  Likewise, an embedded systems programmer might not know how to optimize the delivery of web assets to the browser.

To whiteboard or not to whiteboard, that is the question

Whiteboarding code or pseudo-code is fairly common in programming interviews, but I've never had a good experience with it, on either side of the table.  It's hard to write code on a whiteboard.  Save the whiteboard for designing object hierarchies or other high-level concepts, like the things you'd actually use a whiteboard for in your real job.

Some who agree with the uselessness of whiteboard coding have moved on to something which is, IMO, far worse: live-coding.  I'd say never have someone live-code, but if you are going to, at least never have a person live-code on a machine that is vastly different from their normal experience.  Ask ahead of time and make sure you get their editor of choice and OS of choice available to them, and try to avoid awkward hardware like tiny keyboards that they might not be used to. Programmers spend a great deal of time optimizing their environments to their particular taste.  Asking them to just sit down and code in a foreign environment under time pressure is a recipe for disaster.  

Let me share another anecdotal bad example.  I once had an interviewer sit me down, and without spending any time on pleasantries apart from "this is my name", handed me his tiny laptop and gave me the choice of vim or emacs to start coding in.  I can get by in vim, but it's hardly my editor of choice, and I have a hard time typing on tiny keyboards.  He and the other interviewer (who said maybe two words the entire time), sat on either side of me and watched over my shoulders as I coded solutions to their trivia problems.  The first one was relatively easy, and despite the environment, I knocked it out pretty quickly.  After that, he gave me what he described as an "NP hard" problem (note: using terms like "NP hard" make you sound like an asshat).  The solution to this problem was only sort of complex if you've solved it before, but having never had a problem even remotely similar to this, I was stumped.  When I'm stumped, I clam up for a bit.  In real job situations, I would turn to google for a bit, and if still stumped, I'd engage coworkers.  In an interview, I have trouble discussing the problem, since I don't know the interviewer and I'm horribly shy around new people until I know where they're coming from.  So, I futzed my way through it, and we ended the interview with my having a vague idea of the direction we needed to go, but nothing even close to working.  It took me a while to recognize the need for recursion, despite using recursion fairly frequently in my work.  Needless to say, I didn't get the job.  They actually interrupted the next interview (which was going shockingly well) and sent me packing.  Another note, don't do this; it's humiliating to all involved.  The people interviewing me, the person who had to escort me out, and obviously me.  Given that experience, I wouldn't ever apply to that company again, and I've steered more than one colleague away from them.  Frustrated by my inability to solve the problem,  I spent an hour the next morning in my own environment and hashed out a solution to the problem.  Not to toot my own horn too much, but I seriously doubt the interviewer solved it that quickly the first time they encountered it.

People can only implement algorithms they've already seen


I always see the justification behind doing live-coding or whiteboarding in interviews as that people need to understand the "basics" or they won't be able to handle the bigger things. There's two faulty assumptions in these statements.

1. What you consider the "basics" is not universally true.  Some people consider set theory and graph theory to be basics.  Other people consider advanced data structures or algorithms to be basics.  Given that the interviewee is not you, what they consider necessary knowledge differs, and you shouldn't expect them to have all the same experiences as you.  Someone who gets stumped on a question about data structures, but has only worked with higher level languages, might not be a bad hire, they've simply never had to implement what you're asking about.  Talking to them about building distributed systems, if that's something they've spent a good deal of time doing, is a much better way to gauge their level of skill.
2. If the person hasn't seen the pattern before, they aren't going to be able to solve it under the stress and time constraints of an interview.  Human brains are pattern matching machines.  So, if the interviewee has seen a pattern before, even if the problem was different, they might well be able to solve your problem with a little prodding.  If they've never seen it before, no matter how much prodding you do, they aren't going to be able to solve it.  On the job, they'd have additional resources to discover the solution, so having them do it in an interview setting doesn't give you any indication of how they'd do it on the job.  I've read about this issue in several articles and books, but I don't keep notes like I should, and I can't find the references to share.  Perhaps I'll rediscover them and update this at a later date.

But, without CS trivia and whiteboard coding, what is left?

Given that I think that most traditional programming interview techniques are largely worthless, you may be wondering how one should actually conduct a programming interview. I basically consider this a three step process:

Code is worth a thousand words

The first part of an interview should involve evaluating code the candidate has written in a normal setting.  You should get this code ahead of the interview, evaluate it beforehand, then discuss your findings with the candidate during the interview.  

As for how to get the code, there are basically two options:

1. Ask for a code sample of something they've worked on that they're proud of.  A github profile is ideal, but barring that, they can email in some code. It shouldn't matter if the code is in a language that isn't used at your job.  Good programmers can pick up new programming languages.
2. If they really don't have any code to share (this should be a red flag, but is not always a nonstarter), then give them a problem to solve in the domain of the job they're applying for and a set amount of time to complete it.  The problem should not take an experienced developer more than a few hours to solve, so keep it basic.  Again, avoid CS theory quiz type problems.  Use something practical (if the job is working with REST APIs, have them write a client for an existing REST API, for example). Now you have a code sample.

Given the code sample, you can get a feel for how the programmer thinks.  Look for things like how they structure their object hierarchy, how well they comment their code, how readable it is, does the API to their objects make sense, is it consistent, are there any glaring performance issues, does it actually work, did they write tests, etc.

Dig into their experience

From the same interview with Google mentioned above, there's this gem:


Behavioral interviewing also works — where you’re not giving someone a hypothetical, but you’re starting with a question like, “Give me an example of a time when you solved an analytically difficult problem.” The interesting thing about the behavioral interview is that when you ask somebody to speak to their own experience, and you drill into that, you get two kinds of information. One is you get to see how they actually interacted in a real-world situation, and the valuable “meta” information you get about the candidate is a sense of what they consider to be difficult.

This jives with my experience conducting interviews.  Everyone has some experience working on a programming project if they're applying to be a programmer.  For recent grads, that might just be a project they did for school.  Regardless, having someone speak to their experience, and drilling in for details on parts you find interesting or disturbing, is a great way to get a feel for how a potential candidate thinks.   Go in depth on their design decisions, what other approaches they considered, what they'd do differently now, etc. This will tell you far more about their abilities than any CS knowledge question possibly can.

Determine cultural fit

Last, but certainly not least, is to determine cultural fit.  Is the candidate going to work well with your team?  Do they react badly when confronted?  If you're a startup type company, do they work well with shifting priorities and quick iterations?  If you're a huge behemoth, do they work well having to coordinate with 15 other departments just to get their work done?  Do they have an agreeable personality that you would be able to work with?  I once gently corrected an interviewee, who then went on to tell me I was wrong (I wasn't).  Despite my objections, he was hired, and he lasted maybe 6 months before he left.  He was abrasive and people weren't sad to see him leave.  He left because he didn't think people listened to him (they likely didn't, because he was wrongheaded a lot and didn't react well to constructive criticism).  This one is harder to accomplish than the previous two, as some people are experts at hiding their personalities in interviews, but do your best to weed out the people who aren't going to fit in on your team, no matter how technically brilliant they might be.

The exception to the rule

Every rule has exceptions, so let me state that there are exceptions here.  If your team is developing a programming language, or designing a new data storage system, or working in embedded systems, the normal types of programming interview questions probably apply.  However, these sorts of positions are exceedingly rare, yet these sorts of questions are exceedingly common, so there's a huge disconnect here.

Friday, June 13, 2014

Binary units vs SI units

As a follow-on to my first post, I'll do a quick explanation of the binary vs SI prefix thing I mentioned.

Quick question, is a kilobyte 1000 bytes or 1024 bytes?  Is a gigabyte 1 billion bytes or 1024 * 1024 * 1024 bytes? If you said 1024, you're actually wrong.  That a kibibyte (or gibibyte for the billion version).  Early Operating Systems used a binary approximation of 1000 as 1024 in order to make calculations more efficient, but kilo is a standard term meaning 1000 (kilogram, for example).  They supplanted the term and thus was born the 1024 byte "kilobyte". The industry caught back up at some point and renamed the binary approximations, but there's still a lot of confusion out there because memory manufacturers and a lot of software still refer to the binary approximations with the standard prefixes.  1GB of RAM is actually 1 GiB (gibibyte).  Interestingly, network bandwidth and hard disk capacity don't suffer from this problem.  1Gb of bandwidth is one billion bits.

So when your computer says your 1TB hard drive is really only 850GB or whatever, it's your OS that's wrong.  Quit blaming the marketing departments of hard disk manufacturers because Microsoft can't bother to read a spec.

More info at wikipedia: http://en.wikipedia.org/wiki/Kibibyte

Trying to do a tech blog

Despite being in the industry as a developer for 15 years, I've never bothered to write a tech blog before.  Recently, a coworker sent out an email explaining the difference between binary and SI units in measuring memory and disk size, etc.  I didn't realize this wasn't common knowledge for developers and sysadmins at this point, but several people thanked him for letting them know something entirely new to them.  That made me realize that maybe I do know some things worth sharing, so I'm going to try to write occasionally about some subject of interest to me in a hopefully easy to digest fashion.  

Hopefully it won't be too dry.