Approaches to Performance Evaluation: Microsoft, Culture, and Stack Ranking

Recently, a former academic colleague who knows of my interest in organizational culture brought this article about Microsoft to my attention.  It describes how, under Steve Ballmer as CEO, Microsoft employed a “stack ranking” performance evaluation process.*  Stack ranking, which was first used by Jack Welch at GE, is based on the concept that, in any unit, there will be a somewhat normal distribution of performance, making it possible to rank employees against each other according to percentages of employees deemed as “excellent,” “good,” “average,” “below average,” and “poor” performers.

The author of the article bemoaned the way this ranking system created an unintended non-collaborative culture, as employees become obsessed with avoiding being shunted into one of the low categories in comparison with his or her peers.  Developers don’t want to work with star developers, for fear of dropping in the rankings and employees withhold information to hold back progress their colleagues might be able to make had all information been shared.  Talk about a counter-productive rewards system!

My former colleague, though, who had some familiarity with W.L. Gore & Associates, remembered that Gore also ranked  Associates against others on the same “contribution list” in terms of their contribution to the success of the Enterprise.  He wondered if the Gore process had a similarly deleterious effect at Gore.

No performance or reward process is perfect, or without drawbacks, but I responded that I thought the Gore system of “compensation for contribution” and their contribution ranking process was perceived by Associates as reasonably fair, and in fact, strengthened rather than undercut the culture of collaboration so prized at Gore:

Actually, Gore’s (approach to ranking) is somewhat different (from Microsoft’s), and with good effect.

The similarity is that at Gore everyone is ranked in a particular function or project or business. Rankings go from top to bottom.  Associates are ranked on the basis of ‘contribution,’ which is loosely defined.  And, Associates rank each other.  The ranking lists, or contribution lists, are then used to determine compensation.  Ideally, those ranked the highest would get paid the most, and those ranked lowest would get lower pay.

This does not translate into ‘excellent’ or ‘poor’ performance, necessarily.  A seasoned engineer may be ranked highest.  A first year out of college engineer may be ranked at the bottom.  But that doesn’t mean she is not a contributing Associate, and has no future at Gore.  It just means that, at this early point in her career, she is not perceived as contributing as much more senior engineers.

The only time someone at the bottom of the ranking list gets identified as a poor performer is when he has been at the bottom of the ranking list for several years in a row, performance improvement programs (with the help of his sponsor and leaders and HR) have been put in place, and there is still no perception of increased contribution.  At that point, an Associate might be encouraged to look for some other commitments, or even to leave Gore.

I think most Associates feel the system is pretty fair.  There are complaints about whether I have to rank people I don’t know well (no, you don’t), or whether others who don’t know me well might be ranking me (yes, they might–we have to trust their judgment about their capacity to rank you).  There is a contribution/compensation committee (usually of leaders and senior Associates) that reviews all the ranking information, and makes a judgment about whether some rankings are outliers or not, and can adjust the rank of any Associate.

In fact, because all Associates rank each other, often the Gore system has an effect opposite of the presumed effect of MS system.  Instead of encouraging one-upmanship and politics, the contribution process tends to punish one-upmanship, and rewards cooperative and collaborative behavior.

Indeed, those who collaborate well and bring projects to successful completion will be ranked positively, even if they are not the stars on the project team.  At Gore, there is a belief that no one succeeds on his or her own, and that each can make unique and complementary contributions that help us all “win.”  And those who withhold information, especially in ways that limit the contributions of others, will be ranked negatively.

As I see it, the Gore ranking system works well, generally speaking, for several reasons.  For one thing, the inclusiveness of many inputters on the assessment of contribution usually keeps Associates free from the “tyranny of the boss,” or the single individual who has the capacity to label someone as an excellent or poor performer.  The review by the compensation committee also provides a check-and-balance in those situations where a pleasing personality might be confused with contribution, or where contributions may be unknown to most of the members of the ranking list (which sometimes happens to senior Associates whose many contributions pre-date many of the inputters).  The review by the compensation committee also has the upside of highlighting for significant leaders up-and-coming Associates whose performance might be off their radar screens.  And it can also highlight brewing performance issues that might be a surprise.

Gore’s system is not perfect, and the Enterprise is experimenting with tweaks to the Associates ranking process.  One experiment has inputters rate those on the contribution list in categories of high-to-low contribution (which, when averaged across all inputters, still gives a ranking—and which still doesn’t necessarily translate into excellent-to-poor performance).  Another experiment involves rating Associates in terms of their practice of the culture—to make sure that “high contributors” are also those who are not only judged for “what they do” but “how they do it.”

There are better and worse ways to rank performance, but in the Microsoft example, the approach led to non-collaboration, which can only ever be harmful to an organization’s culture and overall success. Ideally, performance ranking and rewards systems encourage a collaborative culture, and that, in my opinion, is what Gore’s “compensation for contribution” manages to achieve.

*Amid much criticism and employee dissatisfaction, Microsoft has since discontinued the practice of stack ranking.

5 responses to “Approaches to Performance Evaluation: Microsoft, Culture, and Stack Ranking

  1. great review Michael. Two comments, for emphasis mostly (and I underscore and will borrow your “far from perfect” caveats)….for me, the core, or heartbeat, (would “soul” be too hyperbolic?!) of Gore’s ranking process is what we do with the output of the rankings: we talk about them! we try to understand the nuance of an associates contribution, not simply observed output. the dialouge that good committees have (and the fact that there are committees at all!) is likely what differentiates this process the most.

    my second reflection, prompted by your post: historically (and I believe currently, though for sure philosophical consistency is among the challenges) committees, are not bound by considering only the past 12 months of contribution. the idea of past, present, and future contribution is, for me, a huge differentiation from other performance assessment processes. I think it’s also among the factors of this process that keeps us aligned w our values and principles.

    thanks for the forum to reflect!

    • Jane, thank you for your thoughtful points about the Gore contribution and compensation process. Great upgrades! Your comment about how the rankings are used–“we talk about them!”–had me make a connection to another Gore process–the annual culture survey. In both cases, Gore gets input from LOTS of Associates, uses fairly sophisticated statistical procedures to help tease out the main messages in the data, and then “we talk about them!” So unlike those organizations that promote “data-driven decisions,” it would seem more appropriate to say that at Gore, “data drives discussion, and discussion drives decisions.” What gets captured in that phrasing is that at Gore, regardless of how “statistically robust” the data are, judgment is always required. And collective judgment is almost always honed by open, constructive dialogue. Michael

  2. Stack ranking appears to me to be both a fair and just method for rewarding contribution. I am very unforgiving to organizations that just provide a 1-2% across the board increase, without applying those funds to the most productive individuals who are not satisfied with mediocrity.

    • Michael, I agree with your sentiment about across-the-board increases, especially if that approach is taken to avoid “hard” decisions about who is actually contributing a lot and (more commonly) who isn’t actually pulling their weight. I think the issue about stack ranking that is more problematic is when the leader (boss, manager) makes the decision alone and without adequate input from an employee’s peers and co-workers who probably have a much more complete and nuanced view of contribution (or not). Richly rewarding someone the boss likes can be destructive if the boss’s perception of contribution and value doesn’t square with the the broader view of the individual. When that happens, then it is often not “performance” but “poetics” that is perceived as the way to rewards and success. But back to your point, NOT richly rewarding those who ARE making significant contributions is equally damaging.

  3. Pingback: Job Performance | The Tale Of Humoural Medicine·

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s