Is it just me, or do everyone's eyes in this Amazon video have bags under them?

So, basically, there is a significant subset of teams at Amazon that are maintaining mission critical customer facing systems (Tier 1).

What often happened was this: at some point in the past, some team built the original service, system or platform and likely did a pretty fine job of it. When the problem was largely or entirely solved for a time, the bulk of the work was done and the product transitioned into more of a support/maintenance mode, a lot of them moved on to other things (inside or outside of Amazon), because that's what good developers tend to do when they get bored. At that point, maintaining the system was often left to teams full of fledgling managers and inexperienced developers (usually college hire SDE1s). This is particularly common (and severe) at Amazon because they do not hire a large number of experienced and capable DevOps and SRE types who would normally do this job at other companies. They hire SDEs to do it instead (which will make more sense in the following paragraph).

Inevitably, the system reaches the limits of its original design for one reason or another--because somebody thinks it needs new features, it's running an aging stack that somebody needs or wants to replace for some reason, it was never intended to scale beyond some level, and any number of other reasons. At that point, some VP or director or somebody else starts to apply pressure to add new features to the original, or to rebuild it to a new specification. Only now, instead of an experienced team of SREs who know the system and would be capable of that task, you have a bunch of college hires faced with a code base they scarcely understand, and potentially incompetent managers who want to crack the whip to "prove themselves" so they can move up or over to another team. So they start to hack away at it, piling on mounds of hacks and technical debt which then destabilizes the system and contributes to an increased ops (on call) load, along with a corresponding increase in working hours. People naturally burn out and quit, so they're replaced with more inexperienced people and whatever small amount of knowledge they managed to accumulate is lost once again. Nobody smart sticks around long enough to really bring things under control, and what has become a truly vicious cycle continues almost unabated.

A lot of companies (including Google) have the exact same problem, but most of them have simply done a better job of addressing it (usually with a different staffing model).

To be fair to Amazon, what I just described is not the company-wide norm...it's just common and significant enough to have produced a lot of disgruntled and very vocal former employees. There are other teams that have intense workloads due to external competitive pressure (certain parts of AWS, etc), but they make up for it by giving you genuinely cool stuff to do, and you'll see the same thing within certain orgs at Google, Facebook and Microsoft anyhow. There are also a lot of teams that are genuinely great to work on across the board (relaxed workload and cool stuff). Mine was one of them; I worked wonderful hours on brand new technology and had an altogether great experience.

/r/cscareerquestions Thread