Xbox Live Status: Limited

...continued from above

 

Gmail for example does not go down for a few hours once a week due to limitations in the number of "hardware gates" supporting the system. (Microsoft having only a handful in a similar situation for this week's inadequacy) For an instrumental process component core to Google Gmail, hosting, and gmail system(s) -- it would be absurd after learning decades ago of these general principles for Google to ignore such a glaring bottleneck/point of failure/lack of redundancy.

In having a system that is scalable with hardware adaptable to changing stresses, needs, traffic, and hiccups is a result of core designs and architectural implementations built to be extensible and loosely coupled from the supported product and user end.

Microsoft would have bought a few servers, added routers, or whatever resource necessary to prevent service interruption and provide the service capacity required if it currently had a clean product base on which to scale seamlessly. What we see instead is MS crippled from configuration hell, runaway complexity, and possibly a bit of a domino effect of service interruption where hardware failure or systems fall under quota. Having products, edge cases, and services lazily or hastily attached to specific endpoints means other hardware cannot be repurposed to help ease in unequal taxation, DDoS attacks, or just the excepted hardware hiccups in complex large environment. Engineers fixing the MS landscape might first might restrict or disable some services to stop the bleeding on the now overburdened remaining nodes taking extra work after a failure. Hardware may be readily available but crap hardcoded configuration might mean engineers manually initailizing the system and all attached coupled systems configuration in what is hell for any NOCC admin.

Even if Microsoft buys extra hardware and employs further redundancy down each segment and branch of the system, their poor design cheap/fast construction of the larger system act are akin to a physical disability endured by a professional athlete. Specialized resources and configuration coupling is costly from a redundancy standpoint. In Amazon's case, if 10 clusters at 10 units of capacity each run typically at 50% load, their system is at 50/100. A failure of a cluster whose load is ideally routed to the 9 remaining clusters would see about a 5% increase in this idealistic example. Perhaps one cluster gets repurposed leaving some at 70% to ease the spike in traffic. Point being is having an independent system that can scale means that extra processing is shifted to adapt to the change in traffic.

A traffic spike will inevitably tax a component of the system possibly well beyond double the redundancy capability of a non mutable system. It isn't just about hardware; although Microsoft can and likely has over compensated and added redundancy for specialized portions of their landscape/system to handle component failure or traffic spikes, it is a inefficient investment and one that requires they buy or contact more resources than needed never reaching the Amazon/Google/etc. capacity without effectively doubling and tripling the specialized components involved in that system.

We the garners, other MS customers, and MS themselves breaking service contacts all suffer in the end. Other software developers stare my/our fateful curse of communication and objective leadership plaguing IT today. The push to maximize profit to benefit (ideally) employee salaries, investor returns, and the ability to remain competitive will mean costs will always be cut and money spent with justification in IT, marketing, finance, or any other company area.

Microsoft is good at making money as we all have witnessed from Bill Gates and beyond... much better than myself. Perhaps software design decisions are only 20/20 in hindsight for me; I'm certainly not claiming to hold the business expertise to criticize past decisions naively without context. But after a long train of repeated fuckups from Microsoft without a remedy that any business sees today as justified and fiscally affordable, it is apparent there are much greater technical debts burdening their fixes and developement.

Lastly, Microsoft is not alone in their stagnation - most companies today are behind... however their primary intellectual assets, means to revenue, and a possibly still misguided operating model is not nearly as healthy as ten years prior. After over two dominate monopolized decades, both the MS investors on Wall Street and gamers like us should know better by now.

/rant


I'll still be playing games on both console and on my ever pc (no issues on Steam). This was not a fanboy hate nor dismissal of successes as Microsoft has had those too. The above is just a rant as a developer with little to no proofreading and 20 minutes of rambling... perhaps unrelatable to you and others that may read it.

cheers

/r/xboxone Thread Parent