Will the AWS outage make retailers think twice about cloud?

Will the AWS outage make retailers think twice about cloud?

Amazon Web Services (AWS) has become the backbone of so many business-critical tools that it’s almost impossible to imagine the internet without it. So when AWS experiences a cloud outage like the one that occurred yesterday, the repercussions can be highly disruptive — and not “disruptive” in a good way.

The Amazon S3 storage service experienced problems Tuesday afternoon that affected a range of websites and services, as reported by TechCrunch. The S3 service hosts images, full websites and the back-ends of apps. Well known websites, including Business Insider and the file-sharing portion of Slack, were affected by the outage. Because the back-end of Nest is hosted on S3, users were reportedly unable to control their IoT home devices such as thermostats.

Cloud adoption can be a thornier issue in retail than in other spaces. With tech startups, for instance, leveraging the cloud provides a way to build out a tool without investing in expensive in-house infrastructure and the IT talent to maintain it. But retailers have been slower to adopt cloud solutions due to concerns over potential business disruption while migrating business-critical applications from legacy infrastructures. Other worries include being dependent on a third party for maintaining data security and concerns about ownership of business-critical data.

However, a study by IHL Group cited by RetailWire indicated that retailers plan to invest more of their software budgets on cloud solutions this year than last, intending to spend 34 percent of their budget on the cloud in 2017, up from 26 percent in 2016.

It is unclear whether there will be any long-term impact from yesterday’s outage. The AWS Service Health Dashboard indicated that the “increased error rates for Amazon S3” had been resolved as of 2:08 PM PST.

There have been a few widely reported AWS outages in recent years. In September 2015, a five-hour outage disrupted big-name websites like Netflix, IMDb and Reddit, as reported in The Register. And in 2016, weather-related power outages, backup generator failures and a technological glitch led to an extended outage in Sydney, Australia.

BrainTrust

"IT disasters can and will happen. Retailers need to work with technology partners to devise a risk mitigation strategy."

Ralph Jacobson

Global Retail & CPG Sales Strategist, IBM


"The most frustrating thing about yesterday’s outage was the inability to communicate with users to let them know there was an issue..."

Larry Negrich

Director, SaaS Marketing, Zebra Technologies


"...the CIO has a job … and part of that job is to insure redundancy, backups, alternatives, etc."

Paula Rosenblum

Co-founder, RSR Research


Discussion Questions

DISCUSSION QUESTIONS: Should an AWS outage like the one that happened yesterday give pause to retailers considering cloud migrations? How should retailers judge the risk of using cloud services against outages and inefficiencies in their own in-house systems?

Poll

21 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
Bob Phibbs
Trusted Member
7 years ago

It should give all of us pause, not just retailers. If that had been Visa, MasterCard or Amex out for four hours it would have been bedlam. The lack of transparency as to why it occurred in the first place makes all of us wonder how vulnerable commerce is in yet another way due to Amazon.

Sterling Hawkins
Reply to  Bob Phibbs
7 years ago

There’s simply not a better alternative than relying on a large hosting provider for the vast majority of companies. The cost, risks and expertise required to bring something like that totally in-house are far in excess of the rewards. And online activities are now critical for retailers and consumers. Sure, it’s a moderate risk and it needs to be managed alongside all other business risks.

Ken Lonyai
Member
7 years ago

By its nature, technology is prone to these events. No one will be 100 percent safe from them even in-house, so it comes down to evaluating the particular vendor providing IT services vs. the cost of doing the same in-house.

What’s more important is having a hardened and tested contingency plan for when the inevitable does happen, including taking ownership of the situation to customers.

Ian Percy
Member
Reply to  Ken Lonyai
7 years ago

Ken, you know I always agree with you … but on this I have a slightly differing opinion on as expressed in my post.

Ken Lonyai
Member
Reply to  Ian Percy
7 years ago

Nah… we don’t disagree. We’ve chatted about fault testing for software, but the reality is that in the wild, most software is not 100 percent hardened, nor is hardware. Hardware can fail at a moment’s notice despite testing. So, given the “reality” of most software/hardware deployments and the fact that there’s no evidence that retailers with in-house IT are any more hardened/fault-tolerant/secure than vendors, which is what I was commenting on, our positions are pretty much the same.

Adrian Weidmann
Member
7 years ago

The short answer is no. That being said, I would caution retailers to understand the complete cloud infrastructure and workflow of where their assets are flowing. Target would not be happy if their business processes were flowing through Amazon’s infrastructure. I have advocated leveraging private cloud infrastructure in specific instances to ensure privacy and competitive advantage.

Brian Numainville
Active Member
7 years ago

What was particularly interesting was how even Amazon was perplexed at how to communicate with customers. My wife had bought a Kindle book and it went through three times as three different orders. When I called, at the tail-end of the day, the customer service rep at Amazon said that they were upgrading their system and asked if I could please call back in two hours. Really? Of course they had to put some spin on it but the whole world knows there is a much bigger problem than some upgrades! This event does indeed paint a picture of some much bigger vulnerabilities.

Cathy Hotka
Trusted Member
7 years ago

Could a similar outage have happened at a retailer’s own website? Of course. Bob Phibbs makes a good point about transparency, but I’m guessing that there won’t be any long-term ramifications.

Ricardo Belmar
Active Member
7 years ago

I wouldn’t say it will slow down the adoption of cloud by retailers or other enterprises — the march to the cloud is inevitable at this point. However, it WILL (and SHOULD) cause everyone evaluating AWS and any other cloud provider to ask more questions regarding internal maintenance processes, troubleshooting procedures, transparency, accountability, etc. as part of the selection process. A retailer implementing new systems in their data center would carefully plan, evaluate and ask these questions to themselves, so why not to their cloud provider? Especially when we’re talking about real business impact for failure.

Now I might ask the question differently about retailers relying on AWS in the first place — given that the more business they give AWS, the more they are “feeding the beast” that is Amazon in general! Do retailers want to prop up their competitor this way? Or should they be considering Microsoft Azure or others with higher priority?

Ralph Jacobson
Member
7 years ago

IT disasters can and will happen. Retailers need to work with technology partners to devise a risk mitigation strategy. Also, they must research the stability of ALL cloud service platforms for their long-term stability histories. There’s nothing here that should halt retailers’ migration to or continuance on cloud. This is simply a matter of due diligence in their technology provider selection.

Brandon Rael
Active Member
7 years ago

Certainly the concern is real, not only for all retailers but all businesses and organizations that are dependent on Amazon Web Services. The monopolization of Amazon in this space demands that there is transparent guidance as to what/how how this outage occurred and that they take steps to alleviate the risks associated with mass outages.

While companies should formulate their contingency plans with their IT and consulting partners, Amazon should equally take accountability and have a risk mitigation strategy in place for any mass outages. So many entities are dependent on their services, especially in our forever-connected world of commerce.

Tom Redd
Tom Redd
7 years ago

No. Cloud is the direction as long as it is a pure platform and apps were built for it. Depending on the business and the data management system used, there are ways to assure a non-stop style solution so that a fully-operational back up with real-time data is available. Cloud platforms are not the risk — the solutions that businesses run on them are the risk.

Larry Negrich
7 years ago

Yesterday’s four-hour outage, and outages in the past, should be a wake-up call to any business that exclusively uses AWS. AWS likely has a number of failsafes built in but each company should have some downstream capability that, at the least, allows for some communication with users/shoppers. The most frustrating thing about yesterday’s outage was the inability to communicate with users to let them know there was an issue with Amazon’s service. If you are building a retail business that relies on a greater percentage of revenue being generated from online sales, then outages will negatively impact revenue.

Paula Rosenblum
Noble Member
7 years ago

You know, I am fine (more or less) with moving functions to the cloud, but the CIO has a job … and part of that job is to insure redundancy, backups, alternatives, etc. It should be part of a BOD’s fiduciary responsibility to insure this is all in place. Just like back in the day, a disaster recovery plan had to be in place (complete with hot site/cold site, etc.) different levels of recovery plans must be in place in the age of the cloud.

Hey kids, this is not the vendor’s job, it’s YOURS. BOD-on-down. It’s YOURS.

Shep Hyken
Active Member
7 years ago

An outage similar to what happened at AWS shouldn’t make a retailer think twice about the cloud. It should make them think twice about the backup plan they have. It’s going to happen again. If it was a big disruption to the business this time around, will it be a big disruption next time? Plan ahead!

Ian Percy
Member
7 years ago

I had to read this several times and it seems to me something is missing. The implication of the word “outage” implies that there was a power failure somewhere and, if so, more than Amazon would have been affected. But the words “power failure” appear only in the reference to a 2016 problem in Sydney.

So was it an electrical power problem OR a software fault? If the latter, why don’t we admit it? By far most computer failures are caused by faulty software costing the economy almost $60 billion annually. Horror stories about software failure abound, just look up “The ten worst software failures in history” as starting point. Not for the faint of heart.

Why, today, do we still have software failures? It’s because developers won’t recognize that fault-free software source code IS possible. As Lawrence Paulson wrote for the Association of Computing Machinery: “The software development industry claims it is simply too difficult to build correct software. But such a position looks increasingly absurd!” Amen to that!

Di Di Chan
Di Di Chan
Member
7 years ago

When there’s an electricity outage, most people don’t reconsider going back to using the dependable fire as their primary source of light and heat. When there’s an internet outage, most people don’t question if they should go back to using the library as their main method of research.

Yes, AWS outage was annoying. Yes, it disrupted many businesses and workflow for about five hours yesterday. However, compared to the typical recovery time of the more common electricity, internet, or even weather-related outages that people and businesses are used to accepting (and adapting to) as part of life, AWS recovered pretty quickly without much praise.

The problem isn’t if the cloud technology is still a worthwhile upgrade, the problem is developing unrealistic expectations that technology has to be either perfect or nothing. Precisely because the benefits of cloud technology is so convenient and advanced, that any disruption feels disastrous and unacceptable. I don’t think AWS outage should make retailers think twice about the cloud anymore than they would think twice about electricity, internet, or bad weather.

Ken Morris
Trusted Member
7 years ago

I don’t think it should slow retailers’ cloud move. The market really needs to understand the definition of cloud in retail. Retailers are embracing a hybrid cloud (on and off-premise) approach. Retailers have already embraced this hybrid approach with credit authorization many years ago. Today’s new network technology (SD/WAN) gives retailers a low cost, super-redundant, 100% up time SLA networks that actually cost less than their MPLS backbones. Real-time retail will not be slowed by this Amazon blip as cost savings will drive retail to embrace the paradigm.

Phil Rubin
Member
7 years ago

Cloud computing, done right, is still reliable and the economics are such that retailers have no real choice. That said, and as Paula Rosenblum aptly points out, retail leaders need to be responsible for their own technology stacks, disaster recovery, backups, switchovers, etc. These leaders are retail executives and vendor executives. No technology is perfect and just like any other “vendor,” beyond the product itself it’s the people behind it that matter most.

gordon arnold
gordon arnold
7 years ago

There is a perceptible absence of how this down time has affected the consumers’ near- and long-term confidence in e-commerce reliability in business continuity and security. This is and always will be a concern that must have ownership of of the highest priority in any and all aspects of retail business information technology systems design and use.

The race to cloud computing to save investment dollars must be a decision that is weighed with the concerns of the consumer. While the various IT mavens were hard at work to fix a preventable crash, you may rest assured that many of the consumers were looking for an alternative business solution. Now those concerns are in the public eye by means of the turmoil of word of mouth and speculative reporting. AWS gets a pass on this by injecting aimless rhetorical comments to deflect the concerns and possession of responsibilities of this crash. They know full well who owns the backlash from consumers — the retailers.

As retail continues to expand their IT system capabilities, it is imperative for decision makers to understand the needs of the company’s near- and long-term hardware and software investments. The use of proven fault tolerant and expandable infrastructure with a fully funded and staffed support and maintenance plan will greatly reduce the chances of a crash like this.

Harley Feldman
Harley Feldman
7 years ago

Retailers should continue to use cloud-based solutions. The benefits substantially outweigh the cost and risks. The article cited two outages in the past two years, probably less than the number of outages for the retailers’ in house systems. The advantages of cloud services are substantial: sharing development and infrastructure costs over many users, being more configurable to meet the demands of the retailers business cycles and growth, utilizing the latest technology and taking advantage of upgrades as delivered by the service provider, and outsourcing much of the IT labor to the provider. Since the odds of a failure are so low, and the advantages so large, it only makes sense to move many IT services to the cloud.

Assessing the risk should be done by evaluating the lost business and good will that will come from an outage and weigh that against the benefits from outsourcing the labor and technology development to the service provider. On the lost business side — in this case because Amazon had so many businesses down — the retailer’s customers will rationalize any lost productivity in terms of wasted time on Amazon’s failure, and it will have negligible effect on the retailer’s reputation. The customers will come back in short order, and retailers should continue their transition of IT systems to the cloud.