Why do IT service outages keep happening at retail?
Photo: RetailWire

Why do IT service outages keep happening at retail?

Earlier this week it was reported that Target’s checkouts, website and mobile app went down as a result of a technical glitch. It was the third such outage for the retail chain in as many months and, while the glitches have not appeared to have had a material effect on Target’s earnings to date, these incidents have to be concerning to management, at the very least.

While Target has made headlines with its outages, it is far from being alone when it comes to IT-related disruptions, according to a new survey of 300 IT decision makers by LogicMonitor. The study found that 96 percent of organizations have experienced disruptions, with the typical organization surveyed experiencing five outages and five brownouts over the past three years. Ten percent of organizations had 10 or more outages and brownouts over the same period.

“Organizations today are increasingly dependent on the availability of their IT infrastructure,” said Gadi Oren, vice president of technology evangelism of LogicMonitor, in a statement. “A single IT outage can have huge negative business impacts including lost revenue and compliance failure, as well as decreased customer satisfaction and a tarnished brand reputation.”

The biggest costs associated with IT glitches, according to the survey, are lost revenue and productivity, compliance and mitigation costs, damage to the brand and lower stocker prices for publicly traded companies.

Survey respondents pointed to six common causes for slowed or downed systems. These included network failure, usage spikes, human error, software malfunction, hardware failure and third-party outages.

According to the survey’s participants, 51 percent of the outages and 53 percent of the brownouts they experienced could have been avoided. The two biggest misses when it came to avoiding IT service disruptions were a failure to notice when usage neared “danger level” and when hardware or software performance showed signs of degradation. 

BrainTrust

"This isn't just an issue for retail, but it's much more visible to the public when a retailer is impacted."

David Dorf



"Two words: technical debt. Retailers have spent years not spending enough on technology and building a robust future-proof architecture."

Oliver Guy

Global Industry Architect, Microsoft Retail


"Planning, funding, and installing technology are huge issues for retailers. So huge that retailers stop thinking about the issue once the technology is up and running."

Camille P. Schuster, PhD.

President, Global Collaborations, Inc.


Discussion Questions

DISCUSSION QUESTIONS: What do you see as the most common reasons for IT outages and brownouts within retail and consumer-direct brand organizations? How can these incidents be avoided?

Poll

14 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
Mark Ryski
Noble Member
4 years ago

IT is at the heart of all retail business processes, and it’s inevitable that these systems will fail at some point. While the specific reasons for the IT outages are difficult to generalize, there are a number of factors that contribute: 1.) antiquated legacy systems that are difficult or expensive to replace, 2.) limited IT budgets, squeezed due to challenging comp-sales results and 3.) reduced IT resources due to head office staff cuts.

Neil Saunders
Famed Member
4 years ago

One of the issues for some retailers is the fact that systems are, in fact, a patchwork of different technologies and functions added over time. As the demand put on them increases they occasionally fall over. For others, it can be about capacity – which is why a lot of retailers have failures at peak times such as Black Friday.

Art Suriano
Member
4 years ago

A big part of the problem today is that there is less quality checking and testing on just about everything. When it comes to technology, we are in such a rush to get the newest version of software out long before all the bugs are ironed out. Apple’s recent release of iOS 13 is a clear example of that. Unfortunately it has become the new norm, and those working in IT are no different, often due to the pressure from above. When you look at the complexities involved in the day to day activity at a retail chain like Target, it’s a wonder there aren’t more outages. Technology is excellent and I, as most of us do, love the newest and greatest device, software, or app. However we all have to take a step back and try to be a bit patient. We need to accept the fact that either we give all the providers the time they need to make sure everything is working correctly without the bugs or, if we choose to demand everything as soon as possible, we understand that there are going to be problems.

Paula Rosenblum
Noble Member
4 years ago

All the reasons listed are the symptoms, but the cause is definitely as Art describes – lack of quality control. It’s rare that I find a CIO who understands the nuts and bolts of “keeping the lights on” because everyone has become enamored with the need for speed, failing fast, and other ways to race to keep up with the business.

My own opinion, as a former CIO, is that the art/skill is gradually falling into the dustbin of history. Yes, CIOs are supposed to understand the business. They are also supposed to understand THEIR business. That includes testing, security monitoring and pro-active activities, and fall-back plans in the event of outages.

This is not limited to retail. Remember the implementation of ACA? That was an embarrassment because it missed the bases of IT 101 — have a single party in charge.

Ralph Jacobson
Member
4 years ago

I remember POS systems going down in the ’80s. Of course this is nothing new. With today’s increased quantity of systems in stores, as well as increased workloads for those systems, I’m not surprised the challenges continue.

Bottom line, devise “mini-disaster” plans for contingent operations continuity in case of outages. There are guides online you can find with a simple search.

David Dorf
4 years ago

This isn’t just an issue for retail, but it’s much more visible to the public when a retailer is impacted. We’ve become more and more dependent on complex technologies while also reducing IT costs. Unfortunately, I think many retailers look at technology as a necessary evil instead of a market differentiator, and that attitude impacts their funding. Then starved budgets sometimes lead to outages and breaches, both of which end up being very costly.

Ricardo Belmar
Active Member
4 years ago

The underlying infrastructure that supports most store systems, such as the store network, is, unfortunately, an area most retailers fail to invest enough in, leading to these types of outages. Too often retailers address network infrastructure after events such as these rather than looking at this as a preventive investment. The result is that critical apps like POS, CRM, mobile, and any others in the store suddenly run slowly when you least want them to. Our own survey-based research has found that retailers may lose up to 3 percent of annual revenues to slow app performance that leads to outages just like this. At the same time, we found that retailers that invest in the right monitoring and control technologies to prevent these slow-down conditions see a revenue increase up to 6 percent. It’s a significant revenue swing that’s at stake.

For every retailer I’ve spoken with suffering from these problems, I’ve spoken to another who understands how quickly an ROI can be achieved with this kind of investment. It amazes me this isn’t considered by more and more retailers given it’s an easily solvable problem with the right technology that if done right, won’t break their IT budgets. The allure of bright new shiny tech often overshadows investments in network technology until outages like what happened to Target occur.

Ken Morris
Trusted Member
4 years ago

I believe the root cause of these outages is the complexity created by technology silos. Over the years retailers have created islands of automation that have created a support monster. The solution here is unified commerce, one version of software servicing all channels. Until retailers wake up and embrace this vision we will continue to see these types of outages as it is almost impossible to keep this multi-tiered Frankenstein’s Monster running without more people or less complexity.

Oliver Guy
Member
4 years ago

Two words:technical debt.

Retailers have spent years not spending enough on technology and building a robust future-proof architecture. A presentation I saw by an independent analyst a couple of years ago stated: “If you averaged spending 2 percent or less of revenue on IT during any of the last 20 years, you have been under-investing in IT and it is probably the largest threat to your business today.” (Contact me if you want a copy of the presentation.)

But they are not alone, about eight years ago I was talking to a retired friend who had spent her working life in banking. I told her about a retailer I had been working with and how all their systems were badly connected and overall very shaky due to unsupported hardware and software. (The retailer in question had bought multiple hardware spares on eBay to minimize future issues from outdated hardware).

She told me that retail was not alone – she had seen evidence of the same for years in banking, even though she was not in an IT role. At first I did not believe her, then two days later one of the major banks had a major outage that made headlines across the country…

Doug Garnett
Active Member
4 years ago

When companies build themselves too tightly around a technology which is known to have failures, they subject themselves to those failures.

Retailers need to adopt a different stance relative to technology: Today most seem to develop systems presuming they won’t fail. Instead, learn from the airplane industry. Embracing the truth that there WILL BE failure can make everyone safer — including the retailer’s bottom line.

Doug Garnett
Active Member
Reply to  Doug Garnett
4 years ago

From a discussion with @JohnWLewis yesterday: “One of the reasons why aviation is so safe is that there is an underlying assumption that perfect safety is not available, at any price.”

Ryan Mathews
Trusted Member
4 years ago

First of all, these are huge, complex systems, so crashes shouldn’t come as a surprise. Second, they are very visible targets for hackers, and so their integrity is constantly being challenged. And third, from a systems and technology position, retailing is still in a transition period. New applications and reporting functions are being constantly added. New algorithms are being developed every day. The number of AI and IoT possibilities has barely been tapped. And customers, expect these systems to work instantly, perfectly, and certainly without interruption. That’s a lot to ask, and failure is inevitable. And this isn’t even considering the vagaries associated with the legacy power grids these systems depend on. So these “incidents” can’t be avoided, only mitigated. And given the level of demand and innovation, even that is going to remain a stretch goal for years to come.

Camille P. Schuster, PhD.
Member
4 years ago

Planning, funding, and installing technology are huge issues for retailers. So huge that retailers are worn out and stop thinking about the issue once the technology is up and running. However, the job is not finished at that point. Contingency planning for problems and monitoring performance are also critical tasks for long term success. These activities are often thought to be unnecessary when the vendor promises that specific technology will work and the retailer does not want to spend more money. However, the hit to sales and image can be quite expensive as well.

Kenneth Leung
Active Member
4 years ago

A combination of legacy systems being pushed together because of mergers, plus adding on systems to address omnichannel retailing creates a difficult environment to maintain. In addition, the speed of change is reducing the amount of time available for quality assurance and stability testing.