Amazon imposes 90 day ‘code safety reset’ after outages

Company downplays role of AI-written code, blames 'engineering team user error'

Amazon imposes 90 day ‘code safety reset’ after outages

Amazon is rolling out a 90‑day “code safety reset” across some of its most critical retail systems after a series of outages disrupted customer orders – but the company is pushing back on claims that AI‑written code was to blame.

According to internal documents reported by several outlets, the temporary reset applies to roughly 335 so‑called “Tier‑1” systems that directly affect the customer retail experience, including ordering and payments. The measures are intended to add “controlled friction” to software changes in high‑impact areas while the company strengthens longer‑term safeguards.

During the three‑month period, critical code changes are expected to undergo stricter reviews and more extensive documentation before deployment. Reports citing internal guidance say engineers working on affected systems must secure additional approvals, document their changes in internal tooling and adhere to automated reliability checks aligned with Amazon’s central engineering rules.

The moves follow a run of incidents on Amazon.com in early March. One outage on March 2 caused incorrect delivery times at checkout and is reported to have generated about 1.6 million errors and 120,000 lost orders across global marketplaces. A separate outage on March 5 reportedly led to a further 6.3 million lost orders.

Some media coverage linked those problems directly to AI‑assisted coding tools and suggested Amazon was tightening rules specifically around AI‑written code. In a public correction of a Financial Times report, Amazon said those claims were inaccurate. “In fact, only one of the recent incidents involved AI tools in any way, and in that case the cause was unrelated to AI and instead our systems allowed an engineering team user error to have broader impact than it should have,” the company stated on its corporate site.

Amazon says that across the outages reviewed in a recent internal meeting, only a single incident involved AI‑assisted tooling, and that issue stemmed from an engineer following inaccurate advice inferred from an outdated internal wiki. None of the incidents, the company stresses, involved AI‑written code.

The reset and the outages were discussed at a regular Amazon Stores operations meeting referred to internally as TWiST. In a statement to Canadian Occupational Safety, an Amazon spokesperson says this meeting should not be seen as an emergency all‑hands, but as part of routine governance over the retail platform.

“TWiST is our regular weekly operations meeting with a specific group of retail technology leaders and teams where we review operational performance across our store. As part of normal business, the meeting will include a review of the availability of our website and app as we focus on continual improvement.” 

Amazon says TWiST is limited to the retail technology organization, that recent service incidents did not involve Amazon Web Services (AWS), and that it is inaccurate to suggest junior and mid‑level engineers now require senior sign‑off on every AI‑assisted change. Those points align with Amazon’s public position that reports of sweeping, AI‑specific approval rules were overstated.

For occupational health and safety professionals, the episode highlights a familiar tension: the drive to deploy powerful automation tools quickly versus the need for robust change‑management and review processes in systems that underpin essential work. Amazon’s 90‑day reset amounts to a large‑scale experiment in re‑introducing human and procedural checks into AI‑accelerated development — not to abandon AI, but to ensure that when things do go wrong, a single mistake cannot cascade into millions of failed transactions.