What does it mean to have 'scaled' physical AI in manufacturing?

Scaled means production deployment across multiple lines, multiple shifts, and ideally multiple sites — running 24/7 against real customer SLAs, with measurable business outcomes and a maintenance model that does not depend on the founders being in the room. A pilot on one camera for three months is not scaled. Three years of uptime across hundreds of machines in multiple countries is.

Why does Cisco's 2026 report say only 20% of physical AI deployments have scaled?

Three structural reasons. First, the deployment economics of the rip-and-replace model don't survive a CFO review. Second, cloud-first AI architectures don't survive a factory IT review. Third, vendor lock-in to a single chip family doesn't survive a real-world cost-per-camera calculation. The 20% that scaled built around these three constraints from the start; the 61% that haven't are usually fighting at least one of them.

How can a manufacturer move from the 61% to the 20%?

Pick a vendor whose architecture is retrofit-first (runs on your existing cameras, PLCs, MES and ERP), edge-first (inference on-premise, video stays on-site), and chip-agnostic (so cost-per-camera doesn't lock you to whichever silicon was hot the year you started). Then deploy one line, prove the outcome, and expand only after the first line has been running stably for 60-90 days in production.

61% of factories deploy physical AI. 20% scale it. A view from inside the 20%

Cisco's 2026 State of Industrial AI Report landed earlier this year with a number that did not get the attention it deserved.

61% of industrial users are actively deploying physical AI. Only 20% have successfully scaled it.

Read those two numbers again. They are saying that out of every five factories that started a physical AI project, four are still stuck in pilot or partial rollout. The technology exists. The vendors exist. The CFOs have signed off. The pilots are running. And in 80% of cases, the project never gets past the first line, the first plant, or the first year.

This is the deployment gap. It is the most important number in industrial AI right now, and almost nobody is writing about it honestly — least of all the vendors selling into it.

Deploying physical AICisco 2026

61%

Scaled physical AICisco 2026

20%

Source: Cisco 2026 State of Industrial AI Report. The gap between "deploying" and "scaled" is where most physical-AI projects die.

Why I am writing this

I am the CEO of CountAI. We are a deployment company, not a model company. Over the last seven years we have shipped on-premise computer vision across more than 4,500 industrial cameras and 500-plus production machines in seven countries. Some of those machines have been running our AI continuously, 24 hours a day, for more than three years. Real factories, real shifts, real customer SLAs, real money on the line every minute the system is up.

I do not say this to flex. I say it because being in the 20% means I have an unusual view of what is actually breaking in the 80%, and I have stopped seeing anybody in the press or on a conference stage describe it accurately.

What follows is not a survey. It is what I have learned by being inside the gap.

Inside the 20% — CountAI to date

Seven years of running physical AI in real factory conditions.

4,500+

Industrial cameras in production

500+

Machines running 24/7

Countries deployed

25+

Paying enterprise customers

What "scaled" actually means

Most physical AI vendors and most analyst reports use the word scaled loosely. A vendor with one signed customer and three live cameras will tell you they are "scaling." A factory with one line running an AI model will tell their board they have "deployed" physical AI.

That is not what scaled means. Scaled means four things, all of them at the same time.

Scaled means multi-line. Not one camera, not one machine. Multiple lines, side by side, each one feeding the same intelligence layer.

Scaled means multi-shift. Not just the day shift on weekdays. The night shift, the weekend shift, the changeover, the maintenance window. The system has been awake when nobody from your engineering team is.

Scaled means multi-site. Not one plant. The system has been deployed at a second site, then a third, then a fourth, without the original engineering team having to fly in every time.

Scaled means multi-year. Year one is when the system goes in. Year two is when the camera you mounted in year one starts behaving differently because the lens has fogged or the operator moved it. Year three is when the model needs retraining because the product mix has shifted. A system that has not run through all three years is not scaled. It is in pilot.

By that definition, the 20% number Cisco reported is, if anything, generous. Most operations that say they have scaled have hit the first two and are still finding out about the second two.

Three reasons most pilots never scale

I have walked into enough plants where the previous AI project died that I can now predict, within ten minutes of arriving, which of three causes killed it. They are usually the same three.

The deployment economics killed it

The first generation of "smart factory" pitches was a rip-and-replace story. New cameras, new VMS, new MES connectors, new cloud platform, new identity layer, three-year program, eight-figure capital exposure. Most CFOs read this kind of project plan and correctly killed it before line one ever went live.

The pilots that survived this were the ones where a single line owner found enough discretionary budget to run a proof of concept. The pilot worked. But there was no path to plant-wide rollout that the CFO would sign — because the economics still required ripping out the existing stack everywhere. The pilot became a stranded asset. The vendor moved on. The 61% counts these.

The cloud-first architecture killed it

The second-most common failure pattern is the vendor whose product depends on streaming video and operational data to their cloud. This works fine on the demo, where the conference WiFi is good and the customer's procurement team isn't in the room.

It does not survive a real factory IT review. Bandwidth is not provisioned for it. The privacy team flags the worker-footage exposure. The legal team in the EU, UK or Australia asks a series of GDPR / DPA / Privacy Principles questions the vendor cannot answer cleanly. The cybersecurity team finds an inbound firewall rule that nobody is going to approve. The pilot quietly stalls in week four of the rollout review. Nobody calls it dead; it simply never moves.

The chip lock-in killed it

The least-discussed failure mode and, in my experience, the one that gets the most plants stuck after the first site. The vendor's platform is tied to a specific silicon family. The first site goes live. The second site has a different camera layout, different lighting, a different power budget, and the chip-cost math changes. The vendor cannot adapt; the platform cannot run on cheaper, more efficient or more available silicon. The customer pauses the rollout to "re-evaluate." The re-evaluation never finishes.

This is also a category where the 2026 landscape has shifted fast. NVIDIA's Jetson Thor — announced last year and just gained JetPack 7.2 support at COMPUTEX in June — delivers 2,070 TFLOPS of FP4 inference at the edge. Hailo-8 delivers 26 TOPS at 2.5W. Intel Core Ultra Series 2 integrates CPU, GPU and NPU at up to 99 TOPS in a single fanless platform. Qualcomm Dragonwing is now powering Cognex's new In-Sight 3900 vision system. A platform that is hard-tied to one of these is going to look like the wrong choice within 24 months. A platform that runs across several — which is what we built — survives the chip cycle.

The 80% that did not scale rarely failed at AI. They failed at deployment economics, cloud architecture, or chip lock-in. The model worked. The system around the model did not.

Why this is suddenly the right moment to fix it

Three things converged in the first half of 2026 that make the deployment gap look closeable for the first time.

The chips finally cost the right amount. Edge inference that needed an $8,000 GPU two years ago now runs on a $200 module. Hailo-8 sits at 2.5W. Jetson Orin Nano fits inside a smart camera body. Intel Core Ultra delivers 99 TOPS in a fanless industrial form factor. The hardware no longer dominates the unit economics of deployment.

The reference customers have arrived. Siemens and NVIDIA announced in early 2026 that they are building the world's first fully AI-driven, adaptive manufacturing site, with their Electronics Factory in Erlangen, Germany as the first blueprint. Boston Dynamics' production Atlas is scheduled to go into Hyundai's Georgia plant by 2028. UK robotics company Humanoid signed a binding deployment agreement with Schaeffler in May. These announcements move physical AI from "interesting" to "the thing your board is going to ask you about next quarter."

The deployment playbook is now written. Retrofit, edge-first, chip-agnostic. Read MES and ERP through their APIs. Ingest existing camera streams via RTSP. Run inference on an on-premise edge server. Keep raw video on-site. Render the signal by role — operator alert, supervisor shift summary, plant manager cross-line, CFO variance to plan. Anyone telling you something different in 2026 is selling you the 2019 architecture.

Deloitte's State of Generative AI in the Enterprise work, looking at the same audience, found that 58% of business leaders are already using physical AI in some capacity, rising to 80% with plans over the next two years. The deployment wave is coming. The scaling wave is the one that is open.

Want a 30-minute call with the founder of a company actually in the 20%?

If your operation is somewhere in the 61% — pilot running, scaling stuck — I will spend 30 minutes with you walking through what blocked the scale-up and what to ask of any vendor before signing the next contract. No deck. No salesperson. Just the conversation. If at the end you want to evaluate CountAI for one of your lines, we move to that. If not, you walk away with a clearer playbook for whichever vendor you do choose.

Email Harsha directly →

Goes to my inbox. Usually replies the same day.

What I would test before you scale

If you are running a physical AI pilot today and the board is asking how you scale it, three questions are worth asking yourself before you commit the capital.

Have you run for a full quarter on the chip you are going to scale on? Not the lab chip. The production chip, in the production thermal envelope, with the production camera mix. If the answer is no, your scale-up is going to surface failure modes you have not budgeted for.

Has the system run a full night shift without the engineering team in the building? If your operations team cannot recover from the inevitable failure mode at 3am without somebody from the vendor or the data team on call, you have a pilot, not a scaled system.

Has the model retrained itself, or been retrained, since the day it went live? If the answer is no, you have not yet hit the real durability test. Production conditions drift. Models drift with them. Scaled means the platform handled the drift — not that the original demo is still running.

The factories that get this right inside the next 24 months are the ones that move from the 61% to the 20%. The ones that don't are going to spend the next decade explaining to their board why physical AI never quite delivered.

Frequently asked questions

What does "scaled" physical AI actually look like?

Multiple lines, multiple shifts, multiple sites, multiple years, running against real customer SLAs without the original engineering team in the room. A pilot on one line for three months is not scaled. Several hundred machines across multiple countries, running continuously for years, is.

Why does Cisco's 2026 report say only 20% of industrial AI deployments scaled?

The three structural reasons I keep seeing: the rip-and-replace deployment economics never survived CFO review; cloud-first architectures never survived factory IT review; and platforms hard-tied to one chip family never survived the second-site cost-per-camera math. The 20% built around all three constraints from the start.

How does a manufacturer move from the 61% to the 20%?

Pick a vendor whose architecture is retrofit-first, edge-first, and chip-agnostic. Deploy one line. Prove the outcome. Run it through three shifts and 60-90 days of production conditions. Then — and only then — expand. The 20% did not get there by going fast. They got there by going stable.

Which chips are competitive for industrial edge AI in 2026?

NVIDIA Jetson Thor for the high end (2,070 FP4 TFLOPS, 128GB RAM, but 40-130W power). Intel Core Ultra Series 2 for fanless industrial form factors (up to 99 platform TOPS). Hailo-8 / 10H for power-efficient embedded vision (26 TOPS at 2.5W). Qualcomm Dragonwing as a new entrant via Cognex's In-Sight 3900. A serious platform runs across several of these so the chip choice can follow the use case, not the other way around.

How long has CountAI been running physical AI in production?

Seven years. Today we are running across more than 4,500 industrial cameras and 500-plus production machines in seven countries. Some of those deployments have been live continuously for more than three years.

Sources cited: Cisco 2026 State of Industrial AI Report; Deloitte State of Generative AI in the Enterprise (2026); NVIDIA Jetson Thor announcement and JetPack 7.2 (COMPUTEX June 2026); Siemens-NVIDIA Erlangen partnership announcement; Cognex In-Sight 3900 launch (May 2026); Hailo-8 / Hailo-10H product specifications; Intel Core Ultra Series 2 platform specifications.

61% of factories are deploying physical AI. Only 20% have scaled it. A view from inside the 20%.