This text was generated using AI and might contain mistakes. Found a mistake? Edit at GitHub
The hype around artificial intelligence often centers on large language models (LLMs) and generative AI (genAI), but the reality of deploying AI in industrial settings tells a very different story. In a recent conversation with Nikita Golovko, an AI architect at Siemens, we explored why LLMs fail in manufacturing environments and what types of AI actually work when stakes are high.
The Fundamental Problem with LLMs in Industry
Industrial automation demands absolute determinism. A programmable logic controller must make concrete decisions: stop the production line or don’t. It cannot say “probably stop.” This is where LLMs fundamentally fail. These models are probabilistic by nature — the same input produces different outputs, and results lack reproducibility and transparency.
The “confidence illusion” compounds this problem. When an AI system reports 94% confidence in detecting a defect, this doesn’t mean the answer is correct. Confidence is merely the model’s belief level in its output, not a guarantee of accuracy. Additionally, model drift — where training data becomes outdated but the system remains unchanged — creates silent failures. The system continues operating without alerts, providing corrupted results under the hood.
What Actually Works: Classical ML and Computer Vision
Rather than LLMs, industrial applications succeed with classical machine learning models like decision trees and random forests. These models offer interpretability — you can trace the decision logic. They’re also deterministic and reproducible, meeting industrial reliability requirements.
Computer vision models, another critical category, excel at analyzing images for object detection, classification, and segmentation. These systems can identify defects better than humans in many cases, yet humans remain essential for diagnosis and decision-making.
Decision trees prove particularly valuable in regulated environments. Banks use them for loan approval decisions, satisfying GDPR Article 22 requirements for explainability. When a decision affects humans, transparency matters legally and ethically.
A Hybrid Approach: Root Cause Analysis
Nikita demonstrates how generative AI can augment classical systems without replacing them. In their manufacturing platform, computer vision models detect defects and classify them. This structured data travels to the cloud where an LLM summarizes relevant documentation to suggest root causes. The LLM isn’t making critical decisions — it’s synthesizing existing knowledge to assist human operators.
This approach acknowledges LLMs’ actual strength: summarizing and explaining existing information, not generating new knowledge or making autonomous decisions.
Architectural Defense: The Simplex Model
To integrate AI safely, Nikita advocates the simplex architecture: three components working in concert. An AI model provides analysis, deterministic business logic offers a fallback, and a monitoring system watches for problems. When confidence scores drop or input distributions shift unexpectedly, the system switches to deterministic logic or human operators.
This requires AI gateways — specialized middleware that evaluates model metadata and routes decisions accordingly. Circuit breakers, familiar from resilience patterns, can stop using a failing model and restore service through alternatives.
Documentation and Governance
An arc42 extension for AI systems provides comprehensive documentation covering data sources, model behavior, runtime deployment, and risks. This framework, aligned with upcoming AI Act requirements, ensures transparency throughout the model lifecycle.
When using external LLMs from vendors that change monthly, local deployment becomes essential. Using open-source models with hexagonal architecture principles allows swapping implementations without disrupting business logic.
Conclusion
AI succeeds in industrial settings not through cutting-edge LLMs, but through thoughtful application of proven ML techniques combined with robust architectural patterns. The key insight: treat AI as a specialized component with known limitations, isolate it from critical control loops, monitor its behavior constantly, and maintain human oversight. These aren’t constraints unique to manufacturing—they represent best practices all organizations should adopt. Industrial applications simply demand them from day one.