EU GMP Annex 22: Essential AI Guide 2025

What cannabis LPs need to know about the 2025 draft governing AI in batch release, QC, and defect detection.

If anyone on your team pastes deviations into ChatGPT, or if your LP is evaluating a cannabis AI vendor wrapping GPT, Claude, or Gemini for batch review, stop and read EU GMP Annex 22. This new 6-page annex, published for consultation on 7 July 2025 alongside the Annex 11 and Chapter 4 revisions, sets the scope boundary for AI in GMP. It excludes generative AI, large language models (LLMs), and continuously-learning models. EU GMP Annex 22 applies only to static, deterministic models with direct impact on patient safety, product quality, or data integrity.

This spoke covers what EU GMP Annex 22 requires and disallows before 2027-2028 enforcement. See also the broader 2025 EU GMP draft package.

What EU GMP Annex 22 actually says

EU GMP Annex 22 is a new 6-page annex with ten sections plus a glossary. It extends (not replaces) Annex 11, applying to “computerised systems … where Artificial Intelligence models are used in critical applications with direct impact on patient safety, product quality or data integrity” (§1).

What it means for your LP: any AI in batch release, deviation triage, QC classification, defect detection, yield prediction, or environmental-anomaly detection now has a 6-page GMP annex attached. Consultation closed 7 October 2025; enforcement expected 2027-2028 per PIC/S commentary. Chapter 4 §4.24: accountability for data processed with AI “rests with the regulated user.” Liability cannot shift to the vendor.

How GrowerIQ covers it: GrowerIQ is the Annex 11 system of record for the batch, lot, and deviation data an LP’s AI tools may read from or hand results into via human approval. We do not embed AI inside critical-GMP workflows, so customers do not inherit an AI vendor’s §3-§10 liability through us. GrowerIQ’s own generative-AI features (AI Form Builder, AI Report Builder, AI Task Planner, AI Label Builder) run only at design time to draft a template, which a human reviews and locks before daily operations run it deterministically. No AI sits in the runtime path.

Red flags:

Vendor cannot say whether EU GMP Annex 22 applies to their product.
Contract transfers accountability to the AI vendor: Chapter 4 §4.24 violation.
No gap analysis six months after consultation close.

What counts as a static, deterministic AI model?

Under Annex 22 §1, a compliant model must be frozen at deployment (no learning after release) and return identical outputs for identical inputs. Every other class is out of scope or disallowed in critical GMP.

“The document applies to static models, i.e. models that do not adapt their performance during use by incorporating new data … The document applies to models with a deterministic output which, when given identical inputs, provide identical outputs.” Source: Annex 22 §1

Model type	Behaviour	In scope?	Use under EU GMP Annex 22
Static + deterministic	Frozen weights; identical outputs for identical inputs	Yes	Allowed in critical GMP with full §2-§10 documentation
Dynamic / continuously-learning	Adapts during use	No	“Should not be used in critical GMP applications” (§1)
Probabilistic output	Same inputs may return different outputs	No	“Should not be used in critical GMP applications” (§1)
Generative AI / LLMs	GPT, Claude, Gemini class	No	Disallowed in critical GMP; non-critical HITL only (§1)

Before buying any AI tool for GMP-critical workflow, confirm in writing: are weights frozen at deployment, and do identical inputs return identical outputs? Either “no” disallows the tool under §1.

How GrowerIQ covers it: seed-to-sale calculations (yield, potency normalisation, loss reconciliation) in GrowerIQ are rule-based and deterministic, out of scope of Annex 22. GrowerIQ does not ingest or execute customer ML models in the batch-release path, so §10.2 artefact-hashing and input-snapshot obligations remain with the AI tool the customer chose.

Red flags:

Vendor cannot confirm weights are frozen at deployment: §1 violation.
Model returns different outputs for identical inputs: §1 violation.
Vendor cannot categorise their product as static, dynamic, or generative.

Can I use ChatGPT for batch review?

No. Generative AI and LLMs are excluded from critical GMP under EU GMP Annex 22 §1. ChatGPT, Claude, and Gemini cannot sit in the batch-release, deviation-classification, or QC-decisioning path. In non-critical workflows they are allowed only with a trained human-in-the-loop (HITL) signing off every output.

“The document does not apply to Generative AI and Large Language Models (LLM), and such models should not be used in critical GMP applications.” Source: Annex 22 §1

LLMs “should not be used” in critical GMP under EU GMP Annex 22. Non-critical use is conditional on “personnel with adequate qualification and training … ensuring that the outputs from such models are suitable for the intended use, i.e. a human-in-the-loop (HITL)” (§1).

Disallowed in critical path (examples): LLM-driven batch release, deviation classification, change-control impact assessment, QP release recommendations, audit-trail review. Any use that directly influences product quality, patient safety, or data integrity without a qualified human in the loop falls under the §1 exclusion.

Allowed non-critical (HITL per output, examples only, not exhaustive): meeting notes, first-pass supplier-document translation, precedent-deviation suggestions, first-draft SOPs that go through full review. Any non-critical use where a qualified human reviews every output before it influences a GMP record is workable.

How GrowerIQ covers it: GrowerIQ’s generative-AI features (Form Builder, Report Builder, Task Planner, Label Builder) draft templates at design time. Those drafts go through normal review, edit, and sign-off before the template is locked. Once locked, the template runs deterministically; no LLM output reaches the batch record at runtime. The Annex 11 §12-§13 audit trail and e-signature bind the human who approved the template, not an ongoing LLM output stream.

Red flags:

A QA analyst files an LLM-drafted deviation without line-by-line human edit: §1 HITL breach.
LLM-generated SOPs enter the PQS without named approval per output: §1 plus Chapter 4 §4.24.
A cannabis AI vendor’s product is an LLM wrapper positioned for batch review: §1 violation at the scope boundary.

Yield prediction, defect detection, QC classification

You need an SME-authored intended-use document approved before testing, a fully-characterised input sample space including rare variations, and documentation reviewable by the regulated user whether the model is in-house or vendor-supplied.

§3.1 requires the intended-use description “documented and approved before the start of acceptance testing”, authored by a process SME. §3.2 requires subgroup stratification. §2.2 states documentation “should be available and reviewed by the regulated user irrespective of whether a model is trained, validated and tested in-house or … provided by a supplier.” §2.1 demands multidisciplinary sign-off.

Cannabis scenarios:

Trim defect classifier: subgroups by defect (stem, seed, mould, under/over-dried), strain, lighting, growth stage.
Yield prediction: by room, strain, clone generation, veg/flower stage, irrigation.
Environmental anomaly detection: by room, season, equipment vendor.
QC classification (potency, terpene, water activity): by analyte, lab, instrument.

GROWERIQ PLATFORM

Where your AI features meet Annex 22

GrowerIQ is the cultivation and manufacturing system of record. Its audit trail, e-signatures, and change control satisfy Annex 11 §12-§13 for the records a customer’s separate AI tool may write into. The AI model, version lock, feature-attribution artefacts, and drift monitoring live with the AI vendor, not with GrowerIQ.

EXPLORE EU GMP

EU GMP, GACP & GPP cannabis compliance software by GrowerIQ

How GrowerIQ covers it: GrowerIQ captures the subgroup metadata Annex 22 §3.2 expects, including strain, room, equipment, batch stage, and operator, on every lot and activity record. Customers building the §3.1 intended-use documentation for their AI tool can extract this data through GrowerIQ’s standard reporting to characterise the training sample space.

Red flags:

Intended-use document written by the vendor’s data scientist, not a process SME: §3.1.
Sample space ignores rare variations (outdoor flower, summer humidity, strain-specific defects): §3.1.
Vendor says “our documentation is proprietary, trust us”: §2.2.

Explainability: SHAP, LIME, and heatmaps

Under EU GMP Annex 22 §8.1, any AI model used in critical GMP must capture the features behind each decision. SHAP, LIME, and heatmaps are named examples. “It is a neural network, we cannot explain it” is a §8.1 violation.

“During testing of models used in critical GMP applications, systems should capture and record the features in the test data that have contributed to a particular classification or decision (e.g. rejection). Where applicable, techniques like feature attribution (e.g. SHAP values or LIME) or visual tools like heat maps should be used to highlight key factors contributing to the outcome.” Source: Annex 22 §8.1

§8.2 requires that “a review of these features should be part of the process for approval of test results.” When an inspector asks “show me the features the model used to reject this batch,” the LP must produce an artefact tied to that record.

How GrowerIQ covers it: a customer can attach a SHAP, LIME, or heatmap artefact from their AI vendor to the relevant batch record using GrowerIQ’s standard attachment and audit-trail framework. The artefact is produced by the AI vendor; GrowerIQ preserves the version-controlled association to the batch so an inspector can retrieve it on demand. This layers cleanly on top of ALCOA++ data integrity for cannabis manufacturing.

Red flags:

AI cannot produce SHAP, LIME, or heatmap for a batch-reject decision: §8.1.
Vendor says “the model is proprietary, we cannot explain it”: §8.1 plus §8.2.
Artefacts exist but are not linked to the specific batch record: §8.1 plus Annex 11 §12 gap.

Acceptance criteria and the “no decrease” trap

EU GMP Annex 22 §4 requires SME-signed acceptance criteria (confusion matrix, sensitivity, specificity, F1) set before testing begins, and the AI must perform at least as well as the process it replaces. §4.2 requires criteria “documented and approved before the start of acceptance testing.” §4.3 is the “no decrease” clause: “The acceptance criteria of a model, should be at least as high as the performance of the process it replaces.”

Note: §4.3 cross-references “Annex 11 2.7”, but §2.7 is Security. The matching principle is Annex 11 §2.8, No risk increase. Expect correction in the final annex; consultants citing “§2.7” have copied the draft’s typo.

Two traps. The SME signs criteria before testing, no post-rationalisation. And you cannot install an AI classifier unless you know how well the manual process performs. Replacing two-person visual trim-QC with a camera classifier requires baseline reject, false-reject, and false-pass rates of the humans, gathered by documented method.

How GrowerIQ covers it: GrowerIQ retains the historical reject, pass, and deviation records an LP needs to characterise the incumbent manual process. Customers extract this data via standard reports to compute the §4.3 baseline performance against which the replacing AI must be measured.

Red flags:

Acceptance criteria set after testing concluded: §4.2.
No baseline performance for the manual process: §4.3 plus Annex 11 §2.8.
Baseline is a gut-feel estimate, not a documented study.

Test-data independence and the 4-eyes principle

EU GMP Annex 22 §6 requires controls ensuring the people who trained a model have not accessed the test data. For small cannabis LPs with one data scientist, §6.5’s 4-eyes principle (pair with a colleague who never touched the test set) is the realistic path.

Clauses: §5.1 (stratified full-sample-space test data), §5.3 (label verification via independent experts or validated equipment), §5.6 (generative-AI test data “is not recommended and any use hereof should be fully justified”), §6.1-§6.2 (access control plus audit trail on test data), §6.4 (physical objects in final test must not have been used in training), §6.5 (4-eyes pairing). §6.4 matters for cannabis: dried-flower samples in final acceptance must not be the same physical units used to train the model.

How GrowerIQ covers it: when an LP’s AI training corpus comes from GrowerIQ data exports, role-based access governs who may pull data and the audit trail records the export events. The LP is still responsible for building the separate controlled test-data repository Annex 22 §6.1-§6.5 requires; GrowerIQ is the source-data side of that pipeline, not the test-data substrate itself.

Red flags:

Same data scientist trained and test-validated with no 4-eyes pairing recorded: §6.5.
Test data generated by ChatGPT, Claude, or another LLM: §5.6.
Same physical samples used in training re-used for final acceptance: §6.4.

Confidence scores and operation discipline

Every AI prediction in critical GMP must carry its confidence score (§9.1). Below the SME-signed threshold, the model should flag “undecided” and route to a human (§9.2). Post-deployment, EU GMP Annex 22 §10 mandates change control on the model, hosting system, upstream process, and physical inputs (lens, lighting, sensor firmware), with a documented retest each time (§10.1); configuration control via hash, checksum, or signature (§10.2); performance-metric monitoring (§10.3); input-drift monitoring (§10.4); and HITL record retention (§10.5).

How GrowerIQ covers it: GrowerIQ provides the records substrate beneath a customer-run model. If a human logs the model’s output into GrowerIQ by approving or rejecting a batch decision, the e-signature, timestamp, and reason are captured per Annex 11 §12-§13. Model version control, artefact hashing, confidence thresholds, and drift monitoring are obligations that sit with the AI tool the customer runs, not with GrowerIQ. GrowerIQ does not host, execute, or version customer ML models.

Red flags:

AI returns a decision label but no confidence score: §9.1.
No “undecided” state: §9.2.
Deployed model artefact has no hash or version lock: §10.2.
No drift check since go-live: §10.4.
HITL approvals not individually recorded: §10.5 plus Annex 11 §12.2.

EU GMP Annex 22 extends (not replaces) Annex 11: the parent annex’s access control, audit trail, electronic signatures, periodic review, and security clauses all apply to the system hosting the AI. For the full picture, see our companion piece on Annex 11 computerised systems.

Frequently asked questions

Can I use ChatGPT, Claude, or Gemini for batch review?

No, not in the critical path. EU GMP Annex 22 §1 excludes generative AI and LLMs from critical GMP. Non-critical use requires a trained HITL owning every output.

My AI vendor says the model is “always learning.” Is that compliant?

No. §1 rules out dynamic, continuously-learning models in critical GMP. Ask the vendor to confirm in writing that weights are frozen at deployment.

Does EU GMP Annex 22 require SHAP specifically?

§8.1 names SHAP, LIME, and heatmaps as examples. Any feature-attribution or visual-explainability technique works, as long as the features behind each decision are captured and reviewable.

We are small: one data scientist. How do we do test-data independence?

§6.5’s 4-eyes principle: pair the data scientist with a colleague who never accessed the test data, procedurally enforced and audit-trailed.

Who owns AI accountability?

You do. Chapter 4 §4.24: accountability for data processed with AI rests with the regulated user. §2.2 adds that vendor-supplied models do not reduce the LP’s burden.

Does GrowerIQ ship AI that brings my LP under EU GMP Annex 22?

No. GrowerIQ’s AI features are confined to non-critical, design-time template authoring: AI Form Builder, AI Report Builder, AI Task Planner, and AI Label Builder use generative AI to draft a form, report, task plan, or label template, which a human then reviews, edits, and locks. Once locked, the template runs deterministically in daily operations, with no AI in the runtime path. GMP-critical calculations (yield, potency, loss, inventory math, batch release) are all rule-based and deterministic. GrowerIQ does not execute, host, or version customer ML models, so Annex 22 §3 to §10 obligations stay with whichever AI vendor an LP chooses to bring.

Last updated: April 2026

Start your EU GMP Annex 22 gap analysis now

EU GMP Annex 22 is the most consequential piece of the 2025 EU GMP draft for any cannabis LP buying AI tools, running shadow-IT LLMs, or positioning for EU entry. Run your gap analysis against §1 scope, §4 acceptance criteria, §8 explainability, and §10 operations. See how GrowerIQ’s audit trail, change control, and configuration management support LPs running AI.

EXPLORE EU GMP

or book a seed-to-sale demo directly

Recommended For You

Saint Lucia Selects GrowerIQ to Power National Cannabis Traceability Programme

April 20, 2026

Germany’s Cannabis Market Hits EUR 670M: How 900,000 Patients Are Reshaping Europe’s Largest Market

April 17, 2026

Mexico’s Cannabis Gray Market: Legal Personal Use but No Commercial Framework in Sight

April 16, 2026

About GrowerIQ

GrowerIQ is a complete cannabis production management platform. Ours is the first platform to integrate your facility systems, including sensors, building controls, QMS, and ERP, into a single simplified interface.

GrowerIQ is changing the way producers use software - transforming a regulatory requirement into a robust platform to learn, analyze, and improve performance.

To find out more about GrowerIQ and how we can help, fill out the form to the right, start a chat, or contact us.

Start today.

Let us know how to reach you, and we'll get in touch to discuss your project.

GrowerIQ does not share, sell, rent, or trade personally identifiable information with third parties for promotional purposes. Privacy Policy

EU GMP Annex 22: AI in Regulated Cannabis Manufacturing

What cannabis LPs need to know about the 2025 draft governing AI in batch release, QC, and defect detection.

What EU GMP Annex 22 actually says

What counts as a static, deterministic AI model?

Can I use ChatGPT for batch review?

Yield prediction, defect detection, QC classification

Where your AI features meet Annex 22

Explainability: SHAP, LIME, and heatmaps

Acceptance criteria and the “no decrease” trap

Test-data independence and the 4-eyes principle

Confidence scores and operation discipline

Frequently asked questions

Can I use ChatGPT, Claude, or Gemini for batch review?

My AI vendor says the model is “always learning.” Is that compliant?

Does EU GMP Annex 22 require SHAP specifically?

We are small: one data scientist. How do we do test-data independence?

Who owns AI accountability?

Does GrowerIQ ship AI that brings my LP under EU GMP Annex 22?

Start your EU GMP Annex 22 gap analysis now

Recommended For You

Saint Lucia Selects GrowerIQ to Power National Cannabis Traceability Programme

Germany’s Cannabis Market Hits EUR 670M: How 900,000 Patients Are Reshaping Europe’s Largest Market

Mexico’s Cannabis Gray Market: Legal Personal Use but No Commercial Framework in Sight

About GrowerIQ

Start today.

PRODUCTS

RESOURCES

GLOBAL MARKETS

US MARKETS

COMPANY