The GHG inventory under the GHG Protocol is a rule engine that's been stable since 2004: agreed system boundaries, defined emission factors, deterministic calculation. That's exactly the shape of work at which today's frontier models peak. This assessment breaks the workflow into eight sub-tasks and runs each one through the full methodology (Stages 0–4). Two tasks land in the Expert band — a first for the methodology.
Per-sub-task assessment
Stage 0 — operationalisation per sub-task
Before the assessment runs, every sub-task gets a concrete setup: which inputs feed it, which rubric structures the work, which tools come in, what shape the output takes, what gets reviewed and how.
For emission-factor assignment, for example: a versioned EF library (DEFRA, ecoinvent, BAFA) plus a mapping table for fuels and energy carriers. The LLM proposes factors; the output is an EF-assigned dataset with source and version per entry — the foundation for the calculation that follows.
Each of the eight sub-tasks gets the same treatment before scoring. Skip it — intuitive AI use, no defined setup — and the savings drop sharply while the error risk climbs. The Expert verdict for EF assignment and calculation, in particular, depends on doing Stage 0 properly.
Model capabilities — Stage 2
Scored against today's frontier models (Claude / GPT / Gemini, 2026).
Strong:
- Rule lookup and deterministic calculation (D07=3 on T3/T4). EF assignment and Σ activity·EF are easy tool-use for frontier models.
- Tables and structure (D20b: vision on standard PDFs is solid, table reasoning is robust). ERP exports and EF tables come through cleanly.
- Context window (D10b). 200k+ tokens is plenty for multi-site datasets plus methodology docs in one run.
Weak:
- Calibration / metacognition (D08). OECD ACI puts frontier models at L2 of 5. For T6 and T7 that means model claims on data quality and YoY drift need a fact-check.
- Reporting layer (T8). The Revised ESRS 2026 is out of distribution. D17b=0 under ISSA 5000 limited assurance. The model alone won't carry T8.
Deployment readiness — Stage 3
The default reflects a typical setup: 10–50 sites; ERP data available but uneven; a few Excel islands; no central energy-monitoring platform; a compact sustainability function (1–3 people).
What it takes to reach the upper end of the range:
- D04b — inputs available digitally. Fuel, electricity, and refrigerant data as ERP exports or supplier invoices in text form (PDFs fine).
- D11 — user skill. A sustainability lead with basic GHG Protocol knowledge, willing to learn the templates and the EF library.
- D12 — tool integration. LLM API in an EU cloud region; read access to the ERP, no deep integration needed.
- D26b — monitoring. A documented review cadence for AI output before anything goes into a report.
- D02b — blast radius. ISSA 5000 limited assurance from December 2026 — re-performance of the calculation has to be possible (see Stage 4).
For more mature setups — a central energy-monitoring platform, established ERP pipelines, a dedicated assurance function — the upper end of the range moves towards 90–100 %. The DEPLOY modifiers are already baked into the ranges above (+10).
Governance — Stage 4
Governance sets the controls that AI use brings with it — independently of whether the task is technically automatable. Unlike the DMA, the GHG inventory has no veto triggers: all eight sub-tasks contribute positively.
What matters most for the GHG inventory:
- Reperformable audit trail. Assurance requires every calculation to be traceable with identical inputs. Model version, EF source, version state, and reviewer decision logged per calculation run. See ISSA 5000, ESRS E1.
- AI labelling in reporting. External disclosures must be identifiable as AI-supported. See EU AI Act Art. 50, ESRS reporting requirements.
- Sampling oversight. Reporting accountability stays human. AI delivers proposals; the sustainability lead reviews; management signs off on the final inventory. See IAASB professional skepticism.
Less critical here:
- Fundamental rights — low. The GHG inventory doesn't touch vulnerable stakeholders directly.
- Misuse / dual-use — low. Not a safety-critical task.
- Data protection — covered by a standard DPA and an EU cloud setup; no special categories of personal data in play.
Recommended controls:
- Version-locked EF library with a documented update cadence — keeps drift between years from sneaking in.
- Audit trail per calculation run covering model version, prompt, input sources, output, and reviewer — the basis for reperformance by the assurance provider.
- AI labelling in ESRS E1 and CDP reporting — external visibility of AI support.
Where to go from here
Three takeaways from the GHG inventory as a workflow.
AI runs the rule engine on its own. T3 (EF assignment) and T4 (calculation) are Expert tasks. Lookup and Σ activity·EF run under named oversight — the human re-performs by sampling, not step by step. That's the efficiency lever: those two tasks together are 20 % of the workflow effort, and AI carries almost all of it.
The human keeps boundary, data hunting, and reporting. T1 (system boundaries) stays Consultant — group-structure calls (control approach, JVs, carve-outs) stay human-final. T2 (activity data collection) is still 25 % of the effort, with AI as a Collaborator on aggregation — but the chase with site owners stays human. T8 (reporting) is Tool: AI drafts XBRL mappings and narratives, but accountability for external disclosures stays with management and the lead.
Stage 0 matters twice as much here. Templates, EF library, calculation sheet — these aren't nice-to-haves. They're the precondition for Expert at T3/T4 and Collaborator at T2/T5/T6/T7. Skip them and every task drops a band; the workflow aggregate falls from 76 to ~63, and T3/T4 lose what makes them Expert.
The methodology in detail
The assessment framework used here (5 stages, 42 dimensions, 41 institutions) is documented on the methodology page — including Stage 0 operationalisation, the full source matrix, and the recommendation logic on the autonomy scale. To compare directly with the DMA, see DMA assessment →