DE · EN
Book a call
← Back to Insights Worked example · GHG inventory · 2026

Greenhouse Gas Inventory (Scope 1+2+3) — AI potential, assessed

Eight sub-tasks along the GHG Protocol Corporate Standard. The five-stage methodology applied to a workflow that's been stable since 2004 — and the first time it returns an Expert-level verdict.

Very high · 60–85 % time saved · AI as Collaborator
Aggregate 76.4. Higher AI potential than the DMA, because the GHG methodology has been stable for twenty years and no stakeholder vetoes apply.

The GHG inventory under the GHG Protocol is a rule engine that's been stable since 2004: agreed system boundaries, defined emission factors, deterministic calculation. That's exactly the shape of work at which today's frontier models peak. This assessment breaks the workflow into eight sub-tasks and runs each one through the full methodology (Stages 0–4). Two tasks land in the Expert band — a first for the methodology.


Per-sub-task assessment

T1
System-boundary setting · organisational + operational boundary
AI as Consultant50–80 % time saved
T2
Activity-data collection · fuels, electricity, refrigerants
AI as Collaborator65–90 % time saved
T3
Emission-factor assignment · DEFRA, ecoinvent, BAFA, ProBas
AI as Expert75–100 % time saved
T4
Calculation & aggregation · Scope 1+2 location/market
AI as Expert75–100 % time saved
T5
Scope 3 screening · 15-category matrix
AI as Collaborator65–90 % time saved
T6
Data-quality assessment · pedigree matrix
AI as Collaborator65–90 % time saved
T7
Consolidation & YoY check · outlier detection, sector benchmark
AI as Collaborator65–90 % time saved
T8
ESRS E1 / CDP reporting · mapping + narrative disclosures
AI as Tool30–60 % time saved
Aggregate across all eight tasks: very high · 60–85 % time saved · workflow headline 76.4 (Collaborator band).
First Expert verdict of the methodology. T3 (EF assignment, headline 90.6) and T4 (calculation, headline 92.6) are the first real-world sustainability tasks where all four Expert conditions hold at once: rule engine (D07=3), full inter-rater convergence (D32=3), methodological stability (D26a=3), deterministic verifiability (D01=3). For the DMA, not a single task cleared Expert.

Stage 0 — operationalisation per sub-task

Before the assessment runs, every sub-task gets a concrete setup: which inputs feed it, which rubric structures the work, which tools come in, what shape the output takes, what gets reviewed and how.

For emission-factor assignment, for example: a versioned EF library (DEFRA, ecoinvent, BAFA) plus a mapping table for fuels and energy carriers. The LLM proposes factors; the output is an EF-assigned dataset with source and version per entry — the foundation for the calculation that follows.

Each of the eight sub-tasks gets the same treatment before scoring. Skip it — intuitive AI use, no defined setup — and the savings drop sharply while the error risk climbs. The Expert verdict for EF assignment and calculation, in particular, depends on doing Stage 0 properly.


Model capabilities — Stage 2

Scored against today's frontier models (Claude / GPT / Gemini, 2026).

CAPAB coverage for GHG inventory
~73 %

Strong:

Weak:


Deployment readiness — Stage 3

Default — mid-sized company with a compact sustainability function
~62 %

The default reflects a typical setup: 10–50 sites; ERP data available but uneven; a few Excel islands; no central energy-monitoring platform; a compact sustainability function (1–3 people).

What it takes to reach the upper end of the range:

For more mature setups — a central energy-monitoring platform, established ERP pipelines, a dedicated assurance function — the upper end of the range moves towards 90–100 %. The DEPLOY modifiers are already baked into the ranges above (+10).


Governance — Stage 4

Governance sets the controls that AI use brings with it — independently of whether the task is technically automatable. Unlike the DMA, the GHG inventory has no veto triggers: all eight sub-tasks contribute positively.

What matters most for the GHG inventory:

Less critical here:

Recommended controls:

  1. Version-locked EF library with a documented update cadence — keeps drift between years from sneaking in.
  2. Audit trail per calculation run covering model version, prompt, input sources, output, and reviewer — the basis for reperformance by the assurance provider.
  3. AI labelling in ESRS E1 and CDP reporting — external visibility of AI support.

Where to go from here

Three takeaways from the GHG inventory as a workflow.

AI runs the rule engine on its own. T3 (EF assignment) and T4 (calculation) are Expert tasks. Lookup and Σ activity·EF run under named oversight — the human re-performs by sampling, not step by step. That's the efficiency lever: those two tasks together are 20 % of the workflow effort, and AI carries almost all of it.

The human keeps boundary, data hunting, and reporting. T1 (system boundaries) stays Consultant — group-structure calls (control approach, JVs, carve-outs) stay human-final. T2 (activity data collection) is still 25 % of the effort, with AI as a Collaborator on aggregation — but the chase with site owners stays human. T8 (reporting) is Tool: AI drafts XBRL mappings and narratives, but accountability for external disclosures stays with management and the lead.

Stage 0 matters twice as much here. Templates, EF library, calculation sheet — these aren't nice-to-haves. They're the precondition for Expert at T3/T4 and Collaborator at T2/T5/T6/T7. Skip them and every task drops a band; the workflow aggregate falls from 76 to ~63, and T3/T4 lose what makes them Expert.

Caveat: the time-saved figures all assume professional operationalisation per sub-task and a clean EF library with version state. Casual chat use with no setup lands at 30–50 % with elevated error risk — and no Expert verdict.

The methodology in detail

View the full methodology →

The assessment framework used here (5 stages, 42 dimensions, 41 institutions) is documented on the methodology page — including Stage 0 operationalisation, the full source matrix, and the recommendation logic on the autonomy scale. To compare directly with the DMA, see DMA assessment →

Get an assessment for your own context

This walk-through reflects a typical mid-sized CSRD setup. For your real situation — your ERP, your site structure, your assurance requirements — you'd want a tailored read. 3 weeks of diagnostic work, fixed price, no lock-in.