Section 2 of 5 · 15 min read

Designing Agentic Workflows

Not every task is worth automating, and not every automation should involve an agent. This section is about the decisions that happen before you build anything: what makes a task agent-worthy, how to map it into steps, and where human oversight isn't optional.

What makes a task agent-worthy

The goal isn't to build the most sophisticated system possible — it's to pick the right level for each task. Most high-value work lives in simpler automation: a triggered workflow, a scheduled script, a skill that applies a consistent process. Agents earn their added complexity in a narrow set of situations.

A task is agent-worthy when it has most of these properties:

Multi-step with decision points

The task requires more than a fixed sequence — there are branches based on what earlier steps find. "Pull data, clean it, and email a summary" is automatable without an agent. "Pull data, assess whether the anomaly is likely measurement error or real, and route accordingly" needs judgment at the branch point.

Information-gathering across sources

The task involves retrieving information from multiple external sources that can't all be known in advance. Research synthesis across 50 recent papers. Cross-referencing emissions claims against three independent databases. Comparing adaptation plans across 30 countries with inconsistent formats.

Repetitive at scale

The same workflow needs to run for many instances — each facility in a portfolio, each project in a pipeline, each week's worth of monitoring data. A human could do one; the agent handles all of them while the human reviews the outputs.

Coordination-heavy

The task involves pulling from multiple systems, combining outputs, and producing something that would require several context switches for a human. Meeting notes → CRM → follow-up draft. Satellite data → database → alert → notification.

The workflow design process

A workflow that works reliably in production starts with a design phase before any building happens. The sequence:

Define the goal precisely

Not "automate our monitoring" but "every day at 6am, check the Global Forest Watch API for deforestation alerts in our 23 monitored polygons, filter for events covering more than 5 hectares, and post a formatted summary to the #monitoring Slack channel." Vague goals produce agents that wander.

Map every step

List the complete sequence of actions, including the ones that feel obvious. Data retrieval → parsing → cleaning → comparison against baseline → threshold logic → output formatting → delivery. Every step is a potential failure point.

Identify tool needs

For each step, what capability does the agent need? Web access? Code execution? A specific API key? File write permissions? An email integration? Each tool adds capability and blast radius. Grant only what's needed.

Design human checkpoints

Before any consequential action, require human review. Consequential means: sends something externally, modifies a database, produces outputs that flow into decisions. "Review before sending" is not optional on high-stakes workflows — it's what makes the workflow trustworthy.

Climate workflow patterns

Four patterns show up repeatedly in climate agentic workflows. Recognizing the pattern helps you apply the right design approach.

Monitoring pipeline

Continuous data ingestion → anomaly detection → threshold logic → alert routing. Used in: deforestation watch, methane leak detection, emissions compliance tracking, extreme weather alerts. Key design requirement: clear anomaly definitions before deployment.

Research synthesis

Retrieval across multiple sources → structured extraction → comparison → summary generation. Used in: literature reviews, policy landscape analysis, technology readiness assessments, competitive intelligence. Key design requirement: citation tracking so outputs can be verified.

Stakeholder coordination

CRM pull → meeting prep → follow-up drafting → calendar management. Used in: partnership development, investor reporting, community engagement tracking. Key design requirement: human review before any external communication.

Reporting automation

Data collection from multiple systems → normalization → template population → quality checks → distribution. Used in: GHG inventory reporting, ESG disclosure, project progress reports. Key design requirement: data provenance tracking at every step.

What can go wrong

Workflow failures fall into predictable categories. Knowing them before you build is more valuable than debugging them after deployment.

Scope creep

Agents given broad goals tend to expand their scope. "Analyze the data" becomes file writes in unexpected places, API calls you didn't intend, and outputs in formats that break downstream steps. Define goals narrowly and explicitly.

Error propagation

A misclassification or parsing error in step 2 produces incorrect inputs for every subsequent step — and the agent has no way to know it went wrong. Without explicit validation checkpoints, errors compound silently. Build verification steps.

Automation bias

Once a workflow is running reliably, humans stop reviewing outputs carefully. When something eventually goes wrong — and it will — no one catches it until the downstream consequences show up. The governance question isn't just "who reviews?" it's "are they actually reviewing, and do they have the context to catch errors?"

The rule on permissions: start narrow. Authorize a specific folder rather than your entire filesystem. One API endpoint rather than full account access. Errors in agentic workflows can affect real data in real systems. Wider permissions make errors harder to contain.

Exercise

Workflow Mapper

Describe a climate task you want to automate. Answer four quick questions. Get back a structured workflow with step-by-step design, tools needed, and required human checkpoints.

Describe the climate task you want to automate

How often does this task need to run?

Does it need to browse the web or call external APIs?

Does it involve writing or modifying files?

What's the consequence if it makes a mistake?