Agentic AI Glossary
Set of actions an agent can perform.
Interaction between agents to achieve shared objectives.
Managing dependencies and communication across agents.
Identifying and resolving issues in agent workflows.
Process of measuring performance and reliability of agents.
System used to evaluate agent outputs and behavior.
Software platform used to build and manage AI agents.
Policies controlling agent behavior and usage.
Managing creation, deployment, monitoring, and updates of agents.
Recording actions and decisions for debugging.
Continuous cycle of observe → plan → act → evaluate followed by an agent.
Layer connecting agents with tools, APIs, and data sources.
Tracking performance metrics of agent systems.
Monitoring agent actions, decisions, and performance.
Number of steps or depth an agent considers when planning actions.
Execution environment where agents run.
Structured prompting, control flow, and tooling layered around a base model to elicit reliable agentic behavior.
Increasing capacity to handle more tasks or users.
Internal context, memory, and execution status of an agent.
Managing versions of agent logic and configurations.
System that manages execution of agent workflows.
Benchmark suite that evaluates LLM agent capabilities across diverse interactive environments.
AI systems capable of autonomously planning, reasoning, and executing tasks to achieve defined goals with minimal human input.
Retrieval-augmented generation where the agent actively decides when to retrieve, what queries to issue, and how to use the returned context.
Software entity that perceives inputs, makes decisions, and performs actions to achieve objectives.
Assistive agent that helps users perform tasks interactively.
Performance cost incurred when making a model safer or more aligned with human intent.
Connecting agents with external systems for execution.
Process requiring human validation before execution.
Record of agent actions for accountability and compliance.
Using metrics or models to evaluate agent outputs.
Agent that operates independently without continuous human supervision.
Fully automated workflow executed by agents without human input.
Mapping high-level agent intents to concrete executable actions in the target environment.
Set of actions an agent can perform.
Managing configurations across environments.
Interaction between agents to achieve shared objectives.
Machine-readable manifest describing an agent’s identity, capabilities, and supported protocols for discovery.
Comparing performance across systems.
Managing communication and dependencies across agents.
Estimating cost of running agents.
Identifying and fixing issues.
CI/CD pipeline for deploying agents.
Measuring agent performance and reliability.
System used to evaluate agent outputs.
Platform used to build and manage agents.
Policies controlling agent usage and behavior.
Transfer of control and state between one agent and another within a multi-step task.
Simplified rules enabling faster decision-making.
Managing creation, deployment, updates, and monitoring.
Recording execution details.
Continuous cycle of observe, plan, act, and evaluate.
Layer connecting agents to tools and APIs.
Tracking performance metrics.
Monitoring behavior and outputs.
Defined identity, tone, and behavioral profile assigned to an agent to shape its responses.
Depth of steps an agent considers when planning actions.
Strategy or rule set guiding agent decision-making.
Step-by-step reasoning process used to solve complex problems.
Prompting method where the model generates verification questions about its own draft output and revises the answer based on its own answers to those questions.
Mechanism that halts repeated failures in agent workflows.
Agents running on cloud infrastructure.
Ensuring agents adhere to regulatory requirements.
Agent that operates a computer through the graphical interface — controlling mouse, keyboard, and screen — rather than via APIs.
Alignment approach where the model critiques and revises its own outputs against a written set of guiding principles.
Dynamically adding relevant data into prompts during execution.
Maximum amount of input data an agent can process at once.
Agent that interacts with users through natural language.
Property of an agent that reliably accepts correction, shutdown, or modification from authorized operators.
Cost associated with executing a single agent workflow.
Risk of exposing sensitive data through agent outputs.
Component that determines actions based on goals and inputs.
Agent behavior producing consistent outputs for the same input.
Controlled execution ensuring predictable outcomes.
Agents deployed across multiple systems or nodes.
Preference fine-tuning method that optimizes directly on pairwise preference data without training a separate reward model.
Agents deployed on edge devices for local execution.
Agent that acts through a physical or simulated body, grounding perception and action in an environment.
Platform used to deploy and manage agents at scale.
External system or context in which an agent operates.
Agent triggered by events such as alerts or data updates.
Record of actions taken by an agent during task execution.
Alternative execution path when primary logic fails.
Structured mechanism where agents invoke APIs or tools instead of generating free text.
Mechanism where outputs influence future decisions.
Constraints ensuring safe and reliable agent behavior.
AI system designed to achieve specific outcomes through iterative decision-making.
Generalization of ToT where reasoning steps form a graph rather than a tree, enabling merging, backtracking, and cycles across thought paths.
Failure where an agent learns a proxy objective that matches training data but diverges from the intended goal under distribution shift.
Benchmark of real-world assistant tasks requiring reasoning, tool use, web browsing, and multi-step execution.
Using human input to assess agent performance.
Agent structure with parent-child relationships for delegation of tasks.
Incorrect or fabricated output generated by an agent.
System where humans supervise or intervene in agent execution.
Managing permissions for agent actions.
Technique for inferring an agent’s reward function from observed expert behavior rather than specifying it manually.
Structured repository of information used by agents.
Time taken by an agent to process input and execute actions.
Decomposition strategy that solves simpler subproblems first and uses their answers as context to solve progressively harder ones.
Persistent storage of knowledge across sessions.
Research area that reverse-engineers the internal computations of models to understand why an agent produced a given decision.
Mechanism allowing agents to retain context across interactions.
Open protocol that standardizes how agents connect to external tools, data sources, and context providers.
Search algorithm that builds a decision tree through random rollouts, used in reasoning agents to select high-value action sequences.
System where multiple agents collaborate to solve complex tasks.
Using agents to execute complex multi-stage workflows.
Hybrid approach combining neural models with symbolic logic or structured knowledge representations.
Agent behavior where outputs vary due to probabilistic models.
Converting agent outputs into structured formats for execution.
Set of inputs an agent can perceive from its environment.
Central agent that manages and coordinates multiple agents or workflows.
Reward model that evaluates only the final answer of a reasoning trace, ignoring intermediate steps.
Component that processes inputs such as text, images, or signals.
Limits defining what actions an agent can perform.
Two-stage prompting pattern that first produces an explicit plan and then executes each step, improving reliability on multi-step tasks.
Design pattern where one component plans tasks and another executes them.
Breaking down a goal into smaller executable steps.
Formal model for sequential decision-making where the agent cannot fully observe the true state of the environment.
Protecting sensitive data handled by agents.
Agent that anticipates tasks and acts independently.
Reward model that scores each intermediate reasoning step rather than only the final outcome, used to train or select reasoning traces.
Instruction used to guide agent behavior.
Linking multiple prompts where outputs feed into subsequent steps.
Designing prompts to improve agent outputs.
Attack where malicious inputs manipulate agent behavior.
Framework combining reasoning and action steps in agent workflows.
Agent that responds immediately to inputs without planning ahead.
Logical evaluation used by agents to determine next actions.
Structured adversarial testing used to uncover unsafe, harmful, or misaligned agent behaviors before deployment.
Process where agents review past actions to improve future decisions.
Technique where an agent verbalizes self-feedback after failed attempts and stores it in episodic memory to improve subsequent trials.
Technique combining retrieval from external sources with generation.
Rules defining how agents retry failed operations.
Failure mode where an agent maximizes its reward signal in ways that violate the designer’s true intent.
Variant of RLHF that substitutes AI-generated preference labels for human labels to reduce annotation cost.
Alignment technique that fine-tunes a model against a reward model trained on human preference rankings.
Accumulated reusable collection of learned procedures or programs an agent can compose when solving new tasks.
System where one agent independently handles all tasks.
Controlled setup used to test agent behavior.
Temporary context used during a single interaction.
Agent capable of adapting its behavior using feedback.
Restricting agent access to authorized systems.
Techniques that allow humans to supervise AI systems on tasks whose outputs humans cannot fully evaluate directly.
Isolated environment for safe execution of agent actions.
Mechanism preventing harmful or unintended actions.
Instruction defining agent behavior, constraints, and role.
Benchmark that evaluates agents on resolving real GitHub issues across open-source software repositories.
Specialized agent responsible for a subset of tasks within a system.
Enforcing specific formats such as JSON in agent responses.
Accuracy of intermediate reasoning steps.
Tracking current progress and context of an agent.
Behavior that technically satisfies a written specification but violates the underlying goal behind it.
Misuse of external tools due to incorrect reasoning.
Compute consumption measured in tokens during agent execution.
Mechanism to stop long-running or stalled agent tasks.
Dividing complex problems into manageable subtasks.
Measure of how correctly an agent completes tasks.
Triggering external tools during agent execution.
Catalog of tools available to an agent.
Process of choosing the appropriate tool for a task.
Ability of an agent to interact with external tools or APIs.
Agent designed to rely on external tools for task completion.
Model trained to decide autonomously when to call an external API, which API to call, and how to integrate the result.
Reasoning framework where an agent explores multiple branching thought paths and evaluates intermediate states before committing to a solution.
Agent that autonomously handles customer queries and support tasks.
Agent that processes and analyzes datasets automatically.
Agent that automates infrastructure management and deployments.
Agent that gathers and synthesizes information from multiple sources.
Agent that assists with lead qualification and outreach.
Storage of embeddings used for semantic retrieval.
Internal learned model of the environment used by an agent to predict outcomes and plan without acting in the real world.
No matching data found.