Payment Governance for AutoGen Agents
AutoGen is designed around conversations between agents. That makes it unusually good at exploratory multi-agent workflows - and unusually prone to the "one agent's spending authority leaks through the whole conversation" failure mode. This guide shows how to wire xBPP into AutoGen's register_function pattern so every payment in every conversation flows through the same policy.
pip install pyautogen vanar-xbppxBPP runs in-process, synchronously, with no dependencies. It doesn't need access to your LLM, your AutoGen config, or any external services.
AutoGen uses register_function to expose callable tools to an AssistantAgent. Put the xBPP evaluation inside the registered function's body.
import autogen
from xbpp import evaluate
import json
with open('policies/autogen-agent.json') as f:
POLICY = json.load(f)
def pay(amount: float, recipient: str, currency: str = "USDC") -> dict:
"""Send a payment to a recipient."""
verdict = evaluate(
{"amount": amount, "currency": currency, "recipient": recipient},
POLICY
)
if verdict.decision == "BLOCK":
return {
"status": "blocked",
"reasons": verdict.reasons,
"message": verdict.message
}
if verdict.decision == "ESCALATE":
approved = human_approval.request({
"amount": amount,
"recipient": recipient,
"verdict": verdict.to_dict()
})
if not approved:
return {"status": "declined_by_human"}
tx = execute_payment(amount, recipient, currency)
return {"status": "sent", "tx_hash": tx.hash}
assistant = autogen.AssistantAgent(
name="buyer",
llm_config={"config_list": config_list}
)
user_proxy = autogen.UserProxyAgent(
name="user_proxy",
human_input_mode="NEVER",
code_execution_config=False
)
# Register the governed tool on both sides
autogen.agentchat.register_function(
pay,
caller=assistant, # the AssistantAgent that can *request* the call
executor=user_proxy, # the UserProxyAgent that actually *runs* the call
description="Send a USDC payment to a recipient"
)The dual registration (caller + executor) is AutoGen-specific. The policy evaluation runs on the executor side - the UserProxyAgent - which is exactly where you want it: at the point of actual execution, outside the LLM's control.
AutoGen's killer feature is conversations between multiple agents. The same pattern scales: register the governed pay function on whichever agent is designated as the executor, and every call from any other agent in the conversation flows through xBPP.
researcher = autogen.AssistantAgent("researcher", llm_config=...)
buyer = autogen.AssistantAgent("buyer", llm_config=...)
analyst = autogen.AssistantAgent("analyst", llm_config=...)
executor = autogen.UserProxyAgent("executor", human_input_mode="NEVER")
# Register the governed function once on the executor
autogen.agentchat.register_function(
pay,
caller=researcher, # researcher can propose payments
executor=executor,
description="Send a USDC payment"
)
autogen.agentchat.register_function(
pay,
caller=buyer, # buyer can also propose payments
executor=executor,
description="Send a USDC payment"
)
# Now spin up a GroupChat
groupchat = autogen.GroupChat(
agents=[researcher, buyer, analyst, executor],
messages=[],
max_round=20
)
manager = autogen.GroupChatManager(groupchat=groupchat, llm_config=...)All four agents participate in the conversation, but only executor actually runs any code - and every payment it runs is evaluated against the same policy. The group chat stays free-form; governance stays deterministic.
When multiple AssistantAgents can call pay, it's useful to know which one proposed each call. AutoGen passes the caller identity through the conversation, but not directly into the registered function's arguments. The workaround is to thread it through via a closure or a thread-local:
from threading import local
_context = local()
def make_pay(agent_name: str):
def pay_wrapper(amount: float, recipient: str, currency: str = "USDC") -> dict:
_context.agent = agent_name
return pay(amount, recipient, currency)
return pay_wrapper
# Register per-agent wrappers
autogen.agentchat.register_function(
make_pay("researcher"),
caller=researcher,
executor=executor,
description="Send a USDC payment (as researcher)"
)Inside the main pay function, read _context.agent and include it in the transaction passed to evaluate(). Now your audit log tells you which AutoGen agent originated each payment.
AutoGen's UserProxyAgent normally pauses for human input between turns, which would interact badly with an automated agent. Set human_input_mode="NEVER" to disable that pause - xBPP's ESCALATE verdict becomes your human-in-the-loop mechanism instead, and it only fires for specifically grey-zone transactions instead of every single turn.
executor = autogen.UserProxyAgent(
name="executor",
human_input_mode="NEVER",
code_execution_config=False # we don't need arbitrary code exec
)This gives you the best of both modes: the conversation runs autonomously, and escalations pause only when the policy says they should.
AutoGen emits callbacks for tool calls and their results. Hook into these to stream xBPP verdicts into your observability stack:
def on_tool_result(agent, result, *args, **kwargs):
if isinstance(result, dict) and result.get("status") == "blocked":
metrics.increment("xbpp.block", tags={"agent": agent.name})
elif isinstance(result, dict) and result.get("status") == "declined_by_human":
metrics.increment("xbpp.escalate_declined", tags={"agent": agent.name})
executor.register_reply([autogen.AssistantAgent], on_tool_result)Every block, every declined escalation, every allow - surfaced as a metric you can alert on.
Yes - the underlying runtime is the same, you just register the function before launching Studio.
The event-driven model uses Topic and message subscriptions but still calls registered handlers. xBPP goes inside the handler, same as in the conversation model.
Yes. async def pay(...) works with register_function as long as your AutoGen version supports async tools. xBPP's evaluate() is synchronous and runs in a few microseconds, so it doesn't block the async loop.
No - and it shouldn't. Policy evaluation is a function of the transaction and the policy, nothing else. This is intentional: the same policy should produce the same verdict whether it's called from AutoGen, CrewAI, or a shell script.