ติดตามต้นทุนและการใช้งาน

เรียนรู้วิธีติดตามการใช้ token ประมาณต้นทุน และกำหนดค่า prompt caching ด้วย Claude Agent SDK

Claude Agent SDK ให้ข้อมูลการใช้ token โดยละเอียดสำหรับการโต้ตอบแต่ละครั้งกับ Claude คู่มือนี้อธิบายวิธีติดตามการใช้งานอย่างถูกต้องและทำความเข้าใจการรายงานต้นทุน

warning

field total_cost_usd และ costUSD เป็นการประมาณทางฝั่ง client ไม่ใช่ข้อมูลการเรียกเก็บเงินที่ถูกต้อง SDK คำนวณพวกมันในเครื่องจากตารางราคาที่รวมมา ณ เวลา build ดังนั้นอาจแตกต่างจากที่คุณถูกเรียกเก็บเงินจริงเมื่อ:

ราคาเปลี่ยนแปลง
SDK version ที่ติดตั้งไม่รู้จัก model
มีกฎการเรียกเก็บเงินที่ client ไม่สามารถจำลองได้

ใช้ fields เหล่านี้สำหรับ insight การพัฒนาและการจัดทำงบประมาณโดยประมาณ สำหรับการเรียกเก็บเงินที่ถูกต้อง ให้ใช้ Usage and Cost API หรือหน้า Usage ใน Claude Console

ทำความเข้าใจการใช้ Token

TypeScript และ Python SDKs เปิดเผยข้อมูลการใช้งานเดียวกันด้วยชื่อ field ที่แตกต่างกัน:

TypeScript ให้ token breakdowns แบบ per-step บนแต่ละ assistant message และ cumulative total บน result message
Python ให้ token breakdowns แบบ per-step บนแต่ละ assistant message และ total สะสมบน result message (total_cost_usd และ usage dict)

รับต้นทุนรวมของ Query

Result message จะทำเครื่องหมายจุดสิ้นสุดของ agent loop สำหรับการเรียก query() และมี total_cost_usd ซึ่งเป็นต้นทุนประมาณสะสมทั่วทุก steps:

import { query } from "@anthropic-ai/claude-agent-sdk";

for await (const message of query({ prompt: "Summarize this project" })) {
  if (message.type === "result") {
    console.log(`Total cost: $${message.total_cost_usd}`);
  }
}

from claude_agent_sdk import query, ResultMessage
import asyncio


async def main():
    async for message in query(prompt="Summarize this project"):
        if isinstance(message, ResultMessage):
            print(f"Total cost: ${message.total_cost_usd or 0}")


asyncio.run(main())

ติดตามการใช้งานแบบ Per-step และ Per-model

ติดตาม Per-step Usage

แต่ละ assistant message มี nested BetaMessage พร้อม id และ usage object พร้อม token counts เมื่อ Claude ใช้ tools แบบ parallel หลาย messages จะแบ่งปัน id เดียวกัน ติดตาม IDs ที่คุณนับไปแล้วและข้ามรายการซ้ำ:

import { query } from "@anthropic-ai/claude-agent-sdk";

const seenIds = new Set<string>();
let totalInputTokens = 0;
let totalOutputTokens = 0;

for await (const message of query({ prompt: "Summarize this project" })) {
  if (message.type === "assistant") {
    const msgId = message.message.id;

    // Parallel tool calls ใช้ ID เดียวกัน นับเพียงครั้งเดียว
    if (!seenIds.has(msgId)) {
      seenIds.add(msgId);
      totalInputTokens += message.message.usage.input_tokens;
      totalOutputTokens += message.message.usage.output_tokens;
    }
  }
}

console.log(`Steps: ${seenIds.size}`);
console.log(`Input tokens: ${totalInputTokens}`);
console.log(`Output tokens: ${totalOutputTokens}`);

แบ่ง Usage ตาม Model

Result message มี modelUsage ซึ่งเป็น map ของชื่อ model ไปยัง per-model token counts และต้นทุน:

import { query } from "@anthropic-ai/claude-agent-sdk";

for await (const message of query({ prompt: "Summarize this project" })) {
  if (message.type !== "result") continue;

  for (const [modelName, usage] of Object.entries(message.modelUsage)) {
    console.log(`${modelName}: $${usage.costUSD.toFixed(4)}`);
    console.log(`  Input tokens: ${usage.inputTokens}`);
    console.log(`  Output tokens: ${usage.outputTokens}`);
    console.log(`  Cache read: ${usage.cacheReadInputTokens}`);
    console.log(`  Cache creation: ${usage.cacheCreationInputTokens}`);
  }
}

สะสมต้นทุนข้ามหลาย Calls

SDK ไม่ให้ session-level total ดังนั้นถ้าแอปพลิเคชันของคุณทำการเรียก query() หลายครั้ง ให้สะสม totals ด้วยตัวเอง:

import { query } from "@anthropic-ai/claude-agent-sdk";

let totalSpend = 0;

const prompts = [
  "Read the files in src/ and summarize the architecture",
  "List all exported functions in src/auth.ts"
];

for (const prompt of prompts) {
  for await (const message of query({ prompt })) {
    if (message.type === "result") {
      totalSpend += message.total_cost_usd;
      console.log(`This call: $${message.total_cost_usd}`);
    }
  }
}

console.log(`Total spend: $${totalSpend.toFixed(4)}`);

from claude_agent_sdk import query, ResultMessage
import asyncio


async def main():
    total_spend = 0.0

    prompts = [
        "Read the files in src/ and summarize the architecture",
        "List all exported functions in src/auth.ts",
    ]

    for prompt in prompts:
        async for message in query(prompt=prompt):
            if isinstance(message, ResultMessage):
                cost = message.total_cost_usd or 0
                total_spend += cost
                print(f"This call: ${cost}")

    print(f"Total spend: ${total_spend:.4f}")


asyncio.run(main())

จัดการข้อผิดพลาด Caching และความคลาดเคลื่อนของ Token

ติดตามต้นทุนบนการสนทนาที่ล้มเหลว

ทั้ง success และ error result messages มี usage และ total_cost_usd เสมออ่านข้อมูลต้นทุนจาก result message โดยไม่คำนึงถึง subtype ของมัน

ติดตาม Cache Tokens

Agent SDK ใช้ prompt caching โดยอัตโนมัติ usage object มีสอง fields เพิ่มเติมสำหรับการติดตาม cache:

cache_creation_input_tokens: tokens ที่ใช้สร้าง cache entries ใหม่ (เรียกเก็บในอัตราที่สูงกว่า)
cache_read_input_tokens: tokens ที่อ่านจาก cache entries ที่มีอยู่ (เรียกเก็บในอัตราที่ต่ำกว่า)

ขยาย Prompt Cache TTL เป็นหนึ่งชั่วโมง

Cache entries ที่เขียนโดย SDK ใช้ TTL 5 นาทีเป็นค่าเริ่มต้นเมื่อ authenticate ด้วย API key หรือรันบน Bedrock, Vertex AI หรือ Foundry เพื่อร้องขอ TTL 1 ชั่วโมง ให้ตั้ง environment variable ENABLE_PROMPT_CACHING_1H:

from claude_agent_sdk import ClaudeAgentOptions, query
import asyncio


async def main():
    options = ClaudeAgentOptions(
        env={
            "CLAUDE_CODE_USE_BEDROCK": "1",
            "ENABLE_PROMPT_CACHING_1H": "1",
        },
    )

    async for message in query(prompt="Summarize this project", options=options):
        print(message)


asyncio.run(main())

import { query } from "@anthropic-ai/claude-agent-sdk";

const options = {
  env: {
    ...process.env,
    CLAUDE_CODE_USE_BEDROCK: "1",
    ENABLE_PROMPT_CACHING_1H: "1",
  },
};

for await (const message of query({ prompt: "Summarize this project", options })) {
  console.log(message);
}

เอกสารที่เกี่ยวข้อง

TypeScript SDK Reference - เอกสาร API ครบถ้วน
Sessions - การจัดการ sessions
Permissions - การจัดการ tool permissions

ทำความเข้าใจการใช้ Token​

รับต้นทุนรวมของ Query​

ติดตามการใช้งานแบบ Per-step และ Per-model​

ติดตาม Per-step Usage​

แบ่ง Usage ตาม Model​

สะสมต้นทุนข้ามหลาย Calls​

จัดการข้อผิดพลาด Caching และความคลาดเคลื่อนของ Token​

ติดตามต้นทุนบนการสนทนาที่ล้มเหลว​

ติดตาม Cache Tokens​

ขยาย Prompt Cache TTL เป็นหนึ่งชั่วโมง​

เอกสารที่เกี่ยวข้อง​