Amazon Kendra

Amazon Kendra คือบริการ intelligent enterprise search ที่ขับเคลื่อนด้วย Machine Learning ออกแบบมาเพื่อให้ผลการค้นหาที่เข้าใจความหมาย (semantic search) ไม่ใช่แค่การจับคู่คำ (keyword matching) Kendra สามารถตอบคำถามแบบ natural language เช่น "นโยบายลาป่วยมีกี่วัน" หรือ "วิธีติดตั้ง VPN สำหรับ Windows 11" และดึงคำตอบที่แม่นยำจากเอกสารหลายแสนหน้าได้ภายในวินาที

Kendra รองรับ connector กว่า 40 แหล่งข้อมูล ครอบคลุม document repositories, CRM, ticketing systems และ databases ทำให้องค์กรสามารถสร้าง unified search ข้ามระบบต่าง ๆ ได้โดยไม่ต้องย้ายข้อมูล รองรับ document-level permissions เพื่อให้แต่ละคนเห็นเฉพาะเอกสารที่ตนมีสิทธิ์เข้าถึง

AWS Docs: https://docs.aws.amazon.com/kendra/latest/dg/what-is-kendra.html

สถาปัตยกรรม

ฟีเจอร์หลัก

Data Source Connectors

Kendra รองรับ connector สำหรับแหล่งข้อมูลยอดนิยม:

Amazon S3 — sync เอกสาร PDF, Word, HTML, PowerPoint จาก S3 bucket
SharePoint Online/Server — sync เอกสาร, sites, libraries จาก Microsoft SharePoint
Confluence — sync pages, spaces, blogs จาก Confluence Cloud/Server
Salesforce — sync articles, chatter, knowledge base จาก Salesforce
ServiceNow — sync knowledge articles, incidents จาก ServiceNow
RDS/Aurora — sync ข้อมูลจาก relational database โดยตรง
Web Crawler — crawl website และ intranet ตาม URL ที่กำหนด
Google Drive — sync ไฟล์จาก Google Workspace
Microsoft OneDrive — sync ไฟล์จาก OneDrive
Jira — sync issues, comments จาก Jira Cloud/Server
GitHub — sync repositories, wikis, issues จาก GitHub
Slack — sync messages, files จาก Slack channels

Query Suggestions

ระบบแนะนำคำค้นหาที่เกี่ยวข้องขณะพิมพ์ (auto-complete) เรียนรู้จาก query history ของผู้ใช้ในองค์กร ช่วยให้ผู้ใช้ค้นหาได้รวดเร็วและแม่นยำขึ้น

Document Enrichment

ปรับแต่ง metadata ของเอกสารก่อน index เช่น เพิ่ม tags, แก้ไข title, ลบข้อมูล sensitive ออก หรือเรียก Lambda เพื่อ process เอกสารก่อน sync

Relevance Tuning

ปรับ ranking ของผลการค้นหาโดยกำหนดน้ำหนักให้กับ fields เช่น ให้ความสำคัญกับ document title มากกว่า body text หรือ boost เอกสารที่ถูก update ล่าสุด

Incremental Learning / Feedback

ระบบเรียนรู้จาก feedback ของผู้ใช้ว่า document ใดที่ถูกคลิกหลังค้นหา เพื่อปรับ ranking ให้ดีขึ้นเรื่อย ๆ ตามการใช้งานจริง

Kendra Retrieve API

API สำหรับดึง passage (ข้อความตอน) ที่เกี่ยวข้องจากเอกสาร ใช้สำหรับ RAG (Retrieval-Augmented Generation) pipeline ที่ต้องนำ context ไปให้ LLM เช่น Bedrock หรือ SageMaker ตอบคำถาม

Faceted Search

กรองผลการค้นหาตาม attribute เช่น ประเภทเอกสาร, วันที่อัปเดต, แผนก, ผู้เขียน ช่วยให้ผู้ใช้ narrow down ผลลัพธ์ได้รวดเร็ว

Featured Results

กำหนด result ที่จะแสดงในตำแหน่ง top เสมอสำหรับ query ที่กำหนด เช่น เมื่อพนักงานค้นหา "reset password" ให้แสดงคู่มือ IT helpdesk ก่อนเสมอ

Access Control (Document-level Permissions)

Kendra ตรวจสอบ ACL (Access Control List) ของแต่ละเอกสารจากแหล่งข้อมูลต้นทาง ผู้ใช้จะเห็นเฉพาะเอกสารที่ตนมีสิทธิ์เข้าถึงใน SharePoint, S3, Confluence เป็นต้น

การติดตั้งและการตั้งค่า

สร้าง Kendra Index ผ่าน Console

ไปที่ AWS Console > Amazon Kendra > Create an index
เลือก Edition: Developer หรือ Enterprise
กำหนด IAM role สำหรับ Kendra
สร้าง Data Source: เลือก connector type (S3, SharePoint ฯลฯ)
กำหนด sync schedule (ทุกชั่วโมง, ทุกวัน ฯลฯ)
รัน sync ครั้งแรก และรอ index ข้อมูล
ทดสอบค้นหาใน Search console

IAM Permissions ที่จำเป็น

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "kendra:Query",
        "kendra:Retrieve",
        "kendra:SubmitFeedback",
        "kendra:ListIndices"
      ],
      "Resource": "arn:aws:kendra:ap-southeast-1:*:index/*"
    },
    {
      "Effect": "Allow",
      "Action": [
        "s3:GetObject",
        "s3:ListBucket"
      ],
      "Resource": [
        "arn:aws:s3:::my-documents-bucket",
        "arn:aws:s3:::my-documents-bucket/*"
      ]
    }
  ]
}

ติดตั้ง SDK

pip install boto3

วิธีใช้งาน

ตัวอย่าง: ค้นหาด้วย Kendra Query API (Python)

import boto3

client = boto3.client("kendra", region_name="ap-southeast-1")

response = client.query(
    IndexId="INDEX_ID_HERE",
    QueryText="นโยบายการลาป่วยมีกี่วันต่อปี",
    QueryResultTypeFilter="DOCUMENT",
    PageSize=5
)

for result in response["ResultItems"]:
    print("Type:", result["Type"])
    print("Title:", result["DocumentTitle"]["Text"])
    print("Excerpt:", result["DocumentExcerpt"]["Text"])
    print("URI:", result["DocumentURI"])
    print("Score:", result["ScoreAttributes"]["ScoreConfidence"])
    print("---")

ตัวอย่าง: ใช้ Retrieve API สำหรับ RAG

import boto3

kendra = boto3.client("kendra", region_name="ap-southeast-1")
bedrock = boto3.client("bedrock-runtime", region_name="ap-southeast-1")

# ดึง passages จาก Kendra
retrieve_response = kendra.retrieve(
    IndexId="INDEX_ID_HERE",
    QueryText="ขั้นตอนการขอลาพักร้อน",
    PageSize=3
)

# รวม context จาก passages
context = "\n\n".join([
    item["Content"] for item in retrieve_response["ResultItems"]
])

# ส่งให้ LLM ตอบคำถาม
prompt = f"""จากเอกสารต่อไปนี้:
{context}

คำถาม: ขั้นตอนการขอลาพักร้อนมีอะไรบ้าง?
ตอบ:"""

# เรียก Claude บน Bedrock
# ... (Bedrock API call)

ตัวอย่าง: เพิ่มเอกสารลง Index โดยตรง

import boto3
import base64

client = boto3.client("kendra", region_name="ap-southeast-1")

# อ่านไฟล์ PDF
with open("hr_policy.pdf", "rb") as f:
    file_content = base64.b64encode(f.read()).decode()

response = client.batch_put_document(
    IndexId="INDEX_ID_HERE",
    Documents=[
        {
            "Id": "hr-policy-2024",
            "Title": "นโยบาย HR 2024",
            "Blob": file_content,
            "ContentType": "PDF",
            "Attributes": [
                {
                    "Key": "_category",
                    "Value": {"StringValue": "HR Policy"}
                }
            ]
        }
    ]
)

ตัวอย่าง: สร้าง Data Source จาก S3 ด้วย CLI

aws kendra create-data-source \
  --index-id "INDEX_ID" \
  --name "HR-Documents-S3" \
  --type S3 \
  --configuration '{
    "S3Configuration": {
      "BucketName": "hr-documents-bucket",
      "InclusionPrefixes": ["policies/", "procedures/"],
      "DocumentsMetadataConfiguration": {
        "S3Prefix": "metadata/"
      }
    }
  }' \
  --role-arn "arn:aws:iam::123456789:role/KendraS3Role" \
  --schedule "cron(0 12 * * ? *)"

ราคา (ประมาณการในบาท)

ราคาคำนวณจาก 1 USD = 35 บาท

Developer Edition

รายการ	USD	บาท
ค่า index (รายเดือน)	$810/เดือน	~28,350 บาท/เดือน
รองรับ documents	สูงสุด 10,000 เอกสาร	—
Query limit	4,000 queries/วัน	—

Enterprise Edition

รายการ	USD	บาท
ค่า index (รายเดือน)	$7,000/เดือน	~245,000 บาท/เดือน
ค่า query เพิ่มเติม	$0.0007/query	~0.0245 บาท/query
รองรับ documents	สูงสุด 500,000 เอกสาร	—

หมายเหตุ

ไม่มี Free tier สำหรับ production use
Developer Edition เหมาะสำหรับ PoC และ dev/test environment
Enterprise Edition รองรับ SLA 99.9% uptime

ตัวอย่างค่าใช้จ่ายรายเดือน

PoC/Dev (Developer Edition): ~28,350 บาท/เดือน
Enterprise 100,000 เอกสาร + 500,000 query/เดือน: ~245,000 + 12.25 = ~245,012 บาท/เดือน

เหมาะสำหรับ

องค์กรขนาดใหญ่ที่มีเอกสารหลายแสนหน้าและต้องการค้นหาข้อมูลภายในอย่างมีประสิทธิภาพ
บริษัทที่มี knowledge base กระจายอยู่ใน SharePoint, Confluence, Salesforce และ S3
ทีม IT Helpdesk ที่ต้องการ self-service portal ให้พนักงานค้นหาคำตอบเองได้
ทีม HR, Legal ที่ต้องค้นหาข้อมูลในนโยบาย, สัญญา, กฎหมายจำนวนมาก
แพลตฟอร์มที่ต้องการสร้าง RAG (Retrieval-Augmented Generation) pipeline กับ LLM
Call center ที่ต้องการให้ agent ค้นหาข้อมูล product และ policy ได้รวดเร็ว

ใช้ร่วมกับ AWS Services

Amazon S3 — แหล่งเก็บเอกสาร PDF, Word, HTML
Amazon Lex — บอทที่ตอบคำถามโดยค้นหาจาก Kendra
Amazon Bedrock — RAG pipeline ส่ง retrieved passages ให้ LLM
AWS Lambda — document enrichment, custom data source
Amazon CloudWatch — monitor query metrics, indexing errors
AWS IAM / IAM Identity Center — จัดการ user permissions
Amazon OpenSearch — ใช้ร่วมกันสำหรับ hybrid search
AWS Amplify — สร้าง search UI สำหรับ web app
Amazon QuickSight — วิเคราะห์ query patterns และ search analytics

Use Case ตัวอย่าง

1. Knowledge Base ระดับองค์กรสำหรับบริษัทเทคโนโลยีขนาดใหญ่

บริษัทเทคโนโลยีที่มีพนักงาน 10,000 คน deploy Kendra Enterprise Edition เชื่อมต่อกับ Confluence (300,000 pages), SharePoint (150,000 files), Jira (500,000 issues) และ S3 (50,000 PDF) พนักงานสามารถค้นหาคำตอบเกี่ยวกับ IT policy, architecture decisions, deployment procedures ผ่าน intranet portal ที่ใช้ Kendra เป็น backend ลดเวลาค้นหาข้อมูลจากเฉลี่ย 20 นาทีเหลือ 2 นาที ประหยัดเวลาพนักงานรวม 15,000 ชั่วโมง/เดือน

2. Legal Document Search สำหรับสำนักงานกฎหมาย

สำนักงานกฎหมายขนาดใหญ่ใช้ Kendra index สัญญา, คำพิพากษา, กฎหมาย และ legal precedents กว่า 200,000 เอกสาร Kendra Retrieve API ถูกใช้ร่วมกับ Claude บน Bedrock ให้ทนายความถามคำถามเช่น "หาคดีที่เกี่ยวข้องกับการละเมิดสิทธิ์ข้อมูลส่วนบุคคลในช่วง 3 ปีที่ผ่านมา" และได้รับคำตอบพร้อมอ้างอิงเอกสารต้นฉบับ ลดเวลา legal research จากหลายชั่วโมงเหลือไม่กี่นาที

3. Customer Support Knowledge Base สำหรับบริษัทประกันภัย

บริษัทประกันภัยสร้าง customer-facing search portal ที่ index product documentation, FAQ, terms & conditions ลูกค้าค้นหาข้อมูลกรมธรรม์ เงื่อนไขการเคลม และวิธีการชำระเงิน โดยตรงบน website ใช้ Kendra Retrieve API + Bedrock สร้าง AI assistant ที่ตอบคำถามพร้อมแสดง source document ลดปริมาณการโทรเข้า call center ได้ 40% ในช่วง 6 เดือนแรก

สถาปัตยกรรม​

ฟีเจอร์หลัก​

Data Source Connectors​

Query Suggestions​

Document Enrichment​

Relevance Tuning​

Incremental Learning / Feedback​

Kendra Retrieve API​

Faceted Search​

Featured Results​

Access Control (Document-level Permissions)​

การติดตั้งและการตั้งค่า​

สร้าง Kendra Index ผ่าน Console​

IAM Permissions ที่จำเป็น​

ติดตั้ง SDK​

วิธีใช้งาน​

ตัวอย่าง: ค้นหาด้วย Kendra Query API (Python)​

ตัวอย่าง: ใช้ Retrieve API สำหรับ RAG​

ตัวอย่าง: เพิ่มเอกสารลง Index โดยตรง​

ตัวอย่าง: สร้าง Data Source จาก S3 ด้วย CLI​

ราคา (ประมาณการในบาท)​

Developer Edition​

Enterprise Edition​

หมายเหตุ​

ตัวอย่างค่าใช้จ่ายรายเดือน​

เหมาะสำหรับ​

ใช้ร่วมกับ AWS Services​

Use Case ตัวอย่าง​

1. Knowledge Base ระดับองค์กรสำหรับบริษัทเทคโนโลยีขนาดใหญ่​

2. Legal Document Search สำหรับสำนักงานกฎหมาย​

3. Customer Support Knowledge Base สำหรับบริษัทประกันภัย​