使用脈絡快取

本指南說明如何在生成式 AI 應用程式中使用內容快取,並涵蓋下列主題:

您可以使用 REST API 或 Python SDK,在生成式 AI 應用程式中參照內容快取中儲存的內容。如要使用脈絡快取,請先建立脈絡快取

脈絡快取屬性

您在程式碼中使用的內容快取物件包含下列屬性:

  • name:內容快取的完整資源名稱。在要求中參照快取時,您必須使用這個名稱。建立內容快取時,系統會在回應中傳回名稱。

    • 「Format」(形式)projects/PROJECT_NUMBER/locations/LOCATION/cachedContents/CACHE_ID
    • 要求主體範例

      "cached_content": "projects/123456789012/locations/us-central1/123456789012345678" 
  • model:用於建立快取的模型資源名稱。

    • 「Format」(形式)projects/PROJECT_NUMBER/locations/LOCATION/publishers/PUBLISHER_NAME/models/MODEL_ID
  • createTime:指定脈絡快取建立時間的 Timestamp

  • updateTimeTimestamp,指定脈絡快取的最近更新時間。快取更新前,createTimeupdateTime 相同。

  • expireTimeTimestamp,用於指定脈絡快取的到期時間。預設到期時間為 createTime 後 60 分鐘。您可以更新快取,設定新的到期時間。快取過期後會標示為要刪除,無法使用或更新。如要使用過期的快取,必須重新建立快取。

脈絡快取使用限制

建立內容快取時,您可以指定下列功能。後續要求使用快取時,不應再次指定這些功能:

  • GenerativeModel.system_instructions:指定模型在收到使用者指令前應採用的指令。詳情請參閱「系統操作說明」。

  • GenerativeModel.tool_config:指定 Gemini 模型使用的工具,例如函式呼叫功能的工具。詳情請參閱 tool_config 參考資料。

  • GenerativeModel.tools:指定用於建立函式呼叫應用程式的函式。詳情請參閱「函式呼叫」。

使用脈絡快取範例

下列程式碼範例示範如何在要求中使用內容快取。

Python

安裝

pip install --upgrade google-genai

詳情請參閱 SDK 參考說明文件

設定環境變數,透過 Vertex AI 使用 Gen AI SDK:

# Replace the `GOOGLE_CLOUD_PROJECT` and `GOOGLE_CLOUD_LOCATION` values # with appropriate values for your project. export GOOGLE_CLOUD_PROJECT=GOOGLE_CLOUD_PROJECT export GOOGLE_CLOUD_LOCATION=us-central1 export GOOGLE_GENAI_USE_VERTEXAI=True

from google import genai from google.genai.types import GenerateContentConfig, HttpOptions  client = genai.Client(http_options=HttpOptions(api_version="v1")) # Use content cache to generate text response # E.g cache_name = 'projects/111111111111/locations/us-central1/cachedContents/1111111111111111111' response = client.models.generate_content(     model="gemini-2.5-flash",     contents="Summarize the pdfs",     config=GenerateContentConfig(         cached_content=cache_name,     ), ) print(response.text) # Example response #   The Gemini family of multimodal models from Google DeepMind demonstrates remarkable capabilities across various #   modalities, including image, audio, video, and text....

Go

瞭解如何安裝或更新 Go

詳情請參閱 SDK 參考說明文件

設定環境變數,透過 Vertex AI 使用 Gen AI SDK:

# Replace the `GOOGLE_CLOUD_PROJECT` and `GOOGLE_CLOUD_LOCATION` values # with appropriate values for your project. export GOOGLE_CLOUD_PROJECT=GOOGLE_CLOUD_PROJECT export GOOGLE_CLOUD_LOCATION=us-central1 export GOOGLE_GENAI_USE_VERTEXAI=True

import ( 	"context" 	"fmt" 	"io"  	genai "google.golang.org/genai" )  // useContentCacheWithTxt shows how to use content cache to generate text content. func useContentCacheWithTxt(w io.Writer, cacheName string) error { 	ctx := context.Background()  	client, err := genai.NewClient(ctx, &genai.ClientConfig{ 		HTTPOptions: genai.HTTPOptions{APIVersion: "v1"}, 	}) 	if err != nil { 		return fmt.Errorf("failed to create genai client: %w", err) 	}  	resp, err := client.Models.GenerateContent(ctx, 		"gemini-2.5-flash", 		genai.Text("Summarize the pdfs"), 		&genai.GenerateContentConfig{ 			CachedContent: cacheName, 		}, 	) 	if err != nil { 		return fmt.Errorf("failed to use content cache to generate content: %w", err) 	}  	respText := resp.Text()  	fmt.Fprintln(w, respText)  	// Example response: 	// The provided research paper introduces Gemini 1.5 Pro, a multimodal model capable of recalling 	// and reasoning over information from very long contexts (up to 10 million tokens).  Key findings include: 	// 	// * **Long Context Performance:** 	// ...  	return nil } 

REST

您可以使用 REST,透過 Vertex AI API 將 POST 要求傳送至發布者模型端點,即可搭配提示使用內容快取。

使用任何要求資料之前,請先替換以下項目:

  • PROJECT_ID:您的專案 ID
  • LOCATION:處理建立脈絡快取要求的區域。
  • MIME_TYPE:要提交給模型的文字提示。

HTTP 方法和網址:

POST https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/publishers/google/models/gemini-2.0-flash-001:generateContent

JSON 要求主體:

 {   "cachedContent": "projects/PROJECT_NUMBER/locations/LOCATION/cachedContents/CACHE_ID",   "contents": [       {"role":"user","parts":[{"text":"PROMPT_TEXT"}]}   ],   "generationConfig": {       "maxOutputTokens": 8192,       "temperature": 1,       "topP": 0.95,   },   "safetySettings": [       {           "category": "HARM_CATEGORY_HATE_SPEECH",           "threshold": "BLOCK_MEDIUM_AND_ABOVE"       },       {           "category": "HARM_CATEGORY_DANGEROUS_CONTENT",           "threshold": "BLOCK_MEDIUM_AND_ABOVE"       },       {           "category": "HARM_CATEGORY_SEXUALLY_EXPLICIT",           "threshold": "BLOCK_MEDIUM_AND_ABOVE"       },       {           "category": "HARM_CATEGORY_HARASSMENT",           "threshold": "BLOCK_MEDIUM_AND_ABOVE"       }   ], } 

如要傳送要求,請選擇以下其中一個選項:

curl

將要求主體儲存在名為 request.json 的檔案中,然後執行下列指令:

curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
"https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/publishers/google/models/gemini-2.0-flash-001:generateContent"

PowerShell

將要求主體儲存在名為 request.json 的檔案中,然後執行下列指令:

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
-Method POST `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile request.json `
-Uri "https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/publishers/google/models/gemini-2.0-flash-001:generateContent" | Select-Object -Expand Content

您應該會收到類似如下的 JSON 回應:

cURL 指令範例

LOCATION="us-central1" MODEL_ID="gemini-2.0-flash-001" PROJECT_ID="test-project"  curl -X POST \ -H "Authorization: Bearer $(gcloud auth print-access-token)" \ -H "Content-Type: application/json" \ "https://${LOCATION}-aiplatform.googleapis.com/v1/projects/${PROJECT_ID}/locations/${LOCATION}/publishers/google/models/${MODEL_ID}:generateContent" -d \ '{   "cachedContent": "projects/${PROJECT_NUMBER}/locations/${LOCATION}/cachedContents/${CACHE_ID}",   "contents": [       {"role":"user","parts":[{"text":"What are the benefits of exercise?"}]}   ],   "generationConfig": {       "maxOutputTokens": 8192,       "temperature": 1,       "topP": 0.95,   },   "safetySettings": [     {       "category": "HARM_CATEGORY_HATE_SPEECH",       "threshold": "BLOCK_MEDIUM_AND_ABOVE"     },     {       "category": "HARM_CATEGORY_DANGEROUS_CONTENT",       "threshold": "BLOCK_MEDIUM_AND_ABOVE"     },     {       "category": "HARM_CATEGORY_SEXUALLY_EXPLICIT",       "threshold": "BLOCK_MEDIUM_AND_ABOVE"     },     {       "category": "HARM_CATEGORY_HARASSMENT",       "threshold": "BLOCK_MEDIUM_AND_ABOVE"     }   ], }'