Gemini 2.5 Flash Image Preview është tani në dispozicion në Gemini API! Mësoni më shumë

Kjo faqe është përkthyer nga Cloud Translation API.

Kuptimi i dokumentit

Modelet e Binjakëve mund të përpunojnë dokumente në formatin PDF, duke përdorur vizionin vendas për të kuptuar të gjithë kontekstet e dokumenteve. Kjo shkon përtej nxjerrjes së thjeshtë të tekstit, duke i lejuar Binjakët të:

Analizoni dhe interpretoni përmbajtjen, duke përfshirë tekstin, imazhet, diagramet, grafikët dhe tabelat, edhe në dokumente të gjata deri në 1000 faqe.
Ekstraktoni informacionin në formate të strukturuara të daljes .
Përmblidhni dhe përgjigjuni pyetjeve bazuar në elementet vizuale dhe tekstuale në një dokument.
Transkriptoni përmbajtjen e dokumentit (p.sh. në HTML), duke ruajtur paraqitjet dhe formatimin, për përdorim në aplikacionet e rrjedhës së poshtme.

Kalimi i të dhënave të integruara PDF

Ju mund të kaloni të dhëna PDF inline në kërkesën për generateContent . Për ngarkesat PDF nën 20 MB, mund të zgjidhni midis ngarkimit të dokumenteve të koduara base64 ose ngarkimit të drejtpërdrejtë të skedarëve të ruajtur në vend.

Shembulli i mëposhtëm ju tregon se si të merrni një PDF nga një URL dhe ta shndërroni atë në bajt për përpunim:

Python

from google import genai from google.genai import types import httpx  client = genai.Client()  doc_url = "https://discovery.ucl.ac.uk/id/eprint/10089234/1/343019_3_art_0_py4t4l_convrt.pdf"  # Retrieve and encode the PDF byte doc_data = httpx.get(doc_url).content  prompt = "Summarize this document" response = client.models.generate_content(   model="gemini-2.5-flash",   contents=[       types.Part.from_bytes(         data=doc_data,         mime_type='application/pdf',       ),       prompt]) print(response.text)

JavaScript

import { GoogleGenAI } from "@google/genai";  const ai = new GoogleGenAI({ apiKey: "GEMINI_API_KEY" });  async function main() {     const pdfResp = await fetch('https://discovery.ucl.ac.uk/id/eprint/10089234/1/343019_3_art_0_py4t4l_convrt.pdf')         .then((response) => response.arrayBuffer());      const contents = [         { text: "Summarize this document" },         {             inlineData: {                 mimeType: 'application/pdf',                 data: Buffer.from(pdfResp).toString("base64")             }         }     ];      const response = await ai.models.generateContent({         model: "gemini-2.5-flash",         contents: contents     });     console.log(response.text); }  main();

Shkoni

package main  import (     "context"     "fmt"     "io"     "net/http"     "os"     "google.golang.org/genai" )  func main() {      ctx := context.Background()     client, _ := genai.NewClient(ctx, &genai.ClientConfig{         APIKey:  os.Getenv("GEMINI_API_KEY"),         Backend: genai.BackendGeminiAPI,     })      pdfResp, _ := http.Get("https://discovery.ucl.ac.uk/id/eprint/10089234/1/343019_3_art_0_py4t4l_convrt.pdf")     var pdfBytes []byte     if pdfResp != nil && pdfResp.Body != nil {         pdfBytes, _ = io.ReadAll(pdfResp.Body)         pdfResp.Body.Close()     }      parts := []*genai.Part{         &genai.Part{             InlineData: &genai.Blob{                 MIMEType: "application/pdf",                 Data:     pdfBytes,             },         },         genai.NewPartFromText("Summarize this document"),     }      contents := []*genai.Content{         genai.NewContentFromParts(parts, genai.RoleUser),     }      result, _ := client.Models.GenerateContent(         ctx,         "gemini-2.5-flash",         contents,         nil,     )      fmt.Println(result.Text()) }

PUSHIMI

DOC_URL="https://discovery.ucl.ac.uk/id/eprint/10089234/1/343019_3_art_0_py4t4l_convrt.pdf" PROMPT="Summarize this document" DISPLAY_NAME="base64_pdf"  # Download the PDF wget -O "${DISPLAY_NAME}.pdf" "${DOC_URL}"  # Check for FreeBSD base64 and set flags accordingly if [[ "$(base64 --version 2>&1)" = *"FreeBSD"* ]]; then   B64FLAGS="--input" else   B64FLAGS="-w0" fi  # Base64 encode the PDF ENCODED_PDF=$(base64 $B64FLAGS "${DISPLAY_NAME}.pdf")  # Generate content using the base64 encoded PDF curl "https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent?key=$GOOGLE_API_KEY" \     -H 'Content-Type: application/json' \     -X POST \     -d '{       "contents": [{         "parts":[           {"inline_data": {"mime_type": "application/pdf", "data": "'"$ENCODED_PDF"'"}},           {"text": "'$PROMPT'"}         ]       }]     }' 2> /dev/null > response.json  cat response.json echo  jq ".candidates[].content.parts[].text" response.json  # Clean up the downloaded PDF rm "${DISPLAY_NAME}.pdf"

Ju gjithashtu mund të lexoni një PDF nga një skedar lokal për përpunim:

Python

from google import genai from google.genai import types import pathlib  client = genai.Client()  # Retrieve and encode the PDF byte filepath = pathlib.Path('file.pdf')  prompt = "Summarize this document" response = client.models.generate_content(   model="gemini-2.5-flash",   contents=[       types.Part.from_bytes(         data=filepath.read_bytes(),         mime_type='application/pdf',       ),       prompt]) print(response.text)

JavaScript

import { GoogleGenAI } from "@google/genai"; import * as fs from 'fs';  const ai = new GoogleGenAI({ apiKey: "GEMINI_API_KEY" });  async function main() {     const contents = [         { text: "Summarize this document" },         {             inlineData: {                 mimeType: 'application/pdf',                 data: Buffer.from(fs.readFileSync("content/343019_3_art_0_py4t4l_convrt.pdf")).toString("base64")             }         }     ];      const response = await ai.models.generateContent({         model: "gemini-2.5-flash",         contents: contents     });     console.log(response.text); }  main();

Shkoni

package main  import (     "context"     "fmt"     "os"     "google.golang.org/genai" )  func main() {      ctx := context.Background()     client, _ := genai.NewClient(ctx, &genai.ClientConfig{         APIKey:  os.Getenv("GEMINI_API_KEY"),         Backend: genai.BackendGeminiAPI,     })      pdfBytes, _ := os.ReadFile("path/to/your/file.pdf")      parts := []*genai.Part{         &genai.Part{             InlineData: &genai.Blob{                 MIMEType: "application/pdf",                 Data:     pdfBytes,             },         },         genai.NewPartFromText("Summarize this document"),     }     contents := []*genai.Content{         genai.NewContentFromParts(parts, genai.RoleUser),     }      result, _ := client.Models.GenerateContent(         ctx,         "gemini-2.5-flash",         contents,         nil,     )      fmt.Println(result.Text()) }

Ngarkimi i skedarëve PDF duke përdorur API-në e skedarit

Ju mund të përdorni API-në e skedarit për të ngarkuar dokumente më të mëdha. Përdorni gjithmonë API-në e skedarit kur madhësia totale e kërkesës (përfshirë skedarët, kërkesën për tekst, udhëzimet e sistemit, etj.) është më e madhe se 20 MB.

Telefononi media.upload për të ngarkuar një skedar duke përdorur File API. Kodi i mëposhtëm ngarkon një skedar dokumenti dhe më pas përdor skedarin në një thirrje për models.generateContent .

PDF të mëdha nga URL-të

Përdorni API-në e skedarit për të thjeshtuar ngarkimin dhe përpunimin e skedarëve të mëdhenj PDF nga URL-të:

Python

from google import genai from google.genai import types import io import httpx  client = genai.Client()  long_context_pdf_path = "https://www.nasa.gov/wp-content/uploads/static/history/alsj/a17/A17_FlightPlan.pdf"  # Retrieve and upload the PDF using the File API doc_io = io.BytesIO(httpx.get(long_context_pdf_path).content)  sample_doc = client.files.upload(   # You can pass a path or a file-like object here   file=doc_io,   config=dict(     mime_type='application/pdf') )  prompt = "Summarize this document"  response = client.models.generate_content(   model="gemini-2.5-flash",   contents=[sample_doc, prompt]) print(response.text)

JavaScript

import { createPartFromUri, GoogleGenAI } from "@google/genai";  const ai = new GoogleGenAI({ apiKey: "GEMINI_API_KEY" });  async function main() {      const pdfBuffer = await fetch("https://www.nasa.gov/wp-content/uploads/static/history/alsj/a17/A17_FlightPlan.pdf")         .then((response) => response.arrayBuffer());      const fileBlob = new Blob([pdfBuffer], { type: 'application/pdf' });      const file = await ai.files.upload({         file: fileBlob,         config: {             displayName: 'A17_FlightPlan.pdf',         },     });      // Wait for the file to be processed.     let getFile = await ai.files.get({ name: file.name });     while (getFile.state === 'PROCESSING') {         getFile = await ai.files.get({ name: file.name });         console.log(`current file status: ${getFile.state}`);         console.log('File is still processing, retrying in 5 seconds');          await new Promise((resolve) => {             setTimeout(resolve, 5000);         });     }     if (file.state === 'FAILED') {         throw new Error('File processing failed.');     }      // Add the file to the contents.     const content = [         'Summarize this document',     ];      if (file.uri && file.mimeType) {         const fileContent = createPartFromUri(file.uri, file.mimeType);         content.push(fileContent);     }      const response = await ai.models.generateContent({         model: 'gemini-2.5-flash',         contents: content,     });      console.log(response.text);  }  main();

Shkoni

package main  import (   "context"   "fmt"   "io"   "net/http"   "os"   "google.golang.org/genai" )  func main() {    ctx := context.Background()   client, _ := genai.NewClient(ctx, &genai.ClientConfig{     APIKey:  os.Getenv("GEMINI_API_KEY"),     Backend: genai.BackendGeminiAPI,   })    pdfURL := "https://www.nasa.gov/wp-content/uploads/static/history/alsj/a17/A17_FlightPlan.pdf"   localPdfPath := "A17_FlightPlan_downloaded.pdf"    respHttp, _ := http.Get(pdfURL)   defer respHttp.Body.Close()    outFile, _ := os.Create(localPdfPath)   defer outFile.Close()    _, _ = io.Copy(outFile, respHttp.Body)    uploadConfig := &genai.UploadFileConfig{MIMEType: "application/pdf"}   uploadedFile, _ := client.Files.UploadFromPath(ctx, localPdfPath, uploadConfig)    promptParts := []*genai.Part{     genai.NewPartFromURI(uploadedFile.URI, uploadedFile.MIMEType),     genai.NewPartFromText("Summarize this document"),   }   contents := []*genai.Content{     genai.NewContentFromParts(promptParts, genai.RoleUser), // Specify role   }      result, _ := client.Models.GenerateContent(         ctx,         "gemini-2.5-flash",         contents,         nil,     )    fmt.Println(result.Text()) }

PUSHIMI

PDF_PATH="https://www.nasa.gov/wp-content/uploads/static/history/alsj/a17/A17_FlightPlan.pdf" DISPLAY_NAME="A17_FlightPlan" PROMPT="Summarize this document"  # Download the PDF from the provided URL wget -O "${DISPLAY_NAME}.pdf" "${PDF_PATH}"  MIME_TYPE=$(file -b --mime-type "${DISPLAY_NAME}.pdf") NUM_BYTES=$(wc -c < "${DISPLAY_NAME}.pdf")  echo "MIME_TYPE: ${MIME_TYPE}" echo "NUM_BYTES: ${NUM_BYTES}"  tmp_header_file=upload-header.tmp  # Initial resumable request defining metadata. # The upload url is in the response headers dump them to a file. curl "${BASE_URL}/upload/v1beta/files?key=${GOOGLE_API_KEY}" \   -D upload-header.tmp \   -H "X-Goog-Upload-Protocol: resumable" \   -H "X-Goog-Upload-Command: start" \   -H "X-Goog-Upload-Header-Content-Length: ${NUM_BYTES}" \   -H "X-Goog-Upload-Header-Content-Type: ${MIME_TYPE}" \   -H "Content-Type: application/json" \   -d "{'file': {'display_name': '${DISPLAY_NAME}'}}" 2> /dev/null  upload_url=$(grep -i "x-goog-upload-url: " "${tmp_header_file}" | cut -d" " -f2 | tr -d "\r") rm "${tmp_header_file}"  # Upload the actual bytes. curl "${upload_url}" \   -H "Content-Length: ${NUM_BYTES}" \   -H "X-Goog-Upload-Offset: 0" \   -H "X-Goog-Upload-Command: upload, finalize" \   --data-binary "@${DISPLAY_NAME}.pdf" 2> /dev/null > file_info.json  file_uri=$(jq ".file.uri" file_info.json) echo "file_uri: ${file_uri}"  # Now generate content using that file curl "https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent?key=$GOOGLE_API_KEY" \     -H 'Content-Type: application/json' \     -X POST \     -d '{       "contents": [{         "parts":[           {"text": "'$PROMPT'"},           {"file_data":{"mime_type": "application/pdf", "file_uri": '$file_uri'}}]         }]       }' 2> /dev/null > response.json  cat response.json echo  jq ".candidates[].content.parts[].text" response.json  # Clean up the downloaded PDF rm "${DISPLAY_NAME}.pdf"

PDF të mëdha të ruajtura në vend

Python

from google import genai from google.genai import types import pathlib import httpx  client = genai.Client()  # Retrieve and encode the PDF byte file_path = pathlib.Path('large_file.pdf')  # Upload the PDF using the File API sample_file = client.files.upload(   file=file_path, )  prompt="Summarize this document"  response = client.models.generate_content(   model="gemini-2.5-flash",   contents=[sample_file, "Summarize this document"]) print(response.text)

JavaScript

import { createPartFromUri, GoogleGenAI } from "@google/genai";  const ai = new GoogleGenAI({ apiKey: "GEMINI_API_KEY" });  async function main() {     const file = await ai.files.upload({         file: 'path-to-localfile.pdf'         config: {             displayName: 'A17_FlightPlan.pdf',         },     });      // Wait for the file to be processed.     let getFile = await ai.files.get({ name: file.name });     while (getFile.state === 'PROCESSING') {         getFile = await ai.files.get({ name: file.name });         console.log(`current file status: ${getFile.state}`);         console.log('File is still processing, retrying in 5 seconds');          await new Promise((resolve) => {             setTimeout(resolve, 5000);         });     }     if (file.state === 'FAILED') {         throw new Error('File processing failed.');     }      // Add the file to the contents.     const content = [         'Summarize this document',     ];      if (file.uri && file.mimeType) {         const fileContent = createPartFromUri(file.uri, file.mimeType);         content.push(fileContent);     }      const response = await ai.models.generateContent({         model: 'gemini-2.5-flash',         contents: content,     });      console.log(response.text);  }  main();

Shkoni

package main  import (     "context"     "fmt"     "os"     "google.golang.org/genai" )  func main() {      ctx := context.Background()     client, _ := genai.NewClient(ctx, &genai.ClientConfig{         APIKey:  os.Getenv("GEMINI_API_KEY"),         Backend: genai.BackendGeminiAPI,     })     localPdfPath := "/path/to/file.pdf"      uploadConfig := &genai.UploadFileConfig{MIMEType: "application/pdf"}     uploadedFile, _ := client.Files.UploadFromPath(ctx, localPdfPath, uploadConfig)      promptParts := []*genai.Part{         genai.NewPartFromURI(uploadedFile.URI, uploadedFile.MIMEType),         genai.NewPartFromText("Give me a summary of this pdf file."),     }     contents := []*genai.Content{         genai.NewContentFromParts(promptParts, genai.RoleUser),     }      result, _ := client.Models.GenerateContent(         ctx,         "gemini-2.5-flash",         contents,         nil,     )      fmt.Println(result.Text()) }

PUSHIMI

NUM_BYTES=$(wc -c < "${PDF_PATH}") DISPLAY_NAME=TEXT tmp_header_file=upload-header.tmp  # Initial resumable request defining metadata. # The upload url is in the response headers dump them to a file. curl "${BASE_URL}/upload/v1beta/files?key=${GEMINI_API_KEY}" \   -D upload-header.tmp \   -H "X-Goog-Upload-Protocol: resumable" \   -H "X-Goog-Upload-Command: start" \   -H "X-Goog-Upload-Header-Content-Length: ${NUM_BYTES}" \   -H "X-Goog-Upload-Header-Content-Type: application/pdf" \   -H "Content-Type: application/json" \   -d "{'file': {'display_name': '${DISPLAY_NAME}'}}" 2> /dev/null  upload_url=$(grep -i "x-goog-upload-url: " "${tmp_header_file}" | cut -d" " -f2 | tr -d "\r") rm "${tmp_header_file}"  # Upload the actual bytes. curl "${upload_url}" \   -H "Content-Length: ${NUM_BYTES}" \   -H "X-Goog-Upload-Offset: 0" \   -H "X-Goog-Upload-Command: upload, finalize" \   --data-binary "@${PDF_PATH}" 2> /dev/null > file_info.json  file_uri=$(jq ".file.uri" file_info.json) echo file_uri=$file_uri  # Now generate content using that file curl "https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent?key=$GOOGLE_API_KEY" \     -H 'Content-Type: application/json' \     -X POST \     -d '{       "contents": [{         "parts":[           {"text": "Can you add a few more lines to this poem?"},           {"file_data":{"mime_type": "application/pdf", "file_uri": '$file_uri'}}]         }]       }' 2> /dev/null > response.json  cat response.json echo  jq ".candidates[].content.parts[].text" response.json

Mund të verifikoni se API e ka ruajtur me sukses skedarin e ngarkuar dhe të merrni meta të dhënat e tij duke telefonuar files.get . Vetëm name (dhe sipas zgjerimit, uri ) janë unik.

Python

from google import genai import pathlib  client = genai.Client()  fpath = pathlib.Path('example.txt') fpath.write_text('hello')  file = client.files.upload(file='example.txt')  file_info = client.files.get(name=file.name) print(file_info.model_dump_json(indent=4))

PUSHIMI

name=$(jq ".file.name" file_info.json) # Get the file of interest to check state curl https://generativelanguage.googleapis.com/v1beta/files/$name > file_info.json # Print some information about the file you got name=$(jq ".file.name" file_info.json) echo name=$name file_uri=$(jq ".file.uri" file_info.json) echo file_uri=$file_uri

Kalimi i shumë skedarëve PDF

Gemini API është i aftë të përpunojë dokumente të shumta PDF (deri në 1000 faqe) në një kërkesë të vetme, për sa kohë që madhësia e kombinuar e dokumenteve dhe kërkesave të tekstit qëndron brenda dritares së kontekstit të modelit.

Python

from google import genai import io import httpx  client = genai.Client()  doc_url_1 = "https://arxiv.org/pdf/2312.11805" doc_url_2 = "https://arxiv.org/pdf/2403.05530"  # Retrieve and upload both PDFs using the File API doc_data_1 = io.BytesIO(httpx.get(doc_url_1).content) doc_data_2 = io.BytesIO(httpx.get(doc_url_2).content)  sample_pdf_1 = client.files.upload(   file=doc_data_1,   config=dict(mime_type='application/pdf') ) sample_pdf_2 = client.files.upload(   file=doc_data_2,   config=dict(mime_type='application/pdf') )  prompt = "What is the difference between each of the main benchmarks between these two papers? Output these in a table."  response = client.models.generate_content(   model="gemini-2.5-flash",   contents=[sample_pdf_1, sample_pdf_2, prompt]) print(response.text)

JavaScript

import { createPartFromUri, GoogleGenAI } from "@google/genai";  const ai = new GoogleGenAI({ apiKey: "GEMINI_API_KEY" });  async function uploadRemotePDF(url, displayName) {     const pdfBuffer = await fetch(url)         .then((response) => response.arrayBuffer());      const fileBlob = new Blob([pdfBuffer], { type: 'application/pdf' });      const file = await ai.files.upload({         file: fileBlob,         config: {             displayName: displayName,         },     });      // Wait for the file to be processed.     let getFile = await ai.files.get({ name: file.name });     while (getFile.state === 'PROCESSING') {         getFile = await ai.files.get({ name: file.name });         console.log(`current file status: ${getFile.state}`);         console.log('File is still processing, retrying in 5 seconds');          await new Promise((resolve) => {             setTimeout(resolve, 5000);         });     }     if (file.state === 'FAILED') {         throw new Error('File processing failed.');     }      return file; }  async function main() {     const content = [         'What is the difference between each of the main benchmarks between these two papers? Output these in a table.',     ];      let file1 = await uploadRemotePDF("https://arxiv.org/pdf/2312.11805", "PDF 1")     if (file1.uri && file1.mimeType) {         const fileContent = createPartFromUri(file1.uri, file1.mimeType);         content.push(fileContent);     }     let file2 = await uploadRemotePDF("https://arxiv.org/pdf/2403.05530", "PDF 2")     if (file2.uri && file2.mimeType) {         const fileContent = createPartFromUri(file2.uri, file2.mimeType);         content.push(fileContent);     }      const response = await ai.models.generateContent({         model: 'gemini-2.5-flash',         contents: content,     });      console.log(response.text); }  main();

Shkoni

package main  import (     "context"     "fmt"     "io"     "net/http"     "os"     "google.golang.org/genai" )  func main() {      ctx := context.Background()     client, _ := genai.NewClient(ctx, &genai.ClientConfig{         APIKey:  os.Getenv("GEMINI_API_KEY"),         Backend: genai.BackendGeminiAPI,     })      docUrl1 := "https://arxiv.org/pdf/2312.11805"     docUrl2 := "https://arxiv.org/pdf/2403.05530"     localPath1 := "doc1_downloaded.pdf"     localPath2 := "doc2_downloaded.pdf"      respHttp1, _ := http.Get(docUrl1)     defer respHttp1.Body.Close()      outFile1, _ := os.Create(localPath1)     _, _ = io.Copy(outFile1, respHttp1.Body)     outFile1.Close()      respHttp2, _ := http.Get(docUrl2)     defer respHttp2.Body.Close()      outFile2, _ := os.Create(localPath2)     _, _ = io.Copy(outFile2, respHttp2.Body)     outFile2.Close()      uploadConfig1 := &genai.UploadFileConfig{MIMEType: "application/pdf"}     uploadedFile1, _ := client.Files.UploadFromPath(ctx, localPath1, uploadConfig1)      uploadConfig2 := &genai.UploadFileConfig{MIMEType: "application/pdf"}     uploadedFile2, _ := client.Files.UploadFromPath(ctx, localPath2, uploadConfig2)      promptParts := []*genai.Part{         genai.NewPartFromURI(uploadedFile1.URI, uploadedFile1.MIMEType),         genai.NewPartFromURI(uploadedFile2.URI, uploadedFile2.MIMEType),         genai.NewPartFromText("What is the difference between each of the " +                               "main benchmarks between these two papers? " +                               "Output these in a table."),     }     contents := []*genai.Content{         genai.NewContentFromParts(promptParts, genai.RoleUser),     }      modelName := "gemini-2.5-flash"     result, _ := client.Models.GenerateContent(         ctx,         modelName,         contents,         nil,     )      fmt.Println(result.Text()) }

PUSHIMI

DOC_URL_1="https://arxiv.org/pdf/2312.11805" DOC_URL_2="https://arxiv.org/pdf/2403.05530" DISPLAY_NAME_1="Gemini_paper" DISPLAY_NAME_2="Gemini_1.5_paper" PROMPT="What is the difference between each of the main benchmarks between these two papers? Output these in a table."  # Function to download and upload a PDF upload_pdf() {   local doc_url="$1"   local display_name="$2"    # Download the PDF   wget -O "${display_name}.pdf" "${doc_url}"    local MIME_TYPE=$(file -b --mime-type "${display_name}.pdf")   local NUM_BYTES=$(wc -c < "${display_name}.pdf")    echo "MIME_TYPE: ${MIME_TYPE}"   echo "NUM_BYTES: ${NUM_BYTES}"    local tmp_header_file=upload-header.tmp    # Initial resumable request   curl "${BASE_URL}/upload/v1beta/files?key=${GOOGLE_API_KEY}" \     -D "${tmp_header_file}" \     -H "X-Goog-Upload-Protocol: resumable" \     -H "X-Goog-Upload-Command: start" \     -H "X-Goog-Upload-Header-Content-Length: ${NUM_BYTES}" \     -H "X-Goog-Upload-Header-Content-Type: ${MIME_TYPE}" \     -H "Content-Type: application/json" \     -d "{'file': {'display_name': '${display_name}'}}" 2> /dev/null    local upload_url=$(grep -i "x-goog-upload-url: " "${tmp_header_file}" | cut -d" " -f2 | tr -d "\r")   rm "${tmp_header_file}"    # Upload the PDF   curl "${upload_url}" \     -H "Content-Length: ${NUM_BYTES}" \     -H "X-Goog-Upload-Offset: 0" \     -H "X-Goog-Upload-Command: upload, finalize" \     --data-binary "@${display_name}.pdf" 2> /dev/null > "file_info_${display_name}.json"    local file_uri=$(jq ".file.uri" "file_info_${display_name}.json")   echo "file_uri for ${display_name}: ${file_uri}"    # Clean up the downloaded PDF   rm "${display_name}.pdf"    echo "${file_uri}" }  # Upload the first PDF file_uri_1=$(upload_pdf "${DOC_URL_1}" "${DISPLAY_NAME_1}")  # Upload the second PDF file_uri_2=$(upload_pdf "${DOC_URL_2}" "${DISPLAY_NAME_2}")  # Now generate content using both files curl "https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent?key=$GOOGLE_API_KEY" \     -H 'Content-Type: application/json' \     -X POST \     -d '{       "contents": [{         "parts":[           {"file_data": {"mime_type": "application/pdf", "file_uri": '$file_uri_1'}},           {"file_data": {"mime_type": "application/pdf", "file_uri": '$file_uri_2'}},           {"text": "'$PROMPT'"}         ]       }]     }' 2> /dev/null > response.json  cat response.json echo  jq ".candidates[].content.parts[].text" response.json

Detaje teknike

Binjakët mbështet një maksimum prej 1000 faqe dokumentesh. Çdo faqe dokumenti është e barabartë me 258 argumente.

Ndërsa nuk ka kufizime specifike për numrin e pikselëve në një dokument, përveç dritares së kontekstit të modelit, faqet më të mëdha zvogëlohen në një rezolutë maksimale prej 3072x3072 duke ruajtur raportin e tyre origjinal të pamjes, ndërsa faqet më të vogla janë shkallëzuar deri në 768x768 piksele. Nuk ka ulje kostoje për faqet me madhësi më të ulët, përveç gjerësisë së brezit, ose përmirësimit të performancës për faqet me rezolucion më të lartë.

Llojet e dokumenteve

Teknikisht, ju mund të kaloni lloje të tjera MIME për të kuptuar dokumentet, si TXT, Markdown, HTML, XML, etj. Megjithatë, vizioni i dokumentit kupton vetëm në mënyrë kuptimplotë PDF-të . Llojet e tjera do të nxirren si tekst i pastër dhe modeli nuk do të jetë në gjendje të interpretojë atë që shohim në paraqitjen e atyre skedarëve. Çdo specifikë e llojit të skedarit si grafikët, diagramet, etiketat HTML, formatimi Markdown, etj., do të humbet.

Praktikat më të mira

Për rezultate më të mira:

Rrotulloni faqet në orientimin e duhur përpara se të ngarkoni.
Shmangni faqet e paqarta.
Nëse përdorni një faqe të vetme, vendosni kërkesën për tekst pas faqes.

Çfarë është më pas

Për të mësuar më shumë, shikoni burimet e mëposhtme:

Strategjitë e nxitjes së skedarëve : Gemini API mbështet nxitjen me të dhëna teksti, imazhi, audio dhe video, të njohura gjithashtu si nxitje multimodale.
Udhëzimet e sistemit : Udhëzimet e sistemit ju lejojnë të drejtoni sjelljen e modelit bazuar në nevojat tuaja specifike dhe rastet e përdorimit.