Veo 3.1
Generate high-quality videos with advanced control using Google DeepMind's Veo 3.1 model.
Endpoint
POST https://api.azerion.ai/v1/videos/generation
Description
Veo 3.1 builds on Veo 3 with richer and better-synchronized audio, first/last frame control for precise transitions, and subject reference images for consistent character and object appearance across generations. It supports text-to-video, image-to-video, and video extension.
Authentication
This endpoint requires authentication using an API key.
Request
{
"model": "google-veo-3-1",
"prompt": "A detective in a trench coat walks down a rain-soaked neon-lit alley, footsteps echoing against the wet pavement",
"n": 1,
"image": {
"mime_type": "image/jpeg",
"url": "https://example.com/images/first-frame.jpg",
"base_64_encoded": "BASE64_FIRST_FRAME_DATA"
},
"last_frame": {
"mime_type": "image/jpeg",
"url": "https://example.com/images/last-frame.jpg",
"base_64_encoded": "BASE64_LAST_FRAME_DATA"
},
"reference_images": [
{
"image": {
"mime_type": "image/jpeg",
"url": "https://example.com/images/reference.jpg",
"base_64_encoded": "BASE64_REFERENCE_IMAGE_DATA"
},
"reference_type": "subject"
}
],
"video": {
"mime_type": "video/mp4",
"url": "https://example.com/videos/input-video.mp4",
"base_64_encoded": "BASE64_ENCODED_VIDEO_DATA"
},
"generate_audio": true,
"duration": 8,
"resolution": "1080p",
"aspect_ratio": "16:9",
"seed": 42,
"negative_prompt": "blurry, low quality, distorted",
"person_generation": "allow_adult",
"resize_mode": "crop",
"compression_quality": "optimized",
"output_format": "url"
}
Request Parameters
Core Parameters
- model (string, required): Model ID. Use
google-veo-3-1. - prompt (string, required): Text description of the desired video. Maximum 4000 characters. Required for text-to-video; optional but recommended for image-to-video.
- n (integer, optional): Number of videos to generate. Range:
1–4. Default:1.
Image-to-Video
- image (object, optional): Input image for image-to-video generation. Also acts as the first frame when used with
last_frame. Recommended: 720p+ resolution with 16:9 or 9:16 aspect ratio. Provide eitherurlorbase_64_encoded.- mime_type (string): MIME type of the image. Accepted:
"image/jpeg","image/png". - url (string): URL of the input image.
- base_64_encoded (string): Base64-encoded image data.
- mime_type (string): MIME type of the image. Accepted:
Last Frame Control
- last_frame (object, optional): Specifies the final frame of the video. Use with
image(first frame) to create controlled transitions between two states. Provide eitherurlorbase_64_encoded.- mime_type (string): MIME type of the image. Accepted:
"image/jpeg","image/png". - url (string): URL of the last frame image.
- base_64_encoded (string): Base64-encoded image data.
- mime_type (string): MIME type of the image. Accepted:
Reference Images
- reference_images (array, optional): Up to 3 reference images to guide character or object appearance. Each object contains:
- image (object): The reference image. Provide either
urlorbase_64_encoded.- mime_type (string): MIME type. Accepted:
"image/jpeg","image/png". - url (string): URL of the reference image.
- base_64_encoded (string): Base64-encoded image data.
- mime_type (string): MIME type. Accepted:
- reference_type (string):
"subject"— guides character/object appearance.
- image (object): The reference image. Provide either
Style reference images ("style" type) are not supported on Veo 3.1. Only "subject" reference type is available.
Video Extension
- video (object, optional): A previously Veo-generated video to extend. Adds approximately 7 seconds to the original video. Provide either
urlorbase_64_encoded.- mime_type (string): MIME type of the video. Accepted:
"video/mp4". - url (string): URL of the input video.
- base_64_encoded (string): Base64-encoded video data.
- mime_type (string): MIME type of the video. Accepted:
Generation Settings
- generate_audio (boolean, optional): Enable native audio generation with richer and better-synchronized audio than Veo 3. Default:
true. - duration (integer, optional): Video duration in seconds. Accepted:
4,6,8. Default:8. - resolution (string, optional): Output resolution. Accepted:
"720p","1080p". Default:"720p". Veo 3+ models only. - aspect_ratio (string, optional): Video aspect ratio. Accepted:
"16:9","9:16". Default:"16:9". - seed (integer, optional): Seed for reproducible results. Range:
0–4294967295. - negative_prompt (string, optional): Describe elements to exclude from the generated video.
- person_generation (string, optional): Control person generation. Accepted:
"allow_adult"(default),"allow_all","dont_allow". - resize_mode (string, optional): How the input image is resized/cropped. Image-to-video only.
- compression_quality (string, optional): Controls output video compression.
- output_format (string, optional): Response format. Accepted:
"base64","url". Default:"base64". When set to"url", the response returns a URL instead of base64-encoded data.
enhance_prompt is not supported on Veo 3.1. It is only available for Veo 2 models.
Veo 3.1 produces richer and better-synchronized audio than Veo 3. Include audio cues in your prompt for best results — describe dialogue, sound effects, and ambient sounds alongside the visual scene.
Response
A successful request returns a 200 OK status code with a JSON response body.
When output_format is "base64" (default)
{
"videos": [
{
"base64_encoded": "AAAAIGZ0eXBpc29tAAACAGlzb21pc..."
}
]
}
When output_format is "url"
{
"videos": [
{
"url": "https://storage.example.com/generated-video.mp4"
}
]
}
Response Fields
- videos (array): An array of generated video objects.
- base64_encoded (string): The base64-encoded video data (MP4, 24 FPS). Present when
output_formatis"base64". - url (string): URL to the generated video. Present when
output_formatis"url".
- base64_encoded (string): The base64-encoded video data (MP4, 24 FPS). Present when
Working with Base64 Video Data
The response returns videos as base64-encoded strings by default. For details on decoding, saving, and displaying videos, see the Video Generation page.
Example Requests
Text-to-Video (cURL)
curl -X POST https://api.azerion.ai/v1/videos/generation \
-H "Content-Type: application/json" \
-H "Authorization: Bearer YOUR_ACCESS_TOKEN" \
-d '{
"model": "google-veo-3-1",
"prompt": "A detective in a trench coat walks down a rain-soaked neon-lit alley, footsteps echoing against the wet pavement",
"generate_audio": true,
"duration": 8,
"resolution": "1080p",
"aspect_ratio": "16:9",
"n": 1
}'
First/Last Frame Transition (cURL)
curl -X POST https://api.azerion.ai/v1/videos/generation \
-H "Content-Type: application/json" \
-H "Authorization: Bearer YOUR_ACCESS_TOKEN" \
-d '{
"model": "google-veo-3-1",
"prompt": "A flower blooms from bud to full blossom in a time-lapse style",
"image": {
"mime_type": "image/jpeg",
"url": "https://example.com/images/bud.jpg"
},
"last_frame": {
"mime_type": "image/jpeg",
"url": "https://example.com/images/blossom.jpg"
},
"duration": 6,
"resolution": "1080p"
}'
Subject Reference Images (cURL)
curl -X POST https://api.azerion.ai/v1/videos/generation \
-H "Content-Type: application/json" \
-H "Authorization: Bearer YOUR_ACCESS_TOKEN" \
-d '{
"model": "google-veo-3-1",
"prompt": "The character walks through a bustling marketplace, stopping to examine colorful fabrics",
"reference_images": [
{
"image": {
"mime_type": "image/jpeg",
"url": "https://example.com/images/character-ref.jpg"
},
"reference_type": "subject"
}
],
"generate_audio": true,
"duration": 8,
"resolution": "1080p"
}'
Replace YOUR_ACCESS_TOKEN with your actual API key or access token. Refer to the Authentication guide for details on obtaining and using your credentials.
Text-to-Video (Python)
import requests
import base64
import os
api_key = os.environ.get("AZERION_API_KEY")
url = "https://api.azerion.ai/v1/videos/generation"
headers = {
"Content-Type": "application/json",
"Authorization": f"Bearer {api_key}"
}
data = {
"model": "google-veo-3-1",
"prompt": "A detective in a trench coat walks down a rain-soaked neon-lit alley, footsteps echoing against the wet pavement",
"generate_audio": True,
"duration": 8,
"resolution": "1080p",
"aspect_ratio": "16:9",
"n": 1
}
response = requests.post(url, headers=headers, json=data)
result = response.json()
print(f"Status code: {response.status_code}")
# Save the video file
video_data = base64.b64decode(result["videos"][0]["base64_encoded"])
with open("generated_video.mp4", "wb") as f:
f.write(video_data)
print("Video saved as generated_video.mp4")
Text-to-Video (Node.js)
const fetch = require('node-fetch');
const fs = require('fs');
const apiKey = process.env.AZERION_API_KEY;
const url = 'https://api.azerion.ai/v1/videos/generation';
const headers = {
'Content-Type': 'application/json',
'Authorization': `Bearer ${apiKey}`
};
const data = {
model: 'google-veo-3-1',
prompt: 'A detective in a trench coat walks down a rain-soaked neon-lit alley, footsteps echoing against the wet pavement',
generate_audio: true,
duration: 8,
resolution: '1080p',
aspect_ratio: '16:9',
n: 1
};
fetch(url, {
method: 'POST',
headers: headers,
body: JSON.stringify(data)
})
.then(response => response.json())
.then(result => {
// Save the video file
const videoData = Buffer.from(result.videos[0].base64_encoded, 'base64');
fs.writeFileSync('generated_video.mp4', videoData);
console.log('Video saved as generated_video.mp4');
})
.catch(error => console.error('Error:', error));
Output Specifications
| Spec | Value |
|---|---|
| Output Format | MP4 |
| Frame Rate | 24 FPS |
| Native Audio | Richer, better-synchronized than Veo 3 |
| Image-to-Video | Supported |
| Video Extension | Supported |
| Reference Images | Up to 3 ("subject" type only) |
| First/Last Frame Control | Supported |
Veo 3 vs Veo 3.1
| Feature | Veo 3 | Veo 3.1 |
|---|---|---|
| Native Audio | Dialogue, SFX, ambient | Richer, better-synchronized |
| Reference Images | Not supported | Up to 3 ("subject" type) |
| First/Last Frame | Not supported | Supported |
| Video Extension | Supported | Supported |
| Max Resolution | 1080p | 1080p |
| Duration | 4, 6, 8 seconds | 4, 6, 8 seconds |
Related Resources
- Veo 3 — Base model without reference images or frame control
- Google Veo Documentation
- List Models