Skip to content

ImageContent

gitpavleenbali edited this page Feb 17, 2026 · 2 revisions

ImageContent

The Image class handles image data for multimodal AI interactions.

Import

from pyai.multimodal import Image

Creating Images

From File

image = Image.from_file("photo.jpg")

From URL

image = Image.from_url("https://example.com/image.png")

From Bytes

with open("image.png", "rb") as f:
    image = Image.from_bytes(f.read(), media_type="image/png")

From Base64

image = Image.from_base64(
    base64_string,
    media_type="image/jpeg"
)

From PIL Image

from PIL import Image as PILImage

pil_img = PILImage.open("photo.jpg")
image = Image.from_pil(pil_img)

Properties

Property Type Description
width int Image width in pixels
height int Image height in pixels
media_type str MIME type (image/jpeg, etc.)
size_bytes int File size in bytes
format str Image format (png, jpeg, etc.)

Methods

resize()

Resize image while maintaining aspect ratio:

# Resize to max dimensions
resized = image.resize(max_width=1024, max_height=1024)

# Resize to specific size
resized = image.resize(width=800, height=600)

# Scale by percentage
resized = image.resize(scale=0.5)  # 50% size

convert()

Convert to different format:

# Convert to JPEG
jpeg_image = image.convert(format="jpeg", quality=85)

# Convert to PNG
png_image = image.convert(format="png")

# Convert to WebP
webp_image = image.convert(format="webp", quality=80)

crop()

Crop image region:

cropped = image.crop(
    left=100,
    top=50,
    width=400,
    height=300
)

to_base64()

Get base64 encoded string:

b64_string = image.to_base64()

save()

Save to file:

image.save("output.png")
image.save("output.jpg", format="jpeg", quality=90)

Using with Agents

Single Image

from pyai import ask
from pyai.multimodal import Image

image = Image.from_file("chart.png")
response = ask("Explain this chart", images=[image])

Multiple Images

images = [
    Image.from_file("img1.jpg"),
    Image.from_file("img2.jpg"),
    Image.from_file("img3.jpg")
]

response = ask(
    "What do these images have in common?",
    images=images
)

With Agent

from pyai import Agent
from pyai.multimodal import Image

agent = Agent(
    name="Analyst",
    model="gpt-4o"  # Vision model
)

image = Image.from_url("https://example.com/data.png")
result = agent.run("Analyze this data visualization", images=[image])

Format Support

Format Read Write Notes
JPEG βœ… βœ… Most efficient for photos
PNG βœ… βœ… Best for graphics/screenshots
GIF βœ… βœ… Animated GIFs supported
WebP βœ… βœ… Good compression
BMP βœ… βœ… Uncompressed
TIFF βœ… βœ… High quality

Provider Formats

Different providers accept different formats:

# OpenAI format
openai_content = image.to_openai_format()

# Anthropic format
anthropic_content = image.to_anthropic_format()

# Auto-detect (used internally)
provider_content = image.to_provider_format(provider="openai")

See Also

🧠 PYAI Wiki

Home


πŸš€ Getting Started


πŸ’‘ Core Concepts


🎯 One-Liner APIs


πŸ€– Agent Framework


πŸ”— Multi-Agent


πŸ› οΈ Tools & Skills


🏒 Enterprise


πŸŽ™οΈ Voice


πŸ–ΌοΈ Multimodal


πŸ“Š Vector DB


🌐 OpenAPI


πŸ”Œ Plugins


🀝 A2A Protocol


πŸ”’ Security


πŸ“š Reference


Intelligence, Embedded.

Clone this wiki locally