-
Notifications
You must be signed in to change notification settings - Fork 0
ImageContent
gitpavleenbali edited this page Feb 17, 2026
·
2 revisions
The Image class handles image data for multimodal AI interactions.
from pyai.multimodal import Imageimage = Image.from_file("photo.jpg")image = Image.from_url("https://example.com/image.png")with open("image.png", "rb") as f:
image = Image.from_bytes(f.read(), media_type="image/png")image = Image.from_base64(
base64_string,
media_type="image/jpeg"
)from PIL import Image as PILImage
pil_img = PILImage.open("photo.jpg")
image = Image.from_pil(pil_img)| Property | Type | Description |
|---|---|---|
width |
int | Image width in pixels |
height |
int | Image height in pixels |
media_type |
str | MIME type (image/jpeg, etc.) |
size_bytes |
int | File size in bytes |
format |
str | Image format (png, jpeg, etc.) |
Resize image while maintaining aspect ratio:
# Resize to max dimensions
resized = image.resize(max_width=1024, max_height=1024)
# Resize to specific size
resized = image.resize(width=800, height=600)
# Scale by percentage
resized = image.resize(scale=0.5) # 50% sizeConvert to different format:
# Convert to JPEG
jpeg_image = image.convert(format="jpeg", quality=85)
# Convert to PNG
png_image = image.convert(format="png")
# Convert to WebP
webp_image = image.convert(format="webp", quality=80)Crop image region:
cropped = image.crop(
left=100,
top=50,
width=400,
height=300
)Get base64 encoded string:
b64_string = image.to_base64()Save to file:
image.save("output.png")
image.save("output.jpg", format="jpeg", quality=90)from pyai import ask
from pyai.multimodal import Image
image = Image.from_file("chart.png")
response = ask("Explain this chart", images=[image])images = [
Image.from_file("img1.jpg"),
Image.from_file("img2.jpg"),
Image.from_file("img3.jpg")
]
response = ask(
"What do these images have in common?",
images=images
)from pyai import Agent
from pyai.multimodal import Image
agent = Agent(
name="Analyst",
model="gpt-4o" # Vision model
)
image = Image.from_url("https://example.com/data.png")
result = agent.run("Analyze this data visualization", images=[image])| Format | Read | Write | Notes |
|---|---|---|---|
| JPEG | β | β | Most efficient for photos |
| PNG | β | β | Best for graphics/screenshots |
| GIF | β | β | Animated GIFs supported |
| WebP | β | β | Good compression |
| BMP | β | β | Uncompressed |
| TIFF | β | β | High quality |
Different providers accept different formats:
# OpenAI format
openai_content = image.to_openai_format()
# Anthropic format
anthropic_content = image.to_anthropic_format()
# Auto-detect (used internally)
provider_content = image.to_provider_format(provider="openai")- Multimodal-Module - Module overview
- AudioContent - Audio handling
- VideoContent - Video handling
Intelligence, Embedded.