🎨 Data Designer Tutorial: Providing Images as Context for Vision-Based Data Generation¶

📚 What you'll learn¶

This notebook demonstrates how to provide images as context to generate text descriptions using vision-language models.

✨ Visual Document Processing: Converting images to chat-ready format for model consumption
🔍 Vision-Language Generation: Using vision models to generate detailed summaries from images

If this is your first time using Data Designer, we recommend starting with the first notebook in this tutorial series.

📦 Import Data Designer¶

data_designer.config provides access to the configuration API.
DataDesigner is the main interface for data generation.

In [1]:

Copied!





# Standard library imports
import base64
import io
import uuid

# Third-party imports
import pandas as pd
import rich
from datasets import load_dataset
from IPython.display import display
from rich.panel import Panel

# Data Designer imports
import data_designer.config as dd
from data_designer.interface import DataDesigner
# Standard library imports
import base64
import io
import uuid

# Third-party imports
import pandas as pd
import rich
from datasets import load_dataset
from IPython.display import display
from rich.panel import Panel

# Data Designer imports
import data_designer.config as dd
from data_designer.interface import DataDesigner

⚙️ Initialize the Data Designer interface¶

DataDesigner is the main object responsible for managing the data generation process.
When initialized without arguments, the default model providers are used.

In [2]:

Copied!

data_designer = DataDesigner()
data_designer = DataDesigner()

🏗️ Initialize the Data Designer Config Builder¶

The Data Designer config defines the dataset schema and generation process.
The config builder provides an intuitive interface for building this configuration.
When initialized without arguments, the default model configurations are used.

In [3]:

Copied!

config_builder = dd.DataDesignerConfigBuilder()
config_builder = dd.DataDesignerConfigBuilder()

🌱 Seed Dataset Creation¶

In this section, we'll prepare our visual documents as a seed dataset for summarization:

Loading Visual Documents: We use a small pets image dataset containing labeled images
Image Processing: Convert images to base64 format for vision model consumption
Metadata Extraction: Preserve relevant image information (label, etc.)

The seed dataset will be used to generate detailed text descriptions of each image.

In [4]:

Copied!





# Dataset processing configuration
IMG_COUNT = 512  # Number of images to process
BASE64_IMAGE_HEIGHT = 512  # Standardized height for model input

# Load the pets dataset (train split, ~23 MB total)
img_dataset_cfg = {"path": "rokmr/pets", "split": "train"}
# Dataset processing configuration
IMG_COUNT = 512  # Number of images to process
BASE64_IMAGE_HEIGHT = 512  # Standardized height for model input

# Load the pets dataset (train split, ~23 MB total)
img_dataset_cfg = {"path": "rokmr/pets", "split": "train"}

In [5]:

Copied!





def resize_image(image, height: int):
    """
    Resize image while maintaining aspect ratio.

    Args:
        image: PIL Image object
        height: Target height in pixels

    Returns:
        Resized PIL Image object
    """
    original_width, original_height = image.size
    width = int(original_width * (height / original_height))
    return image.resize((width, height))


def convert_image_to_chat_format(record, height: int) -> dict:
    """
    Convert PIL image to base64 format for chat template usage.

    Args:
        record: Dataset record containing image and metadata
        height: Target height for image resizing

    Returns:
        Updated record with base64_image and uuid fields
    """
    image = resize_image(record["image"], height)

    img_buffer = io.BytesIO()
    image.save(img_buffer, format="PNG")
    byte_data = img_buffer.getvalue()
    base64_encoded_data = base64.b64encode(byte_data)
    base64_string = base64_encoded_data.decode("utf-8")

    return record | {"base64_image": base64_string, "uuid": str(uuid.uuid4())}
def resize_image(image, height: int):
    """
    Resize image while maintaining aspect ratio.

    Args:
        image: PIL Image object
        height: Target height in pixels

    Returns:
        Resized PIL Image object
    """
    original_width, original_height = image.size
    width = int(original_width * (height / original_height))
    return image.resize((width, height))


def convert_image_to_chat_format(record, height: int) -> dict:
    """
    Convert PIL image to base64 format for chat template usage.

    Args:
        record: Dataset record containing image and metadata
        height: Target height for image resizing

    Returns:
        Updated record with base64_image and uuid fields
    """
    image = resize_image(record["image"], height)

    img_buffer = io.BytesIO()
    image.save(img_buffer, format="PNG")
    byte_data = img_buffer.getvalue()
    base64_encoded_data = base64.b64encode(byte_data)
    base64_string = base64_encoded_data.decode("utf-8")

    return record | {"base64_image": base64_string, "uuid": str(uuid.uuid4())}

In [6]:

Copied!





# Load and process the image dataset
print("📥 Loading and processing images...")

img_dataset = load_dataset(**img_dataset_cfg).map(
    convert_image_to_chat_format, fn_kwargs={"height": BASE64_IMAGE_HEIGHT}
)
img_dataset = pd.DataFrame(img_dataset[:IMG_COUNT])

print(f"✅ Loaded {len(img_dataset)} images with columns: {list(img_dataset.columns)}")
# Load and process the image dataset
print("📥 Loading and processing images...")

img_dataset = load_dataset(**img_dataset_cfg).map(
    convert_image_to_chat_format, fn_kwargs={"height": BASE64_IMAGE_HEIGHT}
)
img_dataset = pd.DataFrame(img_dataset[:IMG_COUNT])

print(f"✅ Loaded {len(img_dataset)} images with columns: {list(img_dataset.columns)}")

📥 Loading and processing images...

Warning: You are sending unauthenticated requests to the HF Hub. Please set a HF_TOKEN to enable higher rate limits and faster downloads.

[16:33:25] [WARNING] Warning: You are sending unauthenticated requests to the HF Hub. Please set a HF_TOKEN to enable higher rate limits and faster downloads.

✅ Loaded 512 images with columns: ['image', 'label', 'base64_image', 'uuid']

In [7]:

Copied!

img_dataset.head()
img_dataset.head()

Out[7]:

	image	base64_image	uuid
0	<PIL.JpegImagePlugin.JpegImageFile image mode=...	iVBORw0KGgoAAAANSUhEUgAAAeQAAAIACAIAAADc8YinAA...	16f5f74f-13d5-404b-b14d-c8939f7422eb
1	<PIL.JpegImagePlugin.JpegImageFile image mode=...	iVBORw0KGgoAAAANSUhEUgAAAiQAAAIACAIAAAA9rOAHAA...	3d578437-b695-497c-ae9d-b03e8cc86d66
2	<PIL.JpegImagePlugin.JpegImageFile image mode=...	iVBORw0KGgoAAAANSUhEUgAAAqoAAAIACAIAAADFYNm1AA...	d0ecbe64-8ddc-4dd1-a816-656bdef18876
3	<PIL.JpegImagePlugin.JpegImageFile image mode=...	iVBORw0KGgoAAAANSUhEUgAAAwAAAAIACAIAAAC6lJxtAA...	18020c98-1dbb-48b3-a7ea-2026e05db44e
4	<PIL.PngImagePlugin.PngImageFile image mode=RG...	iVBORw0KGgoAAAANSUhEUgAAAqoAAAIACAIAAADFYNm1AA...	2dbd7d23-d550-4742-824e-7a8527aa4c21

In [8]:

Copied!

# Add the seed dataset containing our processed images
df_seed = pd.DataFrame(img_dataset)[["uuid", "label", "base64_image"]]
config_builder.with_seed_dataset(dd.DataFrameSeedSource(df=df_seed))
# Add the seed dataset containing our processed images
df_seed = pd.DataFrame(img_dataset)[["uuid", "label", "base64_image"]]
config_builder.with_seed_dataset(dd.DataFrameSeedSource(df=df_seed))

Out[8]:

DataDesignerConfigBuilder(
    seed_dataset: df seed
)

In [9]:

Copied!





# Add a column to generate detailed image descriptions
config_builder.add_column(
    dd.LLMTextColumnConfig(
        name="description",
        model_alias="nvidia-vision",
        prompt=(
            "Provide a detailed description of the content in this image in Markdown format. "
            "Describe the main subject, background, colors, and any notable details."
        ),
        multi_modal_context=[dd.ImageContext(column_name="base64_image")],
    )
)

data_designer.validate(config_builder)
# Add a column to generate detailed image descriptions
config_builder.add_column(
    dd.LLMTextColumnConfig(
        name="description",
        model_alias="nvidia-vision",
        prompt=(
            "Provide a detailed description of the content in this image in Markdown format. "
            "Describe the main subject, background, colors, and any notable details."
        ),
        multi_modal_context=[dd.ImageContext(column_name="base64_image")],
    )
)

data_designer.validate(config_builder)

[16:35:02] [INFO] ✅ Validation passed

🔁 Iteration is key – preview the dataset!¶

Use the preview method to generate a sample of records quickly.
Inspect the results for quality and format issues.
Adjust column configurations, prompts, or parameters as needed.
Re-run the preview until satisfied.

In [10]:

Copied!

preview = data_designer.preview(config_builder, num_records=2)
preview = data_designer.preview(config_builder, num_records=2)

[16:35:02] [INFO] 👁️ Preview generation in progress

[16:35:02] [INFO] ✅ Validation passed

[16:35:02] [INFO] ⛓️ Sorting column configs into a Directed Acyclic Graph

[16:35:02] [INFO] 🩺 Running health checks for models...

[16:35:02] [INFO]   |-- 👀 Checking 'nvidia/nemotron-nano-12b-v2-vl' in provider named 'nvidia' for model alias 'nvidia-vision'...

[16:35:03] [INFO]   |-- ✅ Passed!

[16:35:03] [INFO] 🌱 Sampling 2 records from seed dataset

[16:35:03] [INFO]   |-- seed dataset size: 512 records

[16:35:03] [INFO]   |-- sampling strategy: ordered

[16:35:03] [INFO] 📝 llm-text model config for column 'description'

[16:35:03] [INFO]   |-- model: 'nvidia/nemotron-nano-12b-v2-vl'

[16:35:03] [INFO]   |-- model alias: 'nvidia-vision'

[16:35:03] [INFO]   |-- model provider: 'nvidia'

[16:35:03] [INFO]   |-- inference parameters:

[16:35:03] [INFO]   |  |-- generation_type=chat-completion

[16:35:03] [INFO]   |  |-- max_parallel_requests=4

[16:35:03] [INFO]   |  |-- temperature=0.85

[16:35:03] [INFO]   |  |-- top_p=0.95

[16:35:03] [INFO] ⚡️ Processing llm-text column 'description' with 4 concurrent workers

[16:35:03] [INFO] ⏱️ llm-text column 'description' will report progress after each record

[16:35:07] [INFO]   |-- 😸 llm-text column 'description' progress: 1/2 (50%) complete, 1 ok, 0 failed, 0.27 rec/s, eta 3.7s

[16:35:07] [INFO]   |-- 🦁 llm-text column 'description' progress: 2/2 (100%) complete, 2 ok, 0 failed, 0.46 rec/s, eta 0.0s

[16:35:08] [INFO] 📊 Model usage summary:

[16:35:08] [INFO]   |-- model: nvidia/nemotron-nano-12b-v2-vl

[16:35:08] [INFO]   |-- tokens: input=606, output=614, total=1220, tps=260

[16:35:08] [INFO]   |-- requests: success=2, failed=0, total=2, rpm=25

[16:35:08] [INFO] 📐 Measuring dataset column statistics:

[16:35:08] [INFO]   |-- 📝 column: 'description'

[16:35:08] [INFO] 🙌 Preview complete!

In [11]:

Copied!

# Run this cell multiple times to cycle through the 2 preview records.
preview.display_sample_record()
# Run this cell multiple times to cycle through the 2 preview records.
preview.display_sample_record()

[index: 0]

                                                                                                              
                                                 Seed Columns                                                 
┏━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ Name         ┃ Value                                                                                       ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ uuid         │ 16f5f74f-13d5-404b-b14d-c8939f7422eb                                                        │
├──────────────┼─────────────────────────────────────────────────────────────────────────────────────────────┤
│ label        │ 0                                                                                           │
├──────────────┼─────────────────────────────────────────────────────────────────────────────────────────────┤
│ base64_image │ iVBORw0KGgoAAAANSUhEUgAAAeQAAAIACAIAAADc8YinAAEAAElEQVR4nOy9V5ckuZEmamZwEREpSna1YAv28JLDHT… │
└──────────────┴─────────────────────────────────────────────────────────────────────────────────────────────┘
                                                                                                              
                                                                                                              
                                              Generated Columns                                               
┏━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ Name        ┃ Value                                                                                        ┃
┡━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ description │ ```markdown                                                                                  │
│             │ # Black and White Cat Portrait                                                               │
│             │                                                                                              │
│             │ ## Main Subject                                                                              │
│             │ The main subject of this image is a **black and white cat**. The cat is facing the camera,   │
│             │ giving us a clear view of its facial features.                                               │
│             │                                                                                              │
│             │ ## Background                                                                                │
│             │ The background is **neutral and blurred**, which helps to keep the focus on the cat. The     │
│             │ colors in the background are primarily **beige** and **gray**.                               │
│             │                                                                                              │
│             │ ## Colors                                                                                    │
│             │ The cat's fur is predominantly **black**, with a **white** patch on its forehead and around  │
│             │ its muzzle. The cat's eyes are a striking **yellow** color, which stands out against its     │
│             │ black fur.                                                                                   │
│             │                                                                                              │
│             │ ## Notable Details                                                                           │
│             │ - The cat's eyes are **wide open**, giving it an alert and curious expression.               │
│             │ - The cat has **white whiskers**, which contrast with its black fur.                         │
│             │ - There is a **black patch** on the cat's chin, adding to its distinctive appearance.        │
│             │ - The cat's ears are **perked up**, indicating interest or alertness.                        │
│             │ ```                                                                                          │
│             │                                                                                              │
│             │ This detailed description provides a comprehensive overview of the image, focusing on the    │
│             │ main subject, background, colors, and notable details.                                       │
└─────────────┴──────────────────────────────────────────────────────────────────────────────────────────────┘

In [12]:

Copied!

# The preview dataset is available as a pandas DataFrame.
preview.dataset
# The preview dataset is available as a pandas DataFrame.
preview.dataset

Out[12]:

	uuid	label	base64_image	description
0	16f5f74f-13d5-404b-b14d-c8939f7422eb	0	iVBORw0KGgoAAAANSUhEUgAAAeQAAAIACAIAAADc8YinAA...	```markdown\n# Black and White Cat Portrait\n\...
1	3d578437-b695-497c-ae9d-b03e8cc86d66	0	iVBORw0KGgoAAAANSUhEUgAAAiQAAAIACAIAAAA9rOAHAA...	```markdown\n# Detailed Description of the Ima...

📊 Analyze the generated data¶

Data Designer automatically generates a basic statistical analysis of the generated data.
This analysis is available via the analysis property of generation result objects.

In [13]:

Copied!

# Print the analysis as a table.
preview.analysis.to_report()
# Print the analysis as a table.
preview.analysis.to_report()

──────────────────────────────────────── 🎨 Data Designer Dataset Profile ─────────────────────────────────────────

                                                                                                                   
                                                 Dataset Overview                                                  
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ number of records               ┃ number of columns               ┃ percent complete records                    ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ 2                               │ 1                               │ 100.0%                                      │
└─────────────────────────────────┴─────────────────────────────────┴─────────────────────────────────────────────┘
                                                                                                                   
                                                                                                                   
                                                📝 LLM-Text Columns                                                
┏━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃                  ┃               ┃                              ┃       prompt tokens ┃       completion tokens ┃
┃ column name      ┃     data type ┃         number unique values ┃          per record ┃              per record ┃
┡━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ description      │        string │                   2 (100.0%) │        29.0 +/- 0.0 │         305.5 +/- 103.9 │
└──────────────────┴───────────────┴──────────────────────────────┴─────────────────────┴─────────────────────────┘
                                                                                                                   
                                                                                                                   
╭────────────────────────────────────────────────── Table Notes ──────────────────────────────────────────────────╮
│                                                                                                                 │
│  1. All token statistics are based on a sample of max(1000, len(dataset)) records.                              │
│  2. Tokens are calculated using tiktoken's cl100k_base tokenizer.                                               │
│                                                                                                                 │
╰─────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
                                                                                                                   
───────────────────────────────────────────────────────────────────────────────────────────────────────────────────

🔎 Visual Inspection¶

Let's compare the original image with the generated description to validate quality:

In [14]:

Copied!





# Compare original image with generated description
index = 0  # Change this to view different examples

# Merge preview data with original images for comparison
comparison_dataset = preview.dataset.merge(pd.DataFrame(img_dataset)[["uuid", "image"]], how="left", on="uuid")

# Extract the record for display
record = comparison_dataset.iloc[index]

print("📄 Original Image:")
display(resize_image(record.image, BASE64_IMAGE_HEIGHT))

print("\n📝 Generated Description:")
rich.print(Panel(record.description, title="Image Description", title_align="left"))
# Compare original image with generated description
index = 0  # Change this to view different examples

# Merge preview data with original images for comparison
comparison_dataset = preview.dataset.merge(pd.DataFrame(img_dataset)[["uuid", "image"]], how="left", on="uuid")

# Extract the record for display
record = comparison_dataset.iloc[index]

print("📄 Original Image:")
display(resize_image(record.image, BASE64_IMAGE_HEIGHT))

print("\n📝 Generated Description:")
rich.print(Panel(record.description, title="Image Description", title_align="left"))

📄 Original Image:

No description has been provided for this image

📝 Generated Description:

╭─ Image Description ─────────────────────────────────────────────────────────────────────────────────────────────╮
│ ```markdown                                                                                                     │
│ # Black and White Cat Portrait                                                                                  │
│                                                                                                                 │
│ ## Main Subject                                                                                                 │
│ The main subject of this image is a **black and white cat**. The cat is facing the camera, giving us a clear    │
│ view of its facial features.                                                                                    │
│                                                                                                                 │
│ ## Background                                                                                                   │
│ The background is **neutral and blurred**, which helps to keep the focus on the cat. The colors in the          │
│ background are primarily **beige** and **gray**.                                                                │
│                                                                                                                 │
│ ## Colors                                                                                                       │
│ The cat's fur is predominantly **black**, with a **white** patch on its forehead and around its muzzle. The     │
│ cat's eyes are a striking **yellow** color, which stands out against its black fur.                             │
│                                                                                                                 │
│ ## Notable Details                                                                                              │
│ - The cat's eyes are **wide open**, giving it an alert and curious expression.                                  │
│ - The cat has **white whiskers**, which contrast with its black fur.                                            │
│ - There is a **black patch** on the cat's chin, adding to its distinctive appearance.                           │
│ - The cat's ears are **perked up**, indicating interest or alertness.                                           │
│ ```                                                                                                             │
│                                                                                                                 │
│ This detailed description provides a comprehensive overview of the image, focusing on the main subject,         │
│ background, colors, and notable details.                                                                        │
╰─────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯

🆙 Scale up!¶

Happy with your preview data?
Use the create method to submit larger Data Designer generation jobs.

In [15]:

Copied!

results = data_designer.create(config_builder, num_records=10, dataset_name="tutorial-4")
results = data_designer.create(config_builder, num_records=10, dataset_name="tutorial-4")

[16:35:08] [INFO] 🎨 Creating Data Designer dataset

[16:35:08] [INFO] ✅ Validation passed

[16:35:08] [INFO] ⛓️ Sorting column configs into a Directed Acyclic Graph

[16:35:08] [INFO] 🩺 Running health checks for models...

[16:35:08] [INFO]   |-- 👀 Checking 'nvidia/nemotron-nano-12b-v2-vl' in provider named 'nvidia' for model alias 'nvidia-vision'...

[16:35:09] [INFO]   |-- ✅ Passed!

[16:35:09] [INFO] ⏳ Processing batch 1 of 1

[16:35:09] [INFO] 🌱 Sampling 10 records from seed dataset

[16:35:09] [INFO]   |-- seed dataset size: 512 records

[16:35:09] [INFO]   |-- sampling strategy: ordered

[16:35:09] [INFO] 📝 llm-text model config for column 'description'

[16:35:09] [INFO]   |-- model: 'nvidia/nemotron-nano-12b-v2-vl'

[16:35:09] [INFO]   |-- model alias: 'nvidia-vision'

[16:35:09] [INFO]   |-- model provider: 'nvidia'

[16:35:09] [INFO]   |-- inference parameters:

[16:35:09] [INFO]   |  |-- generation_type=chat-completion

[16:35:09] [INFO]   |  |-- max_parallel_requests=4

[16:35:09] [INFO]   |  |-- temperature=0.85

[16:35:09] [INFO]   |  |-- top_p=0.95

[16:35:09] [INFO] ⚡️ Processing llm-text column 'description' with 4 concurrent workers

[16:35:09] [INFO] ⏱️ llm-text column 'description' will report progress after each record

[16:35:12] [INFO]   |-- 😴 llm-text column 'description' progress: 1/10 (10%) complete, 1 ok, 0 failed, 0.38 rec/s, eta 23.6s

[16:35:13] [INFO]   |-- 😴 llm-text column 'description' progress: 2/10 (20%) complete, 2 ok, 0 failed, 0.56 rec/s, eta 14.4s

[16:35:13] [INFO]   |-- 🥱 llm-text column 'description' progress: 3/10 (30%) complete, 3 ok, 0 failed, 0.77 rec/s, eta 9.1s

[16:35:13] [INFO]   |-- 🥱 llm-text column 'description' progress: 4/10 (40%) complete, 4 ok, 0 failed, 1.02 rec/s, eta 5.9s

[16:35:15] [INFO]   |-- 😐 llm-text column 'description' progress: 5/10 (50%) complete, 5 ok, 0 failed, 0.84 rec/s, eta 5.9s

[16:35:16] [INFO]   |-- 😐 llm-text column 'description' progress: 6/10 (60%) complete, 6 ok, 0 failed, 0.84 rec/s, eta 4.8s

[16:35:17] [INFO]   |-- 😐 llm-text column 'description' progress: 7/10 (70%) complete, 7 ok, 0 failed, 0.97 rec/s, eta 3.1s

[16:35:17] [INFO]   |-- 😊 llm-text column 'description' progress: 8/10 (80%) complete, 8 ok, 0 failed, 1.10 rec/s, eta 1.8s

[16:35:19] [INFO]   |-- 😊 llm-text column 'description' progress: 9/10 (90%) complete, 9 ok, 0 failed, 0.98 rec/s, eta 1.0s

[16:35:19] [INFO]   |-- 🤩 llm-text column 'description' progress: 10/10 (100%) complete, 10 ok, 0 failed, 0.98 rec/s, eta 0.0s

[16:35:20] [INFO] 📊 Model usage summary:

[16:35:20] [INFO]   |-- model: nvidia/nemotron-nano-12b-v2-vl

[16:35:20] [INFO]   |-- tokens: input=22998, output=2162, total=25160, tps=2373

[16:35:20] [INFO]   |-- requests: success=10, failed=0, total=10, rpm=56

[16:35:20] [INFO] 📐 Measuring dataset column statistics:

[16:35:20] [INFO]   |-- 📝 column: 'description'

In [16]:

Copied!

# Load the generated dataset as a pandas DataFrame.
dataset = results.load_dataset()

dataset.head()
# Load the generated dataset as a pandas DataFrame.
dataset = results.load_dataset()

dataset.head()

Out[16]:

	uuid	base64_image	description
0	16f5f74f-13d5-404b-b14d-c8939f7422eb	iVBORw0KGgoAAAANSUhEUgAAAeQAAAIACAIAAADc8YinAA...	# Image Description The image features a stri...
1	3d578437-b695-497c-ae9d-b03e8cc86d66	iVBORw0KGgoAAAANSUhEUgAAAiQAAAIACAIAAAA9rOAHAA...	Here is the detailed description of the image ...
2	d0ecbe64-8ddc-4dd1-a816-656bdef18876	iVBORw0KGgoAAAANSUhEUgAAAqoAAAIACAIAAADFYNm1AA...	### Detailed Description of the Image: The im...
3	18020c98-1dbb-48b3-a7ea-2026e05db44e	iVBORw0KGgoAAAANSUhEUgAAAwAAAAIACAIAAAC6lJxtAA...	![Cat in Green Cat Bed](image_url) **Descript...
4	2dbd7d23-d550-4742-824e-7a8527aa4c21	iVBORw0KGgoAAAANSUhEUgAAAqoAAAIACAIAAADFYNm1AA...	Image Description: - Main Subject: ...

In [17]:

Copied!

# Load the analysis results into memory.
analysis = results.load_analysis()

analysis.to_report()
# Load the analysis results into memory.
analysis = results.load_analysis()

analysis.to_report()

──────────────────────────────────────── 🎨 Data Designer Dataset Profile ─────────────────────────────────────────

                                                                                                                   
                                                 Dataset Overview                                                  
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ number of records               ┃ number of columns               ┃ percent complete records                    ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ 10                              │ 1                               │ 100.0%                                      │
└─────────────────────────────────┴─────────────────────────────────┴─────────────────────────────────────────────┘
                                                                                                                   
                                                                                                                   
                                                📝 LLM-Text Columns                                                
┏━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃                  ┃               ┃                              ┃       prompt tokens ┃       completion tokens ┃
┃ column name      ┃     data type ┃         number unique values ┃          per record ┃              per record ┃
┡━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ description      │        string │                  10 (100.0%) │        29.0 +/- 0.0 │          229.0 +/- 54.8 │
└──────────────────┴───────────────┴──────────────────────────────┴─────────────────────┴─────────────────────────┘
                                                                                                                   
                                                                                                                   
╭────────────────────────────────────────────────── Table Notes ──────────────────────────────────────────────────╮
│                                                                                                                 │
│  1. All token statistics are based on a sample of max(1000, len(dataset)) records.                              │
│  2. Tokens are calculated using tiktoken's cl100k_base tokenizer.                                               │
│                                                                                                                 │
╰─────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
                                                                                                                   
───────────────────────────────────────────────────────────────────────────────────────────────────────────────────

⏭️ Next Steps¶

Now that you've learned how to use visual context for image summarization in Data Designer, explore more:

Experiment with different vision models for specific image types
Try different prompt variations to generate specialized descriptions (e.g., technical details, key findings)
Combine vision-based descriptions with other column types for multi-modal workflows
Apply this pattern to other vision tasks like image captioning, OCR validation, or visual question answering
Generating images with Data Designer