Use Case Recipes
Recipes are a collection of code examples that demonstrate how to leverage Data Designer in specific use cases. Each recipe is a self-contained example that can be run independently.
New to Data Designer?
Recipes provide working code for specific use cases without detailed explanations. If you're learning Data Designer for the first time, we recommend starting with our tutorial notebooks, which offer step-by-step guidance and explain core concepts. Once you're familiar with the basics, return here for practical, ready-to-use implementations.
Tip
These recipes use the Open AI model provider by default. Ensure your OpenAI model provider has been set up using the Data Designer CLI before running a recipe.
-
Text to Python
Generate a dataset of natural language instructions paired with Python code implementations, with varying complexity levels and industry focuses.
Demonstrates:
- Python code generation
- Python code validation
- LLM-as-judge
-
Text to SQL
Generate a dataset of natural language instructions paired with SQL code implementations, with varying complexity levels and industry focuses.
Demonstrates:
- SQL code generation
- SQL code validation
- LLM-as-judge
-
Product Info QA
Generate a dataset that contains information about products and associated question/answer pairs.
Demonstrates:
- Structured outputs
- Expression columns
- LLM-as-judge
-
Multi-Turn Chat
Generate a dataset of multi-turn chat conversations between a user and an AI assistant.
Demonstrates:
- Structured outputs
- Expression columns
- LLM-as-judge
-
Basic MCP Tool Use
Minimal example of MCP tool calling with Data Designer. Defines a simple MCP server with basic tools and generates data that requires tool calls to complete.
Demonstrates:
- MCP tool calling with LocalStdioMCPProvider
- Simple tool server definition
- Tool-augmented text generation
-
PDF Document QA (MCP + Tool Use)
Generate grounded Q&A pairs from PDF documents using MCP tool calls and BM25 search.
Demonstrates:
- MCP tool calling with LocalStdioMCPProvider
- BM25 lexical search for retrieval
- Retrieval-grounded QA generation
- Per-column trace capture