# Build tasks

> **📝 Note**
>
> An LLM-optimized bundle of this entire section is available at [`section.md`](https://www.union.ai/docs/v2/union/user-guide/task-programming/section.md).
> This single file contains all pages in this section, optimized for AI coding agent context.

This section covers the essential programming patterns and techniques for developing robust Flyte workflows. Once you understand the basics of task configuration, these guides will help you build sophisticated, production-ready data pipelines and machine learning workflows.

## What you'll learn

The task programming section covers key patterns for building effective Flyte workflows:

**Data handling and types**
- [**Files and directories**](https://www.union.ai/docs/v2/union/user-guide/task-programming/files-and-directories/page.md): Work with large datasets using Flyte's efficient file and directory types that automatically handle data upload, storage, and transfer between tasks.
- [**DataFrames**](https://www.union.ai/docs/v2/union/user-guide/task-programming/dataframes/page.md): Pass DataFrames between tasks without downloading data into memory, with support for Pandas, Polars, PyArrow, Dask, and other DataFrame backends.
- [**Data classes and structures**](https://www.union.ai/docs/v2/union/user-guide/task-programming/dataclasses-and-structures/page.md): Use Python data classes and Pydantic models as task inputs and outputs to create well-structured, type-safe workflows.
- [**Custom context**](https://www.union.ai/docs/v2/union/user-guide/task-programming/custom-context/page.md): Use custom context to pass metadata through your task execution hierarchy without adding parameters to every task.

**Execution patterns**
- [**Fanout**](https://www.union.ai/docs/v2/union/user-guide/task-programming/fanout/page.md): Scale your workflows by running many tasks in parallel, perfect for processing large datasets or running hyperparameter sweeps.
- [**Controlling parallel execution**](https://www.union.ai/docs/v2/union/user-guide/task-programming/controlling-parallelism/page.md): Limit concurrent task executions using semaphores or `flyte.map` concurrency for rate-limited APIs, GPU quotas, and resource-constrained workflows.
- [**Human-in-the-loop**](https://www.union.ai/docs/v2/union/user-guide/task-programming/human-in-the-loop/page.md): Pause workflow execution at a checkpoint and wait for a human to provide input or approval before continuing.
- [**Grouping actions**](https://www.union.ai/docs/v2/union/user-guide/task-programming/grouping-actions/page.md): Organize related task executions into logical groups for better visualization and management in the UI.
- [**Container tasks**](https://www.union.ai/docs/v2/union/user-guide/task-programming/container-tasks/page.md): Run arbitrary containers in any language without the Flyte SDK installed, using Flyte's copilot sidecar for seamless data flow.
- [**Remote tasks**](https://www.union.ai/docs/v2/union/user-guide/task-programming/remote-tasks/page.md): Use previously deployed tasks without importing their code or dependencies, enabling team collaboration and task reuse.
- [**Pod templates**](https://www.union.ai/docs/v2/union/user-guide/task-configuration/pod-templates/page.md): Extend tasks with Kubernetes pod templates to add sidecars, volume mounts, and advanced Kubernetes configurations.
- [**Abort and cancel actions**](https://www.union.ai/docs/v2/union/user-guide/task-programming/abort-tasks/page.md): Stop in-progress actions automatically, programmatically, or manually via the CLI and UI.
- [**Other features**](https://www.union.ai/docs/v2/union/user-guide/task-programming/other-features/page.md): Advanced patterns like task forwarding and other specialized task execution techniques.

**Development and debugging**
- [**Notebooks**](https://www.union.ai/docs/v2/union/user-guide/task-programming/notebooks/page.md): Write and iterate on workflows directly in Jupyter notebooks for interactive development and experimentation.
- [**Unit testing**](https://www.union.ai/docs/v2/union/user-guide/task-programming/unit-testing/page.md): Test your Flyte tasks using direct invocation for business logic or `flyte.run()` for Flyte-specific features.
- [**Links**](https://www.union.ai/docs/v2/union/user-guide/task-programming/links/page.md): Add clickable URLs to tasks in the Flyte UI, connecting them to external tools like experiment trackers and monitoring dashboards.
- [**Reports**](https://www.union.ai/docs/v2/union/user-guide/task-programming/reports/page.md): Generate custom HTML reports during task execution to display progress, results, and visualizations in the UI.
- [**Traces**](https://www.union.ai/docs/v2/union/user-guide/task-programming/traces/page.md): Add fine-grained observability to helper functions within your tasks for better debugging and resumption capabilities.
- [**Error handling**](https://www.union.ai/docs/v2/union/user-guide/task-programming/error-handling/page.md): Implement robust error recovery strategies, including automatic resource scaling and graceful failure handling.

## When to use these patterns

These programming patterns become essential as your workflows grow in complexity:

- Use **fanout** when you need to process multiple items concurrently or run parameter sweeps. Use **controlling parallel execution** when you need to limit how many run at the same time.
- Implement **error handling** for production workflows that need to recover from infrastructure failures.
- Apply **grouping** to organize complex workflows with many task executions.
- Leverage **files and directories** when working with large datasets that don't fit in memory.
- Use **DataFrames** to efficiently pass tabular data between tasks across different processing engines.
- Choose **container tasks** when you need to run code in non-Python languages, use legacy containers, or execute AI-generated code in sandboxes.
- Use **remote tasks** to reuse tasks deployed by other teams without managing their dependencies.
- Apply **pod templates** when you need advanced Kubernetes features like sidecars or specialized storage configurations.
- Use **traces** to debug non-deterministic operations like API calls or ML inference.
- Use **links** to connect tasks to external tools like Weights & Biases, Grafana, or custom dashboards directly from the Flyte UI.
- Create **reports** to monitor long-running workflows and share results with stakeholders.
- Use **custom context** when you need lightweight, cross-cutting metadata to flow through your task hierarchy without becoming part of the task's logical inputs.
- Write **unit tests** to validate your task logic and ensure type transformations work correctly before deployment.
- Use **abort and cancel** to stop unnecessary actions when conditions change, such as early convergence in HPO or manual intervention.
- Use **human-in-the-loop** to insert approval gates or data collection checkpoints into automated workflows.

Each guide includes practical examples and best practices to help you implement these patterns effectively in your own workflows.

## Subpages

- [Files and directories](https://www.union.ai/docs/v2/union/user-guide/task-programming/files-and-directories/page.md)
  - Example usage
  - JSONL files
  - Setup
  - JsonlFile
  - Compression
  - JsonlDir
  - Error handling
  - Batch iteration
- [Data classes and structures](https://www.union.ai/docs/v2/union/user-guide/task-programming/dataclasses-and-structures/page.md)
  - Example: Combining Dataclasses and Pydantic Models
- [DataFrames](https://www.union.ai/docs/v2/union/user-guide/task-programming/dataframes/page.md)
  - Setting up the environment and sample data
  - Create a raw DataFrame
  - Create a flyte.io.DataFrame
  - Automatically convert between types
  - Downloading DataFrames
  - Run the example
  - Polars DataFrames
  - Setup
  - Eager DataFrames
  - Lazy DataFrames
  - Run the example
- [Custom types](https://www.union.ai/docs/v2/union/user-guide/task-programming/handling-custom-types/page.md)
  - Types of extensions
  - Creating a type transformer
  - Step 1: Define your custom type
  - Step 2: Create the type transformer
  - Step 3: Register the transformer
  - Distributing type plugins
  - Configure pyproject.toml
  - Automatic loading
  - Controlling plugin loading
  - Using custom types in tasks
  - DataFrame extensions
  - Best practices
- [Custom context](https://www.union.ai/docs/v2/union/user-guide/task-programming/custom-context/page.md)
  - Overview
  - When to use it and when not to
  - Setting custom context
  - Run-level context
  - Overriding inside a task (local override that affects nested tasks)
  - Adding new keys for nested tasks
  - Accessing custom context
- [Abort and cancel actions](https://www.union.ai/docs/v2/union/user-guide/task-programming/abort-tasks/page.md)
  - Action lifetime
  - Canceling actions programmatically
  - External abort
  - Aborting via the CLI
  - Handling external aborts
- [Raw Container Tasks](https://www.union.ai/docs/v2/union/user-guide/task-programming/container-tasks/page.md)
  - What are Container Tasks?
  - How Data Flows In and Out
  - Basic Usage
  - Template Syntax for Inputs
  - Using Container Tasks in Workflows
  - Advanced: Passing Files and Directories
  - Use Case: Agentic Sandbox Execution
  - Use Case: Legacy and Specialized Containers
  - Use Case: Multi-Language Workflows
  - Configuration Options
  - ContainerTask Parameters
  - Supported Input/Output Types
  - Best Practices
  - Local Execution
  - When to Use Container Tasks
- [Links](https://www.union.ai/docs/v2/union/user-guide/task-programming/links/page.md)
  - Creating a link
  - Using execution metadata
  - Dynamic links with override
- [Reports](https://www.union.ai/docs/v2/union/user-guide/task-programming/reports/page.md)
  - A simple example
  - A more complex example
  - Streaming example
- [Notebooks](https://www.union.ai/docs/v2/union/user-guide/task-programming/notebooks/page.md)
  - Iterating on and running a workflow
  - Accessing runs and downloading logs
- [Remote tasks](https://www.union.ai/docs/v2/union/user-guide/task-programming/remote-tasks/page.md)
  - Prerequisites
  - Basic usage
  - Understanding lazy loading
  - When tasks are fetched
  - Benefits of lazy loading
  - Error handling
  - Eager fetching with `fetch()`
  - Module-level vs dynamic loading
  - Complete example
  - Team A: Spark environment
  - Team B: ML environment
  - Team C: Orchestration
  - Invoke remote tasks in a script.
  - Why use remote tasks?
  - When to use remote tasks
  - How remote tasks work
  - Security model
  - Type system
  - Versioning options
  - Customizing remote tasks
  - Available overrides
  - Override examples
  - Chain overrides
  - Best practices
  - 1. Use meaningful task names
  - 2. Document task interfaces
  - 3. Prefer module-level loading
  - 4. Handle versioning thoughtfully
  - 5. Deploy remote tasks first
  - Limitations
  - Next steps
- [Error handling](https://www.union.ai/docs/v2/union/user-guide/task-programming/error-handling/page.md)
- [Traces](https://www.union.ai/docs/v2/union/user-guide/task-programming/traces/page.md)
  - What are traced functions for?
  - What Gets Traced
  - Errors are not recorded
  - Supported Function Types
  - Task Orchestration Pattern
  - Relationship to Caching and Checkpointing
  - How They Work Together
  - Execution Flow
  - Error Handling and Observability
  - Examples in Practice
  - LLM Pipeline with Traces
- [Grouping actions](https://www.union.ai/docs/v2/union/user-guide/task-programming/grouping-actions/page.md)
  - What are groups?
  - The problem groups solve
  - How groups work
  - Common grouping patterns
  - Sequential operations
  - Parallel processing with groups
  - Multi-phase workflows
  - Nested groups
  - Conditional grouping
  - Key insights
- [Fanout](https://www.union.ai/docs/v2/union/user-guide/task-programming/fanout/page.md)
  - Understanding fanout
  - Example
  - Parallel execution
  - Running the example
  - How Flyte handles concurrency and parallelism
- [Controlling parallel execution](https://www.union.ai/docs/v2/union/user-guide/task-programming/controlling-parallelism/page.md)
  - The problem: unbounded parallelism
  - Using asyncio.Semaphore
  - Using flyte.map with concurrency
  - Running the example
  - When to use each approach
- [Human-in-the-loop](https://www.union.ai/docs/v2/union/user-guide/task-programming/human-in-the-loop/page.md)
  - Setup
  - Automated task
  - Requesting human input
  - Wiring it together
  - Event options
  - Submitting input programmatically
- [Other features](https://www.union.ai/docs/v2/union/user-guide/task-programming/other-features/page.md)
  - Task Forwarding
  - Passing Tasks and Functions as Arguments
  - Custom Action Names
  - Set at Task Definition
  - Override at Call Time
  - Invoking Async Functions from Sync Tasks
  - Async and Sync Task Interoperability
  - Calling Sync Tasks from Async Tasks
  - Using with `flyte.map.aio()`
  - Using AnyIO in Async Tasks
- [Unit Testing Tasks](https://www.union.ai/docs/v2/union/user-guide/task-programming/unit-testing/page.md)
  - Understanding Task Invocation
  - Direct Function Invocation
  - Using `flyte.run()`
  - Testing Business Logic
  - Testing Async Tasks
  - Testing Nested Tasks
  - Testing Type Transformations and Serialization
  - Testing Type Restrictions
  - Testing Nested Tasks with Serialization
  - Testing Traced Functions
  - Best Practices
  - Quick Reference
  - Example Test Suite
  - Future Improvements

---
**Source**: https://github.com/unionai/unionai-docs/blob/main/content/user-guide/task-programming/_index.md
**HTML**: https://www.union.ai/docs/v2/union/user-guide/task-programming/
