# Serve and deploy apps

> **📝 Note**
>
> An LLM-optimized bundle of this entire section is available at [`section.md`](https://www.union.ai/docs/v2/union/user-guide/serve-and-deploy-apps/section.md).
> This single file contains all pages in this section, optimized for AI coding agent context.

Flyte provides two main ways to deploy apps: **serve** (for development) and **deploy** (for production). This section covers both methods and their differences.

## Serve vs Deploy

### `flyte serve`

Serving is designed for development and iteration:

- **Dynamic parameter modification**: You can override app parameters when serving
- **Quick iteration**: Faster feedback loop for development
- **Interactive**: Better suited for testing and experimentation

### `flyte deploy`

Deployment is designed for production use:

- **Immutable**: Apps are deployed with fixed configurations
- **Production-ready**: Optimized for stability and reproducibility

## Using Python SDK

### Serve

```
# /// script
# requires-python = ">=3.12"
# dependencies = [
#    "flyte>=2.0.0b52",
# ]
# ///

"""Serve and deploy examples for the _index.md documentation."""

import flyte
import flyte.app

# {{docs-fragment serve-example}}
app_env = flyte.app.AppEnvironment(
    name="my-app",
    image=flyte.app.Image.from_debian_base().with_pip_packages("streamlit==1.41.1"),
    args=["streamlit", "hello", "--server.port", "8080"],
    port=8080,
    resources=flyte.Resources(cpu="1", memory="1Gi"),
)

if __name__ == "__main__":
    flyte.init_from_config()
    app = flyte.serve(app_env)
    print(f"Served at: {app.url}")
# {{/docs-fragment serve-example}}

# {{docs-fragment deploy-example}}
app_env = flyte.app.AppEnvironment(
    name="my-app",
    image=flyte.app.Image.from_debian_base().with_pip_packages("streamlit==1.41.1"),
    args=["streamlit", "hello", "--server.port", "8080"],
    port=8080,
    resources=flyte.Resources(cpu="1", memory="1Gi"),
)

if __name__ == "__main__":
    flyte.init_from_config()
    deployments = flyte.deploy(app_env)
    # Access deployed app URL from the deployment
    for deployed_env in deployments[0].envs.values():
        print(f"Deployed: {deployed_env.deployed_app.url}")
# {{/docs-fragment deploy-example}}
```

*Source: https://github.com/unionai/unionai-examples/blob/main/v2/user-guide/serve-and-deploy-apps/serve_and_deploy_examples.py*

### Deploy

```
# /// script
# requires-python = ">=3.12"
# dependencies = [
#    "flyte>=2.0.0b52",
# ]
# ///

"""Serve and deploy examples for the _index.md documentation."""

import flyte
import flyte.app

# {{docs-fragment serve-example}}
app_env = flyte.app.AppEnvironment(
    name="my-app",
    image=flyte.app.Image.from_debian_base().with_pip_packages("streamlit==1.41.1"),
    args=["streamlit", "hello", "--server.port", "8080"],
    port=8080,
    resources=flyte.Resources(cpu="1", memory="1Gi"),
)

if __name__ == "__main__":
    flyte.init_from_config()
    app = flyte.serve(app_env)
    print(f"Served at: {app.url}")
# {{/docs-fragment serve-example}}

# {{docs-fragment deploy-example}}
app_env = flyte.app.AppEnvironment(
    name="my-app",
    image=flyte.app.Image.from_debian_base().with_pip_packages("streamlit==1.41.1"),
    args=["streamlit", "hello", "--server.port", "8080"],
    port=8080,
    resources=flyte.Resources(cpu="1", memory="1Gi"),
)

if __name__ == "__main__":
    flyte.init_from_config()
    deployments = flyte.deploy(app_env)
    # Access deployed app URL from the deployment
    for deployed_env in deployments[0].envs.values():
        print(f"Deployed: {deployed_env.deployed_app.url}")
# {{/docs-fragment deploy-example}}
```

*Source: https://github.com/unionai/unionai-examples/blob/main/v2/user-guide/serve-and-deploy-apps/serve_and_deploy_examples.py*

## Using the CLI

### Serve

```bash
flyte serve path/to/app.py app_env
```

### Deploy

```bash
flyte deploy path/to/app.py app_env
```

## Next steps

- [**How app serving works**](https://www.union.ai/docs/v2/union/user-guide/serve-and-deploy-apps/how-app-serving-works/page.md): Understanding the serve process and configuration options
- [**How app deployment works**](https://www.union.ai/docs/v2/union/user-guide/serve-and-deploy-apps/how-app-deployment-works/page.md): Understanding the deploy process and configuration options
- [**Activating and deactivating apps**](https://www.union.ai/docs/v2/union/user-guide/serve-and-deploy-apps/activating-and-deactivating-apps/page.md): Managing app lifecycle
- [**Model training and serving**](https://www.union.ai/docs/v2/union/user-guide/basic-project/page.md): Train a model with tasks and serve it via FastAPI
- [**Prefetching models**](https://www.union.ai/docs/v2/union/user-guide/serve-and-deploy-apps/prefetching-models/page.md): Download and shard HuggingFace models for vLLM and SGLang

## Subpages

- [How app serving works](https://www.union.ai/docs/v2/union/user-guide/serve-and-deploy-apps/how-app-serving-works/page.md)
  - Overview
  - Using the Python SDK
  - Overriding parameters
  - Advanced serving options
  - Using CLI
  - Return value
  - Best practices
  - Troubleshooting
- [How app deployment works](https://www.union.ai/docs/v2/union/user-guide/serve-and-deploy-apps/how-app-deployment-works/page.md)
  - Overview
  - Using the Python SDK
  - Deployment plan
  - Overriding App configuration at deployment time
  - Activation/deactivation
  - Using the CLI
  - Example: Full deployment configuration
  - Best practices
  - Deployment status and return value
  - Troubleshooting
- [Activating and deactivating apps](https://www.union.ai/docs/v2/union/user-guide/serve-and-deploy-apps/activating-and-deactivating-apps/page.md)
  - Activation
  - Activate after deployment
  - Activate an app
  - Check activation status
  - Deactivation
  - Lifecycle management
  - Typical deployment workflow
  - Blue-green deployment
  - Using CLI
  - Activate
  - Deactivate
  - Check status
  - Best practices
  - Automatic activation with serve
  - Example: Complete deployment and activation
  - Troubleshooting
- [Prefetching models](https://www.union.ai/docs/v2/union/user-guide/serve-and-deploy-apps/prefetching-models/page.md)
  - Why prefetch?
  - Basic prefetch
  - Using Python SDK
  - Using CLI
  - Using prefetched models
  - Prefetch options
  - Custom artifact name
  - With HuggingFace token
  - With resources
  - Sharding models for multi-GPU
  - vLLM sharding
  - Using shard config via CLI
  - Using prefetched sharded models
  - CLI options
  - Complete example
  - Best practices
  - Troubleshooting

---
**Source**: https://github.com/unionai/unionai-docs/blob/main/content/user-guide/serve-and-deploy-apps/_index.md
**HTML**: https://www.union.ai/docs/v2/union/user-guide/serve-and-deploy-apps/
