# Architecture
> This bundle contains all pages in the Architecture section.
> Source: https://www.union.ai/docs/v2/union/deployment/selfmanaged/architecture/

=== PAGE: https://www.union.ai/docs/v2/union/deployment/selfmanaged/architecture ===

# Architecture

> **📝 Note**
>
> An LLM-optimized bundle of this entire section is available at [`section.md`](https://www.union.ai/docs/v2/union/deployment/selfmanaged/section.md).
> This single file contains all pages in this section, optimized for AI coding agent context.

This section covers the architecture of the Union.ai data plane.
It provides an overview of the components and their interactions within the system.
Understanding the architecture is crucial for effectively deploying and managing your Union.ai cluster.

=== PAGE: https://www.union.ai/docs/v2/union/deployment/selfmanaged/architecture/overview ===

# Overview

The Union.ai architecture consists of two components, referred to as planes — the control plane and the data plane.

![](../../../_static/images/deployment/architecture.svg)

## Control plane

The control plane:
  * Runs within the Union.ai AWS account.
  * Provides the user interface through which users can access authentication, authorization, observation, and management functions.
  * Is responsible for placing executions onto data plane clusters and performing other cluster control and management functions.

## Data plane

Union.ai operates one control plane for each supported region, which supports all data planes within that region. You can choose the region in which to locate your data plane. Currently, Union.ai supports the `us-west`, `us-east`, `eu-west`, and `eu-central` regions, and more are being added.

### Data plane nodes

Worker nodes are responsible for executing your workloads. You have full control over the configuration of your worker nodes. When worker nodes are not in use, they automatically scale down to the configured minimum.

## Union.ai operator

The Union.ai hybrid architecture lets you maintain ultimate ownership and control of your data and compute infrastructure while enabling Union.ai to handle the details of managing that infrastructure.

Management of the data plane is mediated by a dedicated operator (the Union.ai operator) resident on that plane.
This operator is designed to perform its functions with only the very minimum set of required permissions.
It allows the control plane to spin up and down clusters and provides Union.ai's support engineers with access to system-level logs and the ability to apply changes as per customer requests.
It _does not_ provide direct access to secrets or data.

In addition, communication is always initiated by the Union.ai operator in the data plane toward the Union.ai control plane, not the other way around.
This further enhances the security of your data plane.

Union.ai is SOC-2 Type 2 certified. A copy of the audit report is available upon request.

## Registry data

Registry data is comprised of:

* Names of workflows, tasks, launch plans, and artifacts
* Input and output types for workflows and tasks
* Execution status, start time, end time, and duration of workflows and tasks
* Version information for workflows, tasks, launchplans, and artifacts
* Artifact definitions

This type of data is stored in the control plane and is used to manage the execution of your workflows.
This does not include any workflow or task code, nor any data that is processed by your workflows or tasks.

## Execution data

Execution data is comprised of::

* Event data
* Workflow inputs
* Workflow outputs
* Data passed between tasks (task inputs and outputs)

This data is divided into two categories: *raw data* and *literal data*.

### Raw data

Raw data is comprised of:

* Files and directories
* Dataframes
* Models
* Python-pickled types

These are passed by reference between tasks and are always stored in an object store in your data plane.
This type of data is read by (and may be temporarily cached) by the control plane as needed, but is never stored there.

### Literal data

* Primitive execution inputs (int, string... etc.)
* JSON-serializable dataclasses

These are passed by value, not by reference, and may be stored in the Union.ai control plane.

## Data privacy

If you are concerned with maintaining strict data privacy, be sure not to pass private information in literal form between tasks.

=== PAGE: https://www.union.ai/docs/v2/union/deployment/selfmanaged/architecture/kubernetes-rbac ===

# Kubernetes Access Controls

## Roles

See the [dataplane helm charts](https://github.com/unionai/helm-charts/tree/main/charts/dataplane) for detailed information about Roles and ClusterRoles.

### Role Permissions Summary

##### `proxy-system-secret`
- Scoped to `union` namespace
- Permissions on secrets: get, list, create, update, delete

##### `operator-system`
- Scoped to `union` namespace
- Permissions on secrets and deployments: get, list, watch, create, update

##### `union-operator-admission` (for webhook)
- Scoped to `union` namespace
- Permissions on secrets: get, create

### ClusterRole Permissions Summary

#### Metrics and Monitoring Roles

##### `release-name-kube-state-metrics`

- **Purpose**: Collects metrics from Kubernetes resources
- **Access Pattern**: Read-only (`list`, `watch`) to numerous resources across multiple API groups
- **Scope**: Comprehensive - covers core resources, workloads, networking, storage, and authentication

##### `prometheus-operator`
- **Access**: Full control (`*`) over Prometheus monitoring resources
- **Key Permissions**:
  - Complete access to monitoring.coreos.com API group resources
  - Full access to statefulsets, configmaps, secrets
  - Pod management (list, delete)
  - Service/endpoint management
  - Read-only for nodes, namespaces, ingresses

##### `union-operator-prometheus`
- **Access**: Read-only access to metrics sources
- **Resources**: nodes, services, endpoints, pods, endpointslices, ingresses
- **Special**: Access to `/metrics` and `/metrics/cadvisor` endpoints

#### Resource Management Roles

##### `clustersync-resource`
- **Access**: Full control (`*`) over core and RBAC resources
- **Resources**:
  - Core: configmaps, namespaces, pods, resourcequotas, secrets, services, serviceaccounts
  - RBAC: roles, rolebindings, clusterrolebindings
- **API Groups**: `""` (core) and `rbac.authorization.k8s.io`

##### `proxy-system`
- **Access**: Read-only (`get`, `list`, `watch`)
- **Resources**: events, flyteworkflows, pods/log, pods, rayjobs, resourcequotas

#### Workflow Management Roles

##### `operator-system`
- **Access**: Full control over Flyte workflows, CRUD for core resources
- **Resources**:
  - Full access to flyteworkflows
  - Management of pods, configmaps, resourcequotas, podtemplates, nodes
  - Access to `/metrics` endpoint

##### `flytepropeller-webhook-role`
- **Access**: Get, create, update, patch
- **Resources**: mutatingwebhookconfigurations, secrets, pods, replicasets/finalizers
##### `flytepropeller-role`
- **Access**: Varied per resource type
- **Key Permissions**:
  - Read-only for pods
  - Event management
  - CRD management
  - Full control over flyteworkflows including finalizers

## Service Access

### `operator/operator-proxy`
Service that provides access to both cluster resources and cloud provider APIs, particularly focused on compute resource management.

#### Kubernetes Resources

##### Core Resources
- Pods: Access via informers to monitor and manage pod lifecycle.
- Nodes: Access to retrieve node information.
- ResourceQuotas: Read access.
- ConfigMaps: Access for configuration management
- Secrets: Access for credentials storage
- Namespaces: Referenced in container/pod identification contexts

##### Custom Resources
- FlyteWorkflows: Management of v1alpha1.FlyteWorkflow resources
- Kueue Resources (optional): Access to ResourceFlavor, ClusterQueue, and other queue resources
- Karpenter NodePools (optional): For AWS-based compute resource management

##### Cloud Provider Resources
- Object Storage: Read/write operations to cloud storage buckets

##### Authentication and Configuration
- OAuth: Uses app ID for authentication with Union cloud services
- Service Account Roles: Configured via UserRoleKey and UserRole
- Cluster Information: Access to cluster metadata and metrics

### `FlytePropeller/PropellerWebhook`
Kubernetes operator that executes Flyte graphs natively on Kubernetes.

#### Kubernetes Resources
- Manages pod creation for executions
- Secret injection

#### Custom Resources
- FlyteWorkflows: Management of v1alpha1.FlyteWorkflow resources

