Skip to content

Instantly share code, notes, and snippets.

@negz
Last active September 22, 2025 22:55
Show Gist options
  • Save negz/29b2b007d9491907e4a33bcfa497755c to your computer and use it in GitHub Desktop.
Save negz/29b2b007d9491907e4a33bcfa497755c to your computer and use it in GitHub Desktop.
Transaction-based Package Coordination System Specification

Transaction-based Package Coordination System Specification

Context and Problem Statement

GitHub Issue: crossplane/crossplane#6766

Current Problem: Crossplane's package manager makes incremental changes without validating complete operations upfront, leading to:

  • No pre-flight validation of dependency resolution
  • CRD/MRD conflicts discovered too late
  • Risky complex operations with no guarantee of success
  • Concurrent package operations that can interfere with each other

Solution: Introduce Transaction-based coordination where complete package operations are validated before execution and serialized to prevent conflicts.

Transaction Design Philosophy

Key Goals:

  1. Serialize package operations - Only one Transaction can execute at a time via exclusive Lock access
  2. Thorough pre-flight validation - Validate complete dependency trees, CRD compatibility, and resource requirements before making any changes

Important: Transactions are NOT for rollback - Unlike database transactions, these focus on coordination and validation rather than rollback capabilities. Users handle "undo" scenarios through standard Kubernetes manifest management (git revert, kubectl apply previous state).

Key Files for Context

Current Package Manager Implementation:

  • apis/pkg/v1/package_types.go - Package, Provider, Configuration, Function types
  • apis/pkg/v1/revision_types.go - PackageRevision types and lifecycle
  • apis/pkg/v1beta1/lock.go - Lock resource (central coordination point)
  • internal/controller/pkg/manager/reconciler.go - Package controllers (create revisions)
  • internal/controller/pkg/revision/reconciler.go - Revision controller (installs packages)
  • internal/controller/pkg/revision/dependency.go - Dependency resolution logic
  • internal/controller/pkg/resolver/reconciler.go - Resolver controller (manages Lock)
  • internal/xpkg/fetch.go - Package fetching using go-containerregistry
  • internal/dag/dag.go - Dependency graph implementation

Modern Controller Patterns:

  • internal/controller/ops/ - Reference implementation for modern controller patterns
  • internal/controller/ops/operation/constructor.go - Constructor pattern with dependency injection
  • internal/controller/ops/operation/reconciler.go - Modern reconciler structure

Architecture Overview

Current Flow

Provider controller creates ProviderRevisions → ProviderRevision + Resolver controllers collaborate to resolve dependencies and update Lock

New Flow (Transaction Mode)

User creates/updates Provider → Provider controller creates Transaction → Transaction controller validates + resolves tags to digests + creates ProviderRevisions + installs CRDs + updates Lock → ProviderRevision manages runtime

Key Architectural Changes

  • Transaction controller: Does all heavy lifting (validation, tag→digest resolution, ProviderRevision creation, CRD installation, Lock management)
  • Provider/Configuration/Function: User-facing APIs and configuration knobs, create Transactions
  • ProviderRevision/etc: Runtime management and health reporting only
  • Lock contains revisions: Lock.Packages contains revision-level info (digests), not package-level info (tags)

API Design

Transaction Resource

Location: apis/pkg/v1alpha1/transaction_types.go

// Transaction represents a complete proposed state of the package system
// and validates the entire operation before making changes.
type Transaction struct {
    metav1.TypeMeta   `json:",inline"`
    metav1.ObjectMeta `json:"metadata,omitempty"`
    
    Spec   TransactionSpec   `json:"spec,omitempty"`
    Status TransactionStatus `json:"status,omitempty"`
}

type TransactionSpec struct {
    // Change represents the type of change to make to the package system.
    Change ChangeType `json:"change"`
    
    // Install specifies parameters for installing a package.
    // Required when Change is "Install".
    // +optional
    Install *InstallSpec `json:"install,omitempty"`
    
    // Delete specifies parameters for deleting a package.
    // Required when Change is "Delete".
    // +optional
    Delete *DeleteSpec `json:"delete,omitempty"`
    
    // Replace specifies parameters for replacing one package with another.
    // Required when Change is "Replace".
    // +optional
    Replace *ReplaceSpec `json:"replace,omitempty"`
    
    // RetryLimit configures how many times the transaction may fail. When the
    // failure limit is exceeded, the transaction will not be retried.
    // Follows the same pattern as Operation resources for consistency.
    // +optional
    // +kubebuilder:default=5
    RetryLimit *int64 `json:"retryLimit,omitempty"`
}

type InstallSpec struct {
    // Package is a complete snapshot of the Package resource configuration
    // at the time the Transaction was created. This ensures stable inputs
    // across retries and provides all information needed to create PackageRevisions.
    Package PackageSnapshot `json:"package"`
}

type DeleteSpec struct {
    // Source is the OCI repository of the package to delete.
    Source string `json:"source"`
}

type ReplaceSpec struct {
    // Source is the OCI repository of the package to be replaced.
    Source string `json:"source"`
    
    // Package is a complete snapshot of the new Package resource configuration.
    // The Transaction controller uses this to create the replacement package
    // and resolve its dependency tree for equivalence validation.
    Package PackageSnapshot `json:"package"`
}

type PackageSnapshot struct {
    // APIVersion of the Package resource (e.g., "pkg.crossplane.io/v1")
    APIVersion string `json:"apiVersion"`
    
    // Kind of the Package resource (e.g., "Provider", "Configuration", "Function")
    Kind string `json:"kind"`
    
    // Metadata contains essential Package metadata (name, labels)
    // Excludes fields not needed for PackageRevision creation (generation, ownerRefs, etc.)
    Metadata PackageMetadata `json:"metadata"`
    
    // Spec is the complete Package.Spec configuration
    Spec runtime.RawExtension `json:"spec"`
}

type PackageMetadata struct {
    // Name of the Package resource
    Name string `json:"name"`
    
    // UID of the Package resource, required for creating owner references
    UID types.UID `json:"uid"`
    
    // Labels applied to the Package resource
    // +optional
    Labels map[string]string `json:"labels,omitempty"`
}

type ChangeType string

const (
    // ChangeTypeInstall installs the specified package version.
    // If the package source already exists, this updates it to the new version.
    ChangeTypeInstall ChangeType = "Install"
    
    // ChangeTypeDelete removes the specified package source from the system.
    ChangeTypeDelete ChangeType = "Delete"
    
    // ChangeTypeReplace replaces one package source with another.
    // Used for migrating between package ecosystems while preserving CRDs and managed resources.
    ChangeTypeReplace ChangeType = "Replace"
)

type TransactionStatus struct {
    xpv1.ConditionedStatus `json:",inline"`
    
    // Number of transaction failures. Incremented each time the transaction
    // fails and retries. When this reaches RetryLimit, the transaction
    // will not be retried again.
    Failures int64 `json:"failures,omitempty"`
    
    // TransactionNumber is a monotonically increasing number assigned when
    // the Transaction acquires the Lock. Used for ordering Transaction execution.
    TransactionNumber int64 `json:"transactionNumber,omitempty"`
    
    // ProposedLockPackages contains the complete proposed Lock state after resolving
    // the requested change and all dependencies. This represents what the Lock
    // will contain if the Transaction succeeds.
    ProposedLockPackages []LockPackage `json:"proposedLockPackages,omitempty"`
    
    // StartTime when transaction execution began
    StartTime *metav1.Time `json:"startTime,omitempty"`
    
    // CompletionTime when transaction finished (success or failure)
    CompletionTime *metav1.Time `json:"completionTime,omitempty"`
}

// Transaction condition types
const (
    // TypeValidated indicates the transaction has completed pre-flight validation
    TypeValidated xpv1.ConditionType = "Validated"
    
    // TypeInstalled indicates the transaction has completed package installation
    TypeInstalled xpv1.ConditionType = "Installed"
    
    // Additional standard conditions from xpv1.ConditionedStatus:
    // - Synced: Most recent reconcile loop succeeded (True) or hit an error (False)
    // - Ready: Transaction completed successfully overall
)

// Transaction condition progression:
// 1. Validated=True when pre-flight validation succeeds (False with specific reason if validation fails)
// 2. Installed=True when CRDs/ProviderRevisions created successfully (False if installation fails)  
// 3. Ready=True when entire transaction completes successfully (False if any step fails)
// 4. Synced=True/False set at every reconcile return - indicates if latest reconcile succeeded
//
// Failed transactions may trigger rollback by creating new rollback Transactions

// Validation condition reasons
const (
    // Dependency validation reasons
    ReasonMissingDependency   = "MissingDependency"
    ReasonVersionConflict     = "VersionConflict"
    ReasonCircularDependency  = "CircularDependency"
    
    // Schema validation reasons
    ReasonCRDConflict         = "CRDConflict"
    ReasonSchemaIncompatible  = "SchemaIncompatible"
    
    // Success reason
    ReasonAllValidationsPassed = "AllValidationsPassed"
    
    // Future validation reasons:
    // ReasonRBACInsufficient, ReasonQuotaExceeded, ReasonNetworkPolicyViolation, etc.
)

Lock Resource Extensions

Location: apis/pkg/v1beta1/lock.go (extend existing)

The Lock resource serves as the central coordination point for Transaction execution. In transaction mode, coordination state is stored in annotations to avoid API changes while maintaining backward compatibility.

Transaction Coordination Annotations:

const (
    // AnnotationCurrentTransaction indicates which Transaction currently has exclusive access
    // to mutate this Lock. Only one Transaction can execute at a time.
    AnnotationCurrentTransaction = "pkg.crossplane.io/current-transaction"
    
    // AnnotationNextTransactionNumber stores the next available transaction number.
    // Incremented by each Transaction when it releases the Lock.
    AnnotationNextTransactionNumber = "pkg.crossplane.io/next-transaction-number"
    
    // AnnotationTransactionHistoryLimit dictates how many completed transactions to retain.
    // Defaults to 10. Can be disabled by explicitly setting to 0.
    AnnotationTransactionHistoryLimit = "pkg.crossplane.io/transaction-history-limit"
)

Transaction Mode Behavior:

  • Lock.Status conditions are not set when transaction feature flag is enabled
  • All coordination happens through annotations and Transaction resources
  • Legacy resolver controller continues to set conditions when feature flag is disabled
  • Lock resource structure remains unchanged for backward compatibility

Transaction Status Examples

Complete Transaction Example

apiVersion: pkg.crossplane.io/v1alpha1
kind: Transaction
metadata:
  name: install-provider-aws-s3
spec:
  change: "Install"
  install:
    package:
      apiVersion: "pkg.crossplane.io/v1"
      kind: "Provider"
      metadata:
        name: "provider-aws-s3"
      spec:
        package: "xpkg.crossplane.io/crossplane-contrib/provider-aws-s3:v1.0.0"

status:
  failures: 0
  transactionNumber: 42
  
  proposedLockPackages:
    - name: "provider-gcp-compute-existing123"  # Unchanged existing package
      source: "xpkg.crossplane.io/crossplane-contrib/provider-gcp-compute"
      version: "v2.0.0@sha256:existing123"
      dependencies: []
    - name: "provider-family-aws-abc123def456"  # Auto-discovered dependency
      source: "xpkg.crossplane.io/crossplane-contrib/provider-family-aws"
      version: "v1.1.0@sha256:abc123def456"
      dependencies: []
    - name: "provider-aws-s3-def456abc123"      # Requested package
      source: "xpkg.crossplane.io/crossplane-contrib/provider-aws-s3"
      version: "v1.0.0@sha256:def456abc123"
      dependencies: [...]

  conditions:
  - type: Validated
    status: "True"
    reason: "AllValidationsPassed"
    message: "Pre-flight validation completed successfully"
  - type: Installed
    status: "True"
    reason: "InstallationComplete"
    message: "Package installation completed successfully"
  - type: Ready
    status: "True"
    reason: "TransactionComplete"
    message: "Transaction completed successfully"
  
  startTime: "2024-01-15T10:34:00Z"
  completionTime: "2024-01-15T10:34:45Z"

Validation Failure - Missing Dependency

status:
  conditions:
  - type: Validated
    status: "False"
    reason: "MissingDependency"
    message: "Package provider-aws-s3 requires provider-family-aws >=v1.1.0, but no compatible version found"
    lastTransitionTime: "2024-01-15T10:30:00Z"
  - type: Installed
    status: "Unknown"
    reason: "InstallationPending"
    message: "Installation pending validation completion"
    lastTransitionTime: "2024-01-15T10:30:00Z"
  - type: Ready
    status: "False"
    reason: "ValidationFailed"
    message: "Transaction validation failed"
    lastTransitionTime: "2024-01-15T10:30:00Z"

Validation Failure - CRD Conflict

status:
  conditions:
  - type: Validated
    status: "False"
    reason: "CRDConflict"
    message: "Multiple packages install incompatible versions of CRD buckets.s3.aws.crossplane.io"
    lastTransitionTime: "2024-01-15T10:32:00Z"
  - type: Installed
    status: "Unknown"
    reason: "InstallationPending"
    message: "Installation pending validation completion"
    lastTransitionTime: "2024-01-15T10:32:00Z"
  - type: Ready
    status: "False"
    reason: "ValidationFailed"
    message: "Transaction validation failed"
    lastTransitionTime: "2024-01-15T10:32:00Z"

Installation In Progress

status:
  conditions:
  - type: Validated
    status: "True"
    reason: "AllValidationsPassed"
    message: "Pre-flight validation completed successfully"
    lastTransitionTime: "2024-01-15T10:33:00Z"
  - type: Installed
    status: "Unknown"
    reason: "InstallationInProgress"
    message: "Package installation in progress"
    lastTransitionTime: "2024-01-15T10:33:15Z"
  - type: Ready
    status: "Unknown"
    reason: "TransactionInProgress"
    message: "Transaction is currently executing"
    lastTransitionTime: "2024-01-15T10:33:15Z"

Installation Failure

status:
  conditions:
  - type: Validated
    status: "True"
    reason: "AllValidationsPassed"
    message: "Pre-flight validation completed successfully"
    lastTransitionTime: "2024-01-15T10:33:00Z"
  - type: Installed
    status: "False"
    reason: "CRDInstallationFailed"
    message: "Failed to install CRD buckets.s3.aws.crossplane.io"
    lastTransitionTime: "2024-01-15T10:33:30Z"
  - type: Ready
    status: "False"
    reason: "InstallationFailed"
    message: "Transaction installation failed"
    lastTransitionTime: "2024-01-15T10:33:30Z"

Complete Success

status:
  conditions:
  - type: Validated
    status: "True"
    reason: "AllValidationsPassed"
    message: "Pre-flight validation completed successfully"
    lastTransitionTime: "2024-01-15T10:34:00Z"
  - type: Installed
    status: "True"
    reason: "InstallationComplete"
    message: "Package installation completed successfully"
    lastTransitionTime: "2024-01-15T10:34:45Z"
  - type: Ready
    status: "True"
    reason: "TransactionComplete"
    message: "Transaction completed successfully"
    lastTransitionTime: "2024-01-15T10:34:45Z"
  startTime: "2024-01-15T10:34:00Z"
  completionTime: "2024-01-15T10:34:45Z"

Controller Architecture

Transaction Controller

Location: internal/controller/pkg-tx/transaction/

Responsibilities:

  • Validate complete package operations before execution (dependency resolution, version constraints)
  • Resolve package tags to specific OCI digests
  • Manage exclusive Lock access throughout entire operation
  • Create ProviderRevisions with resolved digests for all packages (main + dependencies)
  • Label created resources with Transaction reference for traceability
  • Install CRDs/MRDs directly (reusing Establisher interface)
  • Update Lock state with revision information (digests, not tags)
  • Create Provider resources for missing dependencies
  • Remove packages not in desired state (handle deletions with dependency validation)
  • Garbage collect completed transactions

Key Insight: Transaction controller creates the actual ProviderRevisions that get added to the Lock, ensuring atomic tag→digest resolution and Lock consistency. Resources are labeled for traceability rather than tracked via forward references.

Injected Dependencies:

type Reconciler struct {
    client     client.Client
    log        logging.Logger
    record     event.Recorder
    conditions conditions.Manager
    
    // Injected interfaces
    validator   TransactionValidator // Pre-flight validation
    lock        LockManager         // Exclusive Lock access
    establisher Establisher         // CRD/MRD installation (reused from revision controller)
    fetcher     xpkg.Fetcher        // Package fetching and tag→digest resolution
    parser      parser.Parser       // Package parsing
    revisioner  Revisioner          // ProviderRevision creation and management
}

Modified Package Controllers

Location: internal/controller/pkg-tx/manager/ (new transaction-aware versions)

Key Changes:

  • Create Transactions when Provider spec changes (tag updates, new packages, deletions)
  • Watch Transaction status to update Provider status
  • Do NOT create ProviderRevisions - Transaction controller handles this
  • Update Provider.Status.CurrentRevision based on Transaction results
  • Simplified logic focused on Transaction coordination and status reflection

Transaction Naming Strategy: Package controllers must ensure exactly one Transaction per spec change to prevent duplicates:

// Generate deterministic, collision-free Transaction name
input := fmt.Sprintf("%s/%s/%s/gen-%d", 
    pkg.APIVersion, pkg.Kind, pkg.Name, pkg.Generation)
hash := sha256.Sum256([]byte(input))
shortHash := hex.EncodeToString(hash[:4]) // 8 characters

transactionName := fmt.Sprintf("tx-%s-%s", pkg.Name, shortHash)

// Check if Transaction already exists before creating
existing := &Transaction{}
if err := client.Get(ctx, types.NamespacedName{Name: transactionName}, existing); err == nil {
    // Transaction exists, don't create duplicate
    return reconcile.Result{}, nil
}

Benefits:

  • Collision-free: Hash includes GVK + name + generation for uniqueness across package types
  • Idempotent: Multiple reconciles won't create duplicate Transactions
  • Deterministic: Same Package state always generates same Transaction name
  • Chronological ordering: Use metadata.creationTimestamp for temporal ordering

Dependency Interfaces

TransactionValidator

type TransactionValidator interface {
    // Validate performs comprehensive pre-flight validation using fail-fast approach.
    // Returns error with specific reason for condition status.
    Validate(ctx context.Context, t *v1alpha1.Transaction) error
}

Implementation: NewDependencyValidator() - reuses logic from existing dependency resolution

Validation Flow: Fail-fast sequential validation

  1. Dependency Resolution: Validate complete dependency graph, version constraints
  2. Schema Compatibility: Validate CRD/MRD compatibility and conflicts
  3. Replace Equivalence (for Replace operations): Validate that old and new dependency trees are functionally equivalent
  4. Future Validations: RBAC, resource quotas, network policies, etc.

Replace Operation Validation: For Replace operations, the validator performs additional equivalence checking:

  • Dependency Tree Analysis: Resolve both old (from current Lock + replace.source) and new (from replace.package) dependency trees
  • CRD/MRD Compatibility: Ensure new dependency tree provides same CRDs/MRDs as old tree, with compatible schemas
  • Resource Ownership: Validate that new providers can establish control over existing managed resources
  • Dependency Mapping: Automatically infer which old dependencies should be replaced by which new dependencies (e.g., upbound/provider-family-gcpcrossplane-contrib/provider-family-gcp)

This prevents Replace operations that would orphan existing managed resources or create unresolvable ownership conflicts.

Error Handling: Each validation phase returns specific error types that map to condition reasons:

// Example validation errors
type ValidationError struct {
    Reason  string // Maps to condition reason
    Message string // Maps to condition message
}

// Dependency validation errors
var (
    ErrMissingDependency  = ValidationError{Reason: ReasonMissingDependency}
    ErrVersionConflict    = ValidationError{Reason: ReasonVersionConflict}
    ErrCircularDependency = ValidationError{Reason: ReasonCircularDependency}
)

// Schema validation errors  
var (
    ErrCRDConflict        = ValidationError{Reason: ReasonCRDConflict}
    ErrSchemaIncompatible = ValidationError{Reason: ReasonSchemaIncompatible}
)

LockManager

type LockManager interface {
    // AcquireLock attempts to gain exclusive access to the Lock for a Transaction.
    // Returns the current Lock packages (what's currently installed).
    // Returns ErrLockHeldByAnotherTransaction if lock is held by a different Transaction.
    // If the Transaction already holds the lock, returns the current Lock state.
    AcquireLock(ctx context.Context, t *v1alpha1.Transaction) ([]LockPackage, error)
    
    // ReleaseLock releases exclusive access and updates Lock state with new packages.
    // Only the Transaction that currently holds the lock can release it.
    ReleaseLock(ctx context.Context, t *v1alpha1.Transaction, packages []LockPackage) error
}

// Lock acquisition errors
var (
    ErrLockHeldByAnotherTransaction = errors.New("lock is held by another transaction")
    ErrTransactionDoesNotHoldLock   = errors.New("transaction does not hold the lock")
)

Implementation: NewAPILockManager() - manages Lock annotations using optimistic concurrency control

Lock Acquisition Flow:

  1. Check current holder: If pkg.crossplane.io/current-transaction annotation matches this Transaction, return current packages
  2. Check for conflicts: If held by another Transaction that still exists, return ErrLockHeldByAnotherTransaction
  3. Acquire lock: Set pkg.crossplane.io/current-transaction annotation to this Transaction using atomic update
  4. Handle races: If update fails due to conflict, return ErrLockHeldByAnotherTransaction

Controller Usage (representative example):

// Acquire lock and get current installed packages
currentPackages, err := r.lock.AcquireLock(ctx, transaction)
if err != nil {
    if errors.Is(err, ErrLockHeldByAnotherTransaction) {
        // Set waiting condition and requeue
        r.conditions.For(transaction).MarkConditions(/* waiting condition */)
        return reconcile.Result{RequeueAfter: 30 * time.Second}, nil
    }
    return reconcile.Result{}, err
}

// Validate using injected validator
if err := r.validator.Validate(ctx, transaction); err != nil {
    r.lock.ReleaseLock(ctx, transaction, currentPackages)
    r.conditions.For(transaction).MarkConditions(/* validation failed condition */)
    return reconcile.Result{}, err
}

// Mark validation success
r.conditions.For(transaction).MarkConditions(/* validation success condition */)

// Install packages using injected interfaces - process transaction.Spec.Change
// Based on transaction.Spec.Change, use appropriate spec (Install, Delete, Replace)
// Fetch package using r.fetcher
// Parse package using r.parser  
// Install CRDs using r.establisher
// Create PackageRevision using r.revisioner (sets Package as owner)
// Handle errors by releasing lock and setting conditions

// Build proposedLockPackages with complete new Lock state
transaction.Status.ProposedLockPackages = newPackages

// Success - release lock with new state
r.lock.ReleaseLock(ctx, transaction, newPackages)
r.conditions.For(transaction).MarkConditions(/* success conditions */)
return reconcile.Result{}, r.client.Status().Update(ctx, transaction)

Note: This is a representative example showing the key patterns. The actual implementation will handle specific error cases, condition details, and interface method signatures.

Revisioner Interface

type Revisioner interface {
    // Create creates a PackageRevision with the specified digest and sets the Package as owner.
    // The Package (Provider/Configuration/Function) becomes the owner of the PackageRevision.
    Create(ctx context.Context, pkg LockPackage, digest string, owner client.Object) (client.Object, error)
}

Implementation: NewTransactionRevisioner() - creates PackageRevisions with proper owner references

Key Behavior:

  • Sets Package as owner of PackageRevision (not Transaction)
  • Labels PackageRevision with Transaction reference for traceability
  • Handles Provider, Configuration, and Function revision creation uniformly

Package Configuration Resolution

Challenge: When creating PackageRevisions, the Transaction controller must resolve configuration from multiple sources with proper precedence:

  1. Package.Spec fields (e.g., packagePullSecrets, revisionActivationPolicy) - Apply only to the root package that triggered the Transaction
  2. ImageConfig resources - Apply to any package whose OCI source matches the ImageConfig's prefix rules
  3. System defaults - Fallback values

Key Insight: Root packages (user-requested) inherit Package.Spec configuration, while dependency packages only receive ImageConfig-based configuration.

Proposed Interface (sketch):

// Extends/wraps the existing xpkg.ConfigStore interface
type PackageConfigurationResolver interface {
    ResolveCompleteConfig(ctx context.Context, req PackageConfigRequest) (*ResolvedPackageConfig, error)
}

type PackageConfigRequest struct {
    Source      string        // OCI source to resolve config for
    RootPackage client.Object // Package resource (nil for dependencies)
}

Design Principles:

  • Package snapshotting - Transaction.Spec contains complete Package configuration for stable retry inputs, while ImageConfig resolution happens at execution time
  • Leverage existing logic - Reuse the current xpkg.ConfigStore interface that already handles ImageConfig matching and precedence
  • Clear precedence - ImageConfig settings override Package.Spec settings for overlapping fields
  • Dependency distinction - Root packages get Package.Spec + ImageConfig, dependencies get ImageConfig only

Transaction Controller Usage (sketch):

// For each package in the dependency tree
config, err := r.configResolver.ResolveCompleteConfig(ctx, PackageConfigRequest{
    Source:      lockPackage.Source,
    RootPackage: rootPackageIfApplicable, // nil for dependencies
})

// Use resolved config when creating PackageRevision
revision := buildPackageRevision(lockPackage, config)

This approach maintains the current Package and ImageConfig APIs while enabling the Transaction controller to properly resolve configuration for complete dependency trees.

Replace Operation Execution

Replace operations require coordinated dependency tree replacement to handle scenarios like migrating from upbound/provider-gcp-compute to crossplane-contrib/provider-gcp-compute, where both packages have different family provider dependencies that compete for the same CRDs.

Execution Flow:

  1. Dependency Tree Mapping: Automatically infer which old dependencies should be replaced by new ones based on naming patterns and dependency relationships
  2. Ownership Transfer Coordination:
    • Create new PackageRevisions for replacement packages
    • Set old PackageRevisions to manual activation policy
    • Deactivate old PackageRevisions only after new ones establish control
    • Verify new providers can manage existing managed resources
  3. Atomic Cleanup: Remove old packages only after successful ownership transfer
  4. Lock State Update: Update Lock with new dependency tree, removing old packages

Key Insight: Replace operations turn the complex manual 7-step provider migration process into a single atomic Transaction that validates equivalence upfront and coordinates the entire dependency tree replacement safely.

Implementation Phases

Phase 1: Core Transaction Infrastructure

  1. API definitions - Transaction types, Lock extensions
  2. Transaction controller - Basic validation, CRD installation, Lock management
  3. Lock manager - Exclusive access coordination
  4. Dependency validator - Reuse existing dependency resolution logic
  5. Feature flag - Alpha feature flag for transaction mode

Phase 2: Package Controller Integration

  1. Transaction-aware package controllers - Create Transactions instead of direct operations
  2. Install transaction flow - Complete install coordination via Transactions
  3. Delete transaction flow - Coordinated deletion with dependency validation
  4. Provider → ProviderRevision coordination - Maintain ownership while using Transactions
  5. End-to-end testing - Verify transaction mode works for basic package operations

Phase 3: Advanced Features

  1. Transaction garbage collection - Implement retention policies based on Lock.TransactionHistoryLimit
  2. Manual transaction support - Enable user-created Transactions for complex operations
  3. Enhanced validation - Add CRD conflict detection, compatibility checks

Key Design Decisions

Transaction Creates ProviderRevisions

  • Transaction controller resolves Provider tags to specific OCI digests
  • Transaction controller creates ProviderRevisions with resolved digests
  • Lock contains revision-level information (digests), not package-level (tags)
  • Ensures atomic tag→digest resolution and Lock consistency
  • Transaction controller sets Package as owner of PackageRevision when creating it

Transaction Does the Heavy Lifting

  • Transaction controller handles validation, revision creation, CRD installation, and Lock management
  • Package controllers become "triggers" and "status reflectors"
  • ProviderRevisions focus solely on runtime management and health reporting
  • Clear separation: Transaction = coordination/installation, ProviderRevision = runtime

Exclusive Lock Access Throughout Operation

  • pkg.crossplane.io/current-transaction annotation prevents concurrent transactions
  • Transaction holds lock from validation through revision creation and CRD installation
  • Atomic compare-and-swap operations for lock acquisition
  • No gaps where validated state can become invalid

Transaction Sets Proper Owner References

  • Transaction controller creates PackageRevisions with no controller owner references
  • PackageRevision gets non-controller owner references to both Package (parent) and Transaction (creator)
  • Transaction handles all lifecycle management explicitly, not relying on Kubernetes GC
  • Package controller focuses on status reflection, not revision creation

Owner Reference Strategy for Audit Trail

  • Package → Transaction: Non-controller owner reference for relationship visibility (kubectl tree)
  • Transaction → Created Resources: Non-controller owner references on all resources Transaction creates
  • Latest Transaction Only: When multiple Transactions affect the same resource, only the latest appears in owner references
  • Complete History: Full audit trail preserved in Transaction resources themselves, queryable via labels
  • Benefits: Clean current state visualization while maintaining complete historical audit trail

Coordinated Deletion with Dependency Validation

  • Package removal handled via ChangeTypeDelete transactions specifying the package source
  • Transaction controller determines complete deletion impact (package revisions, unused dependencies)
  • Transaction controller validates no other packages depend on what's being deleted
  • Manual deletion of PackageRevisions, CRDs, and potentially unused dependencies
  • Future: automatic cleanup of packages that were only dependencies of the deleted package

Complete State Specification

  • Transaction.Spec contains complete operation specification (Install/Delete/Replace)
  • Transaction.Status.ProposedLockPackages represents the final desired Lock state after the operation
  • Enables comprehensive validation of operations including dependency resolution and deletion impact
  • PackageSnapshot provides stable inputs for retry scenarios
  • Lock annotations control retention of completed Transactions

Resource Labeling and Owner References for Traceability

  • Transaction controller sets both labels and non-controller owner references on created resources
  • Standard labels: pkg.crossplane.io/transaction: "transaction-name" for user queries
  • Owner references: PackageRevision → Package (parent) and PackageRevision → Transaction (creator)
  • Enables queries: kubectl get providers -l pkg.crossplane.io/transaction=tx-123
  • Enables relationship visualization: kubectl tree shows Package → PackageRevision relationships
  • Transaction status focuses on Transaction state, not resource inventory

Backward Compatibility

  • Dual controller architecture: Leave internal/controller/pkg/ untouched, implement new controllers under internal/controller/pkg-tx/
  • Feature flag control: Toggle between legacy and transaction controller sets
  • No API breaking changes: Only additions (Transaction API, Lock annotations)
  • Identical Package behavior: Both controller sets create identical PackageRevisions and manage identical Package resources

Testing Strategy

Unit Tests

  • Transaction validation logic
  • Lock manager exclusive access
  • Dependency resolution reuse

Integration Tests

  • End-to-end transaction workflows
  • Feature flag behavior
  • Package controller integration
  • Lock coordination under concurrent access

E2E Tests

  • Complex multi-package operations
  • Failure and retry scenarios
  • Migration from legacy to transaction mode
  • Performance with large dependency graphs

Migration Path

  1. Alpha release - Feature flag off by default, opt-in transaction mode

    • Implement controllers under internal/controller/pkg-tx/
    • Feature flag toggles which controller set runs
    • Legacy controllers remain as fallback
  2. Beta release - Feature flag on by default, legacy mode available

  3. GA release - Transaction mode only, remove legacy controllers

  4. Future - Remove feature flag and internal/controller/pkg/

This specification provides the foundation for implementing transaction-based package coordination while maintaining backward compatibility and following established Crossplane patterns.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment