How Our ‘As-Code’ Approach Enabled a Smooth Migration of 450 Flows in Less Than 40 Working Days

Migrating workflow orchestration is a complex process, especially when dealing with hundreds of deployments. When Prefect announced that Prefect Cloud 1 accounts would be frozen, we faced the challenge of upgrading our client’s extensive infrastructure to Prefect 2 while ensuring zero downtime.
While the new version introduced several improvements, such as improved scheduling and enhanced cloud-native capabilities, the migration itself was quite a challenge. Still, we successfully transitioned all workflows seamlessly without interfering with data ingestion continuity through careful planning, automation, and leveraging Python’s flexibility. This case study explores our migration strategy’s key challenges, solutions, and impact.

The Challenge: Overcoming Key Migration Obstacles

While the shift from Prefect 1 to Prefect 2 introduced major improvements in architecture, performance, and scalability, it also required us to reevaluate several critical aspects:

  • Decentralized Architecture: Prefect 2 moved from a monolithic system to a distributed, cloud-native approach, requiring a rethink of deployment strategies.
  • Changes in API and SDK: The transition introduced a new flow and task structure, requiring code refactoring.
  • Scheduler and Execution Model Updates: Prefect 2 introduced new scheduling mechanisms that replaced traditional cron-based scheduling.
  • Infrastructure and Deployment Adjustments: The move to Prefect 2’s deployment-first model required restructuring how workflows were defined and executed.
  • Scaling and Monitoring Enhancements: Improved logging, error handling, and monitoring features required adaptation to new paradigms.

With over 450 flows to migrate, our top priority was to ensure a smooth transition while avoiding disruptions to existing workflows.

Our Solution: Making the Most of Our ‘As-Code’ Approach

Our migration approach leveraged Python’s flexibility and our “as code” framework to make the transition as seamless as possible. Since we had already structured all workflows in Python, adapting them to Prefect 2’s new architecture was significantly easier. Key strategies included:

  • Incremental Refactoring: Updating flow and task definitions using Python decorators (@flow, @task) while maintaining backward compatibility where possible.
  • Automated Deployment Adjustments: Using scripts to modify arguments, update deployments, and efficiently handle hundreds of Prefect flows.
  • Dual-Environment Setup: Running Prefect 1 and Prefect 2 in parallel to validate new workflows without disrupting existing ones.
  • Real-time Monitoring, Feedback Loops, and Error Handling: Actively monitoring both systems to track metrics, logs, and task state changes and detect issues early.
  • Zero-Downtime Migration Strategy: Phasing the transition so Prefect 1 and Prefect 2 ran side by side until full validation was completed.

Planning the Migration: Scoping, Strategy, and Risk Mitigation

Before we dove into the technical part of the process, we carefully planned each aspect of the migration to reduce risk and ensure a smooth transition. We laid a foundation for a structured, interruption-free rollout by:

  • Defining Scope and Dependencies: We identified the flows to be migrated, mapped out their dependencies, and assessed any components that could affect downstream systems.
  • Ensuring Data Consistency: In cases where only part of the architecture was impacted, we ensured data outputs remained compatible (using glue code where needed) to keep consuming systems stable.
  • Avoiding Conflicts: We adjusted Prefect 2 schedules to avoid conflicts with Prefect 1 workflows (e.g., preventing simultaneous API calls) and planned a controlled handoff to production scheduling after validation.
  • Setting Clear Milestones and Ownership: We broke the migration into testable phases, defined checkpoints to validate progress, and assigned clear responsibilities to each team involved.

Implementation: Seven Steps to Fully Migrated Flows

The migration process followed a structured approach:

Step 1: Refactoring Code: Adopting Prefect 2 Syntax and Logic

In Prefect 1, workflows had to be defined as directed acyclic graphs (DAGs), explicitly declaring dependencies between tasks. This requirement has been removed in Prefect 2. Flows are now defined using plain Python logic without needing to be structured as DAGs. In practice, this meant we transitioned from Prefect 1’s @prefect.task and @prefect.flow to Prefect 2’s @task and @flow decorators. We also ensured compatibility by testing each refactored flow in an isolated environment.

Here is an example of a code modification:

Before (Prefect 1):

from prefect import task, Flow

@task
def extract_data():
    return "some data"

@task
def transform_data(data):
    return data.upper()

# Explicit flow definition using context manager - DAG built through assignment

with Flow("ETL Flow") as flow:
    raw_data = extract_data()
    processed_data = transform_data(raw_data) 

After (Prefect 2):

from prefect import flow, task

@task
def extract_data():
    return "some data"

@task
def transform_data(data):
    return data.upper()

@flow  # Flow is now a decorated function
def etl_flow():
    data = extract_data()  # Automatic dependency tracking
    transformed = transform_data(data)

Step 2: Monitoring and Error Handling: Observability-First Mindset

We adopted Prefect 2’s integrated UI and observability features, enhancing visibility into task and flow states. We added detailed logging, fine-grained retry logic, and state handlers for robust error handling. Real-time feedback loops enabled us to catch issues early and react fast.

Step 3: Optimizing Performance: Using Dynamic Task Mapping

We wrote custom scripts to update flow parameters and deployment configurations across all the flows, significantly reducing manual effort and making the migration scalable. We also implemented dynamic task mapping, which allowed us to parallelize operations and optimize performance during execution.

Step 4: Adapting Infrastructure: Ensuring Compatibility with Prefect 2

We restructured our infrastructure to align with Prefect 2’s architecture, including setting up new work pools and queues for more efficient task distribution. Then, we ensured compatibility with our Kubernetes setup for agent deployment, maintaining complete control over execution environments.

Step 5: Defining Deployments: Building Around Prefect 2’ Model

We used Prefect’s new deployment-first model to configure flows, specifying storage blocks, work pools, and environment details. We kept business logic separate from deployment logic and ensured repositories pointed to the correct branches to improve maintainability.

Step 6: Automating at Scale: Removing Manual Work

To handle migration at scale, we created scripts to deploy flows based on specific projects or workflows, modify arguments to select the appropriate environment, and activate or deactivate deployments for individual projects.

Step 7: Testing and Validation: Ensuring a Smooth Rollout

To avoid disruption, we ran Prefect 1 and Prefect 2 flows in parallel and compared outputs for consistency. We also built in rollback options, allowing us to quickly revert to the previous version in case issues were found, and set real-time notifications to resolve any issues proactively.

The Result: How Our Approach Made the Migration Seamless

Our “as-code” approach played a crucial role in ensuring a smooth migration. Rather than focusing solely on post-migration performance, the real success lay in how our methodology prevented disruptions and minimized manual work throughout the transition:

  • Continuous Operations, No Downtime: Running Prefect 1 and Prefect 2 in parallel allowed us to transition without interrupting existing workflows.
  • Faster Transition with Automation: We avoided manual reconfiguration by leveraging Python scripts, significantly accelerating migration.
  • Minimal Impact on Daily Work: Because our workflows were already defined in code, changes could be made programmatically, ensuring developers could continue their tasks without delays.
  • Smooth Scaling and Deployment: The “as-code” approach made it easy to adapt to Prefect 2’s deployment model and infrastructure updates.
  • Better Monitoring and Debugging: Prefect 2’s improved logging and error handling seamlessly integrated into our workflows, allowing for faster issue resolution.

Conclusions

Migrating to Prefect 2 was a complex but necessary shift that improved workflow orchestration capabilities. Because we had adopted an “as-code” approach from the start, we could move quickly and efficiently with minimal disruption. Strategic planning, automation, and Python’s flexibility were key enablers of this seamless transition. For teams still on Prefect 1, adopting a structured migration strategy and leveraging Prefect 2’s new features can unlock better scalability, performance, and monitoring capabilities. As Prefect continues to evolve, staying adaptable and leveraging automation will remain key to managing workflow orchestration efficiently.