Blog

Welcome to The Scalable Way Blog — your source for insights, strategies, and best practices in data platform engineering, analytics, and data science.

Blog posts

Why Data Teams Struggle Without Separate Dev and Prod Environments

Data Engineering Dev vs Prod Data Infrastructure CI/CD

When development and production share the same data environment, even small changes can trigger costly outages. This article explains why separating dev and prod is foundational for reliable analytics, and how teams can do it without overengineering or blowing the budget.

Data Platform Cost Optimization: Practical Strategies for Query Performance, Storage, and Cloud Resource Management

Data Platform Optimization Cloud Cost Management Query Performance Data Engineering

Explore how you can dramatically reduce data platform costs without sacrificing performance. This guide breaks down actionable techniques across query tuning, incremental data loading, cloud resource management, and storage lifecycle design.

SAP Data Ingestion with Python: A Technical Breakdown of Using the SAP RFC Protocol

SAP Python Data Integration RFC Protocol Data Engineering

Streamline SAP data integration with Python by leveraging the RFC protocol. This interview with the lead engineer of a new SAP RFC Connector explores the challenges of large-scale data extraction and explains how a C++ integration improves stability, speed, and reliability for modern data workflows.

CI/CD for Data Workflows: Automating Prefect Deployments with GitHub Actions

prefect prefect worker github actions CI/CD data workflows data platform architecture productized data platform

The final part of the Data Platform Infrastructure on GCP series covers CI/CD for Prefect deployments using GitHub Actions and Docker. Automate flow builds, worker updates, and streamline orchestration across environments.

Scaling Secure Data Access: A Systematic RBAC Approach Using Entra ID

Data Governance Access Management RBAC Entra ID Security Architecture

Establish scalable, secure access controls for your data platform with a systematic RBAC strategy built on Microsoft Entra ID. This article outlines a five-phase implementation—from user persona mapping to automated auditing—designed to balance flexibility, compliance, and operational efficiency.

Getting to Your First Flow Run: Prefect Worker & Deployment Setup

prefect data platform architecture workflow orchestration Kubernetes productized data platform

Run your first data ingestion workflow with Prefect, Docker, and Kubernetes. This guide walks through containerized flow execution, Prefect worker deployment, and clean deployment configs, laying the foundation for a scalable, maintainable orchestration layer.