Migrating Data from SQL Server to PostgreSQL with Sling

Slinger avatar
Slinger
Cover for Migrating Data from SQL Server to PostgreSQL with Sling

Introduction

Migrating data from SQL Server to PostgreSQL is a common requirement in modern data architectures, but it often comes with significant challenges. Traditional approaches involve complex ETL processes, custom scripts, or expensive commercial tools. These methods can be time-consuming, error-prone, and require extensive knowledge of both database systems.

Enter Sling, an open-source data movement and transformation platform that simplifies the migration process. With its intuitive CLI and comprehensive platform features, Sling provides a streamlined approach to database migrations, handling the complexities of data type mappings, schema conversions, and performance optimizations automatically.

Understanding Sling

Sling is a modern data movement tool designed to make data migrations and replications as simple as possible. When it comes to SQL Server to PostgreSQL migrations, Sling offers several key advantages:

  • Simplified Connection Management: Easy setup of database connections with secure credential handling
  • Automatic Schema Conversion: Intelligent mapping of SQL Server data types to their PostgreSQL equivalents
  • Flexible Migration Options: Support for both simple table copies and complex data transformations
  • Performance Optimized: Built-in optimizations for efficient data transfer
  • Multiple Operation Modes: Support for full refresh, incremental, and snapshot-based migrations

Whether you’re migrating a single table or an entire database, Sling provides the tools and flexibility needed to handle your specific use case.

Installation

Getting started with Sling is straightforward. You can install it using various package managers or download it directly. Here are the installation options:

Install using Homebrew (macOS)

brew install slingdata-io/sling/sling

Install using curl (Linux)

curl -LO 'https://github.com/slingdata-io/sling-cli/releases/latest/download/sling_linux_amd64.tar.gz' \
  && tar xf sling_linux_amd64.tar.gz \
  && rm -f sling_linux_amd64.tar.gz \
  && chmod +x sling

Install using Scoop (Windows)

scoop bucket add sling https://github.com/slingdata-io/scoop-sling.git
scoop install sling

Install using Python pip

pip install sling

After installation, verify that Sling is properly installed by checking its version:

# Check Sling version
sling --version

For more detailed installation instructions and system requirements, visit the Sling CLI Getting Started Guide.

Setting Up Database Connections

Before we can start migrating data, we need to configure connections to both our source SQL Server and target PostgreSQL databases. Sling provides several ways to manage database connections, including environment variables and a dedicated configuration file.

SQL Server Connection

Let’s start by setting up the SQL Server connection. You can configure it using the sling conns set command or by adding it to your environment:

# Using sling conns set
sling conns set MSSQL type=sqlserver host=<host> user=<user> database=<database> password=<password> port=<port>

# Or using environment variable
export MSSQL='sqlserver://myuser:[email protected]:1433?database=mydatabase'

For SQL Server connections with additional options:

# SQL Server with instance name and encryption
export MSSQL='sqlserver://myuser:[email protected]:1433/my_instance?database=master&encrypt=true&TrustServerCertificate=true'

PostgreSQL Connection

Similarly, set up the PostgreSQL connection:

# Using sling conns set
sling conns set POSTGRES type=postgres host=<host> user=<user> database=<database> password=<password> port=<port>

# Or using environment variable
export POSTGRES='postgresql://myuser:[email protected]:5432/mydatabase'

Using Environment File

For a more permanent solution, you can create a ~/.sling/env.yaml file:

connections:
  MSSQL:
    type: sqlserver
    host: <host>
    user: <user>
    port: <port>
    instance: <instance>
    database: <database>
    schema: <schema>
    password: <password>
    encrypt: 'true'
    trust_server_certificate: 'true'

  POSTGRES:
    type: postgres
    host: <host>
    user: <user>
    port: <port>
    database: <database>
    schema: <schema>
    password: <password>

Testing Connections

After setting up your connections, it’s important to verify they work correctly:

# Test SQL Server connection
sling conns test MSSQL

# Test PostgreSQL connection
sling conns test POSTGRES

You can also list available tables in your connections:

# List tables in SQL Server
sling conns discover MSSQL

# List tables in PostgreSQL
sling conns discover POSTGRES

For more details about connection configuration, visit:

Basic Data Migration with CLI Flags

Once your connections are set up, you can start migrating data using Sling’s CLI flags. This approach is perfect for simple migrations or when you want to quickly transfer data between databases.

Simple Table Migration

Here’s a basic example of migrating a single table:

# Migrate a single table from SQL Server to PostgreSQL
sling run \
  --src-conn MSSQL \
  --src-stream "select * from sales.customers" \
  --tgt-conn POSTGRES \
  --tgt-object "public.customers"

Migration with Column Selection

You can specify which columns to include in the migration:

# Migrate specific columns
sling run \
  --src-conn MSSQL \
  --src-stream "select customer_id, first_name, last_name, email from sales.customers" \
  --tgt-conn POSTGRES \
  --tgt-object "public.customers"

Incremental Migration

For large tables, you might want to use incremental migration based on a timestamp or ID column:

# Incremental migration using last_updated column
sling run \
  --src-conn MSSQL \
  --src-stream "select * from sales.orders" \
  --tgt-conn POSTGRES \
  --tgt-object "public.orders" \
  --mode incremental \
  --update-key "last_updated"

Migration with Transformations

You can also apply simple transformations during migration:

# Migration with column renaming and type casting
sling run \
  --src-conn MSSQL \
  --src-stream "select 
      id as customer_id,
      UPPER(first_name) as first_name,
      UPPER(last_name) as last_name,
      CONVERT(VARCHAR(100), email) as email_address
    from sales.customers" \
  --tgt-conn POSTGRES \
  --tgt-object "public.customers_transformed"

For more information about CLI flags and options, visit the CLI Flags Overview.

Advanced Migration with Replication YAML

While CLI flags are great for simple migrations, Sling’s replication YAML configuration provides more control and flexibility for complex migration scenarios. This approach allows you to define multiple streams, set detailed options, and maintain your migration configuration in version control.

Basic Replication Configuration

Here’s a basic example of a replication YAML file (sqlserver_to_postgres.yaml):

# Basic replication configuration
source: MSSQL
target: POSTGRES

defaults:
  mode: full-refresh
  target_options:
    add_new_columns: true
    column_casing: snake

streams:
  sales.customers:
    object: public.customers
    primary_key: [customer_id]
    
  sales.orders:
    object: public.orders
    primary_key: [order_id]
    mode: incremental
    update_key: last_updated

To run this replication:

# Run the replication configuration
sling run -r sqlserver_to_postgres.yaml

Complex Replication Example

Here’s a more advanced configuration showing various features:

# Advanced replication configuration
source: MSSQL
target: POSTGRES

defaults:
  mode: full-refresh
  target_options:
    add_new_columns: true
    column_casing: snake
    table_keys:
      primary_key: [id]
      unique_key: [email]
    table_ddl: |
      CREATE TABLE IF NOT EXISTS {table} (
        {column_types},
        PRIMARY KEY(id)
      )

streams:
  # Custom SQL query with transformations
  "select 
      c.customer_id,
      c.first_name,
      c.last_name,
      c.email,
      COUNT(o.order_id) as total_orders,
      SUM(o.total_amount) as lifetime_value
    from sales.customers c
    left join sales.orders o on c.customer_id = o.customer_id
    group by c.customer_id, c.first_name, c.last_name, c.email":
    object: public.customer_summary
    columns:
      customer_id: bigint
      first_name: string(100)
      last_name: string(100)
      email: string(255)
      total_orders: int
      lifetime_value: decimal(15,2)
    primary_key: [customer_id]

  # Incremental load with runtime variables
  "select * from sales.orders where order_date >= {last_run_date}":
    object: public.orders
    mode: incremental
    update_key: order_date
    source_options:
      datetime_format: "YYYY-MM-DD HH:mm:ss"
    target_options:
      table_keys:
        primary_key: [order_id]

  # Multiple tables using wildcard
  "sales.*":
    object: "public.{stream_table}"
    single: false
    select: [-password, -credit_card_number]  # Exclude sensitive columns
    target_options:
      add_new_columns: true
      column_casing: snake

Key Features Explained

Runtime Variables

Sling supports various runtime variables that you can use in your configurations:

  • {stream_schema}: The name of the current table schema being processed
  • {stream_table}: The name of the current table being processed
  • {incremental_value}: The last value of the update key for incremental loads

For more information about runtime variables, visit the Runtime Variables Documentation.

Source Options

Source options allow you to control how data is read from SQL Server:

source_options:
  datetime_format: "YYYY-MM-DD HH:mm:ss"
  limit: 1000000  # Limit the number of rows
  empty_as_null: true

Target Options

Target options control how data is written to PostgreSQL:

target_options:
  add_new_columns: true  # Automatically add new columns
  column_casing: snake  # Convert column names to snake_case
  table_ddl: |  # Custom table creation SQL
    CREATE TABLE IF NOT EXISTS {table} (
      {column_types},
      PRIMARY KEY(id)
    )
  table_keys:  # Define table constraints
    primary_key: [id]
    unique_key: [email]
    foreign_key:
      - columns: [customer_id]
        references: customers(id)

For more detailed information about replication configuration, visit:

Sling Platform Overview

While the CLI is powerful for local development and simple migrations, the Sling Platform provides a comprehensive web interface for managing and monitoring your data migrations at scale. Let’s explore its key components and features.

Platform Components

The Sling Platform consists of several key components:

  • Web Interface: A modern UI for managing connections, creating replications, and monitoring jobs
  • Agents: Distributed workers that execute data operations in your infrastructure
  • Job Scheduler: Built-in scheduling and orchestration capabilities
  • Monitoring Dashboard: Real-time visibility into job execution and performance

Connection Management

The platform provides a user-friendly interface for managing database connections:

Sling Platform Connections

You can easily:

  • Add and configure new connections
  • Test connection health
  • View and manage connection permissions
  • Browse available tables and schemas

Replication Editor

The platform includes a powerful editor for creating and managing replication configurations:

Sling Platform Editor

Key features include:

  • Visual replication builder
  • Syntax highlighting for YAML configurations
  • Real-time validation
  • Version control integration

Job Monitoring

Monitor your migration jobs in real-time:

Sling Platform Job Monitoring

The monitoring dashboard provides:

  • Real-time job status
  • Detailed execution logs
  • Performance metrics
  • Error reporting and troubleshooting

Agent Management

Manage your Sling agents through the platform:

Sling Platform Agent Management

Features include:

  • Agent deployment and configuration
  • Health monitoring
  • Resource utilization tracking
  • Security management

For more information about the Sling Platform, visit:

Additional Resources

For more detailed information and examples, check out these resources:

Join our community: