Transfer Data from Cloudflare D1 to PostgreSQL with Sling

Slinger avatar
Slinger
Cover for Transfer Data from Cloudflare D1 to PostgreSQL with Sling

Moving data between different database systems can be a complex and time-consuming task, especially when dealing with cloud-native databases like Cloudflare D1. Traditional approaches often require writing custom scripts, handling data type conversions, and managing the entire ETL (Extract, Transform, Load) process manually. This complexity increases when you need to maintain data consistency and handle schema changes between source and target databases.

Enter Sling - a modern data movement platform that simplifies the process of transferring data between different database systems. In this guide, we’ll walk through the process of moving data from Cloudflare D1 to PostgreSQL using Sling, demonstrating how to set up connections, configure replications, and manage the entire data transfer process efficiently.

Whether you’re migrating a complete database or setting up continuous synchronization between D1 and PostgreSQL, Sling provides the tools and features you need to accomplish your data movement goals with minimal effort. Let’s dive in and see how Sling makes this process straightforward and reliable.

Prerequisites and Installation

Before we begin transferring data from D1 to PostgreSQL, let’s ensure you have everything needed to follow along with this guide.

Installing Sling

Sling offers multiple installation methods to suit your environment:

# Install using Homebrew (macOS)
brew install slingdata-io/sling/sling

# Install using curl (Linux)
curl -LO 'https://github.com/slingdata-io/sling-cli/releases/latest/download/sling_linux_amd64.tar.gz' \
  && tar xf sling_linux_amd64.tar.gz \
  && rm -f sling_linux_amd64.tar.gz \
  && chmod +x sling

# Install using Scoop (Windows)
scoop bucket add sling https://github.com/slingdata-io/scoop-sling.git
scoop install sling

# Install using Python pip
pip install sling

After installation, verify that Sling is properly installed:

# Check Sling version
sling --version

For more detailed installation instructions, visit the Sling CLI Getting Started Guide.

Understanding the Connections

Before we start moving data, it’s important to understand the connection requirements for both Cloudflare D1 and PostgreSQL. Let’s look at what you’ll need for each database system.

D1 Connection Requirements

To connect to Cloudflare D1, you’ll need:

  • Cloudflare Account ID
  • API Token with appropriate permissions
  • D1 Database name

PostgreSQL Connection Requirements

For PostgreSQL, you’ll need:

  • Host address
  • Port number (default is 5432)
  • Database name
  • Username
  • Password
  • SSL mode (if required)

Both connections can be configured using environment variables, the Sling CLI, or a YAML configuration file. Let’s explore each method in detail.

Setting Up Connections

Let’s configure both our source (D1) and target (PostgreSQL) connections. Sling provides multiple ways to manage these connections securely.

Setting Up D1 Connection

You can set up your D1 connection using any of these methods:

Using Environment Variables

The simplest way is to use environment variables:

# Set D1 connection using environment variable
export D1_SOURCE='d1://account_id:api_token@database_name'

Using the Sling CLI

Alternatively, use the sling conns set command:

# Set up D1 connection with individual parameters
sling conns set d1_source type=d1 account_id=your_account_id api_token=your_api_token database=your_database_name

# Or use a connection URL
sling conns set d1_source url="d1://account_id:api_token@database_name"

Using the Sling Environment File

You can also add the connection details to your ~/.sling/env.yaml file:

connections:
  d1_source:
    type: d1
    account_id: your_account_id
    api_token: your_api_token
    database: your_database_name

Setting Up PostgreSQL Connection

Similarly, let’s set up the PostgreSQL connection:

Using Environment Variables

# Set PostgreSQL connection using environment variable
export PG_TARGET='postgresql://user:pass@host:5432/dbname?sslmode=require'

Using the Sling CLI

# Set up PostgreSQL connection with individual parameters
sling conns set pg_target type=postgres host=host.ip user=myuser database=mydatabase password=mypass port=5432

# Or use a connection URL
sling conns set pg_target url="postgresql://user:pass@host:5432/dbname?sslmode=require"

Using the Sling Environment File

Add to your ~/.sling/env.yaml:

connections:
  pg_target:
    type: postgres
    host: host.ip
    user: myuser
    password: mypass
    port: 5432
    database: mydatabase
    sslmode: require
    schema: public

Testing Your Connections

After setting up both connections, it’s important to verify they’re working correctly:

# Test D1 connection
sling conns test d1_source

# Test PostgreSQL connection
sling conns test pg_target

# List all configured connections
sling conns list

# Discover available tables in D1
sling conns discover d1_source

For more details about connection configuration, visit:

Creating Data Replications

Once your connections are set up, you can start configuring your data replications. Sling provides two main approaches for this: using CLI flags or YAML configuration files. Let’s explore both methods.

Using CLI Flags

The CLI approach is great for quick, one-off transfers or when you’re testing your setup. Here are some examples:

Basic Replication

# Simple table transfer with default options
sling run --source d1_source --target pg_target --stream users

Advanced Replication with Options

# Transfer with specific options for both source and target
sling run \
  --src-conn d1_source \
  --tgt-conn pg_target \
  --src-stream users \
  --mode incremental \
  --select "id, name, email, created_at" \
  --tgt-options '{ "column_casing": "snake", "add_new_columns": true }'

For more details about CLI flags, visit the CLI Flags Documentation.

Using YAML Configuration

For more complex scenarios or production environments, using a YAML configuration file is recommended. This approach provides better version control and reusability.

Basic YAML Configuration

Create a file named d1_to_postgres.yaml:

# Define source and target connections
source: d1_source
target: pg_target

# Default settings for all streams
defaults:
  mode: incremental
  target_options:
    # Automatically add new columns if they appear in source
    add_new_columns: true
    # Convert column names to snake_case
    column_casing: snake

# Define the tables to replicate
streams:
  # Single table replication
  users:
    # Use runtime variable for target table name
    object: public.{stream_table}
    # Specify primary key for incremental updates
    primary_key: [id]
    # Track updates using timestamp column
    update_key: updated_at

Advanced YAML Configuration

Here’s a more complex example that shows additional features:

source: d1_source
target: pg_target

defaults:
  mode: incremental
  target_options:
    add_new_columns: true
    column_casing: snake

streams:
  # Multiple table replications
  'users':
    object: public.{stream_table}
    primary_key: [id]
    update_key: updated_at
    # Select specific columns
    sql: |
      select
        id, 
        first_name, 
        last_name, 
        email,
        created_at,
        updated_at
      from users
      deleted_at IS NULL
    
  'orders':
    object: public.{stream_table}
    primary_key: [order_id]
    update_key: modified_at
    # Define table keys for proper indexing
    target_options:
      add_new_columns: true
      column_casing: snake
      table_keys:
        index: [user_id, status]
    
    # Add source options
    source_options:
      batch_size: 1000
    
  'order_items':
    object: public.{stream_table}
    primary_key: [order_id, item_id]
    update_key: updated_at
    # Define column mappings
    columns:
      order_id: bigint
      item_id: integer
      quantity: integer
      price: decimal(10,2)

To run a YAML-based replication:

# Run the replication using the configuration file
sling run -r d1_to_postgres.yaml

For more details about replication configuration, visit:

Next Steps

Now that you have your D1 to PostgreSQL data transfer set up, here are some ways to take your Sling usage to the next level:

Additional Resources

Community and Support

Join the Sling community to get help and share experiences:

Start small with simple replications and gradually expand your usage as you become more comfortable. Sling’s flexibility means it can grow with your needs, from simple database syncs to complex data pipelines.