Moving data between different database systems can be a complex and time-consuming task, especially when dealing with cloud-native databases like Cloudflare D1. Traditional approaches often require writing custom scripts, handling data type conversions, and managing the entire ETL (Extract, Transform, Load) process manually. This complexity increases when you need to maintain data consistency and handle schema changes between source and target databases.
Enter Sling - a modern data movement platform that simplifies the process of transferring data between different database systems. In this guide, we’ll walk through the process of moving data from Cloudflare D1 to PostgreSQL using Sling, demonstrating how to set up connections, configure replications, and manage the entire data transfer process efficiently.
Whether you’re migrating a complete database or setting up continuous synchronization between D1 and PostgreSQL, Sling provides the tools and features you need to accomplish your data movement goals with minimal effort. Let’s dive in and see how Sling makes this process straightforward and reliable.
Prerequisites and Installation
Before we begin transferring data from D1 to PostgreSQL, let’s ensure you have everything needed to follow along with this guide.
Installing Sling
Sling offers multiple installation methods to suit your environment:
# Install using Homebrew (macOS)
brew install slingdata-io/sling/sling
# Install using curl (Linux)
curl -LO 'https://github.com/slingdata-io/sling-cli/releases/latest/download/sling_linux_amd64.tar.gz' \
&& tar xf sling_linux_amd64.tar.gz \
&& rm -f sling_linux_amd64.tar.gz \
&& chmod +x sling
# Install using Scoop (Windows)
scoop bucket add sling https://github.com/slingdata-io/scoop-sling.git
scoop install sling
# Install using Python pip
pip install sling
After installation, verify that Sling is properly installed:
# Check Sling version
sling --version
For more detailed installation instructions, visit the Sling CLI Getting Started Guide.
Understanding the Connections
Before we start moving data, it’s important to understand the connection requirements for both Cloudflare D1 and PostgreSQL. Let’s look at what you’ll need for each database system.
D1 Connection Requirements
To connect to Cloudflare D1, you’ll need:
- Cloudflare Account ID
- API Token with appropriate permissions
- D1 Database name
PostgreSQL Connection Requirements
For PostgreSQL, you’ll need:
- Host address
- Port number (default is 5432)
- Database name
- Username
- Password
- SSL mode (if required)
Both connections can be configured using environment variables, the Sling CLI, or a YAML configuration file. Let’s explore each method in detail.
Setting Up Connections
Let’s configure both our source (D1) and target (PostgreSQL) connections. Sling provides multiple ways to manage these connections securely.
Setting Up D1 Connection
You can set up your D1 connection using any of these methods:
Using Environment Variables
The simplest way is to use environment variables:
# Set D1 connection using environment variable
export D1_SOURCE='d1://account_id:api_token@database_name'
Using the Sling CLI
Alternatively, use the sling conns set
command:
# Set up D1 connection with individual parameters
sling conns set d1_source type=d1 account_id=your_account_id api_token=your_api_token database=your_database_name
# Or use a connection URL
sling conns set d1_source url="d1://account_id:api_token@database_name"
Using the Sling Environment File
You can also add the connection details to your ~/.sling/env.yaml
file:
connections:
d1_source:
type: d1
account_id: your_account_id
api_token: your_api_token
database: your_database_name
Setting Up PostgreSQL Connection
Similarly, let’s set up the PostgreSQL connection:
Using Environment Variables
# Set PostgreSQL connection using environment variable
export PG_TARGET='postgresql://user:pass@host:5432/dbname?sslmode=require'
Using the Sling CLI
# Set up PostgreSQL connection with individual parameters
sling conns set pg_target type=postgres host=host.ip user=myuser database=mydatabase password=mypass port=5432
# Or use a connection URL
sling conns set pg_target url="postgresql://user:pass@host:5432/dbname?sslmode=require"
Using the Sling Environment File
Add to your ~/.sling/env.yaml
:
connections:
pg_target:
type: postgres
host: host.ip
user: myuser
password: mypass
port: 5432
database: mydatabase
sslmode: require
schema: public
Testing Your Connections
After setting up both connections, it’s important to verify they’re working correctly:
# Test D1 connection
sling conns test d1_source
# Test PostgreSQL connection
sling conns test pg_target
# List all configured connections
sling conns list
# Discover available tables in D1
sling conns discover d1_source
For more details about connection configuration, visit:
Creating Data Replications
Once your connections are set up, you can start configuring your data replications. Sling provides two main approaches for this: using CLI flags or YAML configuration files. Let’s explore both methods.
Using CLI Flags
The CLI approach is great for quick, one-off transfers or when you’re testing your setup. Here are some examples:
Basic Replication
# Simple table transfer with default options
sling run --source d1_source --target pg_target --stream users
Advanced Replication with Options
# Transfer with specific options for both source and target
sling run \
--src-conn d1_source \
--tgt-conn pg_target \
--src-stream users \
--mode incremental \
--select "id, name, email, created_at" \
--tgt-options '{ "column_casing": "snake", "add_new_columns": true }'
For more details about CLI flags, visit the CLI Flags Documentation.
Using YAML Configuration
For more complex scenarios or production environments, using a YAML configuration file is recommended. This approach provides better version control and reusability.
Basic YAML Configuration
Create a file named d1_to_postgres.yaml
:
# Define source and target connections
source: d1_source
target: pg_target
# Default settings for all streams
defaults:
mode: incremental
target_options:
# Automatically add new columns if they appear in source
add_new_columns: true
# Convert column names to snake_case
column_casing: snake
# Define the tables to replicate
streams:
# Single table replication
users:
# Use runtime variable for target table name
object: public.{stream_table}
# Specify primary key for incremental updates
primary_key: [id]
# Track updates using timestamp column
update_key: updated_at
Advanced YAML Configuration
Here’s a more complex example that shows additional features:
source: d1_source
target: pg_target
defaults:
mode: incremental
target_options:
add_new_columns: true
column_casing: snake
streams:
# Multiple table replications
'users':
object: public.{stream_table}
primary_key: [id]
update_key: updated_at
# Select specific columns
sql: |
select
id,
first_name,
last_name,
email,
created_at,
updated_at
from users
deleted_at IS NULL
'orders':
object: public.{stream_table}
primary_key: [order_id]
update_key: modified_at
# Define table keys for proper indexing
target_options:
add_new_columns: true
column_casing: snake
table_keys:
index: [user_id, status]
# Add source options
source_options:
batch_size: 1000
'order_items':
object: public.{stream_table}
primary_key: [order_id, item_id]
update_key: updated_at
# Define column mappings
columns:
order_id: bigint
item_id: integer
quantity: integer
price: decimal(10,2)
To run a YAML-based replication:
# Run the replication using the configuration file
sling run -r d1_to_postgres.yaml
For more details about replication configuration, visit:
Next Steps
Now that you have your D1 to PostgreSQL data transfer set up, here are some ways to take your Sling usage to the next level:
Additional Resources
- Sling Documentation - Comprehensive guides and references
- Example Configurations - More replication examples
- CLI Reference - Detailed CLI command documentation
Community and Support
Join the Sling community to get help and share experiences:
Start small with simple replications and gradually expand your usage as you become more comfortable. Sling’s flexibility means it can grow with your needs, from simple database syncs to complex data pipelines.