
Introduction
Moving data between Snowflake accounts or databases is a common requirement in modern data architectures. Whether you’re migrating data between environments, sharing data with partners, or setting up data replication pipelines, the process traditionally involves complex ETL workflows, custom scripts, and significant engineering effort.
Traditional approaches to Snowflake-to-Snowflake data transfer often require:
- Writing custom Python scripts using Snowflake Connector
- Managing authentication and permissions across accounts
- Implementing error handling and retry mechanisms
- Setting up staging areas for data transfer
- Monitoring and maintaining the data pipeline
- Dealing with schema changes and data type mappings
According to various online discussions, organizations often struggle with these challenges, leading to brittle solutions that require constant maintenance.
Enter Sling: a modern data movement tool that simplifies this entire process. With its intuitive CLI and powerful platform features, Sling transforms what would typically be a complex undertaking into a straightforward configuration-based approach. In this comprehensive guide, we’ll explore how to use Sling to efficiently move data between Snowflake accounts, covering both simple and advanced scenarios.
Understanding Snowflake-to-Snowflake Data Transfer
Moving data between Snowflake accounts traditionally involves several complex steps and considerations. Let’s examine why this process can be challenging and how Sling addresses these challenges.
Traditional Methods and Their Complexities
The conventional approach to Snowflake data transfer often involves:
Manual Export Process
- Writing SQL queries to extract data
- Managing large result sets
- Handling data type conversions
- Setting up intermediate storage
Custom Scripts Development
- Creating Python/Java scripts using Snowflake connectors
- Implementing error handling and retries
- Managing dependencies and versions
- Maintaining documentation
Resource Management
- Configuring warehouse sizes
- Optimizing query performance
- Managing costs across accounts
- Monitoring system resources
Security Considerations
- Managing credentials securely
- Setting up appropriate roles and permissions
- Ensuring data encryption in transit
- Complying with data governance policies
How Sling Simplifies the Process
Sling addresses these challenges by providing:
- Unified Configuration: Simple YAML-based setup for both source and target
- Automated Handling: Built-in support for data types, schema changes, and error recovery
- Efficient Transfer: Direct data movement without intermediate storage
- Security Integration: Secure credential management and role-based access
- Monitoring: Real-time progress tracking and error reporting
- Scalability: Automatic handling of large datasets and parallel processing
Getting Started with Sling
Before we dive into data transfer configurations, let’s get Sling installed and set up on your system. Sling provides multiple installation methods to suit different operating systems and preferences.
Installation
# Install using Homebrew (macOS)
brew install slingdata-io/sling/sling
# Install using curl (Linux)
curl -LO 'https://github.com/slingdata-io/sling-cli/releases/latest/download/sling_linux_amd64.tar.gz' \
&& tar xf sling_linux_amd64.tar.gz \
&& rm -f sling_linux_amd64.tar.gz \
&& chmod +x sling
# Install using Scoop (Windows)
scoop bucket add sling https://github.com/slingdata-io/scoop-sling.git
scoop install sling
# Install using Python pip
pip install sling
After installation, verify that Sling is properly installed:
# Check Sling version
sling --version
For more detailed installation instructions, visit the Sling CLI Getting Started Guide.
Setting Up Snowflake Connections
Before we can start transferring data, we need to configure connections to both our source and target Snowflake accounts. Sling provides multiple ways to manage these connections securely.
Connection Setup Methods
Using Environment Variables
The simplest way to set up Snowflake connections is through environment variables:
# Source Snowflake connection
export SNOWFLAKE_SOURCE='snowflake://user:[email protected]/database/schema?warehouse=compute_wh'
# Target Snowflake connection
export SNOWFLAKE_TARGET='snowflake://user:[email protected]/database/schema?warehouse=compute_wh'
Using the Sling CLI
For a more secure and maintainable approach, use Sling’s connection management commands:
# Set up source Snowflake connection
sling conns set snowflake_source type=snowflake \
account=myaccount.region \
user=myuser \
password=mypassword \
database=mydatabase \
schema=myschema \
warehouse=compute_wh
# Set up target Snowflake connection
sling conns set snowflake_target type=snowflake \
account=myaccount.region \
user=myuser \
password=mypassword \
database=mydatabase \
schema=myschema \
warehouse=compute_wh
Using YAML Configuration
For a more permanent setup, you can define your connections in the ~/.sling/env.yaml
file:
# Environment configuration file
connections:
snowflake_source:
type: snowflake
account: myaccount.region
user: myuser
password: mypassword
database: mydatabase
schema: myschema
warehouse: compute_wh
role: myrole # optional
snowflake_target:
type: snowflake
account: myaccount.region
user: myuser
password: mypassword
database: mydatabase
schema: myschema
warehouse: compute_wh
role: myrole # optional
For more details on connection configuration options, visit the Snowflake Connection Documentation.
Verifying Connections
After setting up your connections, it’s important to verify that they work correctly:
# List all configured connections
sling conns list
# Test source connection
sling conns test snowflake_source
# Test target connection
sling conns test snowflake_target
Data Transfer Using CLI Flags
The quickest way to start transferring data between Snowflake accounts is using Sling’s CLI flags. This approach is perfect for simple transfers or when you’re just getting started.
Basic Transfer Example
Here’s a simple example that transfers a single table from one Snowflake account to another:
# Transfer a single table using CLI flags
sling run \
--src-conn snowflake_source \
--tgt-conn snowflake_target \
--src-stream "sales.orders" \
--tgt-object "sales.orders_copy" \
--tgt-options '{ "add_new_columns": true }'
Advanced Transfer Example
For more complex scenarios, you can use additional CLI flags to customize the transfer:
# Transfer multiple tables with transformations
sling run \
--src-conn snowflake_source \
--tgt-conn snowflake_target \
--src-stream "select o.*, c.customer_name from sales.orders o join sales.customers c on o.customer_id = c.customer_id" \
--tgt-object "analytics.enriched_orders" \
--select: 'order_id,order_date,customer_id,customer_name,total_amount' \
--tgt-options '{
"add_new_columns": true,
"column_casing": "snake"
}'
For a complete overview of available CLI flags, visit the CLI Flags Documentation.
Creating Replication YAML Configurations
While CLI flags are great for simple transfers, YAML configurations provide a more maintainable and powerful way to define your data transfer pipelines. Let’s look at some examples of using YAML configurations for Snowflake-to-Snowflake transfers.
Basic YAML Configuration
Here’s a simple example that transfers multiple tables:
# snowflake_to_snowflake_basic.yaml
source: snowflake_source
target: snowflake_target
# Global options applied to all streams
defaults:
primary_key: [id] # Default primary key for all tables
target_options:
column_casing: snake # Convert all column names to snake_case
add_new_columns: true # Add new columns if they appear in source
streams:
sales.orders:
description: "Orders table replication"
mode: full-refresh
primary_key: [order_id]
select:
- order_id
- order_date
- customer_id
- total_amount
- status
transforms:
"*": [replace_non_printable] # Apply to all columns
sales.customers:
description: "Customers table replication"
mode: incremental
primary_key: [customer_id]
update_key: last_modified
select:
- customer_id
- customer_name
- email
- created_at
- last_modified
where: customer_id > 999
transforms:
email: [trim_space]
"*": [replace_non_printable]
To run this replication:
# Run the replication using the YAML config
sling run -r snowflake_to_snowflake_basic.yaml
Advanced YAML Configuration
For more complex scenarios, here’s an example that includes transformations, custom SQL, and advanced options:
# snowflake_to_snowflake_advanced.yaml
source: snowflake_source
target: snowflake_target
# Environment variables for runtime
env:
DAYS_TO_SYNC: "7"
TARGET_SCHEMA: "analytics"
# Global options applied to all streams
defaults:
mode: incremental
primary_key: [id]
update_key: modified_at
# Global source options
source_options:
datetime_format: "YYYY-MM-DD HH:MI:SS"
# Global target options
target_options:
add_new_columns: true
column_casing: snake
table_ddl:
created_at: timestamp
modified_at: timestamp
is_active: boolean
# Replication level hooks
hooks:
start:
- type: query
conn: snowflake_target
sql: "create schema if not exists {TARGET_SCHEMA}"
end:
- type: http
url: http://my.hook.com/on_finish
streams:
# Orders with custom SQL and runtime variables
sales.orders:
sql: |
select
o.order_id,
o.order_date,
o.total_amount,
c.customer_id,
c.customer_name,
c.email,
current_timestamp() as sync_timestamp
from sales.orders o
join sales.customers c on o.customer_id = c.customer_id
where o.modified_at >= current_date - {DAYS_TO_SYNC}
primary_key: [order_id]
object: "{TARGET_SCHEMA}.enriched_orders"
hooks:
pre:
- type: log
message: "Starting orders sync"
post:
- type: query
conn: snowflake_target
sql: "grant select on {env.TARGET_SCHEMA}.enriched_orders to role analyst"
# Customer metrics with aggregations
sales.customer_metrics:
sql: |
select
customer_id,
count(distinct order_id) as total_orders,
sum(total_amount) as lifetime_value,
min(order_date) as first_order_date,
max(order_date) as last_order_date,
current_timestamp() as calculated_at
from sales.orders
group by customer_id
primary_key: [customer_id]
object: "{TARGET_SCHEMA}.customer_metrics"
mode: full-refresh
To run this replication:
# Run the advanced replication
sling run -r snowflake_to_snowflake_advanced.yaml
For more details on replication configuration options and best practices, visit:
- Replication Structure Documentation
- Source Options Documentation
- Target Options Documentation
- Runtime Variables Documentation
Using Sling Platform
While the CLI is powerful for development and automation, Sling Platform provides a user-friendly web interface for managing and monitoring your data transfers. Let’s explore how to use the Platform for Snowflake-to-Snowflake transfers.
Platform Components
The Sling Platform consists of several key components:
- Web Interface: A modern UI for managing connections, creating replications, and monitoring transfers
- Agents: Lightweight workers that execute data transfers in your infrastructure
- API: RESTful API for programmatic access and integration
- Monitoring: Real-time dashboards and alerts for transfer status
Setting Up Sling Platform
To get started with Sling Platform:
- Sign up at platform.slingdata.io
- Create a new project
- Install and configure a Sling Agent in your environment
- Set up your Snowflake connections through the UI
Creating Transfers in the Platform
The Platform provides an intuitive interface for:
- Connection Management: Securely store and manage your Snowflake credentials
- Replication Design: Visual editor for creating and testing replications
- Scheduling: Set up recurring transfers with flexible schedules
- Monitoring: Track progress and troubleshoot issues
- Team Collaboration: Share configurations and access controls
For detailed instructions on using the Platform, visit the Sling Platform Documentation.
Getting Started
Now that we’ve covered both CLI and Platform approaches, here’s how to get started with your Snowflake-to-Snowflake data transfer:
Choose Your Approach:
- Use CLI for development and simple transfers
- Use Platform for team collaboration and ongoing operations
Set Up Connections:
- Configure source and target Snowflake credentials
- Test connections to ensure proper access
Create Your First Transfer:
- Start with a simple table-to-table transfer
- Test and validate the results
- Gradually add more complex configurations
Monitor and Optimize:
- Track transfer performance
- Adjust warehouse sizes if needed
- Set up alerts for any issues
For more examples and detailed documentation, visit:
Conclusion
Moving data between Snowflake accounts doesn’t have to be complex. With Sling, you can:
- Eliminate the need for custom scripts and complex ETL processes
- Automate data transfers with simple configurations
- Ensure reliable and efficient data movement
- Monitor and manage transfers through CLI or Platform
Whether you’re performing one-time migrations or setting up ongoing replications, Sling provides the tools and flexibility you need to streamline your Snowflake data operations.
Get started today by installing Sling or signing up for Sling Platform.