Moving Data from JSON to SQLite with Sling

Slinger avatar
Slinger
Cover for Moving Data from JSON to SQLite with Sling

Moving data from JSON files into SQLite databases traditionally requires writing custom scripts, handling data type conversions, and managing error cases. This process can be time-consuming and error-prone, often involving multiple dependencies and complex code.

Sling simplifies this process by providing a streamlined, efficient way to transfer data between JSON files and SQLite databases. With features like automatic data type inference, support for nested JSON structures, and flexible transformation options, Sling eliminates the need for custom scripts and reduces development time significantly.

Key advantages of using Sling include:

  • Automated data type mapping and schema creation
  • Built-in support for complex JSON structures
  • Efficient bulk loading capabilities
  • Real-time data validation
  • Simple command-line interface
  • Flexible configuration options

Let’s explore how to set up and use Sling for your JSON to SQLite data integration needs.

Installation

Getting started with Sling is straightforward. You can install it using various package managers:

# Install using Homebrew (macOS)
brew install slingdata-io/sling/sling

# Install using curl (Linux)
curl -LO 'https://github.com/slingdata-io/sling-cli/releases/latest/download/sling_linux_amd64.tar.gz' \
  && tar xf sling_linux_amd64.tar.gz \
  && rm -f sling_linux_amd64.tar.gz \
  && chmod +x sling

# Install using Scoop (Windows)
scoop bucket add sling https://github.com/slingdata-io/scoop-sling.git
scoop install sling

# Install using Python pip
pip install sling

After installation, verify that Sling is properly installed by checking its version:

# Check Sling version
sling --version

For more detailed installation instructions, visit the Sling CLI Getting Started Guide.

Setting Up Connections

Before we can start moving data, we need to configure our source (local JSON files) and target (SQLite) connections. Sling provides multiple ways to manage connections securely.

Local File Connection

For local JSON files, Sling automatically configures a default connection named LOCAL. You don’t need any additional configuration for accessing local files.

SQLite Connection Setup

For SQLite, you can set up the connection using one of these methods:

Using Environment Variables

The simplest way to set up a SQLite connection is through environment variables:

# Set up SQLite connection using environment variable
export SQLITE='sqlite:///path/to/database.db'

Using the Sling CLI

A more secure and maintainable approach is to use Sling’s connection management commands:

# Set up SQLite connection using sling conns set
sling conns set sqlite_db type=sqlite database=/path/to/database.db

Using YAML Configuration

For a more permanent setup, you can define your connections in the ~/.sling/env.yaml file:

connections:
  sqlite_db:
    type: sqlite
    database: /path/to/database.db

Testing Connections

After setting up your connections, it’s important to verify they work correctly:

# Test SQLite connection
sling conns test sqlite_db

# Test local connection
sling conns test local

For more information about connection management and environment variables, refer to the Sling CLI Environment documentation.

Using CLI Flags for Data Sync

Sling’s command-line interface provides a quick way to transfer data using CLI flags. This approach is perfect for one-off transfers or when you want to test your data pipeline before creating a more permanent configuration.

Basic Example

Here’s a simple example of loading a JSON file into a SQLite table:

# Load a JSON file into a SQLite table
sling run \
  --src-conn local \
  --src-stream "file://data/products.json" \
  --tgt-conn sqlite_db \
  --tgt-object "products"

In this example:

  • --src-conn local: Specifies the source connection (local filesystem)
  • --src-stream: Specifies the source JSON file path
  • --tgt-conn sqlite_db: Specifies the target SQLite connection
  • --tgt-object: Specifies the target table name

Advanced Example

Here’s a more complex example that includes source and target options:

# Load JSON data with specific options
sling run \
  --src-conn local \
  --src-stream "file://data/products.json" \
  --src-options '{ "flatten": true, "empty_as_null": true, "jmespath": "products[*]" }' \
  --tgt-conn sqlite_db \
  --tgt-object "products" \
  --tgt-options '{ "column_casing": "snake", "add_new_columns": true }'

This example includes:

  • flatten: Flattens nested JSON structures
  • empty_as_null: Treats empty strings as NULL values
  • jmespath: Extracts specific data using JMESPath expression
  • column_casing: Converts column names to snake_case
  • add_new_columns: Automatically adds new columns if they appear in the source

For more details about available CLI flags, visit the CLI Flags Overview.

Using YAML Configuration

For more complex data synchronization scenarios or when you want to maintain your configuration in version control, Sling supports YAML-based replication configurations. Let’s look at some examples.

Basic YAML Configuration

Here’s a basic example that loads multiple JSON files into SQLite tables:

# local_to_sqlite.yaml
source: local
target: sqlite_db

streams:
  # Load products data
  file://data/products.json:
    object: products
    mode: full-refresh
    source_options:
      format: json
      empty_as_null: true

  # Load customers data
  file://data/customers.json:
    object: customers
    mode: full-refresh
    source_options:
      format: json
      empty_as_null: true

To run this replication:

# Run the replication configuration
sling run -r local_to_sqlite.yaml

Advanced YAML Configuration

Here’s a more complex example that includes transformations and runtime variables:

# local_to_sqlite_advanced.yaml
source: local
target: sqlite_db

defaults:
  mode: full-refresh
  source_options:
    format: json
    empty_as_null: true
    flatten: true
  target_options:
    column_casing: snake
    add_new_columns: true

streams:
  # Load products with transformations
  file://data/products.json:
    object: products
    source_options:
      jmespath: "products[*]"

  # Load orders with dynamic file names
  "file://data/orders_{stream_date}.json":
    object: "{stream_file_name}"
    source_options:
      jmespath: "orders[*]"
    columns:
      total_amount: decimal(20,6)

env:
  stream_date: ${STREAM_DATE}

To run this replication with runtime variables:

# Run the replication with a specific date
export STREAM_DATE=20240101
sling run -r local_to_sqlite_advanced.yaml

This advanced configuration demonstrates:

  • Default options for all streams
  • Column transformations for specific data types
  • Runtime variables for dynamic file names and table names
  • JMESPath expressions for JSON data extraction
  • Data type handling for amounts

For more details about replication configurations, refer to:

Sling Platform Overview

While the CLI is powerful for local development and automation, the Sling Platform provides a comprehensive web interface for managing your data operations at scale. Let’s explore the key components and features of the platform.

Web Interface

The Sling Platform offers an intuitive web interface for managing your data operations:

Sling Editor Interface

The web interface provides:

  • Visual replication editor
  • Real-time validation
  • Syntax highlighting
  • Auto-completion
  • Version control integration

Connection Management

Manage all your connections in one place:

Sling Connections

Benefits of using the Platform include:

  • Centralized credential management
  • Team access controls
  • Connection health monitoring
  • Easy testing and validation

For more information about the Sling Platform, visit:

Getting Started

Now that we’ve covered the various aspects of using Sling for JSON to SQLite data migration, here are some recommended steps to get started:

  1. Start Small

    • Begin with a simple file transfer
    • Test with a subset of your data
    • Validate the results thoroughly
  2. Explore Features

    • Try different replication modes
    • Experiment with transformations
    • Test various source and target options
  3. Scale Up

    • Move to YAML configurations for complex workflows
    • Implement proper error handling
    • Set up monitoring and alerting
  4. Consider Platform

    • Evaluate the Sling Platform for enterprise needs
    • Set up agents for distributed processing
    • Implement team collaboration workflows

For more examples and detailed documentation, visit https://docs.slingdata.io/.