Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 5 additions & 7 deletions src/documentation/malloy_cli/index.malloynb
Original file line number Diff line number Diff line change
Expand Up @@ -15,19 +15,17 @@ Please note that the CLI is currently not notarized (this is a work in progress)

## Setup

The CLI requires database credentials to function. Credentials can be added with `malloy-cli connections create-{database}`, where `{database}` is one of "bigquery", "postgres", or "duckdb". If you want to create a new named connection, options can be found by passing `--help` to any of the `create-{database}` commands, for example: `malloy-cli connections create-bigquery --help`
The CLI stores connection configuration in `~/.config/malloy/malloy-config.json`. This file uses the shared `malloy-config.json` format described in the [Configuration](../setup/config.malloynb) documentation.

#### Setting up BigQuery if you use gCloud
For detailed setup instructions including environment variables, connection commands, and database-specific configuration, see **[CLI Setup](../setup/cli.malloynb)**.

[`gCloud`](https://cloud.google.com/cli) is a command-line tool to work with Google Cloud. Among other things, it can store authentication information for BigQuery. If you already use gCloud to query BigQuery, setting up a connection is as simple as `malloy connections create-bigquery <name>` - no additional authentication information is required. Note that there are other options that you might want to set, such as billing limits - to see possible options, use `malloy connections create-bigquery --help`.
### Default Connections

#### Default connections

By default, two connections are created if you don't already have a name that overrides them - "bigquery" and "duckdb". If `.malloy`` or `.malloySQL`` files reference these connections, they are created automatically. DuckDB uses a built-in instance of DuckDB, and BigQuery attempts to connect to BigQuery using any existing authentication already stored on your computer (like if you have gcloud installed).
Two connections are created automatically if you don't already have a name that overrides them — `bigquery` and `duckdb`. If `.malloy` or `.malloysql` files reference these connection names, they work without explicit setup. DuckDB uses a built-in instance, and BigQuery attempts to connect using any existing gcloud authentication on your computer.

## Usage

The main commands of the CLI are `run` and `compile` - `run` executes queries and returns results, whereas compile returns SQL for a query or many queries.
The main commands of the CLI are `run` and `compile` `run` executes queries and returns results, whereas `compile` returns SQL for a query or many queries.

The CLI has detailed usage information for each command. You can get general help with `malloy-cli --help`, and command-specific help and options with `malloy-cli {command} --help`

75 changes: 31 additions & 44 deletions src/documentation/setup/cli.malloynb
Original file line number Diff line number Diff line change
Expand Up @@ -26,75 +26,62 @@ malloy-cli --help

## Configure Connections

The CLI stores its own configuration separately from VS Code in `~/.config/malloy/config.json`.
The CLI stores its own configuration separately from VS Code in `~/.config/malloy/malloy-config.json`. See **[Configuration](config.malloynb)** for the full config file format, connection type properties, environment variables, and default connections.

**Note:** The CLI does not support environment variables for connection configuration. Use the CLI commands below to manage connections.
**Migrating from an older version:** If you have an existing `~/.config/malloy/config.json`, the CLI will automatically migrate it to the new `malloy-config.json` format on first run.

### Connection Commands

```bash
# List all connections
malloy connections list
malloy-cli connections list

# Create a new connection
malloy connections create-<type> <name>
malloy-cli connections create <type> <name> [key=value ...]

# Test a connection
malloy connections test <name>
# Update an existing connection
malloy-cli connections update <name> [key=value ...]

# Show connection details
malloy connections show <name>

# Delete a connection
malloy connections delete <name>
```

### Create a Connection

```bash
# For BigQuery
malloy-cli connections create-bigquery <connection-name>
malloy-cli connections show <name>

# For Postgres
malloy-cli connections create-postgres <connection-name>
# Test a connection
malloy-cli connections test <name>

# For DuckDB
malloy-cli connections create-duckdb <connection-name>
```
# Show available connection types
malloy-cli connections describe

View options for each database type:
# Show properties for a connection type
malloy-cli connections describe <type>

```bash
malloy-cli connections create-bigquery --help
# Delete a connection
malloy-cli connections delete <name>
```

**BigQuery options:**
### Create a Connection

```
-p, --project <id> GCP project ID
-l, --location <region> Query location (default: "US")
-k, --service-account-key-path <path> Path to service account JSON key
-t, --timeout <milliseconds> Query timeout
-m, --maximum-bytes-billed <bytes> Limit bytes scanned
```
Properties use the exact names from the connection type registry, passed as `key=value` pairs. Use `malloy-cli connections describe <type>` to see available properties for any connection type.

### Default Connections
```bash
# DuckDB
malloy-cli connections create duckdb mydb databasePath=/path/to/data.db

Two connections are created automatically:
# DuckDB with MotherDuck
malloy-cli connections create duckdb md motherDuckToken=tok123 databasePath=md:my_database

- **`bigquery`**: Uses existing gCloud authentication
- **`duckdb`**: Uses a built-in DuckDB instance
# BigQuery
malloy-cli connections create bigquery bq projectId=my-project location=US

If your Malloy files reference these connection names, they work without explicit setup.
# Snowflake
malloy-cli connections create snowflake sf account=myorg warehouse=compute_wh

### BigQuery with gCloud
# Postgres
malloy-cli connections create postgres pg host=localhost port=5432 databaseName=mydb
```

If you already use gCloud:
### DuckDB Working Directory

```bash
gcloud auth login --update-adc
malloy-cli connections create-bigquery my-bq
```
When a DuckDB connection does not have `workingDirectory` set in the config, the CLI automatically resolves relative table paths (like `duckdb.table('data.csv')`) relative to the `.malloy` or `.malloysql` file being run. If you set `workingDirectory` explicitly, that value is used instead.

---

Expand Down
215 changes: 215 additions & 0 deletions src/documentation/setup/config.malloynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,215 @@
>>>markdown
# Configuration: `malloy-config.json`

Both the [VS Code Extension](extension.malloynb) and the [Malloy CLI](cli.malloynb) use a `malloy-config.json` file to configure database connections. This page documents the shared file format, connection types, and environment variable options.

---

## Config File Format

The file contains a `connections` object where each key is a connection name and each value specifies the connection type and its parameters. The **first** connection listed is the default.

```json
{
"connections": {
"my_duckdb": {
"is": "duckdb",
"databasePath": "./data.db"
},
"my_bq": {
"is": "bigquery",
"projectId": "my-project"
}
}
}
```

The `is` field identifies the connection type. All other fields are type-specific parameters documented below.

### Where the config file lives

- **VS Code Extension** — place `malloy-config.json` in your workspace root. The extension detects it automatically and picks up changes on save. In multi-root workspaces, each root can have its own file.
- **Malloy CLI** — `~/.config/malloy/malloy-config.json`. See the [CLI setup](cli.malloynb) for details.

---

## Connection Types

### `duckdb` — DuckDB / MotherDuck

| Parameter | Type | Description |
|---|---|---|
| `databasePath` | file | Path to .db file (default: `:memory:`) |
| `workingDirectory` | string | Working directory for relative paths |
| `motherDuckToken` | password | MotherDuck auth token |
| `additionalExtensions` | string | Comma-separated DuckDB extensions to load (e.g. `"spatial,fts"`). Built-in: json, httpfs, icu |
| `readOnly` | boolean | Open database read-only |
| `setupSQL` | text | Connection setup SQL ([see below](#setup-sql)) |

### `bigquery` — Google BigQuery

| Parameter | Type | Description |
|---|---|---|
| `projectId` | string | GCP project ID |
| `serviceAccountKeyPath` | file | Path to service account JSON key |
| `location` | string | Dataset location |
| `maximumBytesBilled` | string | Byte billing cap |
| `timeoutMs` | string | Query timeout in ms |
| `billingProjectId` | string | Billing project (if different) |
| `setupSQL` | text | Connection setup SQL ([see below](#setup-sql)) |

### `postgres` — PostgreSQL

| Parameter | Type | Description |
|---|---|---|
| `host` | string | Server host |
| `port` | number | Server port |
| `username` | string | Username |
| `password` | password | Password |
| `databaseName` | string | Database name |
| `connectionString` | string | Full connection string (alternative) |
| `setupSQL` | text | Connection setup SQL ([see below](#setup-sql)) |

### `mysql` — MySQL

| Parameter | Type | Description |
|---|---|---|
| `host` | string | Server host |
| `port` | number | Server port (default: 3306) |
| `database` | string | Database name |
| `user` | string | Username |
| `password` | password | Password |
| `setupSQL` | text | Connection setup SQL ([see below](#setup-sql)) |

### `snowflake` — Snowflake

| Parameter | Type | Description |
|---|---|---|
| `account` | string | Snowflake account identifier (required) |
| `username` | string | Username |
| `password` | password | Password |
| `role` | string | Role |
| `warehouse` | string | Warehouse |
| `database` | string | Database |
| `schema` | string | Schema |
| `privateKeyPath` | file | Path to private key (.pem/.key) |
| `privateKeyPass` | password | Private key passphrase |
| `timeoutMs` | number | Query timeout in ms |
| `setupSQL` | text | Connection setup SQL ([see below](#setup-sql)) |

### `trino` / `presto` — Trino or Presto

| Parameter | Type | Description |
|---|---|---|
| `server` | string | Server hostname |
| `port` | number | Server port |
| `catalog` | string | Catalog name |
| `schema` | string | Schema name |
| `user` | string | Username |
| `password` | password | Password |
| `setupSQL` | text | Connection setup SQL ([see below](#setup-sql)) |

---

## Setup SQL

All connection types support a `setupSQL` parameter. This is a multi-line text field containing SQL statements to execute each time the connection is established.

Each statement must end with `;` at the end of a line. Statements can span multiple lines. Only one statement-ending `;` is allowed per line.

**Legal** — each statement ends with `;` on its own line:
```
SET search_path TO analytics;
CREATE TEMP TABLE foo
AS SELECT 1;
```

**Illegal** — two statements on the same line:
```
SET search_path TO analytics; CREATE TEMP TABLE foo AS SELECT 1;
```

---

## Environment Variables

Some databases support configuration via environment variables. When environment variables are set, a connection is created automatically using those values. Values set in `malloy-config.json` override environment variables.

### MotherDuck

MotherDuck is configured through a token. In MotherDuck, click **Settings** then copy your token.

```bash
export MOTHERDUCK_TOKEN=your_token_here
```

The default connection name for MotherDuck is `md`.

**Example usage:**
```malloy
source: hacker_news is md.table('sample_data.hn.hacker_news')
```

### MySQL

MySQL connections can be configured entirely through environment variables.

```bash
export MYSQL_USER=readonly
export MYSQL_HOST=db.example.com
export MYSQL_PORT=3306
export MYSQL_PASSWORD=your_password
export MYSQL_DATABASE=analytics
```

The default connection name is `mysql`. `MYSQL_USER` is required; other variables are optional but typically needed.

### Snowflake

```bash
export SNOWFLAKE_ACCOUNT=myorg-myaccount # Required
export SNOWFLAKE_USER=analyst
export SNOWFLAKE_PASSWORD=your_password
export SNOWFLAKE_WAREHOUSE=compute_wh
export SNOWFLAKE_DATABASE=analytics
export SNOWFLAKE_SCHEMA=public
```

Snowflake also supports TOML configuration at `~/.snowflake/connections.toml`. See [Snowflake connection configuration](https://docs.snowflake.com/en/developer-guide/python-connector/python-connector-connect#connecting-using-the-connections-toml-file) for details.

### Trino

```bash
export TRINO_SERVER=https://trino.example.com # Required
export TRINO_USER=analyst # Required
export TRINO_PASSWORD=your_password
export TRINO_CATALOG=hive
export TRINO_SCHEMA=default
```

The default connection name is `trino`.

### Presto

```bash
export PRESTO_HOST=presto.example.com # Required
export PRESTO_PORT=8080 # Defaults to 8080
export PRESTO_USER=analyst
export PRESTO_PASSWORD=your_password
export PRESTO_CATALOG=hive
export PRESTO_SCHEMA=default
```

The default connection name is `presto`.

### Databases Without Environment Variable Support

- **BigQuery**: Uses `gcloud auth login --update-adc` (OAuth) or a service account key file configured in `malloy-config.json`
- **PostgreSQL**: Uses connection parameters in `malloy-config.json` (credentials stored in system keychain for VS Code)
- **DuckDB**: Uses file paths directly, no authentication needed

---

## Default Connections

Two connections are created automatically if you don't already have a connection that overrides them — `bigquery` and `duckdb`. If your Malloy files reference these connection names, they work without explicit setup. DuckDB uses a built-in in-memory instance, and BigQuery attempts to connect using any existing gcloud authentication on your computer.
Loading