From e40ffa43e2d81e4c024f5f5113e3705733a43446 Mon Sep 17 00:00:00 2001 From: Michael Toy <66150587+mtoy-googly-moogly@users.noreply.github.com> Date: Sun, 15 Feb 2026 16:25:22 -1000 Subject: [PATCH] Consolidate connection config docs into shared config reference page Extract connection types, setup SQL, and environment variable documentation from the extension and CLI pages into a new shared config.malloynb page. Update cross-references and modernize CLI connection command syntax. Co-Authored-By: Claude Opus 4.6 --- src/documentation/malloy_cli/index.malloynb | 12 +- src/documentation/setup/cli.malloynb | 75 +++---- src/documentation/setup/config.malloynb | 215 ++++++++++++++++++++ src/documentation/setup/extension.malloynb | 204 +------------------ src/table_of_contents.json | 4 + 5 files changed, 259 insertions(+), 251 deletions(-) create mode 100644 src/documentation/setup/config.malloynb diff --git a/src/documentation/malloy_cli/index.malloynb b/src/documentation/malloy_cli/index.malloynb index b8bd2aa1..517487e3 100644 --- a/src/documentation/malloy_cli/index.malloynb +++ b/src/documentation/malloy_cli/index.malloynb @@ -15,19 +15,17 @@ Please note that the CLI is currently not notarized (this is a work in progress) ## Setup -The CLI requires database credentials to function. Credentials can be added with `malloy-cli connections create-{database}`, where `{database}` is one of "bigquery", "postgres", or "duckdb". If you want to create a new named connection, options can be found by passing `--help` to any of the `create-{database}` commands, for example: `malloy-cli connections create-bigquery --help` +The CLI stores connection configuration in `~/.config/malloy/malloy-config.json`. This file uses the shared `malloy-config.json` format described in the [Configuration](../setup/config.malloynb) documentation. -#### Setting up BigQuery if you use gCloud +For detailed setup instructions including environment variables, connection commands, and database-specific configuration, see **[CLI Setup](../setup/cli.malloynb)**. -[`gCloud`](https://cloud.google.com/cli) is a command-line tool to work with Google Cloud. Among other things, it can store authentication information for BigQuery. If you already use gCloud to query BigQuery, setting up a connection is as simple as `malloy connections create-bigquery ` - no additional authentication information is required. Note that there are other options that you might want to set, such as billing limits - to see possible options, use `malloy connections create-bigquery --help`. +### Default Connections -#### Default connections - -By default, two connections are created if you don't already have a name that overrides them - "bigquery" and "duckdb". If `.malloy`` or `.malloySQL`` files reference these connections, they are created automatically. DuckDB uses a built-in instance of DuckDB, and BigQuery attempts to connect to BigQuery using any existing authentication already stored on your computer (like if you have gcloud installed). +Two connections are created automatically if you don't already have a name that overrides them — `bigquery` and `duckdb`. If `.malloy` or `.malloysql` files reference these connection names, they work without explicit setup. DuckDB uses a built-in instance, and BigQuery attempts to connect using any existing gcloud authentication on your computer. ## Usage -The main commands of the CLI are `run` and `compile` - `run` executes queries and returns results, whereas compile returns SQL for a query or many queries. +The main commands of the CLI are `run` and `compile` — `run` executes queries and returns results, whereas `compile` returns SQL for a query or many queries. The CLI has detailed usage information for each command. You can get general help with `malloy-cli --help`, and command-specific help and options with `malloy-cli {command} --help` diff --git a/src/documentation/setup/cli.malloynb b/src/documentation/setup/cli.malloynb index 40da8c23..2e78e055 100644 --- a/src/documentation/setup/cli.malloynb +++ b/src/documentation/setup/cli.malloynb @@ -26,75 +26,62 @@ malloy-cli --help ## Configure Connections -The CLI stores its own configuration separately from VS Code in `~/.config/malloy/config.json`. +The CLI stores its own configuration separately from VS Code in `~/.config/malloy/malloy-config.json`. See **[Configuration](config.malloynb)** for the full config file format, connection type properties, environment variables, and default connections. -**Note:** The CLI does not support environment variables for connection configuration. Use the CLI commands below to manage connections. +**Migrating from an older version:** If you have an existing `~/.config/malloy/config.json`, the CLI will automatically migrate it to the new `malloy-config.json` format on first run. ### Connection Commands ```bash # List all connections -malloy connections list +malloy-cli connections list # Create a new connection -malloy connections create- +malloy-cli connections create [key=value ...] -# Test a connection -malloy connections test +# Update an existing connection +malloy-cli connections update [key=value ...] # Show connection details -malloy connections show - -# Delete a connection -malloy connections delete -``` - -### Create a Connection - -```bash -# For BigQuery -malloy-cli connections create-bigquery +malloy-cli connections show -# For Postgres -malloy-cli connections create-postgres +# Test a connection +malloy-cli connections test -# For DuckDB -malloy-cli connections create-duckdb -``` +# Show available connection types +malloy-cli connections describe -View options for each database type: +# Show properties for a connection type +malloy-cli connections describe -```bash -malloy-cli connections create-bigquery --help +# Delete a connection +malloy-cli connections delete ``` -**BigQuery options:** +### Create a Connection -``` --p, --project GCP project ID --l, --location Query location (default: "US") --k, --service-account-key-path Path to service account JSON key --t, --timeout Query timeout --m, --maximum-bytes-billed Limit bytes scanned -``` +Properties use the exact names from the connection type registry, passed as `key=value` pairs. Use `malloy-cli connections describe ` to see available properties for any connection type. -### Default Connections +```bash +# DuckDB +malloy-cli connections create duckdb mydb databasePath=/path/to/data.db -Two connections are created automatically: +# DuckDB with MotherDuck +malloy-cli connections create duckdb md motherDuckToken=tok123 databasePath=md:my_database -- **`bigquery`**: Uses existing gCloud authentication -- **`duckdb`**: Uses a built-in DuckDB instance +# BigQuery +malloy-cli connections create bigquery bq projectId=my-project location=US -If your Malloy files reference these connection names, they work without explicit setup. +# Snowflake +malloy-cli connections create snowflake sf account=myorg warehouse=compute_wh -### BigQuery with gCloud +# Postgres +malloy-cli connections create postgres pg host=localhost port=5432 databaseName=mydb +``` -If you already use gCloud: +### DuckDB Working Directory -```bash -gcloud auth login --update-adc -malloy-cli connections create-bigquery my-bq -``` +When a DuckDB connection does not have `workingDirectory` set in the config, the CLI automatically resolves relative table paths (like `duckdb.table('data.csv')`) relative to the `.malloy` or `.malloysql` file being run. If you set `workingDirectory` explicitly, that value is used instead. --- diff --git a/src/documentation/setup/config.malloynb b/src/documentation/setup/config.malloynb new file mode 100644 index 00000000..9f3e4f1d --- /dev/null +++ b/src/documentation/setup/config.malloynb @@ -0,0 +1,215 @@ +>>>markdown +# Configuration: `malloy-config.json` + +Both the [VS Code Extension](extension.malloynb) and the [Malloy CLI](cli.malloynb) use a `malloy-config.json` file to configure database connections. This page documents the shared file format, connection types, and environment variable options. + +--- + +## Config File Format + +The file contains a `connections` object where each key is a connection name and each value specifies the connection type and its parameters. The **first** connection listed is the default. + +```json +{ + "connections": { + "my_duckdb": { + "is": "duckdb", + "databasePath": "./data.db" + }, + "my_bq": { + "is": "bigquery", + "projectId": "my-project" + } + } +} +``` + +The `is` field identifies the connection type. All other fields are type-specific parameters documented below. + +### Where the config file lives + +- **VS Code Extension** — place `malloy-config.json` in your workspace root. The extension detects it automatically and picks up changes on save. In multi-root workspaces, each root can have its own file. +- **Malloy CLI** — `~/.config/malloy/malloy-config.json`. See the [CLI setup](cli.malloynb) for details. + +--- + +## Connection Types + +### `duckdb` — DuckDB / MotherDuck + +| Parameter | Type | Description | +|---|---|---| +| `databasePath` | file | Path to .db file (default: `:memory:`) | +| `workingDirectory` | string | Working directory for relative paths | +| `motherDuckToken` | password | MotherDuck auth token | +| `additionalExtensions` | string | Comma-separated DuckDB extensions to load (e.g. `"spatial,fts"`). Built-in: json, httpfs, icu | +| `readOnly` | boolean | Open database read-only | +| `setupSQL` | text | Connection setup SQL ([see below](#setup-sql)) | + +### `bigquery` — Google BigQuery + +| Parameter | Type | Description | +|---|---|---| +| `projectId` | string | GCP project ID | +| `serviceAccountKeyPath` | file | Path to service account JSON key | +| `location` | string | Dataset location | +| `maximumBytesBilled` | string | Byte billing cap | +| `timeoutMs` | string | Query timeout in ms | +| `billingProjectId` | string | Billing project (if different) | +| `setupSQL` | text | Connection setup SQL ([see below](#setup-sql)) | + +### `postgres` — PostgreSQL + +| Parameter | Type | Description | +|---|---|---| +| `host` | string | Server host | +| `port` | number | Server port | +| `username` | string | Username | +| `password` | password | Password | +| `databaseName` | string | Database name | +| `connectionString` | string | Full connection string (alternative) | +| `setupSQL` | text | Connection setup SQL ([see below](#setup-sql)) | + +### `mysql` — MySQL + +| Parameter | Type | Description | +|---|---|---| +| `host` | string | Server host | +| `port` | number | Server port (default: 3306) | +| `database` | string | Database name | +| `user` | string | Username | +| `password` | password | Password | +| `setupSQL` | text | Connection setup SQL ([see below](#setup-sql)) | + +### `snowflake` — Snowflake + +| Parameter | Type | Description | +|---|---|---| +| `account` | string | Snowflake account identifier (required) | +| `username` | string | Username | +| `password` | password | Password | +| `role` | string | Role | +| `warehouse` | string | Warehouse | +| `database` | string | Database | +| `schema` | string | Schema | +| `privateKeyPath` | file | Path to private key (.pem/.key) | +| `privateKeyPass` | password | Private key passphrase | +| `timeoutMs` | number | Query timeout in ms | +| `setupSQL` | text | Connection setup SQL ([see below](#setup-sql)) | + +### `trino` / `presto` — Trino or Presto + +| Parameter | Type | Description | +|---|---|---| +| `server` | string | Server hostname | +| `port` | number | Server port | +| `catalog` | string | Catalog name | +| `schema` | string | Schema name | +| `user` | string | Username | +| `password` | password | Password | +| `setupSQL` | text | Connection setup SQL ([see below](#setup-sql)) | + +--- + +## Setup SQL + +All connection types support a `setupSQL` parameter. This is a multi-line text field containing SQL statements to execute each time the connection is established. + +Each statement must end with `;` at the end of a line. Statements can span multiple lines. Only one statement-ending `;` is allowed per line. + +**Legal** — each statement ends with `;` on its own line: +``` +SET search_path TO analytics; +CREATE TEMP TABLE foo + AS SELECT 1; +``` + +**Illegal** — two statements on the same line: +``` +SET search_path TO analytics; CREATE TEMP TABLE foo AS SELECT 1; +``` + +--- + +## Environment Variables + +Some databases support configuration via environment variables. When environment variables are set, a connection is created automatically using those values. Values set in `malloy-config.json` override environment variables. + +### MotherDuck + +MotherDuck is configured through a token. In MotherDuck, click **Settings** then copy your token. + +```bash +export MOTHERDUCK_TOKEN=your_token_here +``` + +The default connection name for MotherDuck is `md`. + +**Example usage:** +```malloy +source: hacker_news is md.table('sample_data.hn.hacker_news') +``` + +### MySQL + +MySQL connections can be configured entirely through environment variables. + +```bash +export MYSQL_USER=readonly +export MYSQL_HOST=db.example.com +export MYSQL_PORT=3306 +export MYSQL_PASSWORD=your_password +export MYSQL_DATABASE=analytics +``` + +The default connection name is `mysql`. `MYSQL_USER` is required; other variables are optional but typically needed. + +### Snowflake + +```bash +export SNOWFLAKE_ACCOUNT=myorg-myaccount # Required +export SNOWFLAKE_USER=analyst +export SNOWFLAKE_PASSWORD=your_password +export SNOWFLAKE_WAREHOUSE=compute_wh +export SNOWFLAKE_DATABASE=analytics +export SNOWFLAKE_SCHEMA=public +``` + +Snowflake also supports TOML configuration at `~/.snowflake/connections.toml`. See [Snowflake connection configuration](https://docs.snowflake.com/en/developer-guide/python-connector/python-connector-connect#connecting-using-the-connections-toml-file) for details. + +### Trino + +```bash +export TRINO_SERVER=https://trino.example.com # Required +export TRINO_USER=analyst # Required +export TRINO_PASSWORD=your_password +export TRINO_CATALOG=hive +export TRINO_SCHEMA=default +``` + +The default connection name is `trino`. + +### Presto + +```bash +export PRESTO_HOST=presto.example.com # Required +export PRESTO_PORT=8080 # Defaults to 8080 +export PRESTO_USER=analyst +export PRESTO_PASSWORD=your_password +export PRESTO_CATALOG=hive +export PRESTO_SCHEMA=default +``` + +The default connection name is `presto`. + +### Databases Without Environment Variable Support + +- **BigQuery**: Uses `gcloud auth login --update-adc` (OAuth) or a service account key file configured in `malloy-config.json` +- **PostgreSQL**: Uses connection parameters in `malloy-config.json` (credentials stored in system keychain for VS Code) +- **DuckDB**: Uses file paths directly, no authentication needed + +--- + +## Default Connections + +Two connections are created automatically if you don't already have a connection that overrides them — `bigquery` and `duckdb`. If your Malloy files reference these connection names, they work without explicit setup. DuckDB uses a built-in in-memory instance, and BigQuery attempts to connect using any existing gcloud authentication on your computer. diff --git a/src/documentation/setup/extension.malloynb b/src/documentation/setup/extension.malloynb index 6463a659..5b8a1866 100644 --- a/src/documentation/setup/extension.malloynb +++ b/src/documentation/setup/extension.malloynb @@ -50,124 +50,13 @@ There are two ways to configure database connections: 1. **`malloy-config.json`** (recommended) — a project-level config file checked into source control 2. **VS Code Settings** — user-level configuration via the command palette -If a `malloy-config.json` exists in your workspace root, it takes priority over VS Code settings. +**Note:** If a `malloy-config.json` exists in your workspace root, VS Code settings for connections are ignored entirely — all connections come from the config file. ### Project Configuration: `malloy-config.json` -Place a `malloy-config.json` file in the root of your project (workspace root). The extension detects it automatically and picks up changes whenever you save. - -The file contains a `connections` object where each key is a connection name and each value specifies the connection type and its parameters. The first connection listed is the default. - -```json -{ - "connections": { - "my_duckdb": { - "is": "duckdb", - "databasePath": "./data.db" - }, - "my_bq": { - "is": "bigquery", - "projectId": "my-project" - } - } -} -``` +Place a `malloy-config.json` file in the root of your project (workspace root). The extension detects it automatically and picks up changes whenever you save. In multi-root workspaces, each workspace root can have its own file with independent connection namespaces. -In multi-root workspaces, each workspace root can have its own `malloy-config.json` with independent connection namespaces. If you open a single file without a workspace (`code file.malloy`), the extension looks for `malloy-config.json` in the file's directory. - -#### Connection Types - -**`duckdb`** — DuckDB / MotherDuck - -| Parameter | Type | Description | -|---|---|---| -| `databasePath` | file | Path to .db file (default: `:memory:`) | -| `workingDirectory` | string | Working directory for relative paths | -| `motherDuckToken` | password | MotherDuck auth token | -| `additionalExtensions` | string | Extra DuckDB extensions to load | -| `readOnly` | boolean | Open database read-only | -| `setupSQL` | text | Connection setup SQL ([see below](#setup-sql)) | - -**`bigquery`** — Google BigQuery - -| Parameter | Type | Description | -|---|---|---| -| `projectId` | string | GCP project ID | -| `serviceAccountKeyPath` | file | Path to service account JSON key | -| `location` | string | Dataset location | -| `maximumBytesBilled` | string | Byte billing cap | -| `timeoutMs` | string | Query timeout in ms | -| `billingProjectId` | string | Billing project (if different) | -| `setupSQL` | text | Connection setup SQL ([see below](#setup-sql)) | - -**`postgres`** — PostgreSQL - -| Parameter | Type | Description | -|---|---|---| -| `host` | string | Server host | -| `port` | number | Server port | -| `username` | string | Username | -| `password` | password | Password | -| `databaseName` | string | Database name | -| `connectionString` | string | Full connection string (alternative) | -| `setupSQL` | text | Connection setup SQL ([see below](#setup-sql)) | - -**`mysql`** — MySQL - -| Parameter | Type | Description | -|---|---|---| -| `host` | string | Server host | -| `port` | number | Server port (default: 3306) | -| `database` | string | Database name | -| `user` | string | Username | -| `password` | password | Password | -| `setupSQL` | text | Connection setup SQL ([see below](#setup-sql)) | - -**`snowflake`** — Snowflake - -| Parameter | Type | Description | -|---|---|---| -| `account` | string | Snowflake account identifier (required) | -| `username` | string | Username | -| `password` | password | Password | -| `role` | string | Role | -| `warehouse` | string | Warehouse | -| `database` | string | Database | -| `schema` | string | Schema | -| `privateKeyPath` | file | Path to private key (.pem/.key) | -| `privateKeyPass` | password | Private key passphrase | -| `timeoutMs` | number | Query timeout in ms | -| `setupSQL` | text | Connection setup SQL ([see below](#setup-sql)) | - -**`trino`** / **`presto`** — Trino or Presto - -| Parameter | Type | Description | -|---|---|---| -| `server` | string | Server hostname | -| `port` | number | Server port | -| `catalog` | string | Catalog name | -| `schema` | string | Schema name | -| `user` | string | Username | -| `password` | password | Password | -| `setupSQL` | text | Connection setup SQL ([see below](#setup-sql)) | - -#### Setup SQL - -All connection types support a `setupSQL` parameter. This is a multi-line text field containing SQL statements to execute each time the connection is established. - -Each statement must end with `;` at the end of a line. Statements can span multiple lines. Only one statement-ending `;` is allowed per line. - -**Legal** — each statement ends with `;` on its own line: -``` -SET search_path TO analytics; -CREATE TEMP TABLE foo - AS SELECT 1; -``` - -**Illegal** — two statements on the same line: -``` -SET search_path TO analytics; CREATE TEMP TABLE foo AS SELECT 1; -``` +See **[Configuration](config.malloynb)** for the full config file format, connection type properties, setup SQL, and environment variables. ### VS Code Settings @@ -186,91 +75,6 @@ VS Code stores these in your user settings: --- -## Environment Variable Configuration - -Some databases support configuration via environment variables. Set these before launching VS Code. - -### MotherDuck - -MotherDuck is configured through a token. In MotherDuck, click **Settings** then copy your token. - -```bash -export MOTHERDUCK_TOKEN=your_token_here -``` - -Then launch VS Code. The extension will automatically use the token when connecting to MotherDuck databases. - -**Example usage:** -```malloy -source: hacker_news is md.table('sample_data.hn.hacker_news') -``` - -The default connection name for MotherDuck is `md`. - -### MySQL - -MySQL connections are configured **only** through environment variables. There is no UI configuration for MySQL in VS Code. - -```bash -export MYSQL_USER=readonly -export MYSQL_HOST=db.example.com -export MYSQL_PORT=3306 -export MYSQL_PASSWORD=your_password -export MYSQL_DATABASE=analytics -``` - -Then launch VS Code. The default connection name is `mysql`. - -**Note:** `MYSQL_USER` is required. Other variables are optional but typically needed. - -### Snowflake - -Snowflake can be configured via UI or environment variables: - -```bash -export SNOWFLAKE_ACCOUNT=myorg-myaccount # Required -export SNOWFLAKE_USER=analyst -export SNOWFLAKE_PASSWORD=your_password -export SNOWFLAKE_WAREHOUSE=compute_wh -export SNOWFLAKE_DATABASE=analytics -export SNOWFLAKE_SCHEMA=public -``` - -Snowflake also supports TOML configuration at `~/.snowflake/connections.toml`. See [Snowflake connection configuration](https://docs.snowflake.com/en/developer-guide/python-connector/python-connector-connect#connecting-using-the-connections-toml-file) for details. - -### Trino - -```bash -export TRINO_SERVER=https://trino.example.com # Required -export TRINO_USER=analyst # Required -export TRINO_PASSWORD=your_password -export TRINO_CATALOG=hive -export TRINO_SCHEMA=default -``` - -The default connection name is `trino`. - -### Presto - -```bash -export PRESTO_HOST=presto.example.com # Required -export PRESTO_PORT=8080 # Defaults to 8080 -export PRESTO_USER=analyst -export PRESTO_PASSWORD=your_password -export PRESTO_CATALOG=hive -export PRESTO_SCHEMA=default -``` - -The default connection name is `presto`. - -### Databases Without Environment Variable Support - -- **BigQuery**: Uses `gcloud auth login --update-adc` (OAuth) or service account key file -- **PostgreSQL**: Uses connection configuration in VS Code UI (credentials stored in system keychain) -- **DuckDB**: Uses file paths directly, no authentication needed - ---- - ## Database-Specific Setup ### DuckDB @@ -335,7 +139,7 @@ Add a PostgreSQL connection via **Malloy: Edit Connections**. Enter host, port, ### MySQL -MySQL connections require environment variables (see above). The extension will detect the connection automatically when the variables are set. +MySQL connections require environment variables — see [Configuration](config.malloynb#mysql). The extension will detect the connection automatically when the variables are set. ### Trino and Presto diff --git a/src/table_of_contents.json b/src/table_of_contents.json index 943bdb04..6e590dbb 100644 --- a/src/table_of_contents.json +++ b/src/table_of_contents.json @@ -28,6 +28,10 @@ "title": "Database Support", "link": "/setup/database_support.malloynb" }, + { + "title": "Configuration", + "link": "/setup/config.malloynb" + }, { "title": "Malloy CLI", "link": "/setup/cli.malloynb"