Skip to content

Instantly share code, notes, and snippets.

@rdraward
Last active March 20, 2026 14:52
Show Gist options
  • Select an option

  • Save rdraward/7c3d980cb3e4a741c8d461d5b0f34fab to your computer and use it in GitHub Desktop.

Select an option

Save rdraward/7c3d980cb3e4a741c8d461d5b0f34fab to your computer and use it in GitHub Desktop.
Context for Gadget's APL (Axoim Processing Language) to SQL skill

APL to ClickHouse SQL Migration Guide

This guide helps developers familiar with Axiom's APL (Axiom Processing Language) translate their queries to ClickHouse SQL for Grafana.

Axiom Datasets → ClickHouse Tables

Axiom Dataset ClickHouse Table
['core-production'] otel.otel_logs_v1
['core-tracing'] otel.otel_traces_v2

Query Types

See overview.md for Grafana query type documentation. The key difference from Axiom is that Grafana requires you to explicitly select the query type (Table, Logs, Time Series, or Traces).

Key Differences

APL Concept ClickHouse SQL Equivalent
extend operator Common Table Expression (CTE) with WITH clause
bin() / bin_auto() Grafana's $__timeInterval(Timestamp) macro
where _time > ago(1h) Grafana's $__timeFilter(Timestamp) macro
summarize count() by field GROUP BY field with aggregation
Pipe syntax (|) Subqueries or CTEs

The extend Operator → CTEs

In APL, extend creates new calculated fields inline:

['otel-logs']
| extend isError = SeverityText == 'ERROR'
| extend environment = LogAttributes['gadget.environment_id']
| where isError == true
| summarize count() by bin_auto(_time), environment

In ClickHouse SQL, use Common Table Expressions (CTEs) with the WITH clause to achieve the same result. Group all extend operations into a single CTE, then reference the computed fields in your main query.

Pattern: APL extend → SQL CTE

WITH extended AS (
    SELECT
        *,
        -- All your "extend" operations go here
        SeverityText = 'ERROR' AS isError,
        LogAttributes['gadget.environment_id'] AS environment
    FROM otel.otel_logs_v1
    WHERE $__timeFilter(Timestamp)
)
SELECT
    $__timeInterval(Timestamp) as time,
    toString(environment) as environment,
    count() as count
FROM extended
WHERE isError = true
GROUP BY time, environment
ORDER BY time

Why CTEs?

  1. Readability - Computed fields are defined upfront, like APL's extend
  2. Reusability - Reference computed fields multiple times without recalculating
  3. Familiar pattern - Maps directly to APL's pipeline model

Multiple Extends

When you have multiple extend operations in APL:

['otel-logs']
| extend isError = SeverityText == 'ERROR'
| extend duration_sec = duration_ms / 1000
| extend service_type = iff(ServiceName contains 'api', 'API', 'Worker')

Combine them all into one CTE:

WITH extended AS (
    SELECT
        *,
        SeverityText = 'ERROR' AS isError,
        duration_ms / 1000 AS duration_sec,
        if(ServiceName LIKE '%api%', 'API', 'Worker') AS service_type
    FROM otel.otel_logs_v1
    WHERE $__timeFilter(Timestamp)
)
SELECT * FROM extended

Time Binning: bin / bin_auto → Grafana Macros

APL's bin() and bin_auto() functions are used to bucket timestamps for time series visualization.

In Grafana, always use the built-in macros instead:

APL Grafana ClickHouse
bin(_time, 1h) $__timeInterval(Timestamp)
bin_auto(_time) $__timeInterval(Timestamp)
where _time > ago(1h) WHERE $__timeFilter(Timestamp)

Why Grafana Macros?

  1. Auto-adjusts - $__timeInterval automatically picks the right bucket size based on the dashboard time range
  2. Syncs with UI - $__timeFilter uses the time range picker in the dashboard
  3. Optimized - Grafana generates efficient ClickHouse-specific SQL

APL Time Series Example

['otel-logs']
| where _time > ago(24h)
| summarize count() by bin_auto(_time), SeverityText

Equivalent Grafana SQL

SELECT
    $__timeInterval(Timestamp) as time,
    toString(SeverityText) as severity,
    count() as count
FROM otel.otel_logs_v1
WHERE $__timeFilter(Timestamp)
GROUP BY time, severity
ORDER BY time

Critical: Cast Group-By Columns to String

This is crucial for Grafana to work correctly.

In APL, grouping by a field automatically treats it as a label. In Grafana, numeric or non-string fields in the GROUP BY clause will be interpreted as metric values instead of labels, breaking your visualization.

Wrong - Grafana treats status_code as a metric

SELECT
    $__timeInterval(Timestamp) as time,
    SpanAttributes['http.response.status_code'] as status_code,  -- Problem!
    count() as count
FROM otel.otel_traces_v2
WHERE $__timeFilter(Timestamp)
GROUP BY time, status_code
ORDER BY time

Correct - Cast to string for labels

SELECT
    $__timeInterval(Timestamp) as time,
    toString(SpanAttributes['http.response.status_code']) as status_code,
    count() as count
FROM otel.otel_traces_v2
WHERE $__timeFilter(Timestamp)
GROUP BY time, status_code
ORDER BY time

Always Cast These

Field Type Example Fix
Numeric attributes SpanAttributes['http.response.status_code'] toString(...)
Environment IDs LogAttributes['gadget.environment_id'] toString(...)
Application IDs LogAttributes['gadget.application_id'] toString(...)
Durations Duration toString(...)

Common APL → SQL Translations

Filtering with where

APL:

['otel-logs']
| where SeverityText == 'ERROR'
| where LogAttributes['gadget.environment_id'] == '75668'

SQL:

SELECT *
FROM otel.otel_logs_v1
WHERE $__timeFilter(Timestamp)
  AND SeverityText = 'ERROR'
  AND LogAttributes['gadget.environment_id'] = '75668'

Conditional Logic with iff

APL:

| extend status_category = iff(status == '200', 'Success', 'Failure')

SQL (in CTE):

WITH extended AS (
    SELECT
        *,
        if(status = '200', 'Success', 'Failure') AS status_category
    FROM otel.otel_logs_v1
    WHERE $__timeFilter(Timestamp)
)

String Contains

APL:

| where ServiceName contains 'api'

SQL:

WHERE ServiceName LIKE '%api%'

Aggregation with summarize

APL:

| summarize
    total = count(),
    avg_duration = avg(duration_ms)
  by bin_auto(_time), ServiceName

SQL:

SELECT
    $__timeInterval(Timestamp) as time,
    toString(ServiceName) as service,
    count() as total,
    avg(duration_ms) as avg_duration
FROM otel.otel_logs_v1
WHERE $__timeFilter(Timestamp)
GROUP BY time, service
ORDER BY time

Top N with top

APL:

| summarize count() by error_code
| top 10 by count_

SQL:

SELECT
    LogAttributes['error.code'] as error_code,
    count() as count
FROM otel.otel_logs_v1
WHERE $__timeFilter(Timestamp)
GROUP BY error_code
ORDER BY count DESC
LIMIT 10

Full Migration Example

Original APL Query

['otel-logs']
| where _time > ago(24h)
| extend isError = SeverityText == 'ERROR'
| extend env = LogAttributes['gadget.environment_id']
| extend app = LogAttributes['gadget.application_id']
| where isError == true
| summarize
    error_count = count(),
    unique_apps = dcount(app)
  by bin_auto(_time), env
| top 100 by error_count

Migrated ClickHouse SQL for Grafana

WITH extended AS (
    SELECT
        Timestamp,
        SeverityText = 'ERROR' AS isError,
        LogAttributes['gadget.environment_id'] AS env,
        LogAttributes['gadget.application_id'] AS app
    FROM otel.otel_logs_v1
    WHERE $__timeFilter(Timestamp)
)
SELECT
    $__timeInterval(Timestamp) as time,
    toString(env) as environment,
    count() as error_count,
    uniq(app) as unique_apps
FROM extended
WHERE isError = true
GROUP BY time, environment
ORDER BY time
LIMIT 100

Time Series Panel YAML Template

Use this template when adding a time series panel to a Grafana dashboard YAML file:

- id: <unique_id>
  type: timeseries
  title: <Panel Title>
  gridPos:
    x: 0
    "y": 0
    h: 10
    w: 24
  datasource:
    type: grafana-clickhouse-datasource
    uid: df6lrfyg3b4sgf
  fieldConfig:
    defaults:
      custom:
        drawStyle: line
        lineInterpolation: linear
        barAlignment: 0
        barWidthFactor: 0.6
        lineWidth: 1
        fillOpacity: 0
        gradientMode: none
        spanNulls: false
        insertNulls: false
        showPoints: auto
        showValues: false
        pointSize: 5
        stacking:
          mode: none
          group: A
        axisPlacement: auto
        axisLabel: ""
        axisColorMode: text
        axisBorderShow: false
        scaleDistribution:
          type: linear
        axisCenteredZero: false
        hideFrom:
          tooltip: false
          viz: false
          legend: false
        thresholdsStyle:
          mode: "off"
      color:
        mode: palette-classic
      mappings: []
      thresholds:
        mode: absolute
        steps:
          - color: green
            value: null
          - color: red
            value: 80
      displayName: "${__field.labels}"
      unit: short
    overrides: []
  options:
    tooltip:
      mode: single
      sort: desc
      hideZeros: true
      hoverProximity: 100
    legend:
      showLegend: true
      displayMode: table
      placement: bottom
      calcs:
        - sum
      sortBy: Total
      sortDesc: true
  pluginVersion: "12.3.0"
  targets:
    - datasource:
        type: grafana-clickhouse-datasource
        uid: df6lrfyg3b4sgf
      editorType: sql
      format: 0
      meta:
        builderOptions:
          columns: []
          database: ""
          limit: 1000
          mode: list
          queryType: table
          table: ""
      pluginVersion: "4.11.2"
      queryType: timeseries
      rawSql: |
        <YOUR_CLICKHOUSE_SQL_HERE>
      refId: A

Migration Workflow

  1. Identify dataset: ['core-production']otel.otel_logs_v1, ['core-tracing']otel.otel_traces_v2
  2. Parse APL query: Identify filters, extends, summarize, grouping
  3. Map fields: _timeTimestamp, messageBody, service_nameServiceName
  4. Translate functions: replace_regexreplaceRegexpAll, make_setgroupArray, etc.
  5. Build CTE: If extends are present, wrap computed fields in a WITH extended AS (...) clause
  6. Determine panel type: Time series if bin_auto/bin present, table otherwise
  7. Generate panel YAML: Use the template above with the translated query
  8. Add to dashboard: Insert panel into grafana/resources/Dashboard/<id>.yaml
  9. Deploy: Run grafanactl resources push dashboards/<id>

Important Notes

  • Cast to string for labels: Any GROUP BY column must be cast with toString() for Grafana to treat it as a label, not a metric
  • Time filter required: Always include $__timeFilter(Timestamp) as the first WHERE condition
  • Time series structure: First column must be $__timeInterval(Timestamp) as time, must GROUP BY and ORDER BY time
  • Regex escaping: In YAML, backslashes need double escaping: \\d+ becomes \\\\d+ in rawSql

Quick Reference

APL → SQL Operators

APL ClickHouse SQL
extend field = expr WITH cte AS (SELECT *, expr AS field ...)
where condition WHERE condition
summarize count() by field SELECT count() ... GROUP BY field
summarize dcount(field) SELECT uniq(field)
summarize avg(field) SELECT avg(field)
bin_auto(_time) $__timeInterval(Timestamp)
bin(_time, 1h) $__timeInterval(Timestamp) or toStartOfHour(Timestamp)
_time > ago(1h) $__timeFilter(Timestamp)
contains LIKE '%value%'
iff(cond, true, false) if(cond, true, false)
top N by field ORDER BY field DESC LIMIT N
replace_regex(pattern, repl, field) replaceRegexpAll(field, pattern, repl)
make_set(field) groupArray(DISTINCT toString(field))
make_set(field, 1) any(field)
make_set(field, N) arraySlice(groupArray(DISTINCT field), 1, N)

Grafana Query Type Selection

Query Pattern Grafana Query Type
GROUP BY $__timeInterval(Timestamp) Time Series (default for charts)
Raw logs with Body, SeverityText Logs
Trace spans with TraceId, SpanId Traces
Aggregation without time bucketing Table
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment