jq + yq + dasel: shell-friendly JSON, YAML, XML transformations

Install

# Debian / Ubuntu
sudo apt install jq

# yq (Go-based; the Mike Farah fork; not the Python yq)
sudo curl -L -o /usr/local/bin/yq \
    "https://github.com/mikefarah/yq/releases/latest/download/yq_linux_amd64"
sudo chmod +x /usr/local/bin/yq

# dasel
sudo curl -L -o /usr/local/bin/dasel \
    "https://github.com/tomwright/dasel/releases/latest/download/dasel_linux_amd64"
sudo chmod +x /usr/local/bin/dasel

# Or via mise (see /tutorials/mise-polyglot-runtime-versions.html)
mise use -g jq yq dasel

jq: the most-useful patterns

# Identity (pretty-print JSON)
curl -s api.example.com/users | jq .

# Pick a field
echo '{"user":{"name":"alice","age":42}}' | jq '.user.name'
# "alice"

# Pick multiple fields, output a new object
... | jq '{name: .user.name, age: .user.age}'

# Array element by index
echo '[1,2,3,4]' | jq '.[2]'

# Map over an array
... | jq '[.[] | .name]'                       # ["alice", "bob", ...]

# Filter array elements
... | jq '[.[] | select(.age > 30)]'

# Combine: select + project
curl -s api.example.com/users \
    | jq '[.[] | select(.status == "active") | {id, name, email}]'

# Aggregate
echo '[1,2,3,4,5]' | jq 'add'                  # 15
echo '[1,2,3,4,5]' | jq 'length'               # 5
echo '[1,2,3,4,5]' | jq 'max'                  # 5

# Group by a field
... | jq 'group_by(.country) | map({country: .[0].country, count: length})'

# Transform array of objects to CSV
... | jq -r '.[] | [.id, .name, .email] | @csv'

# Sort
... | jq 'sort_by(.created_at) | reverse | .[0:10]'

-r outputs raw strings (no quotes); -c compact output; --arg name value passes shell variables in; --slurp reads multiple JSON values into an array.

jq in practice: real one-liners

# List all running container names
docker ps --format '{{json .}}' | jq -r '.Names'

# Get the IP of the first AWS instance with a tag
aws ec2 describe-instances --output json \
    | jq -r '.Reservations[].Instances[] | select(.Tags[]? | .Key=="Name" and .Value=="web") | .PublicIpAddress'

# Top 10 slowest endpoints from a JSON log file
cat nginx-access.log.json | jq -r '.uri' | sort | uniq -c | sort -rn | head

# Pull just the body of a Kubernetes ConfigMap as plain text
kubectl get cm nginx-config -o json | jq -r '.data."nginx.conf"'

yq: jq syntax for YAML

yq mirrors jq's filter language but operates on YAML. Same patterns apply:

# Pretty-print
yq . config.yaml

# Read a value
yq '.services.app.image' docker-compose.yml

# Edit in place — change a Kubernetes deployment's image
yq -i '.spec.template.spec.containers[0].image = "myapp:v2.0"' deployment.yaml

# Add a new value
yq -i '.spec.replicas = 3' deployment.yaml

# Convert YAML to JSON (or back)
yq -o=json config.yaml > config.json
yq -P -o=yaml config.json > config.yaml         # -P = pretty

# Merge two YAML files
yq eval-all 'select(fileIndex==0) * select(fileIndex==1)' base.yaml override.yaml

# Multi-document YAML — extract all Deployment kinds
yq '. | select(.kind == "Deployment")' all-resources.yaml

yq processes multi-document YAML (the kind Kubernetes manifests use, with --- separators) natively. jq can't.

dasel: the universal converter

dasel speaks JSON, YAML, TOML, XML, and CSV with one syntax. The killer use case: converting between formats and querying weird formats jq/yq can't.

# JSON ↔ YAML ↔ TOML ↔ XML
dasel -f config.json -r json -w yaml
dasel -f Cargo.toml -r toml -w json '.dependencies'

# Read from XML
dasel -f response.xml -r xml '.root.user.email'

# Edit in place across formats
dasel put -f pyproject.toml -r toml -t string -v "1.2.3" '.project.version'
dasel put -f deployment.yaml -r yaml -t string -v "myapp:v2.0" \
    '.spec.template.spec.containers.[0].image'

# Validate a JSON file
dasel validate -f config.json -r json

dasel's selector syntax differs from jq's (uses . + numeric indexes like .[0]), but it composes for cross-format work where jq + yq would need multiple tools chained.

Working with HTTP APIs

The standard pattern for any REST API:

# Fetch + extract
curl -s -H "Authorization: Bearer $TOKEN" \
    https://api.github.com/repos/cli/cli/issues?state=open \
    | jq -r '.[] | "\(.number)\t\(.title)"'

# Paginated fetch + flatten
seq 1 5 | while read page; do
    curl -s "https://api.github.com/users?per_page=100&page=$page"
done | jq -s 'flatten'

# POST with derived JSON
curl -X POST -H "Content-Type: application/json" \
    -d "$(jq -n --arg name "$NAME" --arg email "$EMAIL" \
        '{name: $name, email: $email, role: "viewer"}')" \
    https://api.example.com/users

Editing Kubernetes manifests programmatically

# Bump every container's image tag across a multi-doc manifest
yq -i '(.. | select(has("image")) | .image) |= sub(":v.*"; ":v2.0")' resources.yaml

# Add a label to every resource
yq -i '.metadata.labels.tier = "prod"' resources.yaml

# Convert all YAML files in a directory to be ConfigMap-loaded
for f in config/*.conf; do
    yq -n --arg name "$(basename $f .conf)" --arg content "$(cat $f)" \
        '{
            apiVersion: "v1", kind: "ConfigMap",
            metadata: {name: $name},
            data: {($name + ".conf"): $content}
        }'
done

CSV the easy way

# JSON array of objects → CSV with headers
jq -r '(.[0] | keys_unsorted) as $keys
       | $keys, (.[] | [.[$keys[]]]) | @csv' users.json > users.csv

# CSV → JSON
dasel -f users.csv -r csv -w json

# Or via Miller (mlr), which is the third-tier upgrade for serious CSV work
mlr --c2j cat users.csv
mlr --csv filter '$status == "active"' users.csv
mlr --csv stats1 -a mean,p95 -f duration_ms then sort -nr duration_mean users.csv

Miller is purpose-built for CSV/TSV/JSON tabular data; for serious data-wrangling beyond what jq does cleanly, install mlr as well.

htmlq: jq for HTML

For scraping — the missing fourth tool. htmlq uses CSS selectors:

sudo apt install htmlq
# Or: cargo install htmlq

curl -s https://example.com | htmlq 'h1' --text
curl -s https://example.com | htmlq 'a' --attribute href

The 10 patterns to actually memorize

jq '.field' — access a field
jq '.[]' — iterate array elements
jq '.[] | select(.field == "value")' — filter
jq '[.[] | .field]' — pluck a field across an array
jq -r '.field' — raw output (no quotes)
jq -s 'flatten' — combine multiple JSON inputs
jq 'length' / jq 'add' — basic aggregations
yq -i '.path.to.field = "value"' — in-place YAML edit
yq -o=json file.yaml — format conversion
jq -r '.[] | [.a, .b] | @csv' — CSV output

That covers the vast majority of day-to-day use. The full languages are deeper (jq has variables, functions, modules), but those ten patterns get the shell-side of any "I need to extract / transform this structured data" job done.