Skip to content

Elasticsearch Query Workflows with DocKit

This guide covers day-to-day Elasticsearch work: writing queries that don't rot, saving them for later, and moving workflows across environments without losing your mind. If you just need the feature list or download links, head over to the Elasticsearch GUI client page.

JSON5 and inline comments — write queries like a developer

Elasticsearch DSL is expressive, but strict JSON gets noisy once a query has a few nested bool blocks and aggregations. JSON5 lets you drop the quote discipline and add inline // comments. This is how I leave context about why a specific filter or boost exists.

If you save a query today and revisit it next week, those comments are usually the only thing that'll save you. They explain intent directly next to the clause that matters instead of forcing you to hunt through separate notes or Jira tickets.

Strict JSON query:

json
GET /logs-prod-*/_search
{
  "size": 0,
  "query": {
    "bool": {
      "must": [
        {
          "range": {
            "@timestamp": {
              "gte": "now-24h",
              "lte": "now"
            }
          }
        },
        {
          "bool": {
            "should": [
              {
                "term": {
                  "log.level": "error"
                }
              },
              {
                "match_phrase": {
                  "message": "connection refused"
                }
              }
            ],
            "minimum_should_match": 1
          }
        }
      ],
      "filter": [
        {
          "term": {
            "service.name": "checkout-api"
          }
        },
        {
          "terms": {
            "deployment.environment": [
              "staging",
              "production"
            ]
          }
        }
      ]
    }
  },
  "aggs": {
    "by_environment": {
      "terms": {
        "field": "deployment.environment",
        "size": 5
      },
      "aggs": {
        "top_error_messages": {
          "terms": {
            "field": "error.code.keyword",
            "size": 10
          }
        }
      }
    }
  }
}

The same request in DocKit using JSON5:

javascript
GET /logs-prod-*/_search
{
  size: 0,
  query: {
    bool: {
      must: [
        {
          range: {
            '@timestamp': {
              gte: 'now-24h',
              lte: 'now',
            },
          },
        },
        {
          bool: {
            should: [
              {term: {'log.level': 'error'}},
              {match_phrase: {message: 'connection refused'}},
            ],
            minimum_should_match: 1,
          },
        },
      ],
      filter: [
        {term: {'service.name': 'checkout-api'}},
        {
          terms: {
            'deployment.environment': [
              'staging',
              'production',
            ],
          },
        },
      ],
    },
  },
  aggs: {
    by_environment: {
      terms: {
        field: 'deployment.environment',
        size: 5,
      },
      aggs: {
        // Keep the top codes next to the environment split.
        top_error_messages: {
          terms: {
            field: 'error.code.keyword',
            size: 10,
          },
        },
      },
    },
  },
}

JSON5 makes it much easier to modify and review queries without the bracket-matching headache.

Log analysis workflow

For log work, the pattern I use is simple: connect, save the query with a descriptive name, and rerun it every time the incident returns.

Open the cluster that owns the logs. In DocKit, select a connection profile for production or staging and jump into Dev Tools. If your team uses daily indices, keep your query broad enough to survive the next rollover.

Write the error query for the time window you need:

json
GET /logs-*/_search
{
  "query": {
    "bool": {
      "must": [
        {"match": {"level": "ERROR"}},
        {"range": {"@timestamp": {"gte": "now-1h"}}}
      ]
    }
  },
  "sort": [{"@timestamp": "desc"}]
}

Check the first page of hits. If the results look right, save it as something explicit like "Checkout API Errors - last 1h" rather than just "errors."

Next time that same service trips an alert, don't rebuild the DSL from memory. Load the saved query, adjust the range, and hit run. This keeps your query logic stable while the incident window moves.

Saved operational queries usually live longer than you expect, especially if you switch to JSON5 and annotate why certain filters are there.

Relevance tuning is tedious because you're running the same search dozens of times with tiny scoring changes. The baseline is usually a multi_match search:

json
GET /products/_search
{
  "query": {
    "multi_match": {
      "query": "wireless headphones",
      "fields": ["title^3", "description", "brand^2"]
    }
  }
}

When you start wondering why a title match outranked a strong brand match, use the AI assistant as a query reviewer. Ask it to explain the scoring model and which fields are dominating the ranking.

Once you have a handle on the behavior, make a deliberate second pass:

json
GET /products/_search
{
  "query": {
    "multi_match": {
      "query": "wireless headphones",
      "type": "best_fields",
      "fields": [
        "title^5",
        "brand^3",
        "category^2",
        "description"
      ],
      "tie_breaker": 0.2
    }
  }
}

The assistant can explain what best_fields plus a tie_breaker will actually do before you run it across a million-item catalog. It's much faster than tab-switching to the official docs for a sanity check.

Run the baseline, inspect top hits, ask the assistant why the ordering looks that way, then tighten the boosts. You're still making the decisions; the assistant just handles the "why is Elasticsearch doing this?" part.

Multi-environment query testing

I use this mostly to avoid those "it worked in staging" conversations. Save connection profiles for dev, staging, and prod up front. Keep auth and endpoints in the profile so the query text never needs environment-specific hacks.

Treat the query as the constant and the environment as the variable. Switch between them, run the exact same request, and compare. You're looking for hit counts, result ordering, and mapping differences that might break the query in prod.

This makes it obvious when a mismatch comes from data drift or a mapping update that didn't finish. If staging returns 123 docs and prod returns 0, you know the issue isn't your query—it's the data in production.

Index migration with reindex

Reindex operations are simple in principle but easy to mess up. You need the request, then a way to watch progress without getting bored.

json
POST /_reindex
{
  "source": {
    "index": "old-products"
  },
  "dest": {
    "index": "new-products"
  },
  "script": {
    "source": "ctx._source.price = ctx._source.price * 1.1"
  }
}

Run the request in Dev Tools, then switch to the cluster management view to watch the task. You can track document counts and see if the target index is actually filling at the rate you expect.

If the reindex stalls, you'll catch it from the same tool where you launched it. No need to poll a separate shell every ten minutes.

When to use the AI assistant

I use it when the bottleneck is understanding, not just typing.

If an aggregation tree is structurally valid but impossible to reason about, ask the assistant to explain it before you touch it. If you know the result you need but the DSL is on the tip of your tongue, let it draft the first version and then edit it. And if a query is slow or returns garbage, ask it to point out broad filters or suspicious scoring logic.

If you already know the exact DSL you want, just write it. The assistant is best for shortening the feedback loop around the annoying parts of the stack.

Resources

Last updated: