Introduction Elasticsearch Engineer

Stack Introduction

Elasticsearch Platform

Out-of-the-Box Solutions

Elastic Observability
Elastic Security

Build your own

Elastic Search

Elasticsearch AI Platform

Ingest and Secure Storage
AI / ML and Search
Visualization and Automation

Kibana

Explore
Visualize
Engage

Elasticsearch

Store
Analyze
Machine Learning
Generative AI

Integrations

Connect
Collect
Alert

Elasticsearch Data Journey

Collect, connect, and visualize your data from any source.

flowchart LR

    subgraph Data
        A[Data]
    end

    subgraph Ingest
        B[Beats]
        C[Logstash]
        D[Elastic Agent<br>Integrations]
    end

    subgraph Store
        E[Elasticsearch]
    end

    subgraph Visualize
        F[Kibana]
    end

    A --> B & C & D
    B --> C
    B & C & D--> E
    E --> F

Elasticsearch is a Document Store

Elasticsearch is a distributed document store
Documents are serialized JSON objects that are:
- stored in Elasticsearch under a unique Document ID
- distributed across the cluster and can be accessed immediately from any node

Kibana

Kibana is a front-end app that sits on top of the Elastic Stack
It provides search and data visualization capabilities for data in Elasticsearch

Exploring and Querying Data with Kibana

Start with Discover
- Create a data view to access your data
- Explore the fields in your data
- Examine popular values
- Use the query bar and filters to see subsets of your data

Installation Options

Elastic Cloud

Elastic Cloud Hosted
Elastic Cloud Serverless

Elastic Self-Managed

Elastic Stack
Elastic Cloud on Kubernetes
Elastic Cloud Enterprise

Index Operations

Documents are Indexed into an Index

In Elasticsearch a document is indexed into an index
An index:
- is a logical way of grouping data
- can be thought of as an optimized collection of documents
- is used as a verb and a noun

Index a Document: curl Example

To create an index, send a request using POST that specifies:
- index_name
- _doc resource
- document
By default, Elasticsearch generates the ID for you

$ curl -X POST "localhost:9200/my_blogs/_doc" -H 'Content-Type: application/json' -d'
{
    "title": "Fighting Ebola with Elastic",
    "category": "Engineering",
    "author": {
        "first_name": "Emily",
        "last_name": "Mosher"
} } '

Index a Document: Dev Tools > Console

Console providing Elasticsearch & Kibana REST interaction
User-friendly interface to create and submit requests
View API docs

Index a Document: PUT vs. POST

When you index a document using:
- PUT: you pass in a document ID with the request if the document ID already exists, the index will be updated and the _version incremented by 1
- POST: the document ID is automatically generated with a unique ID for the document

Request:

PUT my_blogs/_doc/6OCz5pEBqWhDYCLiWpe5
{
    "title" : "Fighting Ebola with Elastic",
    "category": "User Stories",
    “Author” : {
        “first name” : “Emily”,
        “last name” : “Mosher”
        }
}

Response:

{
    "_index" : "my_blogs",
    "_type" : "_doc",
    "_id" : "6OCz5pEBqWhDYCLiWpe5",
    "_version" : 2,
    "result" : "updated",
    ...
}

Retrieve a Document

Use a GET request with the document’s unique ID

Request:

GET my_blogs/_doc/6OCz5pEBqWhDYCLiWpe5

Response:

{
    ...
    "_id" : "6OCz5pEBqWhDYCLiWpe5",
    "_source": {
        "title": "Fighting Ebola with Elastic",
        "category": "User Stories",
        "author": {
            "first_name": "Emily",
            "last_name": "Mosher"
        }

Create a Document

Index a new JSON document with the _create resource
- guarantees that the document is only indexed if it does not already exist
- can not be used to update an existing document

Request:

POST my_blogs/_create/4
{
    "title" : "Fighting Ebola with Elastic",
    "category": "Engineering",
    “Author” : {
        “first name” : “Emily”,
        “last name” : “Mosher”
        }
}

Response:

{
    "_index" : "my_blogs",
    "_type" : "_doc",
    "_id" : "4",
    "_version" : 1,
    "result" : "created",
    ...
}

Update Specific Fields

Use the _update resource to modify specific fields in a document
- add the doc context
- _version is incremented by 1

Request:

POST my_blogs/_update/4
{
    "doc" : {
        "category": "User Stories"
    }
}

Response:

{
    "_index" : "my_blogs",
    "_type" : "_doc",
    "_id" : "4",
    "_version" : 2,
    "result" : "updated",
    ...
}

Delete a Document

Use DELETE to delete an indexed document

Request:

DELETE my_blogs/_doc/4

Response:

{
"_index": "my_blogs",
    "_type": "_doc",
    "_id": "4",
    "_version": 3,
    "result": "deleted",
    "_shards": {
        "total": 2,
        "successful": 2,
        "failed": 0
    },
    "_seq_no": 3,
    "_primary_term": 1
}

Cheaper in Bulk

Use the BULK API to index many documents in a single API call
- increases the indexing speed
- useful if you need to index a data stream such as log events
Four actions
- create, index, update, and delete
The response is a large JSON structure
- returns individual results of each action that was performed
- failure of a single action does not affect the remaining actions

Bulk API Example

Newline delimited JSON (NDJSON) structure
- increases the indexing speed
- index, create, update actions expect a newline followed by a JSON object on a single line

Example:

POST comments/_bulk
{"index" : {}}
{"title": "Tuning Go Apps with Metricbeat", "category": "Engineering"}
{"index" : {"_id":4}}
{"title": "Elasticsearch Released", "category": "Releases"}
{"create" : {"_id":5}}
{"title": "Searching for needle in", "category": "User Stories"}
{"update" : {"_id":2}}
{"doc": {"title": "Searching for needle in haystack"}}
{"delete": {"_id":1}}

Upload a File in Kibana

Quickly upload a log file or delimited CSV, TSV, or JSON file
- used for initial exploration of your data
- not intended as part of production process

Understanding Data

Most data can be categorized into:
- (relatively) static data: data set that may grow or change, but slowly or infrequently, like a catalog or inventory of items
- times series data: event data associated with a moment in time that (usually) grows rapidly, like log files or metrics
Elastic Stack works well with either type of data

Searching Data

Different Use Cases

Search
- Typically uses human generated, error-prone data
- Often uses free-form text fields for anybody to type anything
Observability:
- Need to analyze HUGE amounts of data in real-time
- Ingest load can vary
Security:
- Collect data from MANY different sources with different data formats

Query Languages

Several to choose from:

KQL
Lucene
ES|QL
Query DSL
Elasticsearch SQL
EQL

Basic Structure of Search

In Elasticsearch, search breaks down into two basic parts:
- Queries
  - Which documents meet a specific set of criteria?
- Aggregations
  - Tell me something about a group of documents

Using Query DSL

Send a request using the search API:
- GET <index>/_search

match_all query

is the default request for the search API
- Every document is a hit for this search
- Elasticsearch returns 10 hits by default

Aggregations

Visualizations on a Kibana dashboard are powered by aggregations

Aggregating Data

Request:

GET blogs/_search
{
  "aggs": {
    "first_blog": {
      "min": {
        "field": "publish_date"
      }
    }
  }
}

Response:

{
  ...
  "aggregations": {
    "first_blog": {
      "value": 1265658554000,
      "value_as_string": "2010-02-08T19:49:14.000Z"
    }
  }
}

ES|QL

A piped query language that delivers advanced search capabilities
- Streamlines searching, aggregating, and visualizing large data sets
- Brings together the capabilties of multiple languages (Query DSL, KQL, EQL, Lucene, SQL, …)
Powered by a dedicated query engine with concurrent processing
- Designed for performance
- Enhances speed and efficiency irrespective of data source and structure

Query

Composed of a series of commands chained together by pipes

Running an ES|QL Query in Dev Tools

Wrap the query in a POST request to the query API
- By default, results are returned as a JSON object
- Use the format option to retrieve the results in alternative formats

Request:

POST /_query
{
"query": "FROM blogs | KEEP publish_date, authors.full_name | SORT (publish_date)"
}

Request with format:

POST /_query?format=csv
{
  "query": """
      FROM blogs
        | KEEP publish_date, authors.first_name, authors.last_name
        | SORT (publish_date)
  “””
}

Running an ES|QL Query in Discover

Select Language ES|QL in the Data View pull-down
Expand the query editor to enter multiline commands
Click the Run button or type command/alt-Enter to run the query

Examples

FROM blogs
| KEEP publish_date, authors.first_name, authors.last_name

FROM blogs
| WHERE authors.last_name.keyword == "Kearns"
| KEEP publish_date, authors.first_name, authors.last_name

FROM blogs
| STATS count = COUNT(*) BY authors.last_name.keyword
| SORT count DESC
| LIMIT 10

Keyboard shortcuts

Cybersecurity Notes