Introduction Elasticsearch Engineer
Stack Introduction
Elasticsearch Platform
Out-of-the-Box Solutions
- Elastic Observability
- Elastic Security
Build your own
- Elastic Search
Elasticsearch AI Platform
- Ingest and Secure Storage
- AI / ML and Search
- Visualization and Automation
Kibana
- Explore
- Visualize
- Engage
Elasticsearch
- Store
- Analyze
- Machine Learning
- Generative AI
Integrations
- Connect
- Collect
- Alert
Elasticsearch Data Journey
Collect, connect, and visualize your data from any source.
flowchart LR
subgraph Data
A[Data]
end
subgraph Ingest
B[Beats]
C[Logstash]
D[Elastic Agent<br>Integrations]
end
subgraph Store
E[Elasticsearch]
end
subgraph Visualize
F[Kibana]
end
A --> B & C & D
B --> C
B & C & D--> E
E --> F
Elasticsearch is a Document Store
- Elasticsearch is a distributed document store
- Documents are serialized JSON objects that are:
- stored in Elasticsearch under a unique Document ID
- distributed across the cluster and can be accessed immediately from any node
Kibana
- Kibana is a front-end app that sits on top of the Elastic Stack
- It provides search and data visualization capabilities for data in Elasticsearch
Exploring and Querying Data with Kibana
- Start with
Discover- Create a
data viewto access your data - Explore the
fieldsin your data - Examine popular
values - Use the
query barandfiltersto see subsets of your data
- Create a
Installation Options
Elastic Cloud
- Elastic Cloud Hosted
- Elastic Cloud Serverless
Elastic Self-Managed
- Elastic Stack
- Elastic Cloud on Kubernetes
- Elastic Cloud Enterprise
Index Operations
Documents are Indexed into an Index
- In Elasticsearch a document is indexed into an index
- An index:
- is a logical way of grouping data
- can be thought of as an optimized collection of documents
- is used as a verb and a noun
Index a Document: curl Example
- To create an index, send a request using POST that specifies:
index_name_docresourcedocument
- By default, Elasticsearch generates the ID for you
$ curl -X POST "localhost:9200/my_blogs/_doc" -H 'Content-Type: application/json' -d'
{
"title": "Fighting Ebola with Elastic",
"category": "Engineering",
"author": {
"first_name": "Emily",
"last_name": "Mosher"
} } '
Index a Document: Dev Tools > Console
- Console providing Elasticsearch & Kibana REST interaction
- User-friendly interface to create and submit requests
- View API docs
Index a Document: PUT vs. POST
- When you index a document using:
- PUT: you pass in a document ID with the request if the document ID already exists, the index will be updated and the _version incremented by 1
- POST: the document ID is automatically generated with a unique ID for the document
Request:
PUT my_blogs/_doc/6OCz5pEBqWhDYCLiWpe5
{
"title" : "Fighting Ebola with Elastic",
"category": "User Stories",
“Author” : {
“first name” : “Emily”,
“last name” : “Mosher”
}
}
Response:
{
"_index" : "my_blogs",
"_type" : "_doc",
"_id" : "6OCz5pEBqWhDYCLiWpe5",
"_version" : 2,
"result" : "updated",
...
}
Retrieve a Document
- Use a GET request with the document’s unique ID
Request:
GET my_blogs/_doc/6OCz5pEBqWhDYCLiWpe5
Response:
{
...
"_id" : "6OCz5pEBqWhDYCLiWpe5",
"_source": {
"title": "Fighting Ebola with Elastic",
"category": "User Stories",
"author": {
"first_name": "Emily",
"last_name": "Mosher"
}
Create a Document
- Index a new JSON document with the
_createresource- guarantees that the document is only indexed if it does not already exist
- can not be used to update an existing document
Request:
POST my_blogs/_create/4
{
"title" : "Fighting Ebola with Elastic",
"category": "Engineering",
“Author” : {
“first name” : “Emily”,
“last name” : “Mosher”
}
}
Response:
{
"_index" : "my_blogs",
"_type" : "_doc",
"_id" : "4",
"_version" : 1,
"result" : "created",
...
}
Update Specific Fields
- Use the
_updateresource to modify specific fields in a document- add the
doccontext _versionis incremented by 1
- add the
Request:
POST my_blogs/_update/4
{
"doc" : {
"category": "User Stories"
}
}
Response:
{
"_index" : "my_blogs",
"_type" : "_doc",
"_id" : "4",
"_version" : 2,
"result" : "updated",
...
}
Delete a Document
- Use DELETE to delete an indexed document
Request:
DELETE my_blogs/_doc/4
Response:
{
"_index": "my_blogs",
"_type": "_doc",
"_id": "4",
"_version": 3,
"result": "deleted",
"_shards": {
"total": 2,
"successful": 2,
"failed": 0
},
"_seq_no": 3,
"_primary_term": 1
}
Cheaper in Bulk
- Use the BULK API to index many documents in a single API call
- increases the indexing speed
- useful if you need to index a data stream such as log events
- Four actions
- create, index, update, and delete
- The response is a large JSON structure
- returns individual results of each action that was performed
- failure of a single action does not affect the remaining actions
Bulk API Example
- Newline delimited JSON (NDJSON) structure
- increases the indexing speed
- index, create, update actions expect a newline followed by a JSON object on a single line
Example:
POST comments/_bulk
{"index" : {}}
{"title": "Tuning Go Apps with Metricbeat", "category": "Engineering"}
{"index" : {"_id":4}}
{"title": "Elasticsearch Released", "category": "Releases"}
{"create" : {"_id":5}}
{"title": "Searching for needle in", "category": "User Stories"}
{"update" : {"_id":2}}
{"doc": {"title": "Searching for needle in haystack"}}
{"delete": {"_id":1}}
Upload a File in Kibana
- Quickly upload a log file or delimited CSV, TSV, or JSON file
- used for initial exploration of your data
- not intended as part of production process
Understanding Data
- Most data can be categorized into:
- (relatively) static data: data set that may grow or change, but slowly or infrequently, like a catalog or inventory of items
- times series data: event data associated with a moment in time that (usually) grows rapidly, like log files or metrics
- Elastic Stack works well with either type of data
Searching Data
Different Use Cases
- Search
- Typically uses human generated, error-prone data
- Often uses free-form text fields for anybody to type anything
- Observability:
- Need to analyze HUGE amounts of data in real-time
- Ingest load can vary
- Security:
- Collect data from MANY different sources with different data formats
Query Languages
Several to choose from:
- KQL
- Lucene
- ES|QL
- Query DSL
- Elasticsearch SQL
- EQL
Basic Structure of Search
- In Elasticsearch, search breaks down into two basic parts:
- Queries
- Which documents meet a specific set of criteria?
- Aggregations
- Tell me something about a group of documents
- Queries
Using Query DSL
- Send a request using the search API:
- GET <index>/_search
match_all query
- is the default request for the search API
- Every document is a hit for this search
- Elasticsearch returns 10 hits by default
Aggregations
- Visualizations on a Kibana dashboard are powered by aggregations
Aggregating Data
Request:
GET blogs/_search
{
"aggs": {
"first_blog": {
"min": {
"field": "publish_date"
}
}
}
}
Response:
{
...
"aggregations": {
"first_blog": {
"value": 1265658554000,
"value_as_string": "2010-02-08T19:49:14.000Z"
}
}
}
ES|QL
- A piped query language that delivers advanced search capabilities
- Streamlines searching, aggregating, and visualizing large data sets
- Brings together the capabilties of multiple languages (Query DSL, KQL, EQL, Lucene, SQL, …)
- Powered by a dedicated query engine with concurrent processing
- Designed for performance
- Enhances speed and efficiency irrespective of data source and structure
Query
- Composed of a series of commands chained together by pipes
Running an ES|QL Query in Dev Tools
- Wrap the query in a POST request to the query API
- By default, results are returned as a JSON object
- Use the
formatoption to retrieve the results in alternative formats
Request:
POST /_query
{
"query": "FROM blogs | KEEP publish_date, authors.full_name | SORT (publish_date)"
}
Request with format:
POST /_query?format=csv
{
"query": """
FROM blogs
| KEEP publish_date, authors.first_name, authors.last_name
| SORT (publish_date)
“””
}
Running an ES|QL Query in Discover
- Select
Language ES|QLin theData Viewpull-down - Expand the query editor to enter multiline commands
- Click the
Run buttonor typecommand/alt-Enterto run the query
Examples
FROM blogs
| KEEP publish_date, authors.first_name, authors.last_name
FROM blogs
| WHERE authors.last_name.keyword == "Kearns"
| KEEP publish_date, authors.first_name, authors.last_name
FROM blogs
| STATS count = COUNT(*) BY authors.last_name.keyword
| SORT count DESC
| LIMIT 10