Metrics
Elasticsearch is a powerful search and analytics engine for various data types. Monitoring its metrics is vital for maintaining performance, stability, and reliability. The following is a list of essential Elasticsearch metrics in PDS. Understanding these metrics will help administrators optimize performance, troubleshoot issues, and ensure the Elasticsearch cluster runs smoothly.
For Elasticsearch deployment, the data service metrics are accessible on port 9114.
Access metrics
Below is a step-by-step guide on how to access Elasticsearch metrics for PDS deployments:
-
Identify the Elasticsearch pod running in your namespace:
kubectl get pods -n <your-namespace>
Look for the pod name that corresponds to your Elasticsearch instance or its sidecar exporter.
-
Port-forward from your local machine’s port 9114 to the pod’s port 9114:
kubectl port-forward -n <your-namespace> <elasticsearch-pod-name> 9114:9114
-
Open a browser or use
curl
to go tohttp://localhost:9114/metrics
.You should see a text-based Prometheus metrics output specific to Elasticsearch.
-
Check for the service exposing the Elasticsearch exporter. for example,
<release-name>-elasticsearch-exporter
:kubectl get svc -n <your-namespace>
-
Access the metrics:
-
If NodePort, note
<nodeport>
:http://<node-ip>:<nodeport>/metrics
-
If LoadBalancer, note
<loadbalancer-ip>
:http://<loadbalancer-ip>:9114/metrics
-
-
Verify metrics:
-
Using curl:
curl http://<host>:9114/metrics
Replace
<host>
with either localhost (if using port-forward),<node-ip>
(NodePort), or<loadbalancer-ip>
(LoadBalancer). -
Prometheus UI:
In Prometheus, navigate to the Expression browser and search for metrics beginning with
elasticsearch_
or similar Elasticsearch-related prefixes to confirm they are being scraped. -
Grafana or other dashboards:
If you have Grafana connected to Prometheus, open your dashboard. Check that Elasticsearch metrics (those starting with
elasticsearch_
) are being ingested and displayed.
-
- Ensure that any NetworkPolicies or firewall rules allow inbound traffic on port 9114 if you plan to expose it externally.
- Metrics naming conventions can vary depending on the Elasticsearch exporter version. Generally, look for prefixes like
elasticsearch_
.
Elasticsearch metrics
Metric name | Description | Metric type |
---|---|---|
elasticsearch_breakers_estimated_size_bytes | Estimated size in bytes of breaker | gauge |
elasticsearch_breakers_limit_size_bytes | Limit size in bytes for breaker | gauge |
elasticsearch_breakers_overhead | Overhead factor used by Elasticsearch breakers. | counter |
elasticsearch_breakers_tripped | tripped for breaker | counter |
elasticsearch_cluster_health_active_primary_shards | The number of primary shards in your cluster. This is an aggregate total across all indices. | gauge |
elasticsearch_cluster_health_active_shards | Aggregate total of all shards across all indices, which includes replica shards. | gauge |
elasticsearch_cluster_health_delayed_unassigned_shards | Shards delayed to reduce reallocation overhead | gauge |
elasticsearch_cluster_health_initializing_shards | Count of shards that are being freshly created. | gauge |
elasticsearch_cluster_health_number_of_data_nodes | Number of data nodes in the cluster. | gauge |
elasticsearch_cluster_health_number_of_in_flight_fetch | The number of ongoing shard info requests. | gauge |
elasticsearch_cluster_health_number_of_nodes | Number of nodes in the cluster. | gauge |
elasticsearch_cluster_health_number_of_pending_tasks | Cluster level changes which have not yet been executed | gauge |
elasticsearch_cluster_health_task_max_waiting_in_queue_millis | Max time in millis that a task is waiting in queue. | gauge |
elasticsearch_cluster_health_relocating_shards | The number of shards that are currently moving from one node to another node. | gauge |
elasticsearch_cluster_health_status | Whether all primary and replica shards are allocated. | gauge |
elasticsearch_cluster_health_unassigned_shards | The number of shards that exist in the cluster state, but cannot be found in the cluster itself. | gauge |
elasticsearch_clusterinfo_last_retrieval_failure_ts | Timestamp of the most recent failure when retrieving cluster information. | gauge |
elasticsearch_filesystem_data_available_bytes | Available space on block device in bytes | gauge |
elasticsearch_filesystem_data_free_bytes | Free space on block device in bytes | gauge |
elasticsearch_filesystem_data_size_bytes | Size of block device in bytes | gauge |
elasticsearch_filesystem_io_stats_device_operations_count | Count of disk operations | counter |
elasticsearch_filesystem_io_stats_device_read_operations_count | Count of disk read operations | counter |
elasticsearch_filesystem_io_stats_device_write_operations_count | Count of disk write operations | counter |
elasticsearch_filesystem_io_stats_device_read_size_kilobytes_sum | Total kilobytes read from disk | counter |
elasticsearch_filesystem_io_stats_device_write_size_kilobytes_sum | Total kilobytes written to disk | counter |
elasticsearch_indexing_pressure_current_all_in_bytes | Current total indexing pressure in bytes. | gauge |
elasticsearch_indexing_pressure_limit_in_bytes | Maximum allowed indexing pressure in bytes. | gauge |
elasticsearch_indices_completion_size_in_bytes | Size in bytes of index completion operations. | counter |
elasticsearch_indices_docs | Count of documents on this node | gauge |
elasticsearch_indices_docs_deleted | Count of deleted documents on this node | gauge |
elasticsearch_indices_fielddata_evictions | Evictions from field data | counter |
elasticsearch_indices_fielddata_memory_size_bytes | Field data cache memory usage in bytes | gauge |
elasticsearch_indices_filter_cache_evictions | Evictions from filter cache | counter |
elasticsearch_indices_filter_cache_memory_size_bytes | Filter cache memory usage in bytes | gauge |
elasticsearch_indices_flush_time_seconds | Cumulative flush time in seconds | counter |
elasticsearch_indices_flush_total | Total flushes | counter |
elasticsearch_indices_get_exists_time_seconds | Total time get exists in seconds | counter |
elasticsearch_indices_get_exists_total | Total get exists operations | counter |
elasticsearch_indices_get_missing_time_seconds | Total time of get missing in seconds | counter |
elasticsearch_indices_get_missing_total | Total get missing | counter |
elasticsearch_indices_get_time_seconds | Total get time in seconds | counter |
elasticsearch_indices_get_total | Total get | counter |
elasticsearch_indices_indexing_delete_time_seconds_total | Total time indexing delete in seconds | counter |
elasticsearch_indices_indexing_delete_total | Total indexing deletes | counter |
elasticsearch_indices_indexing_index_time_seconds_total | Cumulative index time in seconds | counter |
elasticsearch_indices_indexing_index_total | Total index calls | counter |
elasticsearch_indices_indexing_is_throttled | Indicates if indexing is throttled (boolean metric). | gauge |
elasticsearch_indices_indexing_throttle_time_seconds_total | Cumulative seconds of index throttling. | counter |
elasticsearch_indices_merges_current | Number of ongoing merge operations. | gauge |
elasticsearch_indices_merges_current_size_in_bytes | Total size in bytes of ongoing merges. | gauge |
elasticsearch_indices_merges_docs_total | Cumulative docs merged | counter |
elasticsearch_indices_merges_total | Total merges | counter |
elasticsearch_indices_merges_total_size_bytes_total | Total merge size in bytes | counter |
elasticsearch_indices_merges_total_throttled_time_seconds_total | Total seconds merges were throttled. | counter |
elasticsearch_indices_merges_total_time_seconds_total | Total time spent merging in seconds | counter |
elasticsearch_indices_query_cache_cache_total | Count of query cache | counter |
elasticsearch_indices_query_cache_cache_size | Size of query cache | gauge |
elasticsearch_indices_query_cache_count | Count of query cache hit/miss | counter |
elasticsearch_indices_query_cache_evictions | Evictions from query cache | counter |
elasticsearch_indices_query_cache_memory_size_bytes | Query cache memory usage in bytes | gauge |
elasticsearch_indices_query_cache_total | Size of query cache total | counter |
elasticsearch_indices_query_miss_count | Number of query misses in Elasticsearch indices. | counter |
elasticsearch_indices_refresh_time_seconds_total | Total time spent refreshing in seconds | counter |
elasticsearch_indices_refresh_total | Total refreshes | counter |
elasticsearch_indices_request_cache_count | Count of request cache hit/miss | counter |
elasticsearch_indices_request_cache_evictions | Evictions from request cache | counter |
elasticsearch_indices_request_cache_memory_size_bytes | Request cache memory usage in bytes | gauge |
elasticsearch_indices_request_miss_count | Number of request misses in Elasticsearch indices. | counter |
elasticsearch_indices_search_fetch_time_seconds | Total search fetch time in seconds | counter |
elasticsearch_indices_search_fetch_total | Total number of fetches | counter |
elasticsearch_indices_search_query_time_seconds | Total search query time in seconds | counter |
elasticsearch_indices_search_query_total | Total number of queries | counter |
elasticsearch_indices_search_scroll_time_seconds | Total seconds spent in search scroll operations. | counter |
elasticsearch_indices_search_scroll_total | Total number of search scroll operations. | counter |
elasticsearch_indices_search_suggest_time_seconds | Total seconds spent generating search suggestions. | counter |
elasticsearch_indices_search_suggest_total | Total number of search suggestion requests. | counter |
elasticsearch_indices_segments_count | Count of index segments on this node | gauge |
elasticsearch_indices_segments_doc_values_memory_in_bytes | Bytes of doc values memory in segments. | gauge |
elasticsearch_indices_segments_fixed_bit_set_memory_in_bytes | Bytes used by fixed bit sets in segments. | gauge |
elasticsearch_indices_segments_index_writer_memory_in_bytes | Bytes used by the index writer. | gauge |
elasticsearch_indices_segments_memory_bytes | Current memory size of segments in bytes | gauge |
elasticsearch_indices_segments_norms_memory_in_bytes | Bytes of norms data in segments. | gauge |
elasticsearch_indices_segments_points_memory_in_bytes | Bytes of points data in segments. | gauge |
elasticsearch_indices_segments_stored_fields_memory_in_bytes | Bytes of stored fields in segments. | gauge |
elasticsearch_indices_segments_term_vectors_memory_in_bytes | Bytes of term vectors in segments. | gauge |
elasticsearch_indices_segments_terms_memory_in_bytes | Bytes of terms data in segments. | gauge |
elasticsearch_indices_segments_version_map_memory_in_bytes | Bytes used by the version map in segments. | gauge |
elasticsearch_indices_store_size_bytes | Current size of stored index data in bytes | gauge |
elasticsearch_indices_store_throttle_time_seconds_total | Throttle time for index store in seconds | counter |
elasticsearch_indices_translog_operations | Total translog operations | counter |
elasticsearch_indices_translog_size_in_bytes | Total translog size in bytes | gauge |
elasticsearch_indices_warmer_time_seconds_total | Total warmer time in seconds | counter |
elasticsearch_indices_warmer_total | Total warmer count | counter |
elasticsearch_jvm_buffer_pool_used_bytes | Used buffer pool size in JVM, in bytes. | gauge |
elasticsearch_jvm_gc_collection_seconds_count | Count of JVM GC runs | counter |
elasticsearch_jvm_gc_collection_seconds_sum | GC run time in seconds | counter |
elasticsearch_jvm_memory_committed_bytes | JVM memory currently committed by area | gauge |
elasticsearch_jvm_memory_max_bytes | JVM memory max | gauge |
elasticsearch_jvm_memory_used_bytes | JVM memory currently used by area | gauge |
elasticsearch_jvm_memory_pool_used_bytes | JVM memory currently used by pool | gauge |
elasticsearch_jvm_memory_pool_max_bytes | JVM memory max by pool | gauge |
elasticsearch_jvm_memory_pool_peak_used_bytes | JVM memory peak used by pool | counter |
elasticsearch_jvm_memory_pool_peak_max_bytes | JVM memory peak max by pool | counter |
elasticsearch_jvm_uptime_seconds | Elasticsearch JVM uptime in seconds. | gauge |
elasticsearch_node_stats_json_parse_failures | Count of failed JSON parse attempts in node stats. | counter |
elasticsearch_node_stats_total_scrapes | Total number of node stats scrapes. | counter |
elasticsearch_node_stats_up | Indicates if node stats scrape was successful (boolean metric). | gauge |
elasticsearch_nodes_roles | Roles assigned to each Elasticsearch node. | gauge |
elasticsearch_os_cpu_percent | Percent CPU used by the OS | gauge |
elasticsearch_os_load1 | Shortterm load average | gauge |
elasticsearch_os_load5 | Midterm load average | gauge |
elasticsearch_os_load15 | Longterm load average | gauge |
elasticsearch_os_mem_actual_free_bytes | Actual free memory in bytes on the OS. | gauge |
elasticsearch_os_mem_actual_used_bytes | Actual used memory in bytes on the OS. | gauge |
elasticsearch_process_cpu_percent | Percent CPU used by process | gauge |
elasticsearch_process_cpu_seconds_total | Process CPU time in seconds | counter |
elasticsearch_process_mem_resident_size_bytes | Resident memory in use by process in bytes | gauge |
elasticsearch_process_mem_share_size_bytes | Shared memory in use by process in bytes | gauge |
elasticsearch_process_mem_virtual_size_bytes | Total virtual memory used in bytes | gauge |
elasticsearch_process_open_files_count | Open file descriptors | gauge |
elasticsearch_scrape_duration_seconds | Duration of the scrape process in seconds. | gauge |
elasticsearch_scrape_success | Indicates whether the scrape succeeded (boolean metric). | gauge |
elasticsearch_thread_pool_active_count | Thread Pool threads active | gauge |
elasticsearch_thread_pool_completed_count | Thread Pool operations completed | counter |
elasticsearch_thread_pool_largest_count | Thread Pool largest threads count | gauge |
elasticsearch_thread_pool_queue_count | Thread Pool operations queued | gauge |
elasticsearch_thread_pool_rejected_count | Thread Pool operations rejected | counter |
elasticsearch_thread_pool_threads_count | Thread Pool current threads count | gauge |
elasticsearch_transport_rx_packets_total | Count of packets received | counter |
elasticsearch_transport_rx_size_bytes_total | Total number of bytes received | counter |
elasticsearch_transport_tx_packets_total | Count of packets sent | counter |
elasticsearch_transport_tx_size_bytes_total | Total number of bytes sent | counter |
elasticsearch_version | Version information for Elasticsearch. | gauge |
go_gc_duration_seconds | Duration of Go garbage collection in seconds. | summary |
go_goroutines | Number of active goroutines in Go. | gauge |
go_info | Metadata about the Go environment. | gauge |
go_memstats_alloc_bytes | Current bytes of allocated heap memory in Go. | gauge |
go_memstats_alloc_bytes_total | Total bytes allocated for Go objects since start. | counter |
go_memstats_buck_hash_sys_bytes | Bytes of memory in use by Go’s buck hash structures. | gauge |
go_memstats_frees_total | Total number of freed Go objects. | counter |
go_memstats_gc_sys_bytes | Bytes of memory obtained from the OS for Go garbage collection. | gauge |
go_memstats_heap_alloc_bytes | Allocated bytes in Go’s heap. | gauge |
go_memstats_heap_idle_bytes | Bytes in idle (unused) Go heap. | gauge |
go_memstats_heap_inuse_bytes | Bytes in active Go heap. | gauge |
go_memstats_heap_objects | Number of allocated objects in Go’s heap. | gauge |
go_memstats_heap_released_bytes | Bytes of Go heap memory returned to the OS. | gauge |
go_memstats_heap_sys_bytes | Bytes of Go heap memory obtained from the OS. | gauge |
go_memstats_last_gc_time_seconds | Time of the last Go garbage collection (Unix timestamp). | gauge |
go_memstats_lookups_total | Number of pointer lookups in Go. | counter |
go_memstats_mallocs_total | Total number of malloc operations in Go. | counter |
go_memstats_mcache_inuse_bytes | Bytes of memory used by Go’s mcache structures. | gauge |
go_memstats_mcache_sys_bytes | Bytes of memory obtained from the OS for mcache. | gauge |
go_memstats_mspan_inuse_bytes | Bytes of memory used by Go’s mspan structures. | gauge |
go_memstats_mspan_sys_bytes | Bytes of memory obtained from the OS for mspan. | gauge |
go_memstats_next_gc_bytes | Target heap size of the next Go garbage collection. | gauge |
go_memstats_other_sys_bytes | Bytes used by other system allocations in Go. | gauge |
go_memstats_stack_inuse_bytes | Bytes of stack memory in active use by Go. | gauge |
go_memstats_stack_sys_bytes | Bytes of stack memory obtained from the OS. | gauge |
go_memstats_sys_bytes | Total bytes of memory obtained from the OS for Go. | gauge |
go_threads | Number of OS threads created by the Go runtime. | gauge |
process_cpu_seconds_total | Total CPU time consumed by the process in seconds. | counter |
process_max_fds | Maximum number of open file descriptors allowed. | gauge |
process_open_fds | Current number of open file descriptors. | gauge |
process_resident_memory_bytes | Resident memory size of the process in bytes. | gauge |
process_start_time_seconds | Start time of the process (Unix timestamp). | gauge |
process_virtual_memory_bytes | Virtual memory size of the process in bytes. | gauge |
process_virtual_memory_max_bytes | Maximum amount of virtual memory available to the process. | gauge |
promhttp_metric_handler_requests_in_flight | Number of current in-flight requests in Prometheus. | gauge |
promhttp_metric_handler_requests_total | Total number of requests handled by Prometheus. | counter |
elasticsearch_clusterinfo_up | Up metric for the cluster info collector | gauge |
elasticsearch_exporter_build_info | Build info for the Elasticsearch exporter. | gauge |