How do you optimize Elasticsearch indexing throughput?
Answer
Several techniques improve indexing throughput significantly. Use the Bulk API (POST /_bulk) to send multiple index/update/delete operations in a single request — aim for 5–15 MB per bulk request batch. Increase refresh interval: set refresh_interval: "30s" or -1 (disable) during initial bulk loads, then restore to "1s" — each refresh triggers a Lucene merge, so reducing frequency dramatically cuts write overhead. Disable replicas during initial load: set number_of_replicas: 0 during bulk indexing, then restore — Elasticsearch won't replicate each document during the load. Use multiple indexing threads/clients to saturate all shards in parallel. After bulk indexing, call POST /index/_forcemerge?max_num_segments=1 to consolidate segments and improve query performance.
Previous
What are Elasticsearch hardware tuning best practices?
Next
What are circuit breakers in Elasticsearch?
More Elasticsearch Questions
View all →- Advanced What are cluster health states in Elasticsearch and what causes each?
- Advanced What is hot-warm-cold architecture in Elasticsearch?
- Advanced What is Index Lifecycle Management (ILM) in Elasticsearch?
- Advanced What is Cross-Cluster Replication (CCR) and Cross-Cluster Search (CCS)?
- Advanced How do script_score and function_score work for custom relevance?