This repository contains a modified Dockerfile for building elasticsearch based on Docker's automated build published to the public Docker Hub Registry. It contains simple recipes for preloading data into the Docker image and running elasticsearch without external persistent data files.
-
Install Docker.
-
Clone this repository.
-
Select a preloaded data recipe (see below).
-
Build the elasticsearch Docker image with your preloaded data.
-
Load the Docker image into Docker for execution.
docker load -i preloaded-elasticsearch.tar.gz
docker run -d -p 9200:9200 -p 9300:9300 preloaded-elasticsearch
After a few seconds, open http://<host>:9200
to interact with elasticsearch.
This repository is based on a snapshot of the dockerfile/elasticsearch repository. The parent repository creates a Docker image that offers persistent and shared data by mounting a data directory located outside the Docker container. It is not possible to start elasticseatch, load data, and save a new elasticsearc image with the data preloaded, unless you use an external data directory.
This repository removes the mountable external data directory from the elasticsearch docker image build. It provides two recipes for building an elasticsearch docker image with preloaded data.
If you have an existing elasticsearch data directory tree, you can drop it in place when building the elasticsearch Docker image. You save and compress the elasticsearch Docker image with the drop-in preloaded data, and run it as shown in the Usage section, above.
You can obtain an elasticsearch data directory tree by extracting data from a running (but quiescent) elasticsearch Docker container with:
docker cp ${CONTAINER}/usr/share/elasticsearch/data extracted-data
where ${CONTAINER} is the ID of the running Docker container as obtained from:
docker ps
This is a more complex recipe. You first build a modified elasticsearch Docker image, without the mountable external data volume and without preloaded data. You run that image inside a Docker container and load data into it. You commit the container to a new docker image, and save and compress the new image with the preloaded data.
The elasticsearch Docker image with preloaded data can be run as shown in the Usage section, above.
The curl
command is used to check the status of elasticsearch before and
after data is loaded into it. There is an example, in comments, of how to use
curl
to load JSON Lines data into elasticsearch.
-
This repository currently contains a snapshot of the dockerfile/elasticsearch repository. Perhaps it should be linked to the parent repository?
-
Is there a procedure that should be followed for shutting down eleasticsearch in a running Docker container before extracting data from the container or committing the container to a new Docker image?
-
The scripts appear to use an elasticsearch.org signing key. Is this really allowable, now that I've modified the scripts?
-
The scripts should check for failures and abort with appropriate feedback.
-
The scripts should be converted from csh to bash for portability.
-
These TODOs should be converted to github issues.