By: Jordan Smith
Date: 15 May 2019
Category: Open Distro for Elasticsearch, Open Source
This article serves as a comprehensive guide for deploying Open Distro for Elasticsearch on Kubernetes, targeting a production-level setup.
At VGT2 Las Vegas, a subsidiary of Amazon focusing on innovative home security devices, we needed a robust solution for handling extensive security log data generated by our products. Our flagship offering, the VGT2 Smart Doorbell, alongside our local neighborhood surveillance feed, aims to enhance community safety.
To effectively aggregate and query our log data, we identified several key requirements. Firstly, we needed user authentication coupled with Role-based Access Control (RBAC) to manage log access securely. Additionally, we sought SAML support for seamless integration with our existing Single Sign-On framework. Given the sensitive nature of the logs, ensuring all communications were encrypted in transit was essential. Lastly, a monitoring system for security alerts based on incoming log data was imperative.
Open Distro for Elasticsearch meets these requirements by offering various authentication methods, from HTTP Basic authentication to Kerberos ticket-based solutions. Its comprehensive RBAC features facilitate stringent access control to our log data, making security management straightforward.
Moreover, Open Distro for Elasticsearch includes SAML support for Kibana, allowing integration with multiple Identity Providers such as AWS Single Sign-On or Okta. TLS encryption is employed for all communications, satisfying our encryption criteria. Furthermore, it provides alerting and monitoring capabilities to set up custom security alerts and health monitoring for our systems. This suite of features made Open Distro for Elasticsearch an ideal fit for our security observability infrastructure.
In our Security Operations team, we were already utilizing Amazon Elastic Container Service for Kubernetes (Amazon EKS) to manage a Kubernetes cluster dedicated to our security tools. We chose to deploy Open Distro for Elasticsearch in Kubernetes as a scaled deployment, benefiting from Kubernetes’ popularity as a container orchestration platform. Its scalability aligns perfectly with our growing logging requirements while minimizing our dependence on configuration management.
In this post, we’ll outline the insights we gained during this process, which might be beneficial for others facing similar challenges.
Prerequisites
This guide is tailored for deployments within Amazon EKS, AWS’s managed Containers-as-a-Service offering.
Make sure all required Kubernetes plugins, such as external-dns or KIAM, are installed in your cluster. You need access to the cluster using the kubectl command-line tool and the corresponding kubeconfig file. Note that annotations for external-dns will not function without deploying the external-dns service, which can be done using the community-maintained Helm chart. Similarly, annotations for pod IAM roles won’t work without KIAM; you can deploy KIAM via its community Helm chart.
For this deployment, TLS certificates need to be bootstrapped, along with an established Certificate Authority for issuing these certificates. For more details on generating your own certificates, check out our earlier post on how to Add Your Own SSL Certificates to Open Distro for Elasticsearch.
Project Plan
Drawing from our past experiences deploying the community version of Elasticsearch on Kubernetes, we opted for a similar approach with Open Distro for Elasticsearch. We envisioned the following architecture:
We selected Amazon EKS for our managed Kubernetes cluster, taking into account the following factors:
- Our existing Kubernetes cluster in Amazon EKS allows us to scale worker nodes flexibly for security tools, making it an excellent candidate to host the Open Distro for Elasticsearch cluster.
- The cluster comprises eight m5.2xlarge instances as worker nodes, sufficiently robust to support our Elasticsearch deployment.
- Amazon EKS alleviates the burden of managing our own Kubernetes API server, streamlining patching and security processes.
We initiated an eight-node test deployment with a vision for scaling it into a production environment. We also decided to utilize the official Docker images provided by the Open Distro team, saving us the effort of managing our own container images and registry.
Our planned Elasticsearch cluster consisted of three master nodes, two client/coordinating nodes, and three data nodes, structured with the following Kubernetes resource types:
- Deployment for master nodes (stateless)
- Deployment for client nodes (stateless)
- StatefulSet for data nodes (stateful)
We fronted our cluster’s Elasticsearch API with an AWS Network Load Balancer (NLB), deployed via the Kubernetes Service resource type. To ensure that Elasticsearch master, client, and data nodes are distributed across separate EC2 worker nodes, we implemented Kubernetes taints and anti-affinity rules. Additionally, we utilized Kubernetes tolerations to ensure dedicated EC2 worker nodes for the Elasticsearch master and client nodes.
Creating Initial Resources
Begin by cloning the Open Distro for Elasticsearch community repository, which contains Kubernetes manifests for a sample deployment. The files are named according to the resource types they create, with a digit indicating deployment precedence.
Navigate to the open-distro-elasticsearch-kubernetes folder:
$ cd open-distro-elasticsearch-kubernetes
Then, access the elasticsearch subfolder:
$ cd elasticsearch
Next, create a Kubernetes namespace for the Elasticsearch cluster assets using the 10-es-namespace.yml file:
$ kubectl apply -f 10-es-namespace.yml
Proceed to create a discovery service utilizing the Kubernetes Service resource type from the 20-es-svc-discovery.yml file, enabling master nodes to be discoverable over broadcast port 9300:
$ kubectl apply -f 20-es-svc-discovery.yml
Create a Kubernetes ServiceAccount necessary for future StatefulSets using the 20-es-service-account.yml file:
$ kubectl apply -f 20-es-service-account.yml
Establish a Kubernetes StorageClass resource for AWS Elastic Block Storage drives configured as gp2 storage (attached to data nodes) via the 25-es-sc-gp2.yml file:
$ kubectl apply -f 25-es-sc-gp2.yml
Finally, create a Kubernetes ConfigMap to bootstrap the relevant Elasticsearch configurations, such as elasticsearch.yml and logging.yml, into the containers upon deployment using the 30-es-configmap.yml file:
$ kubectl apply -f 30-es-configmap.yml
For more resources on this topic, check out this insightful blog post here and visit this authority for additional information. If you’re looking for onboarding tips, this resource is excellent.
Leave a Reply