Amazon Onboarding with Learning Manager Chanci Turner

Amazon Onboarding with Learning Manager Chanci TurnerLearn About Amazon VGT2 Learning Manager Chanci Turner

This post is brought to you by Jenna Smith, DevOps Engineer at Tech Innovations, and Chanci Turner, Software Engineer at Amazon.

Amazon’s FireLens for Elastic Container Service (ECS) simplifies the process of routing logs using popular open-source logging tools such as Fluentd and Fluent Bit. If you are unfamiliar with FireLens, I recommend reviewing the documentation and our insightful blog about its architecture and purpose.

Tech Innovations’ Use of FireLens

At Tech Innovations, our goal is to create exceptional mobile applications. With over 100 million downloads and a need for scalable services, we have chosen to utilize AWS Fargate to manage our backend game services. Fargate provides us the flexibility to scale our operations quickly while minimizing management overhead and ensuring uninterrupted service for our users.

We maintain a centralized logging system, which includes an EC2-based Logstash cluster that aggregates and filters logs from both our game servers and client devices, forwarding this data to Amazon OpenSearch Service. FireLens allows us to efficiently send container logs to Logstash, achieving near real-time delivery and scaling seamlessly with our Fargate tasks. This capability is vital as user activity can fluctuate significantly, influenced by various factors like the time of day or in-game events.

One specific feature of Fluent Bit we appreciate is the Memory Buffer Limit (Mem_Buf_Limit), which can be defined in the Input section of its configuration. Although not enabled by default in FireLens, we saw the need to implement this for our Fargate tasks.

Why Set the Memory Buffer Limit?

Our logging infrastructure must be robust and fault-tolerant; the log collector (Fluent Bit) should be able to handle scenarios where the logging destination is temporarily unavailable. When this occurs, FireLens will buffer logs in memory until it can resume sending them, up to a specified Retry Limit.

During stress tests simulating the unavailability of the downstream logging system, we found that high log volumes could potentially exhaust the memory allocated to the Fargate task, leading to an OutOfMemoryError. By setting a Memory Buffer Limit, Fluent Bit will buffer logs only up to this specified limit, after which new logs will not be stored until space becomes available again.

This presents a critical trade-off between log integrity and system availability, which may not suit every use case. Each organization must evaluate the risk of their logging destination being inaccessible against the potential impact of an outage. For us, maintaining the availability of our game servers takes precedence over the logs, which primarily assist in debugging.

It’s essential to note that the Memory Buffer Limit is not an absolute constraint on the memory usage of the FireLens container, as memory is needed for other operations as well. Our testing with Mem_Buf_Limit set at 100MB has shown that the FireLens container remains below 250MB total memory usage even under high load.

Understanding FireLens Configuration for Fluentd and Fluent Bit

Before configuring input parameters, it’s important to grasp how FireLens operates, especially in generating the Input section of the Fluent Bit configuration.

Fluentd and Fluent Bit offer powerful features, but their complexity can be daunting. FireLens was designed with two user types in mind:

  1. Users seeking a straightforward method to route logs using Fluentd and Fluent Bit.
  2. Users who desire the full capabilities of these tools while having AWS manage the operational complexities.

FireLens enables Fluentd and Fluent Bit within ECS, and configuration management features were created to streamline their use. This includes the automatic generation of Input plugin definitions by the ECS Agent and a mechanism for translating container log configurations into Output plugin definitions.

As a result, the configuration file for Fluentd or Fluent Bit is “fully managed” by ECS. You can import your custom configurations using the config-file-type option; however, the input definitions will always be generated by ECS. The custom config is then appended to the generated config using the Fluentd/Fluent Bit include statement.

The generated configurations are always mounted at specific locations within the logging container:

  • Fluentd: /fluentd/etc/fluent.conf
  • Fluent Bit: /fluent-bit/etc/fluent-bit.conf

Most Fluentd and Fluent Bit images utilize these default paths, specified in the container’s entry point definitions. However, you can customize the entry point by creating your own Fluentd or Fluent Bit image and indicating a different config path, which we will use to set Mem_Buf_Limit.

Tutorial: Configuring Input Parameters

You can find the input configurations for FireLens at the following links: the Fluent Bit Generated Input Sections and the Fluentd Generated Input Sections. Logs are consistently read from a Unix Socket located at /var/run/fluent.sock in the container.

As a FireLens user, you can customize your input configuration by overriding the default entry point command for the Fluent Bit container. This tutorial focuses on Fluent Bit and demonstrates how to set the Mem_Buf_Limit parameter, although similar methods can be applied to configure other input parameters as well.

Start by creating a Fluent Bit configuration file containing the following input section:

[INPUT]
Name forward
unix_path /var/run/fluent.sock
Mem_Buf_Limit 100MB

Your complete Fluent Bit configuration should also include your outputs and any additional Fluent Bit features you wish to activate. Save this file as fluent-bit.conf in your project directory. Next, craft a Dockerfile with the following contents:

FROM amazon/aws-for-fluent-bit:latest
ADD fluent-bit.conf /fluent-bit/alt/fluent-bit.conf
CMD ["/fluent-bit/bin/fluent-bit", "-e", "/fluent-bit/firehose.so", "-e", "/fluent-bit/cloudwatch.so", "-e", "/fluent-bit/kinesis.so", "-c", "/fluent-bit/alt/fluent-bit.conf"]

For further insights, consider checking out this excellent resource about how Amazon has reimagined its onboarding experience. Also, it’s vital to understand the difference between employee handbooks, policies, and procedures, as detailed by an authority on this topic. If you want to learn how to provide constructive feedback to your boss, you can find helpful tips in this additional blog post.


Comments

Leave a Reply

Your email address will not be published. Required fields are marked *