The Requirements
Our requirements are simple. We run microservices in Docker, using Kubernetes as our deployment platform. We want all of our logs in Splunk. So the requirements are simply to take the logs from our microservice containers, and the logs from Kubernetes itself, and the logs from the host OS, and ship them to Splunk.
How Hard Can That Be?
We followed the 12-Factor App’s recommendations to have all logs for our services go to stdout (and sometimes to stderr). We used the journald logging driver in Docker, and installed Splunk forwarders on each host machine. Yay, all our logs are going to Splunk! Done. What, you mean you actually want to search your logs too?
I suppose you could say that was an implicit requirement. The trouble we had was that our applications were spitting out logs in JSON format, and when the tooling shipped those up to Splunk, it was escaping out all that nice JSON formatting. So we were ending up with log messages like:
{ "__REALTIME_TIMESTAMP": "1495808072502730", "__MONOTONIC_TIMESTAMP": "1607839175886", "PRIORITY": "6", "_PID": "1365", "_COMM": "dockerd", "_SYSTEMD_UNIT": "docker.service", "_HOSTNAME": "kubehost001.example.com", "CONTAINER_ID": "9b67ebfb41e5", "CONTAINER_ID_FULL": "9b67ebfb41e5b5580593f2a6f685ed5a740a2aa0cd13c40f8f19ebfbfe809017", "CONTAINER_NAME": "k8s_awesome-logging-service.41c5113d_awesome-logging-service-2802682077-dnlfg_devEnvironment_5fcb32a0-3be0-11e7-81c6-005056ab1c1d_04634e28", "CONTAINER_TAG": "", "ENVIRONMENT_NAME": "dev", "APP_NAME": "awesome-logging-service", "_TRANSPORT": "journal", "MESSAGE": "{\"name\":\"awesome-logging-service\",\"hostname\":\"awesome-logging-service-2802682077-dnlfg\",\"pid\":15,\"req_id\":\"f95a3f61-ebb0-46d6-b5d5-37f1383258fc\",\"level\":30,\"msg\":\"Received request { url: '/awesome-logging-service/health',\\n method: 'GET',\\n headers: \\n { host: 'someIp:8080',\\n 'user-agent': 'Go-http-client/1.1',\\n 'accept-encoding': 'gzip',\\n connection: 'close' },\\n body: undefined }\",\"time\":\"2017-05-26T14:14:32.501Z\",\"v\":0}", "_SOURCE_REALTIME_TIMESTAMP": "1495808072502003" }
We get some important information, such as the environment and app name, container id if we needed it, etc. However, we usually want to search on things in the MESSAGE
field. You know, things like req_id="f95a3f61-ebb0-46d6-b5d5-37f1383258fc"
. However, do you see all those backslash “\” characters in the MESSAGE
? Those don’t play nicely with Splunk. Splunk can’t pull out the key/value pairs when the log is escaped like that.
It turns out that somewhere along the line, probably journald, was wrapping our log message in its JSON payload, and because it doesn’t expect messages to be in JSON format, it does the proper thing and escapes them, so that your log message doesn’t mess up the JSON payload. On top of that, we still had some logs that were not in JSON format, default logs coming out of our tools like Tomcat.
Iteration 2
To make searching these logs easier, let’s just ship the service’s logs directly to Splunk, without going to stdout or journald or anything. Splunk happens to have an HTTP Event Collector (HEC), and libraries for the languages we use to ship events directly to the HEC. So we plugged that in to get our applications to ship logs directly to Splunk. This made things a bit better:
{ "message": { "name": "awesome-logging-service", "pid": 15, "audit": true, "remoteAddress": "::ffff:10.2.132.0", "remotePort": 46418, "req_id": "365848f5-9a7f-416a-8a08-f99b9043cb90", "req": { "query": {}, "method": "POST", "url": "/awesome-logging-service/do", "headers": { "accept": "text/plain, */*", "content-type": "application/json", "txnguid": "ea8a2c5b-d981-4bb0-8f35-d9a872c24a5b", "txntimestamp": "1495752521990", "content-length": "20", "host": "kubehost002.example.com:31374", "connection": "Keep-Alive", "user-agent": "Apache-HttpClient/4.3.6 (java 1.5)", "accept-encoding": "gzip,deflate" }, "httpVersion": "1.1", "trailers": {}, "version": "*", "timers": { "handler-0": 240, "handler-1": 22, "parseQueryString": 55, "readBody": 2910, "parseBody": 729, "bunyan": 127, "handler-2": 94, "handler-3": 75, "handler-4": 277769 } }, "res": { "statusCode": 201, "headers": { "api-version": "1.0.0", "content-type": "text/plain", "content-length": 53 }, "trailer": false }, "latency": 284, "_audit": true, "msg": "handled: 201", "v": 0 }, "severity": "info" }
A bit verbose, but that’s fine. At least now when Splunk gets the message and it is searchable. Splunk understands the JSON formatting, and you can build up complex spath
expressions to pull out what you need. But wait, what happened to my ENVIRONMENT_NAME
and APP_NAME
fields? Well they are gone now, and no good way to bring them back. We also still have the trouble with pieces of our tooling that still write just to stdout, such as Tomcat, app initializations, and other third-party tools. So while our service logs themselves are better, we now have made it harder to search across multiple streams, and you always have to think about what type of log event you’re looking for and somehow remember what format those logs are in so that you can search for them. Obviously not ideal.
Iteration 3
Introduce fluentd. Fluentd is basically a small utility that can ingest and reformat log messages from various sources, and can spit them out to any number of outputs. So, just as an example, it can ingest logs from journald, inspect and transform those messages, and ship them up to Splunk.
So we deployed fluentd as a DaemonSet in our Kubernetes cluster, and pointed it at the journald logs. With various filters we can determine which of those logs are from our Kubernetes containers (our services) and which logs are either from the host OS or from Kubernetes itself, and we can act on each of those types of logs differently. We haven’t quite yet flushed out all the routing and transformations we need to get the logs in a good place, but this is by far the best solution we have worked on.
{ "name": "awesome-logging-service", "index": "dev", "sourcetype": "application", "source": "awesome-logging-service", "host": "awesome-logging-service-2521530320-hzzsh", "hostname": "awesome-logging-service-2521530320-hzzsh", "pid": 15, "level": 20, "msg": "Received request { url: '/awesome-logging-service/health',\n method: 'GET',\n headers: \n { host: 'someIp:8080',\n 'user-agent': 'Go-http-client/1.1',\n 'accept-encoding': 'gzip',\n connection: 'close' },\n body: undefined }", "time": "2017-05-24T16:56:21.393Z", "v": 0, "PRIORITY": "6", "_HOSTNAME": "kubehost001.example.com", "CONTAINER_TAG": "", "_TRANSPORT": "journal", "_PID": "12809", "_COMM": "docker", "_SYSTEMD_UNIT": "docker.service", "CONTAINER_ID": "3cf404b8f08d", "CONTAINER_ID_FULL": "3cf404b8f08daa6ffe4752cd53e3531d2d0a2072a8119774b5f5c65bd50efc1a", "CONTAINER_NAME": "k8s_awesome-logging-service_awesome-logging-service-2521530320-hzzsh_devEnvironment_57843361-3f10-11e7-a70e-00155d019453_13", "_SOURCE_REALTIME_TIMESTAMP": "1495644981402154", "docker": { "container_id": "3cf404b8f08daa6ffe4752cd53e3531d2d0a2072a8119774b5f5c65bd50efc1a" }, "kubernetes": { "container_name": "awesome-logging-service", "namespace_name": "dev", "pod_name": "awesome-logging-service-2521530320-hzzsh", "pod_id": "57843361-3f10-11e7-a70e-00155d019453", "labels": { "app": "awesome-logging-service", "pod-template-hash": "2521530320" }, "host": "kubehost001.example.com", "master_url": "https://10.0.0.1:443/api" } }
Iteration 4
After many small tweaks to the fluentd configs, and what seems like a hundred other things, we landed on something similar to what is below:
DaemonSet
apiVersion: v1 kind: ServiceAccount metadata: name: fluentd namespace: example-system labels: k8s-app: fluentd kubernetes.io/cluster-service: "true" addonmanager.kubernetes.io/mode: Reconcile --- kind: ClusterRole apiVersion: rbac.authorization.k8s.io/v1 metadata: name: fluentd labels: k8s-app: fluentd kubernetes.io/cluster-service: "true" addonmanager.kubernetes.io/mode: Reconcile rules: - apiGroups: - "" resources: - "namespaces" - "pods" verbs: - "get" - "watch" - "list" --- kind: ClusterRoleBinding apiVersion: rbac.authorization.k8s.io/v1 metadata: name: fluentd labels: k8s-app: fluentd kubernetes.io/cluster-service: "true" addonmanager.kubernetes.io/mode: Reconcile subjects: - kind: ServiceAccount name: fluentd namespace: example-system apiGroup: "" roleRef: kind: ClusterRole name: fluentd apiGroup: "" --- apiVersion: apps/v1 kind: DaemonSet metadata: name: fluentd namespace: example-system labels: app: fluentd version: ${IMAGE_VERSION} gitsha: "${GITSHA}" kubernetes.io/cluster-service: "true" addonmanager.kubernetes.io/mode: Reconcile spec: selector: matchLabels: app: fluentd template: metadata: labels: app: fluentd version: "1.6" spec: serviceAccountName: fluentd containers: - name: fluentd image: docker-repo.example.com:5000/fluentd:1.6 env: - name: KUBE_NODE valueFrom: fieldRef: fieldPath: spec.nodeName - name: RUBY_GC_HEAP_INIT_SLOTS value: "500000" - name: RUBY_GC_HEAP_GROWTH_FACTOR value: "1" - name: RUBY_GC_HEAP_GROWTH_MAX_SLOTS value: "10000" - name: FLUENT_UID value: "0" resources: limits: memory: 275Mi cpu: 500m requests: cpu: 100m memory: 275Mi volumeMounts: - name: config mountPath: /fluentd/etc - name: varlog mountPath: /var/log - name: journal mountPath: /run/log/journal - name: ephemeralmount mountPath: /mnt/sda1/var/lib/docker/containers readOnly: true - name: varlibdockercontainers mountPath: /var/lib/docker/containers readOnly: true - name: dockersock mountPath: /var/run/docker.sock readOnly: true terminationGracePeriodSeconds: 30 volumes: - name: varlog hostPath: path: /var/log - name: ephemeralmount hostPath: path: /mnt/sda1/var/lib/docker/containers - name: varlibdockercontainers hostPath: path: /var/lib/docker/containers - name: dockersock hostPath: path: /var/run/docker.sock - name: journal hostPath: path: /run/log/journal - name: config configMap: name: fluentd imagePullSecrets: - name: example-docker-repo
This contains a lot of recommended setup for running fluentd as a DaemonSet. Along with that, it pulls a custom fluentd image we build, and then maps in volumes from the host system. These volumes come into play so that we can read the system journal logs, a kubernetes plugin for fluentd can gather metadata about containers, and we also mount in a ConfigMap to configure fluentd.
ConfigMap
apiVersion: v1 kind: ConfigMap metadata: name: fluentd namespace: example-system data: fluent.conf: | @include kubernetes.conf <match *.kubernetes.journal.container.fluentd> @type splunk-http-eventcollector server "{{ splunk_host }}:{{ splunk_port }}" token "{{ splunk_token }}" protocol "{{ splunk_protocol }}" index infrastructure sourcetype ${tag} source fluentd host "#{ENV['KUBE_NODE']}" all_items true batch_size_limit 1000000 buffer_type file buffer_chunk_limit 700k buffer_path /var/log/fluent/buffer-infrastructure-fluentd flush_interval 5s </match> <match audit.kubernetes.journal.container.**> @type splunk-http-eventcollector server "{{ splunk_host }}:{{ splunk_port }}" token "{{ splunk_token }}" protocol "{{ splunk_protocol }}" index user-gestures sourcetype ${tag} source fluentd host "#{ENV['KUBE_NODE']}" all_items true batch_size_limit 1000000 buffer_type file buffer_chunk_limit 700k buffer_path /var/log/fluent/buffer-user-gestures flush_interval 5s </match> <match app.kubernetes.journal.container.**> @type splunk-http-eventcollector server "{{ splunk_host }}:{{ splunk_port }}" token "{{ splunk_token }}" protocol "{{ splunk_protocol }}" index app-data sourcetype ${tag} source fluentd host "#{ENV['KUBE_NODE']}" all_items true batch_size_limit 1000000 buffer_type file buffer_chunk_limit 700k buffer_path /var/log/fluent/buffer-app-data flush_interval 5s </match> <match **> @type splunk-http-eventcollector server "{{ splunk_host }}:{{ splunk_port }}" token "{{ splunk_token }}" protocol "{{ splunk_protocol }}" index infrastructure sourcetype ${tag} source fluentd host "#{ENV['KUBE_NODE']}" all_items true batch_size_limit 1000000 buffer_type file buffer_chunk_limit 700k buffer_path /var/log/fluent/buffer-infrastructure flush_interval 5s </match> kubernetes.conf: | <match fluent.**> @type null </match> <source> @type systemd path /var/log/journal read_from_head false <storage> @type local persistent true path /var/log/fluent/journal.pos </storage> tag journal </source> <source> @type systemd path /run/log/journal read_from_head false <storage> @type local persistent true path /var/log/fluent/runjournal.pos </storage> tag journal </source> <match journal> @type rewrite_tag_filter <rule> key CONTAINER_NAME pattern ^k8s_([\w-]*?)[\._] tag kubernetes.journal.container.$1 </rule> <rule> key SYSLOG_IDENTIFIER pattern ^(.*) tag journal.$1 </rule> </match> <source> @type tail format /^(?<time>[^ ]* [^ ,]*)[^\[]*\[[^\]]*\]\[(?<severity>[^ \]]*) *\] (?<msg>.*)$/ time_format %Y-%m-%d %H:%M:%S path /var/log/salt/minion pos_file /var/log/fluent/fluentd-salt.pos tag salt </source> <source> @type tail format syslog path /var/log/startupscript.log pos_file /var/log/fluent/fluentd-startupscript.log.pos tag startupscript </source> <source> @type tail format /^time="(?<time>[^)]*)" level=(?<severity>[^ ]*) msg="(?<msg>[^"]*)"( err="(?<error>[^"]*)")?( statusCode=($<status_code>\d+))?/ path /var/log/docker.log pos_file /var/log/fluent/fluentd-docker.log.pos tag docker </source> <source> @type tail format none path /var/log/etcd.log pos_file /var/log/fluent/fluentd-etcd.log.pos tag etcd </source> <filter kubernetes.journal.container.**> @type kubernetes_metadata annotation_match /^example\..*$/ </filter> <filter kubernetes.journal.container.**> @type parser key_name MESSAGE reserve_data true <parse> @type multi_format @include parsers.conf </parse> </filter> <filter kubernetes.journal.container.**> @type record_transformer enable_ruby true remove_keys hostname,name,msg <record> message ${record["msg"] ? record["msg"] : record["message"] ? record["message"] : record["MESSAGE"]} level ${record["level"] ? record["level"] == 10 ? "TRACE" : record["level"] == 20 ? "DEBUG" : record["level"] == 30 ? "INFO" : record["level"] == 40 ? "WARN" : record["level"] == 50 ? "ERROR" : record["level"] == 60 ? "FATAL" : record["level"] : (record["PRIORITY"] ? record["PRIORITY"] == "0" ? "EMERG" : record["PRIORITY"] == "1" ? "ALERT" : record["PRIORITY"] == "2" ? "CRIT" : record["PRIORITY"] == "3" ? "ERROR" : record["PRIORITY"] == "4" ? "WARN" : record["PRIORITY"] == "5" ? "NOTICE" : record["PRIORITY"] == "6" ? "INFO" : record["PRIORITY"] == "7" ? "DEBUG" : record["PRIORITY"] : record["severity"])} </record> </filter> <match kubernetes.journal.container.**> @type rewrite_tag_filter @include audit.conf </match> <match unprocessed.audit.**> @type rewrite_tag_filter #In here, we want to mark kube-probe health checks remove_tag_prefix unprocessed <rule> key userAgent pattern ^(?!kube-probe).*$ tag ${tag} </rule> <rule> key $.req.headers.user-agent pattern ^(?!kube-probe).*$ tag ${tag} </rule> <rule> key responseStatus pattern ^(?!200)$ tag ${tag} </rule> <rule> key $.res.statusCode pattern ^(?!200)$ tag ${tag} </rule> <rule> #things that make it here we throw away key message pattern .+ tag ignore.${tag} </rule> </match> <match ignore.audit.**> @type null </match>
In the ConfigMap, we define a number config files for fluentd. One is basically just a Splunk routing file that defines which Splunk indexes to load the various log types into (we defined “infrastructure”, “audit”, and “app” indexes).
The second config file defined in that configmap, kubernetes.conf
is a file with routes to gather and filter out what we care about. We piece out the different sources, and match patterns to see if the log event is from one of our containers, or a piece of the infrastructure. If they are from our containers (tag starts with kubernetes.journal.container
) we do more processing on the event to pull out some Kubernetes annotations, parse the logs, and determine if they are HTTP audit logs, or generic application logs. We have pulled out the parsing and audit patterns into separate files to help with reading and maintaining those.
We also process the audit logs a bit more, to try and remove all the successful health checks. These are logs that we don’t need in Splunk; they do not add any value and just add a whole lot of clutter.
Dockerfile
FROM ruby:2.3 as builder WORKDIR /example-fluent COPY fluent-plugin-example.gemspec Rakefile Gemfile /example-fluent/ COPY src ./src COPY test ./test COPY k8s_configmap ./k8s_configmap RUN bundle install && bundle exec rake test && gem build fluent-plugin-example.gemspec FROM fluentd:v1.6.2-debian-1.0 LABEL maintainer="example.com" LABEL owner="Example" USER root COPY --from=builder /example-fluent/fluent-plugin-example-*.gem . RUN buildDeps="sudo make gcc g++ libc-dev ruby-dev" \ && apt-get update \ && apt-get install -y --no-install-recommends $buildDeps \ && apt-get install -y --no-install-recommends iproute \ && sudo gem install \ # fluent-plugin-splunk-http-eventcollector:0.3.0 \ included in the our own plugin, until this thing updates fluent-plugin-systemd:1.0.2 \ fluent-plugin-record-reformer:0.9.1 \ fluent-plugin-kubernetes_metadata_filter:2.1.6 \ fluent-plugin-rewrite-tag-filter:2.2.0 \ fluent-plugin-multi-format-parser:1.0.0 \ fluent-plugin-example \ && sudo gem sources --clear-all \ && SUDO_FORCE_REMOVE=yes \ apt-get purge -y --auto-remove \ -o APT::AutoRemove::RecommendsImportant=false \ $buildDeps \ && rm -rf /var/lib/apt/lists/* \ /home/fluent/.gem/ruby/2.3.0/cache/*.gem CMD [ "fluentd", "-c", "/fluentd/etc/fluent.conf", "-p", "/fluentd/plugins", "--no-supervisor" ]
Our own fluentd image is based on the official fluentd image, and we basically just bundle in all the plugins we need. We do build our own fluentd plugin, to add a fix to the splunk-http-evencollector plugin that is currently commented out, and also include some event translation that is hard to do in the fluentd configs to help with log event consistency.
Hopefully This Helps
This was my most viewed post from my Blogger site, so it seemed like a great starting point for pulling posts over to this site. Hopefully this rambling excursion and the example files provided here can help somebody if they are facing similar difficulties.
If we were to start over from scratch, I’m sure there are much better ways to do this, including using the fluentd logging driver for Docker directly. That would eliminate many of the hacks we had to add in here.