Help us understand the problem. What is going on with this article?

データ分析基盤:ElasticBeanstalk上のログをfluentd(td-agent-3)を使ってBigQueryに保管する

分析の幅を広げるためにRailsAPIモードで動かしているアプリケーションのアクセスログを収集したい...
そこで、

  • fluentdでログ収集
  • 格納先はBigQuery

という仕様で分析基盤を構築することに。

前提

  • ElasticBeanstalk (以下、EB)
    • Puma with Ruby 2.5 running on 64bit Amazon Linux/2.8.6
  • td-agent-3
  • fluentd-bigquery-plugin 2.6.1

目次

  1. Railsサイドでログ整形
  2. BigQuery Credentials設定
  3. .ebextentions設定
  4. デプロイ

1. Railsサイドでログ整形

Railsから出力されるデフォルトのログがひどいため、logrageを使って、出力ログを整えておく。

『Railsのログをなんとかしたい人生だった』

このあたりを参考にさせていただきながら、出力されるlogをキレイなJSONに設定。

2. BigQuery Credentials設定

EBとの連携のために、APIキーを用意する必要がある。

GCP上の「APIとサービス」「認証情報」 へ行き、 「認証情報を作成」をクリック

該当のページ:https://console.cloud.google.com/apis/credentials

作成後、JSONファイルをダウンロード。
このJSONファイルを読み込んでEC2上にファイルをコピーしてcredentialsとして使う。

そのため、JSONファイルはS3内のバケットに置いておいて、EBからアクセスできるようにしておく。

3. .ebextentions設定

さて本題はここから。
ElasticBeanstalkなので .ebextentionsにすべての設定・コマンドを書かなければならない。

やることは、

  1. 前のステップで行ったBigQueryとEBとの連携設定
  2. td-agent.confファイルの設定
  3. td-agent-install設定
  4. "/etc/init.d/td-agent" 初期設定変更

まず1から。
S3に置いたJSONファイルのバケット名とpathを下記の該当箇所に入力。
それ以外はコピペで問題ないはず。

05_fluentd_bigquery_credentials.config
Resources:
  AWSEBAutoScalingGroup:
    Metadata:
      AWS::CloudFormation::Authentication:
        S3Auth:
          type: "s3"
          buckets: ["バケット名"]
          roleName:
            "Fn::GetOptionSetting":
              Namespace: "aws:autoscaling:launchconfiguration"
              OptionName: "IamInstanceProfile"
              DefaultValue: "aws-elasticbeanstalk-ec2-role"
files:
  /etc/td-agent/big_query.json:
    mode: "000644"
    owner: root
    group: root
    authentication: "S3Auth"
    source: "JSONファイルのpathを記述"

続いて、"/etc/td-agent/td-agent.conf" 設定ファイルの中身。

06-td-agent-config.config
files:
  "/etc/td-agent/td-agent.conf":
    owner: root
    group: root
    content: |
      <source>
        @type tail
        format json
        path /var/app/current/log/"ログファイル名"
        tag "タグ名"
        pos_file /var/app/current/log/"pos_file名"
      </source>

      <filter "tag名">
        @type grep
        <exclude>
          key user_agent
          pattern ^ELB-HealthChecker/2.0$
        </exclude>
      </filter>

      <match "tag名">
        @type bigquery_insert

        <buffer>
          @type file
          path /var/log/td-agent/buffer
          timekey 3600
          chunk_limit_size 256m
          queue_limit_length 128
          total_limit_size 10g
          flush_interval 30s
          flush_thread_interval 1.0
          flush_thread_count 15
          retry_max_times 9999999999999
          retry_wait 1s
        </buffer>

        auth_method json_key
        json_key /etc/td-agent/big_query.json

        project "プロジェクト名"
        dataset "データセット名"

        auto_create_table true

        <inject>
          time_key timestamp
          time_type string
          time_format %Y-%m-%d %H:%M:%S
        </inject>

        schema [
          {"name": "timestamp", "type": "TIMESTAMP"},
          {"name": "host", "type": "STRING"},
          {"name": "remote_ip", "type": "STRING"},
          {"name": "user_id", "type": "INTEGER"},
          {"name": "path", "type": "STRING"},
          {"name": "method", "type": "STRING"},
          {"name": "status", "type": "INTEGER"},
          {"name": "format", "type": "STRING"},
          {"name": "controller", "type": "INTEGER"},
          {"name": "action", "type": "STRING"},
          {"name": "duration", "type": "FLOAT"},
          {"name": "view", "type": "FLOAT"},
          {"name": "db", "type": "STRING"},
          {"name": "user_agent", "type": "STRING"},
          {"name": "referer", "type": "STRING"},
          {"name": "os", "type": "STRING"},
          {"name": "os_version", "type": "STRING"},
          {"name": "browser", "type": "STRING"},
          {"name": "browser_version", "type": "STRING"},
          {"name": "exception_class", "type": "STRING"}, 
          {"name": "exception_message", "type": "STRING"}, 
          {"name": "exception_backtrace", "type": "STRING"}
        ]
      </match>

各ディレクティブ毎にすることを簡単に付け加えておくと、
<source>:logrageを使って生成したログファイルを指定する。pos_file名も拡張子が.posになるが、ログファイルと同じ名前を指定
<filter>:ヘルスチェックをログから除きたいため、<exclude>を使って指定
<match>:matchしたログの処理。ココらへんはほぼテンプレ通り。

余談ですが、このディレクティブの書き方がversionによって違うし、2.xのものの情報が多かったために、かなり手間取った。公式ドキュメントも書き方が怪しいので、どなたか3.x系の正式な書き方をまとめてくれると泣いて喜びます...

話を戻します。td-agent-install設定は下記の通り。

07-td-agent-install.config
commands:
    01-command:
        command: echo 'Defaults:root    !requiretty' >> /etc/sudoers

    02-command:
        command: curl -s -L https://toolbelt.treasuredata.com/sh/install-amazon1-td-agent3.sh | sh

    03-command:
        command: td-agent-gem install fluent-plugin-bigquery

ここでtd-agent-3fluent-plugin-bigqueryをinstall

最後に"/etc/init.d/td-agent"をちょっといじる。
ファイル自体はめちゃくちゃ長いが、いじったのは権限だけ。
初期設定では、owner/groupともにtd-agentになっているが、rootに変更。

[error]: #0 Permission denied @ rb_sysopen - /var/app/current/log/ログファイル名

上記のように、権限の問題で、ログファイルにアクセスできないエラーが出たため、付け足したファイル。

08-td-agent-init-config.config
files:
  "/etc/init.d/td-agent":
    owner: root
    group: root
    content: |
      #!/bin/sh
      ### BEGIN INIT INFO
      # Provides:          td-agent
      # Required-Start:    $network $local_fs
      # Required-Stop:     $network $local_fs
      # Default-Start:     2 3 4 5
      # Default-Stop:      0 1 6
      # Short-Description: data collector for Treasure Data
      # Description:       td-agent is a data collector
      ### END INIT INFO
      # pidfile:           /var/run/td-agent/td-agent.pid

      export PATH=/sbin:/usr/sbin:/bin:/usr/bin

      TD_AGENT_NAME=td-agent
      TD_AGENT_HOME=/opt/td-agent
      TD_AGENT_DEFAULT=/etc/sysconfig/td-agent
      TD_AGENT_USER=root
      TD_AGENT_GROUP=root
      TD_AGENT_RUBY=/opt/td-agent/embedded/bin/ruby
      TD_AGENT_BIN_FILE=/usr/sbin/td-agent
      TD_AGENT_LOG_FILE=/var/log/td-agent/td-agent.log
      TD_AGENT_PID_FILE=/var/run/td-agent/td-agent.pid
      TD_AGENT_LOCK_FILE=/var/lock/subsys/td-agent
      TD_AGENT_OPTIONS="--use-v1-config"

      # timeout can be overridden from /etc/sysconfig/td-agent
      STOPTIMEOUT=120

      # Read configuration variable file if it is present
      if [ -f "${TD_AGENT_DEFAULT}" ]; then
        . "${TD_AGENT_DEFAULT}"
      fi

      if [ -n "${name}" ]; then
        # backward compatibility with omnibus-td-agent <= 2.2.0. will be deleted from future release.
        echo "Warning: Declaring \$name in ${TD_AGENT_DEFAULT} has been deprecated. Use \$TD_AGENT_NAME instead." 1>&2
        TD_AGENT_NAME="${name}"
      fi

      if [ -n "${prog}" ]; then
        # backward compatibility with omnibus-td-agent <= 2.2.0. will be deleted from future release.
        echo "Warning: Declaring \$prog in ${TD_AGENT_DEFAULT} for customizing \$PIDFILE has been deprecated. Use \$TD_AGENT_PID_FILE instead." 1>&2
      if [ -z "${PIDFILE}" ]; then
        TD_AGENT_PID_FILE="//var/run/td-agent/${prog}.pid"
      fi
        TD_AGENT_LOCK_FILE="//var/lock/subsys/${prog}"
        TD_AGENT_PROG_NAME="${prog}"
      else
        unset TD_AGENT_PROG_NAME
      fi

      if [ -n "${process_bin}" ]; then
        # backward compatibility with omnibus-td-agent <= 2.2.0. will be deleted from future release.
        echo "Warning: Declaring \$process_bin in ${TD_AGENT_DEFAULT} has been deprecated. Use \$TD_AGENT_RUBY instead." 1>&2
        TD_AGENT_RUBY="${process_bin}"
      fi

      if [ -n "${PIDFILE}" ]; then
        echo "Warning: Declaring \$PIDFILE in ${TD_AGENT_DEFAULT} has been deprecated. Use \$TD_AGENT_PIDFILE instead." 1>&2
        TD_AGENT_PID_FILE="${PIDFILE}"
      fi

      if [ -n "${DAEMON_ARGS}" ]; then
        # TODO: Show warning on use of `DAEMON_ARGS`
        # echo "Warning: Declaring \$DAEMON_ARGS in ${TD_AGENT_DEFAULT} has been deprecated. Use \$TD_AGENT_OPTIONS instead." 1>&2
        START_STOP_DAEMON_ARGS=""
        parse_daemon_args() {
          while [ -n "$1" ]; do
            case "$1" in
            "--user="?* )
              echo "Warning: Declaring --user in \$DAEMON_ARGS has been deprecated. Use \$TD_AGENT_USER instead." 1>&2
              TD_AGENT_USER="${1#*=}"
              ;;
            "--user" )
              echo "Warning: Declaring --user in \$DAEMON_ARGS has been deprecated. Use \$TD_AGENT_USER instead." 1>&2
              shift 1
              TD_AGENT_USER="$1"
              ;;
            * )
              START_STOP_DAEMON_ARGS="${START_STOP_DAEMON_ARGS} $1"
              ;;
            esac
            shift 1
          done
        }
        parse_daemon_args ${DAEMON_ARGS}
      fi

      if [ -n "${TD_AGENT_ARGS}" ]; then
        ORIG_TD_AGENT_ARGS="${TD_AGENT_ARGS}"
        TD_AGENT_ARGS=""
        parse_td_agent_args() {
          while [ -n "$1" ]; do
            case "$1" in
            "--group="?* )
              echo "Warning: Declaring --group in \$TD_AGENT_ARGS has been deprecated. Use \$TD_AGENT_GROUP instead." 1>&2
              TD_AGENT_GROUP="${1#*=}"
              ;;
            "--group" )
              echo "Warning: Declaring --group in \$TD_AGENT_ARGS has been deprecated. Use \$TD_AGENT_GROUP instead." 1>&2
              shift 1
              TD_AGENT_GROUP="$1"
              ;;
            "--user="?* )
              echo "Warning: Declaring --user in \$TD_AGENT_ARGS has been deprecated. Use \$TD_AGENT_USER instead." 1>&2
              TD_AGENT_USER="${1#*=}"
              ;;
            "--user" )
              echo "Warning: Declaring --user in \$TD_AGENT_ARGS has been deprecated. Use \$TD_AGENT_USER instead." 1>&2
              shift 1
              TD_AGENT_USER="$1"
              ;;
            * )
              TD_AGENT_ARGS="${TD_AGENT_ARGS} $1"
              ;;
            esac
            shift 1
          done
        }
        parse_td_agent_args ${ORIG_TD_AGENT_ARGS}
      fi

      # Arguments to run the daemon with
      TD_AGENT_ARGS="${TD_AGENT_ARGS:-${TD_AGENT_BIN_FILE} --log ${TD_AGENT_LOG_FILE} ${TD_AGENT_OPTIONS}}"
      START_STOP_DAEMON_ARGS="${START_STOP_DAEMON_ARGS}"

      # Exit if the package is not installed
      [ -x "${TD_AGENT_RUBY}" ] || exit 0

      # Source function library.
      . /etc/init.d/functions

      # Define LSB log_* functions.
      # Depend on lsb-base (>= 3.0-6) to ensure that this file is present.
      . /lib/lsb/init-functions

      # Check the user
      if [ -n "${TD_AGENT_USER}" ]; then
        if ! getent passwd | grep -q "^${TD_AGENT_USER}:"; then
          echo "$0: user for running ${TD_AGENT_NAME} doesn't exist: ${TD_AGENT_USER}" >&2
          exit 1
        fi
        mkdir -p "$(dirname "${TD_AGENT_PID_FILE}")"
        chown -R "${TD_AGENT_USER}" "$(dirname "${TD_AGENT_PID_FILE}")"
        START_STOP_DAEMON_ARGS="${START_STOP_DAEMON_ARGS} --user ${TD_AGENT_USER}"
      fi

      if [ -n "${TD_AGENT_GROUP}" ]; then
        if ! getent group -s files | grep -q "^${TD_AGENT_GROUP}:"; then
          echo "$0: group for running ${TD_AGENT_NAME} doesn't exist: ${TD_AGENT_GROUP}" >&2
          exit 1
        fi
        TD_AGENT_ARGS="${TD_AGENT_ARGS} --group ${TD_AGENT_GROUP}"
      fi

      if [ -n "${TD_AGENT_PID_FILE}" ]; then
        mkdir -p "$(dirname "${TD_AGENT_PID_FILE}")"
        chown -R "${TD_AGENT_USER}" "$(dirname "${TD_AGENT_PID_FILE}")"
        TD_AGENT_ARGS="${TD_AGENT_ARGS} --daemon ${TD_AGENT_PID_FILE}"
      fi

      # 2012/04/17 Kazuki Ohta <k@treasure-data.com>
      # Use jemalloc to avoid memory fragmentation
      if [ -f "${TD_AGENT_HOME}/embedded/lib/libjemalloc.so" ]; then
        export LD_PRELOAD="${TD_AGENT_HOME}/embedded/lib/libjemalloc.so"
      fi

      kill_by_file() {
        local sig="$1"
        shift 1
        local pid="$(cat "$@" 2>/dev/null || true)"
        if [ -n "${pid}" ]; then
          if /bin/kill "${sig}" "${pid}" 1>/dev/null 2>&1; then
            return 0
          else
            return 2
          fi
        else
          return 1
        fi
      }

      #
      # Function that starts the daemon/service
      #
      do_start() {
        # Set Max number of file descriptors for the safety sake
        # see http://docs.fluentd.org/en/articles/before-install
        ulimit -n 65536 1>/dev/null 2>&1 || true
        local RETVAL=0
        daemon --pidfile="${TD_AGENT_PID_FILE}" ${START_STOP_DAEMON_ARGS} "${TD_AGENT_RUBY}" ${TD_AGENT_ARGS} || RETVAL="$?"
        [ $RETVAL -eq 0 ] && touch "${TD_AGENT_LOCK_FILE}"
        return $RETVAL
      }

      #
      # Function that stops the daemon/service
      #
      do_stop() {
        # Return
        #   0 if daemon has been stopped
        #   1 if daemon was already stopped
        #   2 if daemon could not be stopped
        #   other if a failure occurred
        if [ -e "${TD_AGENT_PID_FILE}" ]; then
          # Use own process termination instead of killproc because killproc can't wait SIGTERM
          if kill_by_file -TERM "${TD_AGENT_PID_FILE}"; then
            local i
            for i in $(seq "${STOPTIMEOUT}"); do
              if kill_by_file -0 "${TD_AGENT_PID_FILE}"; then
                sleep 1
              else
                break
              fi
            done
            if kill_by_file -0 "${TD_AGENT_PID_FILE}"; then
              echo -n "Timeout error occurred trying to stop ${TD_AGENT_NAME}..."
              return 2
            else
              rm -f "${TD_AGENT_PID_FILE}"
              rm -f "${TD_AGENT_LOCK_FILE}"
            fi
          else
            return 1
          fi
        else
          if killproc "${TD_AGENT_PROG_NAME:-${TD_AGENT_NAME}}"; then
            rm -f "${TD_AGENT_PID_FILE}"
            rm -f "${TD_AGENT_LOCK_FILE}"
          else
            return 2
          fi
        fi
      }

      #
      # Function that sends a SIGHUP to the daemon/service
      #
      do_reload() {
        kill_by_file -HUP "${TD_AGENT_PID_FILE}"
      }
      do_restart() {
        if ! do_configtest; then
          return 1
        fi
        local val=0
        do_stop || val="$?"
        case "${val}" in
        0 | 1 )
          if ! do_start; then
            return 1
          fi
          ;;
        * ) # Failed to stop
          return 1
          ;;
        esac
      }

      do_configtest() {
        eval "${TD_AGENT_ARGS} ${START_STOP_DAEMON_ARGS} --dry-run -q"
      }

      RETVAL=0
      case "$1" in
      "start" )
        echo -n "Starting ${TD_AGENT_NAME}: "
        do_start || RETVAL="$?"
        case "$RETVAL" in
        0 )
          log_success_msg "${TD_AGENT_NAME}"
          ;;
        * )
          log_failure_msg "${TD_AGENT_NAME}"
          exit 1
          ;;
        esac
        ;;
      "stop" )
        echo -n "Stopping ${TD_AGENT_NAME}: "
        do_stop || RETVAL="$?"
        case "$RETVAL" in
        0 )
          log_success_msg "${TD_AGENT_NAME}"
          ;;
        * )
          log_failure_msg "${TD_AGENT_NAME}"
          exit 1
          ;;
        esac
        ;;
      "reload" )
        echo -n "Reloading ${TD_AGENT_NAME}: "
        if ! do_configtest; then
          log_failure_msg "${TD_AGENT_NAME}"
          exit 1
        fi
        if do_reload; then
          log_success_msg "${TD_AGENT_NAME}"
        else
          log_failure_msg "${TD_AGENT_NAME}"
          exit 1
        fi
        ;;
      "restart" )
        echo -n "Restarting ${TD_AGENT_NAME}: "
        if do_restart; then
          log_success_msg "${TD_AGENT_NAME}"
        else
          log_failure_msg "${TD_AGENT_NAME}"
        exit 1
        fi
        ;;
      "status" )
        if kill_by_file -0 "${TD_AGENT_PID_FILE}"; then
          log_success_msg "${TD_AGENT_NAME} is running"
        else
          log_failure_msg "${TD_AGENT_NAME} is not running"
          exit 1
        fi
        ;;
      "condrestart" )
        if [ -f "${TD_AGENT_LOCK_FILE}" ]; then
          echo -n "Restarting ${TD_AGENT_NAME}: "
          if do_restart; then
            log_success_msg "${TD_AGENT_NAME}"
          else
            log_failure_msg "${TD_AGENT_NAME}"
            exit 1
          fi
        fi
        ;;
      "configtest" )
        if do_configtest; then
          log_success_msg "${TD_AGENT_NAME}"
        else
          log_failure_msg "${TD_AGENT_NAME}"
          exit 1
        fi
        ;;
      * )
        echo "Usage: $0 {start|stop|reload|restart|condrestart|status|configtest}" >&2
        exit 1
        ;;
      esac

commands:
    01-restart-td-agent:
        command: sudo /etc/init.d/td-agent restart

ここまでできたら eb deployで上げたらおしまい。

td-agentの初期インストールは処理が重くて、AutoScalingが発動するケースが。
これは仕方ないので、最初だけインスタンスサイズ上げてもいいかもしれない。

設定中に出たエラーもあるので、気が向いたら追記していく予定。

Why not register and get more from Qiita?
  1. We will deliver articles that match you
    By following users and tags, you can catch up information on technical fields that you are interested in as a whole
  2. you can read useful information later efficiently
    By "stocking" the articles you like, you can search right away
Comments
No comments
Sign up for free and join this conversation.
If you already have a Qiita account
Why do not you register as a user and use Qiita more conveniently?
You need to log in to use this function. Qiita can be used more conveniently after logging in.
You seem to be reading articles frequently this month. Qiita can be used more conveniently after logging in.
  1. We will deliver articles that match you
    By following users and tags, you can catch up information on technical fields that you are interested in as a whole
  2. you can read useful information later efficiently
    By "stocking" the articles you like, you can search right away
ユーザーは見つかりませんでした