Ruby
Python
Perl
PHP
WebAPI

Web API 秒間アクセスリミッター Python, Ruby, PHP, Perl サンプルコード

1. 要約

この記事では、Python、Ruby、PHP、Perl のインタープリター言語で実装した Web API 秒間アクセスリミッターのサンプルコードを、GlusterFS & tmpfs メモリーキャッシュッシステムにのせた例を投稿します。

2. はじめに

Web API サービスの開発プロジェクトでは、一定期間内にアクセス数が超過した場合に、“429 Too Many Requests” の HTTP ステータスコードを返却するというような、アクセス制限をかけなければならない要件が出てくる事もあると思います。
おそらくこういった時の要件は、基準になる時間が短い中で、低負荷、高速性、正確性などのパフォーマンスが求められてくると思います。
そこで、この記事では、Python、Ruby、PHP、Perl のインタープリター言語で実装した Web API 秒間アクセスリミッターのサンプルコードを、GlusterFS & tmpfs メモリーキャッシュッシステムにのせた例を投稿してみます。

3. 要件

  • アクセスが 1 秒間に N 回を超過した場合に、“429 Too Many Requests” の HTTP ステータスコードを返却し遮断する。
  • N はプロジェクトの仕様に依存する。
  • 1 秒間のアクセス制御という性質上、この処理がボトルネックにならないようにする。

4. 環境

  • RHEL 7 系
  • tmpfs
  • GlusterFS 4.1.5
  • Python >= 3
  • PHP >= 5
  • Ruby >= 2
  • Perl >= 5

5. 設計

5-1. 全体構成例

AccessLimiter.png

5-2. メモリキャッシュシステム構成例

GlusterFS_tmpfs_architecture.png

5-3. Web アプリケーションフレームワークについて

軽量なフレームワークでも、起動までに多くの負荷がかかるため、「フレームワークに処理が入る前」に実装。

5-4. ライブラリの読み込みについて

ライブラリのロードは負荷がかかるため、極力ビルトインの処理を中心に実装。

5-5. 例外処理について

フレームワーク依存の例外処理で負荷を増やしても意味がないため、シンプルな低次元コードで実装。

5-6. データリソースについて

RDBMS のような重たいデータリソースは避けて、tmpfs によるメモリーシステムを使用。

5-7. データの同期について

データの同期には、GlusterFS によるファイル同期システムを使用。

6. メモリキャッシュシステム

こちらの構築は、“GlusterFS と tmpfs による分散型フォールトトレラントメモリキャッシュシステム” をご参照ください。

7. サンプルコード

7-1. Python サンプルコード

Python >= 3

access_limiter.py
#!/usr/bin/python
# coding:utf-8

import time
import datetime
import cgi
import os
from pathlib import Path
import re
import sys
import inspect
import traceback
import json

# Definition
def limitSecondsAccess():
    try:
        # Init
        ## Access Timestamp Build
        sec_usec_timestamp = time.time()
        sec_timestamp = int(sec_usec_timestamp)

        ## Access Limit Default Value
        ### Depends on Specifications: For Example 10
        access_limit = 10

        ## Roots Build
        ### Depends on Environment: For Example '/cache_client'
        tmp_root = '/cache_client'
        access_root = os.path.join(tmp_root, 'access')

        ## Auth Key
        ### Depends on Specifications: For Example 'app_id'
        auth_key = 'app_id'

        ## Response Content-Type
        ### Depends on Specifications: For Example JSON and UTF-8
        response_content_type = 'Content-Type: application/json; charset=utf-8'

        ### Response Bodies Build
        ### Depends on Design
        response_bodies = {}

        # Authorized Key Check
        query = cgi.FieldStorage()
        auth_id = query.getvalue(auth_key)
        if not auth_id:
            raise Exception('Unauthorized', 401)

        # The Auth Root Build
        auth_root = os.path.join(access_root, auth_id)

        # The Auth Root Check
        if not os.path.isdir(auth_root):
            # The Auth Root Creation
            os.makedirs(auth_root, exist_ok=True)

        # A Access File Creation Using Micro Timestamp
        ## For example, other data resources such as memory cache or RDB transaction.
        ## In the case of this sample code, it is lightweight because it does not require file locking and transaction processing.
        ## However, in the case of a cluster configuration, file system synchronization is required.
        access_file_path = os.path.join(auth_root, str(sec_usec_timestamp))
        path = Path(access_file_path)
        path.touch()

        # The Access Counts Check
        access_counts = 0
        for base_name in os.listdir(auth_root):
            ## A Access File Path Build
            file_path = os.path.join(auth_root, base_name)

            ## Not File Type
            if not os.path.isfile(file_path):
                continue

            ## The Base Name Data Type Casting
            base_name_sec_usec_timestamp = float(base_name)
            base_name_sec_timestamp = int(base_name_sec_usec_timestamp)

            ## Same Seconds Stampstamp
            if sec_timestamp == base_name_sec_timestamp:

                ### A Overtaken Processing
                if sec_usec_timestamp < base_name_sec_usec_timestamp:
                    continue

                ### Access Counts Increment
                access_counts += 1

                ### Too Many Requests
                if access_counts > access_limit:
                    raise Exception('Too Many Requests', 429)

                continue

            ## Past Access Files Garbage Collection
            if sec_timestamp > base_name_sec_timestamp:
                os.remove(file_path)

    except Exception as e:
        # Exception Tuple to HTTP Status Code
        http_status = e.args[0]
        http_code = e.args[1]

        # 4xx
        if http_code >= 400 and http_code <= 499:
            # logging
            ## snip...
        # 5xx
        elif http_code >= 500:
            # logging
            # snip...

            ## The Exception Message to HTTP Status
            http_status = 'foo'
        else:
            # Logging
            ## snip...

            # HTTP Status Code for The Response
            http_status = 'Internal Server Error'
            http_code = 500

        # Response Headers Feed
        print('Status: ' + str(http_code) + ' ' + http_status)
        print(response_content_type + "\n\n")

        # A Response Body Build
        response_bodies['message'] = http_status
        response_body = json.dumps(response_bodies)

        # The Response Body Feed
        print(response_body)

# Excecution
limitSecondsAccess()

7-2. Ruby サンプルコード

Ruby >= 2

access_limiter.rb
#!/usr/bin/ruby
# -*- coding: utf-8 -*-

require 'time'
require 'fileutils'
require 'cgi'
require 'json'

# Definition
def limitScondsAccess

    begin
        # Init
        ## Access Timestamp Build
        time = Time.now
        sec_timestamp = time.to_i
        sec_usec_timestamp_string = "%10.6f" % time.to_f
        sec_usec_timestamp = sec_usec_timestamp_string.to_f

        ## Access Limit Default Value
        ### Depends on Specifications: For Example 10
        access_limit = 10

        ## Roots Build
        ### Depends on Environment: For Example '/cache_client'
        tmp_root = '/cache_client'
        access_root = tmp_root + '/access'

        ## Auth Key
        ### Depends on Specifications: For Example 'app_id'
        auth_key = 'app_id'

        ## Response Content-Type
        ### Depends on Specifications: For Example JSON and UTF-8
        response_content_type = 'application/json'
        response_charset = 'utf-8'

        ## Response Bodies Build
        ### Depends on Design
        response_bodies = {}

        # Authorized Key Check
        cgi = CGI.new
        if ! cgi.has_key?(auth_key) then
            raise 'Unauthorized:401'
        end
        auth_id = cgi[auth_key]

        # The Auth Root Build
        auth_root = access_root + '/' + auth_id

        # The Auth Root Check
        if ! FileTest::directory?(auth_root) then
            # The Auth Root Creation
            if ! FileUtils.mkdir_p(auth_root, :mode => 0775) then
                raise 'Could not create the auth root. ' + auth_root + ':500'
            end
        end

        # A Access File Creation Using Micro Timestamp
        ## For example, other data resources such as memory cache or RDB transaction.
        ## In the case of this sample code, it is lightweight because it does not require file locking and transaction processing.
        ## However, in the case of a cluster configuration, file system synchronization is required.
        access_file_path = auth_root + '/' + sec_usec_timestamp.to_s
        if ! FileUtils::touch(access_file_path) then
            raise 'Could not create the access file. ' + access_file_path + ':500'
        end

        # The Access Counts Check
        access_counts = 0
        Dir.glob(auth_root + '/*') do |access_file_path|

            # Not File Type
            if ! FileTest::file?(access_file_path) then
                next
            end

            # The File Path to The Base Name
            base_name = File.basename(access_file_path)

            # The Base Name to Integer Data Type
            base_name_sec_timestamp = base_name.to_i

            # Same Seconds Timestamp
            if sec_timestamp == base_name_sec_timestamp then

                ### The Base Name to Float Data Type
                base_name_sec_usec_timestamp = base_name.to_f

                ### A Overtaken Processing
                if sec_usec_timestamp < base_name_sec_usec_timestamp then
                    next
                end

                ### Access Counts Increment
                access_counts += 1

                ### Too Many Requests
                if access_counts > access_limit then
                    raise 'Too Many Requests:429'
                end

                next
            end

            # Past Access Files Garbage Collection
            if sec_timestamp > base_name_sec_timestamp then
                File.unlink access_file_path
            end
        end

        # The Response Feed
        cgi.out({
            ## Response Headers Feed
            'type' => 'text/html',
            'charset' => response_charset,
        }) {
            ## The Response Body Feed
            ''
        }

    rescue => e
        # Exception to HTTP Status Code
        messages = e.message.split(':')
        http_status = messages[0]
        http_code = messages[1]

        # 4xx
        if http_code >= '400' && http_code <= '499' then
            # logging
            ## snip...
        # 5xx
        elsif http_code >= '500' then
            # logging
            ## snip...

            # The Exception Message to HTTP Status
            http_status = 'foo'
        else
            # Logging
            ## snip...

            # HTTP Status Code for The Response
            http_status = 'Internal Server Error'
            http_code = '500'
        end

        # The Response Body Build
        response_bodies['message'] = http_status
        response_body = JSON.generate(response_bodies)

        # The Response Feed
        cgi.out({
            ## Response Headers Feed
            'status' => http_code + ' ' + http_status,
            'type' => response_content_type,
            'charset' => response_charset,
        }) {
            ## The Response Body Feed
            response_body
        }
    end
end

limitScondsAccess

7-3. PHP サンプルコード

PHP >= 5

access_limiter.php
<?php

# Definition
function limitSecondsAccess()
{
    try {
        # Init
        ## Access Timestamp Build
        $sec_usec_timestamp = microtime(true);
        list($sec_timestamp, $usec_timestamp) = explode('.', $sec_usec_timestamp);

        ## Access Limit Default Value
        ### Depends on Specifications: For Example 10
        $access_limit = 10;

        ## Roots Build
        ### Depends on Environment: For Example '/cache_client'
        $tmp_root = '/cache_client';
        $access_root = $tmp_root . '/access';

        ## Auth Key
        ### Depends on Specifications: For Example 'app_id'
        $auth_key = 'app_id';

        ## Response Content-Type
        ## Depends on Specifications: For Example JSON and UTF-8
        $response_content_type = 'Content-Type: application/json; charset=utf-8';

        ## Response Bodies Build
        ### Depends on Design
        $response_bodies = array();

        # Authorized Key Check
        if (empty($_REQUEST[$auth_key])) {
            throw new Exception('Unauthorized', 401);
        }
        $auth_id = $_REQUEST[$auth_key];

        # The Auth Root Build
        $auth_root = $access_root . '/' . $auth_id;

        # The Auth Root Check
        if (! is_dir($auth_root)) {
            ## The Auth Root Creation
            if (! mkdir($auth_root, 0775, true)) {
                throw new Exception('Could not create the auth root. ' . $auth_root, 500);
            }
        }

        # A Access File Creation Using Micro Timestamp
        /* For example, other data resources such as memory cache or RDB transaction.
         * In the case of this sample code, it is lightweight because it does not require file locking and transaction processing.
         * However, in the case of a cluster configuration, file system synchronization is required.
         */
        $access_file_path = $auth_root . '/' . strval($sec_usec_timestamp);
        if (! touch($access_file_path)) {
            throw new Exception('Could not create the access file. ' . $access_file_path, 500);
        }

        # The Auth Root Scanning
        if (! $base_names = scandir($auth_root)) {
            throw new Exception('Could not scan the auth root. ' . $auth_root, 500);
        }

        # The Access Counts Check
        $access_counts = 0;
        foreach ($base_names as $base_name) {
            ## A current or parent dir
            if ($base_name === '.' || $base_name === '..') {
                continue;
            }

            ## A Access File Path Build
            $file_path = $auth_root . '/' . $base_name;

            ## Not File Type
            if (! is_file($file_path)) {
                continue;
            }

            ## The Base Name to Integer Data Type
            $base_name_sec_timestamp = intval($base_name);

            ## Same Seconds Timestamp
            if ($sec_timestamp === $base_name_sec_timestamp) {

                ## The Base Name to Float Data Type
                $base_name_sec_usec_timestamp = floatval($base_name);

                ### A Overtaken Processing
                if ($sec_usec_timestamp < $base_name_sec_usec_timestamp) {
                    continue;
                }

                ### Access Counts Increment
                $access_counts++;

                ### Too Many Requests
                if ($access_counts > $access_limit) {
                    throw new Exception('Too Many Requests', 429);
                }

                continue;
            }

            ## Past Access Files Garbage Collection
            if ($sec_timestamp > $base_name_sec_timestamp) {
                @unlink($file_path);
            }
        }
    } catch (Exception $e) {
        # The Exception to HTTP Status Code
        $http_code = $e->getCode();
        $http_status = $e->getMessage();

        # 4xx
        if ($http_code >= 400 && $http_code <= 499) {
            # logging
            ## snip...
        # 5xx
        } else if ($http_code >= 500) {
            # logging
            ## snip...

            # The Exception Message to HTTP Status
            $http_status = 'foo';
        # Others
        } else {
            # Logging
            ## snip...

            # HTTP Status Code for The Response
            $http_status = 'Internal Server Error';
            $http_code = 500;
        }

        # Response Headers Feed
        header('HTTP/1.1 ' . $http_code . ' ' . $http_status);
        header($response_content_type);

        # A Response Body Build
        $response_bodies['message'] = $http_status;
        $response_body = json_encode($response_bodies);

        # The Response Body Feed
        exit($response_body);
    }
}

# Execution
limitSecondsAccess();
?>

7-4. Perl サンプルコード

Perl >= 5

access_limiter.pl
#!/usr/bin/perl

use strict;
use warnings;
use utf8;
use Time::HiRes qw(gettimeofday);
use CGI;
use File::Basename;
use JSON;

# Definition
sub limitSecondsAccess {

    eval {
        # Init
        ## Access Timestamp Build
        my ($sec_timestamp, $usec_timestamp) = gettimeofday();
        my $sec_usec_timestamp = ($sec_timestamp . '.' . $usec_timestamp) + 0;

        ## Access Limit Default Value
        ### Depends on Specifications: For Example 10
        my $access_limit = 10;

        ## Roots Build
        ### Depends on Environment: For Example '/cache_client'
        my $tmp_root = '/cache_client';
        my $access_root = $tmp_root . '/access';

        ## Auth Key
        ### Depends on Specifications: For Example 'app_id'
        my $auth_key = 'app_id';

        ## Response Content-Type
        ## Depends on Specifications: For Example JSON and UTF-8

        ## Response Bodies Build
        ### Depends on Design
        my %response_bodies;

        # Authorized Key Check
        my $CGI = new CGI;
        if (! defined($CGI->param($auth_key))) {
            die('Unauthorized`401`');
        }
        my $auth_id = $CGI->param($auth_key);

        # The Auth Root Build
        my $auth_root = $access_root . '/' . $auth_id;

        # The Access Root Check
        if (! -d $access_root) {
            ## The Access Root Creation
            if (! mkdir($access_root)) {
                die('Could not create the access root. ' . $access_root . '`500`');
            }
        }

        # The Auth Root Check
        if (! -d $auth_root) {
            ## The Auth Root Creation
            if (! mkdir($auth_root)) {
                die('Could not create the auth root. ' . $auth_root . '`500`');
            }
        }

        # A Access File Creation Using Micro Timestamp
        ## For example, other data resources such as memory cache or RDB transaction.
        ## In the case of this sample code, it is lightweight because it does not require file locking and transaction processing.
        ## However, in the case of a cluster configuration, file system synchronization is required.
        my $access_file_path = $auth_root . '/' . $sec_usec_timestamp;
        if (! open(FH, '>', $access_file_path)) {
            close FH;
            die('Could not create the access file. ' . $access_file_path . '`500`');
        }
        close FH;

        # The Auth Root Scanning
        my @file_pathes = glob($auth_root . "/*");
        if (! @file_pathes) {
            die('Could not scan the auth root. ' . $auth_root . '`500`');
        }

        # The Access Counts Check
        my $access_counts = 0;
        foreach my $file_path (@file_pathes) {

            ## Not File Type
            if (! -f $file_path) {
                next;
            }

            ## The Base Name Extract
            my $base_name = basename($file_path);

            ## The Base Name to Integer Data Type
            my $base_name_sec_timestamp = int($base_name);

            ## Same Seconds Timestamp
            if ($sec_timestamp eq $base_name_sec_timestamp) {

                ## The Base Name to Float Data Type
                my $base_name_sec_usec_timestamp = $base_name;

                ### A Overtaken Processing
                if ($sec_usec_timestamp lt $base_name_sec_usec_timestamp) {
                    next;
                }

                ### Access Counts Increment
                $access_counts++;

                ### Too Many Requests
                if ($access_counts > $access_limit) {
                    die("Too Many Requests`429`");
                }

                next;
            }

            ## Past Access Files Garbage Collection
            if ($sec_timestamp gt $base_name_sec_timestamp) {
                unlink($file_path);
            }
        }
    };

    if ($@) {
        # Error Elements Extract
        my @e = split(/`/, $@);

        # Exception to HTTP Status Code
        my $http_status = $e[0];
        my $http_code = '0';
        if (defined($e[1])) {
            $http_code = $e[1];
        }

        # 4xx
        if ($http_code ge '400' && $http_code le '499') {
            # logging
            ## snip...
        # 5xx
        } elsif ($http_code ge '500') {
            # logging
            ## snip...

            ## The Exception Message to HTTP Status
            $http_status = 'foo';
        # Others
        } else {
            # logging
            ## snip...

            $http_status = 'Internal Server Error';
            $http_code = '500';
        }

        # Response Headers Feed
        print("Status: " . $http_code . " " . $http_status . "\n");
        print('Content-Type: application/json; charset=utf-8' . "\n\n");

        # A Response Body Build
        my %response_bodies;
        $response_bodies{'message'} = $http_status;
        $a = \%response_bodies;
        my $response_body = encode_json($a);

        # The Response Body Feed
        print($response_body);
    }

}

#Excecution
&limitSecondsAccess();

8. まとめ

この記事では、Python、Ruby、PHP、Perl のインタープリター言語で実装した Web API 秒間アクセスリミッターのサンプルコードを、GlusterFS & tmpfs メモリーキャッシュッシステムにのせた例を投稿してみました。
おそらく、Web API 秒間アクセスリミッターのような機能要件の場合は、基準になる時間が短く、低負荷、高速性、正確性などのパフォーマンスが求められてくると思います。
そのため、いくつか重要になるかもしれないポイントを書き出してみましたが、何かのご参考になれば幸いです。