Edited at

Web API 秒間アクセスリミッター Python, Ruby, PHP, Perl サンプルコード


1. 要約

この記事では、Python、Ruby、PHP、Perl のインタープリター言語で実装した Web API 秒間アクセスリミッターのサンプルコードを、GlusterFS & tmpfs メモリーキャッシュッシステムにのせた例を投稿します。


2. はじめに

Web API サービスの開発プロジェクトでは、一定期間内にアクセス数が超過した場合に、“429 Too Many Requests” の HTTP ステータスコードを返却するというような、アクセス制限をかけなければならない要件が出てくる事もあると思います。

おそらくこういった時の要件は、基準になる時間が短い中で、低負荷、高速性、正確性などのパフォーマンスが求められてくると思います。

そこで、この記事では、Python、Ruby、PHP、Perl のインタープリター言語で実装した Web API 秒間アクセスリミッターのサンプルコードを、GlusterFS & tmpfs メモリーキャッシュッシステムにのせた例を投稿してみます。


3. 要件


  • アクセスが 1 秒間に N 回を超過した場合に、“429 Too Many Requests” の HTTP ステータスコードを返却し遮断する。

  • N はプロジェクトの仕様に依存する。

  • 1 秒間のアクセス制御という性質上、この処理がボトルネックにならないようにする。


4. 環境


  • RHEL 7 系

  • tmpfs

  • GlusterFS 4.1.5

  • Python >= 3

  • PHP >= 5

  • Ruby >= 2

  • Perl >= 5


5. 設計


5-1. 全体構成例

AccessLimiter.png


5-2. メモリキャッシュシステム構成例

GlusterFS_tmpfs_architecture.png


5-3. Web アプリケーションフレームワークについて

軽量なフレームワークでも、起動までに多くの負荷がかかるため、「フレームワークに処理が入る前」に実装。


5-4. ライブラリの読み込みについて

ライブラリのロードは負荷がかかるため、極力ビルトインの処理を中心に実装。


5-5. 例外処理について

フレームワーク依存の例外処理で負荷を増やしても意味がないため、シンプルな低次元コードで実装。


5-6. データリソースについて

RDBMS のような重たいデータリソースは避けて、tmpfs によるメモリーシステムを使用。


5-7. データの同期について

データの同期には、GlusterFS によるファイル同期システムを使用。


6. メモリキャッシュシステム

こちらの構築は、“GlusterFS と tmpfs による分散型フォールトトレラントメモリキャッシュシステム” をご参照ください。


7. サンプルコード


7-1. Python サンプルコード

Python >= 3


access_limiter.py

#!/usr/bin/python

# coding:utf-8

import time
import datetime
import cgi
import os
from pathlib import Path
import re
import sys
import inspect
import traceback
import json

# Definition
def limitSecondsAccess():
try:
# Init
## Access Timestamp Build
sec_usec_timestamp = time.time()
sec_timestamp = int(sec_usec_timestamp)

## Access Limit Default Value
### Depends on Specifications: For Example 10
access_limit = 10

## Roots Build
### Depends on Environment: For Example '/cache_client'
tmp_root = '/cache_client'
access_root = os.path.join(tmp_root, 'access')

## Auth Key
### Depends on Specifications: For Example 'app_id'
auth_key = 'app_id'

## Response Content-Type
### Depends on Specifications: For Example JSON and UTF-8
response_content_type = 'Content-Type: application/json; charset=utf-8'

### Response Bodies Build
### Depends on Design
response_bodies = {}

# Authorized Key Check
query = cgi.FieldStorage()
auth_id = query.getvalue(auth_key)
if not auth_id:
raise Exception('Unauthorized', 401)

# The Auth Root Build
auth_root = os.path.join(access_root, auth_id)

# The Auth Root Check
if not os.path.isdir(auth_root):
# The Auth Root Creation
os.makedirs(auth_root, exist_ok=True)

# A Access File Creation Using Micro Timestamp
## For example, other data resources such as memory cache or RDB transaction.
## In the case of this sample code, it is lightweight because it does not require file locking and transaction processing.
## However, in the case of a cluster configuration, file system synchronization is required.
access_file_path = os.path.join(auth_root, str(sec_usec_timestamp))
path = Path(access_file_path)
path.touch()

# The Access Counts Check
access_counts = 0
for base_name in os.listdir(auth_root):
## A Access File Path Build
file_path = os.path.join(auth_root, base_name)

## Not File Type
if not os.path.isfile(file_path):
continue

## The Base Name Data Type Casting
base_name_sec_usec_timestamp = float(base_name)
base_name_sec_timestamp = int(base_name_sec_usec_timestamp)

## Same Seconds Stampstamp
if sec_timestamp == base_name_sec_timestamp:

### A Overtaken Processing
if sec_usec_timestamp < base_name_sec_usec_timestamp:
continue

### Access Counts Increment
access_counts += 1

### Too Many Requests
if access_counts > access_limit:
raise Exception('Too Many Requests', 429)

continue

## Past Access Files Garbage Collection
if sec_timestamp > base_name_sec_timestamp:
os.remove(file_path)

except Exception as e:
# Exception Tuple to HTTP Status Code
http_status = e.args[0]
http_code = e.args[1]

# 4xx
if http_code >= 400 and http_code <= 499:
# logging
## snip...
# 5xx
elif http_code >= 500:
# logging
# snip...

## The Exception Message to HTTP Status
http_status = 'foo'
else:
# Logging
## snip...

# HTTP Status Code for The Response
http_status = 'Internal Server Error'
http_code = 500

# Response Headers Feed
print('Status: ' + str(http_code) + ' ' + http_status)
print(response_content_type + "\n\n")

# A Response Body Build
response_bodies['message'] = http_status
response_body = json.dumps(response_bodies)

# The Response Body Feed
print(response_body)

# Excecution
limitSecondsAccess()



7-2. Ruby サンプルコード

Ruby >= 2


access_limiter.rb

#!/usr/bin/ruby

# -*- coding: utf-8 -*-

require 'time'
require 'fileutils'
require 'cgi'
require 'json'

# Definition
def limitScondsAccess

begin
# Init
## Access Timestamp Build
time = Time.now
sec_timestamp = time.to_i
sec_usec_timestamp_string = "%10.6f" % time.to_f
sec_usec_timestamp = sec_usec_timestamp_string.to_f

## Access Limit Default Value
### Depends on Specifications: For Example 10
access_limit = 10

## Roots Build
### Depends on Environment: For Example '/cache_client'
tmp_root = '/cache_client'
access_root = tmp_root + '/access'

## Auth Key
### Depends on Specifications: For Example 'app_id'
auth_key = 'app_id'

## Response Content-Type
### Depends on Specifications: For Example JSON and UTF-8
response_content_type = 'application/json'
response_charset = 'utf-8'

## Response Bodies Build
### Depends on Design
response_bodies = {}

# Authorized Key Check
cgi = CGI.new
if ! cgi.has_key?(auth_key) then
raise 'Unauthorized:401'
end
auth_id = cgi[auth_key]

# The Auth Root Build
auth_root = access_root + '/' + auth_id

# The Auth Root Check
if ! FileTest::directory?(auth_root) then
# The Auth Root Creation
if ! FileUtils.mkdir_p(auth_root, :mode => 0775) then
raise 'Could not create the auth root. ' + auth_root + ':500'
end
end

# A Access File Creation Using Micro Timestamp
## For example, other data resources such as memory cache or RDB transaction.
## In the case of this sample code, it is lightweight because it does not require file locking and transaction processing.
## However, in the case of a cluster configuration, file system synchronization is required.
access_file_path = auth_root + '/' + sec_usec_timestamp.to_s
if ! FileUtils::touch(access_file_path) then
raise 'Could not create the access file. ' + access_file_path + ':500'
end

# The Access Counts Check
access_counts = 0
Dir.glob(auth_root + '/*') do |access_file_path|

# Not File Type
if ! FileTest::file?(access_file_path) then
next
end

# The File Path to The Base Name
base_name = File.basename(access_file_path)

# The Base Name to Integer Data Type
base_name_sec_timestamp = base_name.to_i

# Same Seconds Timestamp
if sec_timestamp == base_name_sec_timestamp then

### The Base Name to Float Data Type
base_name_sec_usec_timestamp = base_name.to_f

### A Overtaken Processing
if sec_usec_timestamp < base_name_sec_usec_timestamp then
next
end

### Access Counts Increment
access_counts += 1

### Too Many Requests
if access_counts > access_limit then
raise 'Too Many Requests:429'
end

next
end

# Past Access Files Garbage Collection
if sec_timestamp > base_name_sec_timestamp then
File.unlink access_file_path
end
end

# The Response Feed
cgi.out({
## Response Headers Feed
'type' => 'text/html',
'charset' => response_charset,
}) {
## The Response Body Feed
''
}

rescue => e
# Exception to HTTP Status Code
messages = e.message.split(':')
http_status = messages[0]
http_code = messages[1]

# 4xx
if http_code >= '400' && http_code <= '499' then
# logging
## snip...
# 5xx
elsif http_code >= '500' then
# logging
## snip...

# The Exception Message to HTTP Status
http_status = 'foo'
else
# Logging
## snip...

# HTTP Status Code for The Response
http_status = 'Internal Server Error'
http_code = '500'
end

# The Response Body Build
response_bodies['message'] = http_status
response_body = JSON.generate(response_bodies)

# The Response Feed
cgi.out({
## Response Headers Feed
'status' => http_code + ' ' + http_status,
'type' => response_content_type,
'charset' => response_charset,
}) {
## The Response Body Feed
response_body
}
end
end

limitScondsAccess



7-3. PHP サンプルコード

PHP >= 5


access_limiter.php

<?php

# Definition
function limitSecondsAccess()
{
try {
# Init
## Access Timestamp Build
$sec_usec_timestamp = microtime(true);
list($sec_timestamp, $usec_timestamp) = explode('.', $sec_usec_timestamp);

## Access Limit Default Value
### Depends on Specifications: For Example 10
$access_limit = 10;

## Roots Build
### Depends on Environment: For Example '/cache_client'
$tmp_root = '/cache_client';
$access_root = $tmp_root . '/access';

## Auth Key
### Depends on Specifications: For Example 'app_id'
$auth_key = 'app_id';

## Response Content-Type
## Depends on Specifications: For Example JSON and UTF-8
$response_content_type = 'Content-Type: application/json; charset=utf-8';

## Response Bodies Build
### Depends on Design
$response_bodies = array();

# Authorized Key Check
if (empty($_REQUEST[$auth_key])) {
throw new Exception('Unauthorized', 401);
}
$auth_id = $_REQUEST[$auth_key];

# The Auth Root Build
$auth_root = $access_root . '/' . $auth_id;

# The Auth Root Check
if (! is_dir($auth_root)) {
## The Auth Root Creation
if (! mkdir($auth_root, 0775, true)) {
throw new Exception('Could not create the auth root. ' . $auth_root, 500);
}
}

# A Access File Creation Using Micro Timestamp
/* For example, other data resources such as memory cache or RDB transaction.
* In the case of this sample code, it is lightweight because it does not require file locking and transaction processing.
* However, in the case of a cluster configuration, file system synchronization is required.
*/

$access_file_path = $auth_root . '/' . strval($sec_usec_timestamp);
if (! touch($access_file_path)) {
throw new Exception('Could not create the access file. ' . $access_file_path, 500);
}

# The Auth Root Scanning
if (! $base_names = scandir($auth_root)) {
throw new Exception('Could not scan the auth root. ' . $auth_root, 500);
}

# The Access Counts Check
$access_counts = 0;
foreach ($base_names as $base_name) {
## A current or parent dir
if ($base_name === '.' || $base_name === '..') {
continue;
}

## A Access File Path Build
$file_path = $auth_root . '/' . $base_name;

## Not File Type
if (! is_file($file_path)) {
continue;
}

## The Base Name to Integer Data Type
$base_name_sec_timestamp = intval($base_name);

## Same Seconds Timestamp
if ($sec_timestamp === $base_name_sec_timestamp) {

## The Base Name to Float Data Type
$base_name_sec_usec_timestamp = floatval($base_name);

### A Overtaken Processing
if ($sec_usec_timestamp < $base_name_sec_usec_timestamp) {
continue;
}

### Access Counts Increment
$access_counts++;

### Too Many Requests
if ($access_counts > $access_limit) {
throw new Exception('Too Many Requests', 429);
}

continue;
}

## Past Access Files Garbage Collection
if ($sec_timestamp > $base_name_sec_timestamp) {
@unlink($file_path);
}
}
} catch (Exception $e) {
# The Exception to HTTP Status Code
$http_code = $e->getCode();
$http_status = $e->getMessage();

# 4xx
if ($http_code >= 400 && $http_code <= 499) {
# logging
## snip...
# 5xx
} else if ($http_code >= 500) {
# logging
## snip...

# The Exception Message to HTTP Status
$http_status = 'foo';
# Others
} else {
# Logging
## snip...

# HTTP Status Code for The Response
$http_status = 'Internal Server Error';
$http_code = 500;
}

# Response Headers Feed
header('HTTP/1.1 ' . $http_code . ' ' . $http_status);
header($response_content_type);

# A Response Body Build
$response_bodies['message'] = $http_status;
$response_body = json_encode($response_bodies);

# The Response Body Feed
exit($response_body);
}
}

# Execution
limitSecondsAccess();
?>



7-4. Perl サンプルコード

Perl >= 5


access_limiter.pl

#!/usr/bin/perl

use strict;
use warnings;
use utf8;
use Time::HiRes qw(gettimeofday);
use CGI;
use File::Basename;
use JSON;

# Definition
sub limitSecondsAccess {

eval {
# Init
## Access Timestamp Build
my ($sec_timestamp, $usec_timestamp) = gettimeofday();
my $sec_usec_timestamp = ($sec_timestamp . '.' . $usec_timestamp) + 0;

## Access Limit Default Value
### Depends on Specifications: For Example 10
my $access_limit = 10;

## Roots Build
### Depends on Environment: For Example '/cache_client'
my $tmp_root = '/cache_client';
my $access_root = $tmp_root . '/access';

## Auth Key
### Depends on Specifications: For Example 'app_id'
my $auth_key = 'app_id';

## Response Content-Type
## Depends on Specifications: For Example JSON and UTF-8

## Response Bodies Build
### Depends on Design
my %response_bodies;

# Authorized Key Check
my $CGI = new CGI;
if (! defined($CGI->param($auth_key))) {
die('Unauthorized`401`');
}
my $auth_id = $CGI->param($auth_key);

# The Auth Root Build
my $auth_root = $access_root . '/' . $auth_id;

# The Access Root Check
if (! -d $access_root) {
## The Access Root Creation
if (! mkdir($access_root)) {
die('Could not create the access root. ' . $access_root . '`500`');
}
}

# The Auth Root Check
if (! -d $auth_root) {
## The Auth Root Creation
if (! mkdir($auth_root)) {
die('Could not create the auth root. ' . $auth_root . '`500`');
}
}

# A Access File Creation Using Micro Timestamp
## For example, other data resources such as memory cache or RDB transaction.
## In the case of this sample code, it is lightweight because it does not require file locking and transaction processing.
## However, in the case of a cluster configuration, file system synchronization is required.
my $access_file_path = $auth_root . '/' . $sec_usec_timestamp;
if (! open(FH, '>', $access_file_path)) {
close FH;
die('Could not create the access file. ' . $access_file_path . '`500`');
}
close FH;

# The Auth Root Scanning
my @file_pathes = glob($auth_root . "/*");
if (! @file_pathes) {
die('Could not scan the auth root. ' . $auth_root . '`500`');
}

# The Access Counts Check
my $access_counts = 0;
foreach my $file_path (@file_pathes) {

## Not File Type
if (! -f $file_path) {
next;
}

## The Base Name Extract
my $base_name = basename($file_path);

## The Base Name to Integer Data Type
my $base_name_sec_timestamp = int($base_name);

## Same Seconds Timestamp
if ($sec_timestamp eq $base_name_sec_timestamp) {

## The Base Name to Float Data Type
my $base_name_sec_usec_timestamp = $base_name;

### A Overtaken Processing
if ($sec_usec_timestamp lt $base_name_sec_usec_timestamp) {
next;
}

### Access Counts Increment
$access_counts++;

### Too Many Requests
if ($access_counts > $access_limit) {
die("Too Many Requests`429`");
}

next;
}

## Past Access Files Garbage Collection
if ($sec_timestamp gt $base_name_sec_timestamp) {
unlink($file_path);
}
}
};

if ($@) {
# Error Elements Extract
my @e = split(/`/, $@);

# Exception to HTTP Status Code
my $http_status = $e[0];
my $http_code = '0';
if (defined($e[1])) {
$http_code = $e[1];
}

# 4xx
if ($http_code ge '400' && $http_code le '499') {
# logging
## snip...
# 5xx
} elsif ($http_code ge '500') {
# logging
## snip...

## The Exception Message to HTTP Status
$http_status = 'foo';
# Others
} else {
# logging
## snip...

$http_status = 'Internal Server Error';
$http_code = '500';
}

# Response Headers Feed
print("Status: " . $http_code . " " . $http_status . "\n");
print('Content-Type: application/json; charset=utf-8' . "\n\n");

# A Response Body Build
my %response_bodies;
$response_bodies{'message'} = $http_status;
$a = \%response_bodies;
my $response_body = encode_json($a);

# The Response Body Feed
print($response_body);
}

}

#Excecution
&limitSecondsAccess();



8. まとめ

この記事では、Python、Ruby、PHP、Perl のインタープリター言語で実装した Web API 秒間アクセスリミッターのサンプルコードを、GlusterFS & tmpfs メモリーキャッシュッシステムにのせた例を投稿してみました。

おそらく、Web API 秒間アクセスリミッターのような機能要件の場合は、基準になる時間が短く、低負荷、高速性、正確性などのパフォーマンスが求められてくると思います。

そのため、いくつか重要になるかもしれないポイントを書き出してみましたが、何かのご参考になれば幸いです。