GeForce GTX 1070 (8GB)
ASRock Z170M Pro4S [Intel Z170chipset]
Ubuntu 16.04 LTS desktop amd64
TensorFlow v1.2.1
cuDNN v5.1 for Linux
CUDA v8.0
Python 3.5.2
IPython 6.0.0 -- An enhanced Interactive Python.
gcc (Ubuntu 5.4.0-6ubuntu1~16.04.4) 5.4.0 20160609
GNU bash, version 4.3.48(1)-release (x86_64-pc-linux-gnu)
scipy v0.19.1
geopandas v0.3.0
MATLAB R2017b (Home Edition)
ADDA v.1.3b6
MemoryEnhancer > MEDC > test_get_list_180130.py > GitHub上のMarkdownファイル名(*.html.md)リストを取得する | GitHub REST API v3 | JSONモジュール | BeautifulSoup
にて@SaitoAtsushi さんに教えていただいたGitHub REST API v3 + JSONでの読込みを試していた。
読めた?と思った次の実行で<Response [403]>
が出るようになった。
{
"message": "API rate limit exceeded for 314.159.265.358. (But here's the good news: Authenticated requests get a higher rate limit. Check out the documentation for more details.)",
"documentation_url": "https://developer.github.com/v3/#rate-limiting"
}
Rate limitingにひっかかったようだ。
https://developer.github.com/v3/#rate-limiting
https://developer.github.com/v3/rate_limit/
ではStatus: 200 OK
になっているが、よく分からない。
(追記 2018/01/31)
60回までは<Response [200]>
でJSON文字列が取得でき、61回目で<Response [403]>
が出るようになった。
https://developer.github.com/v3/
For unauthenticated requests, the rate limit allows for up to 60 requests per hour. Unauthenticated requests are associated with the originating IP address, and not the user making requests.
以下が試そうとしていた実装。
import requests as rq
import json
import sys
IN_URL = "https://api.github.com/repos/yasokada/TechEnglish_170903/contents/data"
res = rq.get(IN_URL)
print(res)
#wrk = json.dumps(wrk, sort_keys = False, indent = 4)
for elem in json.loads(res.text):
#print(type(elem))
#print(elem)
print(elem['name'])
#sys.exit()
https://github.com/github.com
以下をrequestsで読込みBeautifulSoupで処理した場合はこういう制限にひっかからないのだろうか(実際に読めている)。
関連: GitHub > ページへのアクセス制限 > REST API v3は60/hour | https://github.comアクセスは40/hour
以下は読もうとしていたJSON形式の例
(http://www.ctrlshift.net/jsonprettyprinter/ で整形)
[
{
"_links": {
"git": "https://api.github.com/repos/yasokada/TechEnglish_170903/git/blobs/49445b32d8a945e33b1e546d14758149333ea7b6",
"html": "https://github.com/yasokada/TechEnglish_170903/blob/master/data/10.html.md",
"self": "https://api.github.com/repos/yasokada/TechEnglish_170903/contents/data/10.html.md?ref=master"
},
"download_url": "https://raw.githubusercontent.com/yasokada/TechEnglish_170903/master/data/10.html.md",
"git_url": "https://api.github.com/repos/yasokada/TechEnglish_170903/git/blobs/49445b32d8a945e33b1e546d14758149333ea7b6",
"html_url": "https://github.com/yasokada/TechEnglish_170903/blob/master/data/10.html.md",
"name": "10.html.md",
"path": "data/10.html.md",
"sha": "49445b32d8a945e33b1e546d14758149333ea7b6",
"size": 953,
"type": "file",
"url": "https://api.github.com/repos/yasokada/TechEnglish_170903/contents/data/10.html.md?ref=master"
},
code v0.2
<Response [403]>
への対応を追加。
import requests as rq
import json
import sys
IN_URL = "https://api.github.com/repos/yasokada/TechEnglish_170903/contents/data"
res = rq.get(IN_URL)
# wrk = json.dumps(wrk, sort_keys = False, indent = 4)
if "[200]" not in str(res):
print("ERROR: Rate limiting")
print(res)
print(res.text)
sys.exit()
for elem in json.loads(res.text):
print(elem['name'])
制限がかかっている時の実行例。
$ python3 test_get_list_GitHubAPI_JSON_180130.py
ERROR: Rate limiting
<Response [403]>
{"message":"API rate limit exceeded for 314.159.265.358. (But here's the good news: Authenticated requests get a higher rate limit. Check out the documentation for more details.)","documentation_url":"https://developer.github.com/v3/#rate-limiting"}
制限がかかっていない時の実行例。
$ python3 test_get_list_GitHubAPI_JSON_180130.py | head
10.html.md
1096.html.md
1097.html.md
1098.html.md
1099.html.md
1273.html.md
1274.html.md
1275.html.md
1593.html.md
1594.html.md