More than 3 years have passed since last update.

Youtube Data API使って動画検索してみた（初心者）

Last updated at 2020-11-04Posted at 2020-11-04

はじめに

データ分析の勉強はインプットも大事だけど実践が1番ってことで、練習になるいいデータはないかなーと思っていました。Youtubeのデータがいいものなのかは、正直今の僕には判断ができません。でも、よくYoutube見るし、興味のある分野なので**「Youtube Data API」**を使って分析用のデータ抽出ができるようになることを目標に使い方をまとめてボチボチまとめて行こうかと思います。APIの学習には以下のページ（APIリファレンス）を利用しました。
https://developers.google.com/youtube/v3/docs?hl=ja

検索処理

今回は手始めに以下の条件で動画を検索し、結果をcsvファイルに出力します。

指定したキーワードで動画を検索（キーワードは第1引数で指定）
検索結果は再生回数で降順に表示

また、検索結果の動画がどのチャンネルのものかを度数分布化し、csvファイルに出力します。

ソースコード

ソースは以下の通りです。プログラム内の変数「DEVELOPER_KEY」は、自身のAPIキーを入力してください。APIキーの発行方法はここでは割愛します。

searchKeyword.py

# import library
from apiclient.discovery import build
from apiclient.errors import HttpError
import argparse
import numpy as np
import pandas as pd

# Set Yotube Data API key
DEVELOPER_KEY = "YOUR API　KEY!!!"
YOUTUBE_API_SERVICE_NAME = "youtube"
YOUTUBE_API_VERSION = "v3"

def searchKeyword(options):
    # キーワード検索処理
    youtube = build(YOUTUBE_API_SERVICE_NAME, YOUTUBE_API_VERSION, 
                    developerKey=DEVELOPER_KEY)
    searchResults = youtube.search().list(q=options.sw,
                                        type="video",
                                        part="id,snippet",
                                        maxResults=options.max_results,
                                        order="viewCount"
                                        ).execute()
    
    # 検索結果分類処理
    videos = []
    others = []
    for searchResult in searchResults["items"]:
        if (searchResult["id"]["kind"] == "youtube#video"):
            videos.append(searchResult)
        else :
            others.append(searchResult)

    #動画、チャンネル情報整形、csvファイル出力
    videoTitles = []
    viewCounts = []
    likeCounts = []
    dislikeCounts = []
    favoriteCounts = []
    commentCounts =[]
    videoChannelTitles = []
    stat_list = [viewCounts, likeCounts, dislikeCounts, favoriteCounts, commentCounts]
    stat_keywords = ['viewCount', 'likeCount', 'dislikeCount', 'favoriteCount', 'commentCount']
    for video in videos:
        videoDetail = youtube.videos().list( part="statistics, snippet",
                                            id = video["id"]["videoId"]
                                            ).execute()
        channelDetail = youtube.channels().list(part="snippet", 
                                                id=videoDetail["items"][0]["snippet"]["channelId"]
                                                ).execute()
        
        videoTitles.append(videoDetail["items"][0]["snippet"]["title"])
        for stat, stat_keyword in zip(stat_list, stat_keywords):
            try:
                stat.append(videoDetail["items"][0]["statistics"][stat_keyword])
            except KeyError:
                stat.append(0)
        videoChannelTitles.append(channelDetail["items"][0]["snippet"]["title"])

    df_videos = pd.DataFrame({"title":videoTitles, "ViewCount":viewCounts, 
                            "channelTitle":videoChannelTitles,"likeCount":likeCounts,
                            "dislikeCount":dislikeCounts, "favoriteCount":favoriteCounts,
                            "commentCount":commentCounts})
    df_videos.to_csv("Search_result_{}.csv".format(options.sw),encoding="utf-8_sig")
    df_videos_countbyChannel = df_videos["channelTitle"].value_counts()
    df_videos_countbyChannel.to_csv("ChannelTitle_{}.csv".format(options.sw),encoding="utf-8_sig")

    return df_videos, df_videos_countbyChannel
    


if __name__ == "__main__":
    # parse Argument
    parser = argparse.ArgumentParser("search Youtube Program...")
    parser.add_argument("sw", help="search Keyword in Youtube")
    parser.add_argument("--max_results", type=int, help="max of search results",
                        default=50)
    options = parser.parse_args()

    searchKeywordResults = searchKeyword(options)

実行してみた

実際に動かしてみました。今回は「量子コンピュータ」を検索キーワードに指定して実行します。

$ python searchKeyword.py "量子コンピュータ"

「searchKeyword.py」を配置したディレクトリに「Search_result_量子コンピュータ.csv」と「ChannelTitle_量子コンピュータ.csv」ができた。この2つのファイルの内容を確認してみる。

Search_result_量子コンピュータ.csv（冒頭部分のみ記載）

No	title	ViewCount	channelTitle	likeCount	dislikeCount	commentCount
0	Quantum Computers Explained – Limits of Human Technology	12915763	Kurzgesagt – In a Nutshell	310808	3405	16871
1	[マインクラフト]疑似量子ビット計算機[理論上世界最速？]	4483432	田辺魅癒喜	60153	2057	9898
2	量子コンピューターは通常のコンピューターと何が違うのか？【日本科学情報】【科学技術】	622469	日本科学情報	8019	435	647
3	世界を変える「量子コンピューター」とは？ホリエモンが解説！【NewsPicksコラボ】	232913	堀江貴文ホリエモン	1443	121	275
4	この世界はシミュレーション⁉もし量子コンピュータが完成したら...【都市伝説】	211623	ミルクティー飲みたい	2722	142	411
5	【驚愕】量子コンピュータの衝撃「想像を絶する勘違い」	144126	イチゼロシステム	1898	121	199
6	スパコンを遥かに凌駕！国産量子コンピューター発表(17/11/20)	121514	ANNnewsCH	1085	47	0
7	【量子力学】「量子コンピューター」と「シュテルン・ゲルラッハの実験」を学ぼう	117389	イケハヤ大学	1311	214	95
8	【挑戦】10分でわかる「量子コンピュータ」	110234	NEX工業	1579	178	187
9	【量子コンピュータ】第一回「量子ビットと重ね合わせ」（10分）	105738	量子コイン	0	0	58
10	ビットコインが崩壊か⁉Googleが量子コンピュータ開発でどのようになるのか？ブロックチェーンの安全性など解説	99405	もふもふ不動産	1675	121	192

うまい具合に動画情報が取得できたようです。

ChannelTitle_量子コンピュータ.csv（冒頭部分のみ記載）

チャンネル名	Count
量子コイン	7
慶應義塾Keio University	5
DENSO Official Channel	2
神王ＴＶ	2
報道SAMURAI	2
もふもふ不動産	2
jstsciencechannel	1
EE Times Japan	1
ミルクティー飲みたい	1
ブライトサイド	Bright Side Japan

こちらもうまい具合に情報取得できたようです。

最後に

これを応用していけば色々面白いことが出来そうです。もうちょっと色々できるように機能を少しずつ拡張していこうと思います。

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up