More than 1 year has passed since last update.

GAS＋スプレッドシート+ChatGPT+arXiv APIを使って、新着論文abstractの要約＋スプレッドシート管理+メール通知をしてもらう

Last updated at 2024-02-01Posted at 2023-03-30

Google Apps Script（GAS）を使って、
arXiv APIとChatGPT APIを利用して論文情報とabstractの要約情報をスプレッドシートで管理し、さらにメール通知する仕組みを作りました。

動作イメージ

GASで定期実行を設定すると、画像のようなメール通知が来るようになり、スプレッドシートにデータが書き込まれます。

使い方

ChatGPT APIのAPIキーを取得
空のスプレッドシートを作成し、IDを取得
コード中のCONFIGを編集し、APIキーとスプレッドシートのID、検索条件などを設定
GASの定期実行トリガーを設定

定期実行設定すると、スプレッドシートに論文情報と要約情報が自動で追加され、メール通知が送信されます。
CONFIGの詳しい設定方法は以下に記載します。

CONFIGの設定

apiKey：ChatGPTのAPIキー
spreadsheetId：スプレッドシートのID。
promptFormat：ChatGPTに入力するプロンプトのフォーマット。この下にarXive APIが取得した論文情報が結合されます
email：送信するメールの設定で、受信者、件名、送信者、cc、bccなどを指定。cc,bccは不要であれば、空のリストにしてください。
search：Arxiveから論文を検索するための設定、検索条件を指定。
summaryRequestLimit：要約の取得回数の上限を設定

const CONFIG = {
  apiKey: "{ChatGPTのAPIキー}",
  spreadsheetId: "{スプレッドシートID}",
  promptFormat: "You are a researcher specializing in AI.\n" +
                "Please describe the following paper in Japanese, separating the title, abstract, and related keywords.\n" +
                "Please use bullet points for the main points.\n" +
                "出力は日本語でお願いします。\n",
  email: {
    recipient: "to_email@example.com",
    subject: "Arxive新着論文",
    sender: "Arxive要約bot",
    ccRecipients: ["cc_email1@example.com", "cc_email2@example.com"]
    bccRecipients: ["bcc_email1@example.com", "bcc_email2@example.com"]
     },
  search: {
    terms: ["key word"],
    subject: "cs.AI"
    daysAgo: 1,
    maxResults: 100,
    sortBy: "submittedDate",
    sortOrder: "descending",
  },
  summaryRequestLimit: 5 // 要約取得の上限回数 
};

定期実行トリガーの設定

main関数を定時で実行する設定をします。
サイドメニューからトリガー設定ページ → 右下からトリガーを追加

コード

ChatGPTさんに、クラスの役割を書いてもらいました。

Arxive APIに関するクラス

ArxivArticle：arxive APIから取得した論文情報を表すクラス
ArxivFetcher：Arxive APIから論文情報を取得するクラス。指定した検索語句・カテゴリ・期間などに合致する論文情報を取得する

ChatGPT APIに関するクラス

GPTRequester：ChatGPT APIを利用して要約を取得するクラス
GPTRequestBuilder：ChatGPT APIにリクエストするためのペイロードを生成するクラス
GPTResponseParser：ChatGPT APIから受け取った応答から、要約されたテキストを抽出する

スプレッドシート・メールを操作するクラス

SpreadsheetManager：Googleスプレッドシートを管理するクラス
EmailBodyBuilder：メールの本文を生成するクラス。論文情報と要約をメール本文に追加する
EmailSender：メールを送信するクラス。送信先、件名、本文を指定して、メールを送信する

class ArxivArticle {
    constructor(id, title, authors, published, abstract) {
        this.id = id;
        this.title = title;
        this.authors = authors;
        this.published = published;
        this.abstract = abstract;
    }
}

class ArxivFetcher {
    constructor() {
        this.atomNamespace = XmlService.getNamespace("http://www.w3.org/2005/Atom");
    }

    toYYYYMMDD(date) {
        return [date.getFullYear(), date.getMonth() + 1, date.getDate()].join("/");
    }

    getPastDate(daysAgo) {
        const today = new Date();
        return new Date(today.getTime() - daysAgo * 24 * 60 * 60 * 1000);
    }

    buildURL(searchTerms, subject, daysAgo = 1, maxResults = 100, sortBy = 'submittedDate', sortOrder = 'descending') {
        const baseURL = 'http://export.arxiv.org/api/query?';
        const pastDate = this.getPastDate(daysAgo);
        const formattedDate = this.toYYYYMMDD(pastDate);
        const searchQuery = searchTerms.map(term => `all:${term}`).join("&") + `&cat:${subject}`;
        const queryParams = {
            search_query: searchQuery,
            start: 0,
            max_results: maxResults,
            sortBy: sortBy,
            sortOrder: sortOrder,
            from: formattedDate
        };
        return baseURL + Object.keys(queryParams)
            // .map(key => `${encodeURIComponent(key)}=${encodeURIComponent(queryParams[key])}`)
            .map(key => `${key}=${queryParams[key]}`)
            .join('&');
    }

    fetchEntries(searchTerms, subject, daysAgo, maxResults = 100, sortBy = 'submittedDate', sortOrder = 'descending') {
        const url = this.buildURL(searchTerms, subject, daysAgo, maxResults, sortBy, sortOrder);
        Logger.log(url);
        const response = UrlFetchApp.fetch(url);
        const xml = XmlService.parse(response.getContentText());
        const root = xml.getRootElement();
        var entries = root.getChildren('entry', this.atomNamespace);
        return entries;
    }

    extractEntryData(entry) {
        const id = entry.getChildText('id', this.atomNamespace);
        const title = entry.getChildText('title', this.atomNamespace);
        const authorsElement = entry.getChildren('author', this.atomNamespace);
        const authors = authorsElement.map(author => author.getChildText('name', this.atomNamespace)).join(', ');
        const published = entry.getChildText('published', this.atomNamespace);
        const abstract = entry.getChildText('summary', this.atomNamespace);
        const arxivArticle = new ArxivArticle(id, title, authors, published, abstract);
        return arxivArticle;
    }
}

class GPTRequester {
    constructor(apiKey) {
        this.apiKey = apiKey;
    }
    requestSummary(messages) {
        const url = "https://api.openai.com/v1/chat/completions";
        const options = {
            "method": "post",
            "headers": {
                Authorization: `Bearer ${this.apiKey}`,
                "Content-Type": "application/json",
            },
            "payload": JSON.stringify({
                model: "gpt-3.5-turbo",
                messages,
            }),
        };
        return JSON.parse(UrlFetchApp.fetch(url, options).getContentText());
    }
}


class GPTRequestBuilder {
    constructor(promptFormat) {
        this.promptFormat = promptFormat;
    }
    buildPayload(arxivArticle) {
        const input = "【Title】\n" + arxivArticle.title + "\n" + "【Summary】\n" + arxivArticle.abstract;
        const payload = this.promptFormat + input;
        return payload;
    }
}

class GPTResponseParser {
    constructor(response) {
        this.response = response;
    }
    extractSummary() {
        return this.response.choices.map((c) => c.message.content.trim());
    }
}

class SpreadsheetManager {
    constructor(spreadsheetId) {
        this.sheet = SpreadsheetApp.openById(spreadsheetId).getActiveSheet();
        this.columns = ["ID", "Title", "Authors", "Published", "Abstract", "GPT Summary"];
    }
    idExists(id) {
        const idColumn = this.sheet.getRange(1, 1, this.sheet.getLastRow()).getValues();
        return idColumn.some(row => row[0] === id)
    }
    saveToSpreadsheet(arxivArticle, gptSummary) {
        const newRow = this.sheet.getLastRow() + 1;
        this.sheet.getRange(newRow, 1).setValue(arxivArticle.id);
        this.sheet.getRange(newRow, 2).setValue(arxivArticle.title);
        this.sheet.getRange(newRow, 3).setValue(arxivArticle.authors);
        this.sheet.getRange(newRow, 4).setValue(arxivArticle.published);
        this.sheet.getRange(newRow, 5).setValue(arxivArticle.abstract);
        this.sheet.getRange(newRow, 6).setValue(gptSummary);
    }
}


class EmailBodyBuilder {
    constructor() {
        this.mailText = "arXive新着論文\n\n";
    }
    appendPaperInfo(arxivArticle, gptSummary) {
      const paperInfoText = Object.entries(arxivArticle)
        .map(([key, value]) => `【${key}】\n  ${value}\n`)
        .join('');
            
      const output = `${paperInfoText}\n\n${gptSummary}\n\n\n`;
      this.mailText += output;
    }
}
class EmailSender {
    constructor(recipient, subject, sender, cc = null, bcc = null) {
        this.recipient = recipient;
        this.subject = subject;
        this.sender = sender;
        this.cc = cc;
        this.bcc = bcc;
    }
    sendEmail(body) {
        const options = {
            name: this.sender,
            // cc: this.cc.join(', '),
            // bcc: this.bcc.join(', ')
        };
        GmailApp.sendEmail(this.recipient, this.subject, body, options);
    }
}

function summarizePaper(arxivArticle, gptSummarizer, gptRequestBuilder) {
  const payload = gptRequestBuilder.buildPayload(arxivArticle);
  Logger.log(payload);
  const res = gptSummarizer.requestSummary([
    {
      role: "user",
      content: payload,
    },
  ]);
  Logger.log(res);
  const gptResponseParser = new GPTResponseParser(res);
  return gptResponseParser.extractSummary();
}

function handleArxivEntry(entry, arxivFetcher, gptSummarizer, gptRequestBuilder, spreadsheetManager, emailBodyBuilder) {
  const arxivArticle = arxivFetcher.extractEntryData(entry);
  
  if (spreadsheetManager.idExists(arxivArticle.id)) {
    Logger.log(`${arxivArticle.id} is already exists`);
    return false;
  }
  
  const gptSummaryText = summarizePaper(arxivArticle, gptSummarizer, gptRequestBuilder);
  
  spreadsheetManager.saveToSpreadsheet(arxivArticle, gptSummaryText);
  emailBodyBuilder.appendPaperInfo(arxivArticle, gptSummaryText.join("\n"));
  
  return true;
}

const CONFIG = {
  apiKey: "{ChatGPTのAPIキー}",
  spreadsheetId: "{スプレッドシートID}",
  promptFormat: "You are a researcher specializing in AI.\n" +
                "Please describe the following paper in Japanese, separating the title, abstract, and related keywords.\n" +
                "Please use bullet points for the main points.\n" +
                "出力は日本語でお願いします。\n",
  email: {
    recipient: "to_email@example.com",
    subject: "Arxive新着論文",
    sender: "Arxive要約bot",
    ccRecipients: ["cc_email1@example.com", "cc_email2@example.com"]
    bccRecipients: ["bcc_email1@example.com", "bcc_email2@example.com"]
     },
  search: {
    terms: ["key word"],
    subject: "cs.AI"
    daysAgo: 1,
    maxResults: 100,
    sortBy: "submittedDate",
    sortOrder: "descending",
  },
  summaryRequestLimit: 5 
};

function main() {
  const arxivFetcher = new ArxivFetcher();
  const gptRequestBuilder = new GPTRequestBuilder(CONFIG.promptFormat);
  const gptRequester = new GPTRequester(CONFIG.apiKey);

  const spreadsheetManager = new SpreadsheetManager(CONFIG.spreadsheetId);
  const emailBodyBuilder = new EmailBodyBuilder();
  const emailSender = new EmailSender(
    CONFIG.email.recipient,
    CONFIG.email.subject,
    CONFIG.email.sender,
    CONFIG.email.ccRecipients,
    CONFIG.email.bccRecipients
  );

  const entries = arxivFetcher.fetchEntries(
    CONFIG.search.terms,
    CONFIG.search.subject,
    CONFIG.search.daysAgo,
    CONFIG.search.maxResults,
    CONFIG.search.sortBy,
    CONFIG.search.sortOrder
  );
  let count = 0;

  for (const entry of entries) {
    Utilities.sleep(1000);
    if (handleArxivEntry(entry, arxivFetcher, gptRequester, gptRequestBuilder, spreadsheetManager, emailBodyBuilder)) {
    count++;
    }
    if (count >= CONFIG.summaryRequestLimit) {
    break;
    }
  }

const mailTexts = emailBodyBuilder.mailText;
emailSender.sendEmail(mailTexts);
}

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up