NodeでURLからCSVデータ取得してみた

Node.js

Posted at 2024-07-18

いちいち更新されるたびにファイルをダウンロードしてきて...って、すごくめんどくさいので、CSVぐらいURLから取得できないかなあ...

ということで、nodeでCSVをURLから取得できるプログラムを作ってみた。

urlcsv.js

export class URLCSV {
  async * #makeTextFileLineIterator(fileURL, charCode) {
    const utf8Decoder = new TextDecoder(charCode);
    const response = await fetch(fileURL);
    const reader = response.body.getReader();
    let { value: chunk, done: readerDone } = await reader.read();
    chunk = chunk ? utf8Decoder.decode(chunk) : "";

    const newline = /\r?\n/gm;
    let startIndex = 0;

    while (true) {
      const result = newline.exec(chunk);
      if (!result) {
        if (readerDone) break;
        const remainder = chunk.substr(startIndex);
        ({ value: chunk, done: readerDone } = await reader.read());
        chunk = remainder + (chunk ? utf8Decoder.decode(chunk) : "");
        startIndex = newline.lastIndex = 0;
        continue;
      }
      yield chunk.substring(startIndex, result.index);
      startIndex = newline.lastIndex;
    }

    if (startIndex < chunk.length) {
      // Last line didn't end in a newline char
      yield chunk.substr(startIndex);
    }
  }

  async fetch(fileUrl, charCode) {
    const lines = []
    for await (const line of this.#makeTextFileLineIterator(fileUrl, charCode)) {
      lines.push(line)
    }
    console.table(lines)
    return lines
  }
}

内閣府の祝日CSVもすんなり取得できた。

main.js

const csv = new URLCSV()
const csvTexts = await csv.fetch('https://www8.cao.go.jp/chosei/shukujitsu/syukujitsu.csv', "shift-jis");

fetch処理はMDNに、参考になるコードがあったのでそのまま使用した。
ジェネレーター関数って...ワイルドだよね...。
テキストデータだと、順次レスポンスが返される関係でチャンク処理が必要になるらしい。

作ったものの、頻繁に変わることもないデータであれば毎回取ってくるのはナンセンスだから、バッチ処理にしたほうがいいかも？と思った。

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up