1
1

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?

More than 1 year has passed since last update.

サーバレススクレイピング環境をCDK, Lambda, Puppeteerで作る

Posted at

Serverless Puppeteer on AWS Lambda

サーバレス環境でスクレイピングがしたい人のために、CDKのサンプルを作成しました.

CDKを実行すると、スクレイピングに必要なレイヤーを作成し、Lambdaを実行することでスクレイピングが行えます.

Github: https://github.com/rv-rescala/jp-serverless-puppeteer

  • 動作確認済み環境: MacOS Monterey
  • 言語: Typescript

事始め

あらかじめAWS CDK, typescript, nodeをインストールしてください.

インストール

git clone https://github.com/rv-rescala/jp-serverless-puppeteer
npx @puppeteer/browsers install chromium@latest --path /tmp/localChromium
npm install @sparticuz/chromium@109 puppeteer-core@19.4
make init

環境変数の設定

export IS_LOCAL=true    
export BROWSER_PATH=/tmp/localChromium/chromium/mac_arm-1153064/chrome-mac/Chromium.app/Contents/MacOS/Chromium # chack your path

ローカルでのテスト実行

npx ts-node test/test-browser.ts

AWSへのデプロイ

cdk synth --all
cdk deploy --all

Lambdaの実行

"jp-serverless-puppeteer-test-puppeteer-handler"をLambda上で実行してください.

Reference

1
1
0

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
1
1

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?