PuppeteerでセレクタやXPath用いてのテキスト取得やアクションの実行を行う例
テキストの取得
セレクタ
let selector = 'li.test:nth-child(2) a';
await page.waitForSelector(selector);
let textList = await page.evaluate((selector) => {
let textList = [];
let nodeList = document.querySelectorAll(selector);
for (let index = 0; index < nodeList.length; index++) {
let value = nodeList[index].href;// href, innerHTML, innerText, etc.
if (typeof value !== 'undefined'){
textList.push(value);
}
}
return textList;
}, selector);
XPath
let textList = [];
let xpath = '//li[@class="test"][2]/a[1]';
await page.waitForXPath(xpath);
const elementHandleList = await page.$x(xpath);
for (let index = 0; index < elementHandleList.length; index++) {
let value = await (await elementHandleList[index].getProperty('innerHTML')).jsonValue();// innerHTML, innerText, etc.
if (typeof value !== 'undefined'){
textList.push(value);
}
}
属性
let textList = [];
await page.waitForXPath('//li[@class="test"][2]/a[1]/@href');
const elementHandleList = await page.$x('//li[@class="test"][2]/a[1]/@href');
for (let index = 0; index < elementHandleList.length; index++) {
let value = await (await elementHandleList[index].getProperty('value')).jsonValue();
if (typeof value !== 'undefined'){
textList.push(value);
}
}
アクション(click)
clickで以下のエラーが出る場合はbrowserにdefaultViewportやsetViewportの設定がうまくいっていない可能性あります。
Error: Node is either not visible or not an HTMLElement
'defaultViewport' : { 'width' : 1600, 'height' : 1200 }
セレクタ
let selector = 'h2';
await page.waitForSelector(selector);
let elementHandleList = await page.$$(selector);
await elementHandleList[0].click();
await page.waitForNavigation();
XPath
let xpath = '//h2';
await page.waitForXPath(xpath);
let elementHandleList = await page.$x(xpath);
await elementHandleList[0].click();
await page.waitForNavigation();
その他
サンプルが色々あるのでいつかお世話になりそう