More than 3 years have passed since last update.

Java PDFからテキストを抽出

Last updated at 2021-09-27Posted at 2021-09-26

今回は、Spire.PDF for Javaという素晴らしいライブラリを使って、PDFからテキストを抽出する方法を紹介したいと思います。

Spire.PDF for Javaとは？

Spire.PDF for Java is a PDF API that enables Java applications to read, write and save PDF documents without using Adobe Acrobat.

英語が苦手な方への翻訳：

Spire.PDF for Javaは、Eiceblue社が開発され、開発者が JavaプラットホームでPDFのドキュメントを迅速かつ高品質で作成・編集・変換・印刷するために設計された専門的なPDF処理APIです。そして、無料版と商用版がありますが、今回の記事で商用版を紹介しています。

準備

1.E-iceblueの公式サイトからSpire.PDF for Javaをダウンロードしてください。

2.IDEを起動して新規プロジェクトを作成してから、インストールされたファイルにある相応しいSpire.PDF.jarを参照に追加してください。

元のファイル

```JAVA import com.spire.pdf.PdfDocument; import com.spire.pdf.PdfPageBase; import java.io.*;

public class Extract_Text {

public static void main(String[] args) {
	
	//PdfDocumentオブジェクトを作成します
	PdfDocument doc = new PdfDocument();
	//PDFファイルをロードします。
    doc.loadFromFile("test.pdf");

    //StringBuilderオブジェクトを作成します。         
    StringBuilder sb = new StringBuilder();   

    PdfPageBase page;                
    //PDF ページをループします。
    for(int i= 0;i<doc.getPages().getCount();i++){
        page = doc.getPages().get(i);            
        sb.append(page.extractText(true));
    }
    FileWriter writer;
    try {
    	
        writer = new FileWriter("ExtractText.txt");
        writer.write(sb.toString());
        writer.flush();
    } catch (IOException e) {
        e.printStackTrace();
    }

    doc.close();
}

}

<h1><strong>実行結果</strong></h1>
<p><strong><img src="https://cdn-ak.f.st-hatena.com/images/fotolife/l/lendoris/20210926/20210926111112.png" alt="f:id:lendoris:20210926111112p:plain" width="554" height="342" loading="lazy" title="" class="hatena-fotolife" itemprop="image" /></strong></p>
<h1><strong>以下の関連記事もご参照</strong></h1>
><p><strong>PDF変換機能：<a href="https://www.e-iceblue.com/Tutorials/Java/Spire.PDF-for-Java/Program-Guide/Conversion/Convert-PDF-to-Word-in-Java.html">Convert PDF to Word in Java</a></strong></p>

><p><strong>PDFの作成法：<a href="https://www.e-iceblue.com/Tutorials/Spire.PDF-for-JAVA/Spire.PDF-Program-Guide-JAVA/Document-Operation/Create-a-PDF-Document-in-Java.html">Create a PDF Document in Java</a></strong></p>

><p><strong>PDFを印刷：<a href="https://www.e-iceblue.com/Tutorials/Java/Spire.PDF-for-Java/Program-Guide/Print/How-to-print-PDF-document-in-Java.html">Create a PDF Document in Java</a></strong></p>

<h1><strong>終わりに</strong></h41>
<p><strong><a href="https://www.e-iceblue.com/Introduce/pdf-for-java.html#.YU_XIfl1NmI">Spire.PDF for Java</a>は、パワフルな機能を満載しているので、使いやすいし、開発者にとってかなり便利なツールだと思います。</strong></p>
<p> </p>

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up