Editor.jsで生成した文章をWordファイルで書き出す

Last updated at 2023-11-07Posted at 2023-11-07

はじめに

今回はeditor.jsを使用して作成した文章をWordファイルで書き出すために色々とやったことを残します。

ドキュメントの作成はdocx.jsのドキュメントを参考にすすめます。(https://docx.js.org/#/)

Docx.jsの基本を覚える

公式ドキュメントのドキュメント作成方法をみると

const doc = new Document({
    sections: [
        {
            properties: {},
            children: [
                new Paragraph({
                    children: [
                        new TextRun("Hello World"),
                        new TextRun({
                            text: "Foo Bar",
                            bold: true,
                        }),
                        new TextRun({
                            text: "\tGithub is the best",
                            bold: true,
                        }),
                    ],
                }),
            ],
        },
    ],
});

となり、作成されています。

sectionsがページ
childrenが段落
new Paragraph()で作成するのがブロック
という風に解釈し進めていきました。
そのため、今回はchildren[]の中に各ブロックの文章を入れていき出力したいと思います。

editor.jsのデータを取得する

テキストエクスポート用のボタンを設置し、ボタンクリックをハンドリングした場合に処理を行います。
記事のデータをarticleとします。

const handleTextButtonExport = async () => {
  const doc = new Document({
    title: article.value.title,
    sections: [
      {
        properties: {
          type: SectionType.CONTINUOUS
        },
        children: await createChildren(article.value.detail)
      }
      })
}

ボタンが押下されたタイミングで新規Documentを作成します。
childrenの取得をcreateChildren関数に行わせます。

// 段落作成
const createChildren = async (blocks) => {
  // ブロックごとに処理を行う
  const children = await Promise.all(blocks.map(async (block) => {
    const type = block.type
    // paragraphの場合
    if (type === 'paragraph') {
      const data = newText(block.data.text)
      const paragraph = new Paragraph({
        children: data,
        spacing: {
          line: 276,
          before: 100,
          after: 100
        }
      })
      return paragraph
      // headerの場合
    } else if (type === 'header') {
      const headerLevel = [
        {level: 1, heading: HeadingLevel.HEADING_1},
        {level: 2, heading: HeadingLevel.HEADING_2},
        {level: 3, heading: HeadingLevel.HEADING_3},
        {level: 4, heading: HeadingLevel.HEADING_4},
        {level: 5, heading: HeadingLevel.HEADING_5},
        {level: 6, heading: HeadingLevel.HEADING_6}
      ]
      const header = new Paragraph({
        text: block.data.text,
        heading: headerLevel.find(header => header.level === block.data.level)?.heading,
        alignment: AlignmentType.CENTER
      })
      return header

      // 画像の場合
    } else if (type === 'image') {
      const image = await imageConvert(block.data.file.url)
      // サイズ取得
      const element: HTMLImageElement = await new Promise(resolve => {
        const blob = new Blob([image], { type: 'image/jpeg' })
        const url = URL.createObjectURL(blob)
        const element = new Image()
        element.src = url
        element.onload = () => {
          resolve(element)
        }
      })
      // サイズ変換
      let width = element.width
      let height = element.height
      const ratio = width / height
      const maxWidth = 590
      const maxHeight = 590
      if (width > maxWidth) {
        width = maxWidth
        height = maxWidth / ratio
      }
      if (height > maxHeight) {
        height = maxHeight
        width = maxHeight * ratio
      }
      const imageRun = new Paragraph({
          children: [
            new ImageRun({
              data: image,
              transformation: {
                  width,
                  height,
              }
            })
          ]
        })
      return imageRun
    }
  })
  ) as FileChild[]
  return children
}

今回のパターンでは文章、見出し、画像の３パターンのみを想定し実装しました。

paragraph

複数の文字データの場合、 new TextRun()で作成したテキストが配列として入ります。
そのため、各ブロックごとにTextRun[]を返す必要があるため、newText()関数を作成して文字データを処理させることにしました。

const newText = (text: string) => {
  const replace = text.replace(/(<a\s.*?>|<\/a>)/g,'').split(/(<[^>]*>.*?<[^>]*>|<br>)/g)
  let texts = [] as TextRun[]
  replace.forEach((text: string) => {
    if (text === '<br>') {
      texts.push(new TextRun({
        text: '\n'
      }))
      return
    }
    const isBoldText = text.match(/<b>(.*?)<\/b>/g)
    const convertedText = text.replace(/<[^>]*>/g,'')
    const textRun = new TextRun({
      text: convertedText,
      bold: !!isBoldText
    })
    texts.push(textRun)
  })
  return texts
}

editor.jsはParagraphの場合、各ブロックの内にHTMLタグが記載されるため、HTMLタグを取り除く必要があります。
今回は<a>タグをリンク表示する必要がなかったため、最初から取り除いて処理を行いました。
HTMLタグが存在するポイントごとに配列化、それぞれをTextRun()していきます。
分割した子要素ごとにテキストタイプを調査します。
今回のパターンではeditorJS上でboldのみを適用していたため、boldテキストの有無のみを調べています。

header

docx.js内で指定されているheaderレベルごとの定数を定義します。

      const headerLevel = [
        {level: 1, heading: HeadingLevel.HEADING_1},
        {level: 2, heading: HeadingLevel.HEADING_2},
        {level: 3, heading: HeadingLevel.HEADING_3},
        {level: 4, heading: HeadingLevel.HEADING_4},
        {level: 5, heading: HeadingLevel.HEADING_5},
        {level: 6, heading: HeadingLevel.HEADING_6}
      ]

editor.jsではblock.data.levelにheaderLevelの記載があるので定数に変換してparagraph内部に格納します。

image

editor.jsで執筆したデータ内にある画像はS3上にアップロードされているため、URLから画像を取得する処理をかませます。
docx.jsの画像の型定義はArrayBuffer|Buffer|stringのため、いずれかに変換する必要があります。

// URLから画像を取得、Base64形式に変換する
const imageConvert = (imageUrl: string) => {
    return fetch(imageUrl, {
        method: 'GET',
        mode: 'cors',
        cache: 'no-cache',
        credentials: 'same-origin'
    })
    .then(response => {
        const arrayBuffer = response.arrayBuffer()
        // ArrayBufferをBufferに変換
        return arrayBuffer.then(buffer => Buffer.from(buffer))
    })
    .catch((e) => {
        return new ArrayBuffer(0)
    })
}

また、画像のサイズを縦横どちらかを590pxに制限し、比率をそれに合わせる必要があるため、一度HTMLElementにして取得する必要がありました。

      const element: HTMLImageElement = await new Promise(resolve => {
        const blob = new Blob([image], { type: 'image/jpeg' })
        const url = URL.createObjectURL(blob)
        const element = new Image()
        element.src = url
        element.onload = () => {
          resolve(element)
        }
      })

element関数の中に画像データが格納されたため、これを用いて計算を行います。

      let width = element.width
      let height = element.height
      const ratio = width / height
      const maxWidth = 590
      const maxHeight = 590
      if (width > maxWidth) {
        width = maxWidth
        height = maxWidth / ratio
      }
      if (height > maxHeight) {
        height = maxHeight
        width = maxHeight * ratio
      }

これで画像表示に必要な値が揃ったため、画像ブロックを作成します。

const imageRun = new Paragraph({
          children: [
            new ImageRun({
              data: image,
              transformation: {
                  width,
                  height,
              }
            })
          ]
        })

これで各ブロックをdocx,jsでの形式に変換しました。

書き出す

Packerを使用して書き出します。blobデータに変換し、saveAsを用いて保存します。

  // 出力処理
  Packer.toBlob(doc).then((blob) => {
        saveAs(blob, `${article.value.title}.docx`)
  })

さいごに

今回は用途が限定されていたこともあり、かなり限定的処理を入れての対応となりました　。
HTMLタグ周りの処理をもっと汎用性を持たせることで複雑なタグ構成でも対応できるようになると思います。
editor.js側でなにかしら対策できれば良いのかもしれません。もう少し汎用性が高く、安定した文章書き出しを見つけられればと思います。

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up