0
0

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?

Qwen3-VL-32B-Instructは漢文OCRとして使えるのか

0
Posted at

12月26日の記事の続きだが、Qwen3-VL-32B-Instructだと夾註まで読めてしまうようだ。プログラムは、こんな感じ。

#! /usr/bin/python3
# pip3 install transformers jinja2
img="http://kanji.zinbun.kyoto-u.ac.jp/db-machine/toho/L/B0010001.jpg"
from transformers import pipeline
nlp=pipeline("image-text-to-text","Qwen/Qwen3-VL-32B-Instruct",
  max_new_tokens=2048,device_map="auto")
d=nlp([{"role":"user","content":[{"type":"image","image":img},
  {"type":"text","text":"閱讀垂直文言文(含夾註),逐行輸出。"}]}])
print(d[0]["generated_text"][1]["content"])

漢書零片』を読み込ませてみたところ、私(安岡孝一)の手元では以下の結果が出力された。

以下為圖中垂直書寫之文言文,含夾註,逐行輸出,依從右至左、自上而下之傳統排版順序:

---

**因章事舉直言極諫並見郎從官農蓋其**
**意加於往前以明示四方使天下咸知主**
**上聖明不以言罪下也若此則流言消釋**
**疑惑著明鳳白行其策欽之補過將美皆**
**此類也(師古曰將助也)優游不仕以壽終欽子及**
**昆弟支屬至二千石者且十人欽兄緩前**
**免太常以列侯奉朝請成帝時乃薨子業**
**嗣業有材能以列侯遷復為太常數言**
**失不事權貴與丞相翟方進術尉定陵侯**
**淳于長不平後業坐法免官復為函谷關**
**都尉會定陵侯長有罪當就國長舅紅**
**侯立與業書曰誠哀老姊垂白隨姪伏**
**出關(師古曰垂白者言白髮下垂也無狀猶言不肖)願勿復用前事相**
**侵定陵侯旣出關伏罪復發(蘇林曰長與許)**
**(后書也語在外)威下雒陽獄丞相史搜得紅陽侯書奏業**
**聽請不敬(服虔曰受立屬請爲不敬)坐免就國其春丞相**
**方進薨業上書言方進本與長深結厚更**
**相稱薦(師古曰更晉工衡反)長陷大惡獨得不坐苟**

---

**前漢傳三十**
**士元**

---

**東方文化院藏書所**(右下角紅色印章)

---

**6193**(右上角編號)

---

註:此為古籍書頁,文字為垂直排列,自右向左閱讀。夾註以小字附於正文旁,括號內為現代整理時所加註釋說明,以利理解。原圖中部分字跡因紙張老化略顯模糊,但可辨識。

何ヶ所か読み間違いがあるものの、かなりスゴイ。ただ、32BパラメータのQwen3VLモデルだと、メモリが128GB以上は必要なので、動作環境がなかなか厳しい。GGUF版も試してみるべきかなあ。

0
0
0

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
0
0

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?