PyGhidraの使い方

Last updated at 2025-05-15Posted at 2025-02-28

PyGhidraについて

困ったら公式ドキュメントを読んでみてください。

Ghidra 11.3からGhidraでPythonが使えるようになった。今までもあったように思えますが実はJavaで実装したPython擬きのJythonというものでネイティブのPythonは使えませんでした。

検証環境

Ghidra 11.3
Openjdk 21 LTS
windows 11
python 3.11.11 (pyenv)

起動方法

以前なら<GHIDRA_INSTALL_DIR>/ghidraRun(.bat)を実行することでGhidraが起動したが、PyGhidraを使うためには<GHIDRA_INSTALL_DIR>/support/pyghidraRun(.bat)を実行する必要があります。
実行するとpyghidraようのvenvを作成するかなどを聞いてくるのでyesで進める。事前にvenvをactivateさせた状態ならそれを使ってくれます。そうでない場合はAppData/Roamin/ghidra/ghidra_<version>PUBLIC/venvに仮想環境が生成される。linuxは~/.config/ghidra/ghidra<version>_PUBLIC/venvにある。
起動出来たら適当な実行ファイルインポートなりしてCodeBrowserを起動する。ツールバーのwindow→PyGhidraを選択するとインタープリタが起動されるはず。pyghidraRun(.bat)で起動していないとPyGhidraを選択しても使えません。

注意点

Ghidraのconfig関連のファイルはwindowsならAppData/Roaming/ghidra/に、linuxならhome/user/.configにあります。一度ghidraRunを実行し初回起動をしておかないとこれらのファイルは生成されません。
要するに初回起動していないとpyghidraRunが動かないかも

pyGhidraRun.batではpyコマンドが使えるかチェックしているっぽいが環境によっては以下のpyをpythonやpython3に修正する必要あり

:: Make sure Python3 is installed
set PYTHON=py 
where /q %PYTHON%
if not %ERRORLEVEL% == 0 (
	set PYTHON=python
	where /q !PYTHON!
	if not !ERRORLEVEL! == 0 (
		echo Python 3 is not installed.
		goto exit1
	)
)

使い方

インタープリタ

CodeBrowserで起動したPyGhidraウィンドウではもちろんGhidraのAPIが使えます。またFlatAPI関連も問題なく動きます。以下例

>>> print(getFunctionContaining(currentAddress))
main

ただ使ってみた感じPyGhidraではAPI関連のtab補完がいまいち。API試すだけならJythonの方がいいかも。Python側で型をうまく認識できないんだろうな～

GhidraScript

GhidraScriptでもPyGhidraが使えます。Window→Script ManagerのExamplesフォルダにPyGhidraBasics.pyというGhidraScriptをPyGhidraで動作させる際のサンプルコードがある。
先頭の@runtime PyGhidraでPyGhidraを指定できるようです。Jythonを使いたい場合は@runtime JythonでOK

# @category: Examples.Python
# @runtime PyGhidra

また以下のimport文を追加するとVSCode上でGhidraのAPI関連でエラーを出さなくなります。この文言を追加することでcurrentProgramなどが使えるようになります。

import typing
if typing.TYPE_CHECKING:
    from ghidra.ghidra_builtins import *

もしエラーが出続けるようなら

pip install ghidra-stubs==<version>

を実行するとなおる。(エラーは消えるけど警告線は引かれる)
一応ドキュメントにもそんな感じのことは書いてある。
https://github.com/NationalSecurityAgency/ghidra/blob/master/Ghidra/Features/PyGhidra/src/main/py/README.md

コード例

LLM4Decompile( https://github.com/albertan017/LLM4Decompile )で現在の関数を解析させるスクリプト

動作環境や必要なもの

OS Ubuntu 22.04 LTS
Nvidia cuda driver

llm4dec.py

# Decompile current function using LLM4decompile
# @category LLM4Decompile
# @runtime PyGhidra
# @keybinding F5
# @author blend-tea


# LLM Requirements
from transformers import AutoTokenizer, AutoModelForCausalLM
import torch

# Ghidra Requirements
import typing
if typing.TYPE_CHECKING:
    from ghidra.ghidra_builtins import *

from ghidra.app.decompiler import DecompInterface
from ghidra.util.task import ConsoleTaskMonitor

def load_model():
    model_path = 'LLM4Binary/llm4decompile-1.3b-v2' # V2 Model
    tokenizer = AutoTokenizer.from_pretrained(model_path)
    model = AutoModelForCausalLM.from_pretrained(model_path, torch_dtype=torch.bfloat16).cuda()
    return tokenizer, model

def decompile_function(function):
    decompiler_interface = DecompInterface()
    decompiler_interface.openProgram(currentProgram)
    decompiled_function = decompiler_interface.decompileFunction(function, 0, ConsoleTaskMonitor())
    return decompiled_function.getDecompiledFunction().getC()

def llm4dec(function):
    tokenizer, model = load_model()
    decompiled_function = decompile_function(function)
    inputs = tokenizer(decompiled_function, return_tensors="pt").to(model.device)
    with torch.no_grad():
        outputs = model.generate(**inputs, max_new_tokens=2048)
    return tokenizer.decode(outputs[0][len(inputs[0]):-1])

current_function = getFunctionContaining(currentAddress)
print(llm4dec(current_function))

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up