More than 1 year has passed since last update.

Ghidraの解析結果の関数名とコメントを入出力するプラグインの作成

Posted at 2023-05-09

Ghidraの解析結果

Ghidraで解析結果を共有する場合、メニューのExport ProgramでGhidra Zip File(gzf)形式で保存するのが一般的です。
この解析結果の中で特に重要なのは

コメント
関数の内容(リネームした関数名、引数名)

だと思われます。(Ghidraがデコンパイルした関数内の変数は使い回しが多く、リネームしても単一の目的で使われることは少ないです)

gzfファイルにはこれらの情報以外に解析元のプログラムのアセンブリや作業状況など、
ファイルを読み込ませた際に自動で作成される情報が全て含まれています。
Git等を使用した解析途中の履歴を保存する場合にgzfファイルでは情報が過多なため、
上記の内容の入出力を行うプラグインの作成を行いました。

プラグインの出力するテキストファイルがあれば、解析結果の中で必要な情報だけ共有することが出来ます。

Ghidraの導入

Ghidraを使用するには下記のURLからダウンロード出来る実行ファイルと、Javaのランタイム(JDK)が必要です。

JDKのインストール(Powershell)

winget install -e --id Microsoft.OpenJDK.17

Ghidraプラグインの作成

GhidraのプラグインはJava又はPython（Python2系）で記述できます。
スクリプトで利用できるAPIは以下で検索できます。

スクリプト内で

FlatProgramAPI (ghidra.program.flatapi.FlatProgramAPI)
currentProgram (ghidra.program.model.listing.Program)
は予約されており、Import無しで利用できます。
また、コメントに日本語は使用できません。（実行時にエラーになります）

作成したプラグインの動作確認はWindowsのみで行っています。

Exportプラグイン

解析情報を出力するプラグインを作成します。
書式は以下の通りです。（適当に考えました）
引数は型と名前がスペースで区切られています。

@function
アドレス,関数名,引数1,引数2,・・・
アドレス,関数名,引数1,引数2,・・・
・・・
@end
@comment
@address,アドレス,コメントタイプ
コメント本文
@address,アドレス,コメントタイプ
コメント本文
・・・
@end

出力ファイル(例)

@function
01000000,main
01001000,func1,undefined4 param1,undefined4 param2,undefined4 param3,undefined4 param4,undefined4 param5
@end
@comment
@address,01000000,1
メイン関数
@address,01001000,3
関数1
テスト
@end

以下が解析情報を出力するコードになります。
FUN_やthunk_のような解析時に自動生成される名称は除外しています。（自分で編集した情報だけ出力するため）
また、テキストはutf8でエンコードして出力しています。

対象のコメントは

PRE_COMMENT
POST_COMMENT
PLATE_COMMENT
の3種類です(EOL_COMMENTは自動生成されることが多いため除外しています)

Export.py

def export(outSymPath):
    with open(outSymPath, 'w') as outFile:
        export_function(outFile)
        export_comment(outFile)


def export_function(outFile):
    outFile.write("@function\n")
    functions = currentProgram.getFunctionManager().getFunctions(True)
    for func in functions:
        off = func.getEntryPoint().offset
        name = func.getName()
        if name.startswith("FUN_") or name.startswith("thunk_"):
            continue
        argstr = ""
        for arg in func.getSignature().getArguments():
            argstr += ",%s %s" % (arg.getDataType(), arg.getName())
        outFile.write("%08X,%s%s\n" % (off, name, argstr))

    outFile.write("@end\n")


def export_comment(outFile):
    outFile.write("@comment\n")
    listing = currentProgram.getListing()
    commentIterator = listing.getCommentAddressIterator(
        currentProgram.getMemory(), True)
    while commentIterator.hasNext():
        commentAddr = commentIterator.next()
        cu = listing.getCodeUnitContaining(commentAddr)
        if cu is None:
            continue
        comment = ""
        kind = 0
        if cu.getComment(1):  # PRE_COMMENT
            comment = cu.getComment(1)
            kind = 1
        elif cu.getComment(2):  # POST_COMMENT
            comment = cu.getComment(2)
            kind = 2
        elif cu.getComment(3):  # PLATE_COMMENT
            comment = cu.getComment(3)
            kind = 3
        else:
            continue

        outFile.write("@address,%08X,%d\n" %
                      (commentAddr.getUnsignedOffset(), kind))
        outFile.write(comment.encode('utf-8') + "\n")

    outFile.write("@end\n")


# main
sym = askFile("Select output FunctionInfo .txt file", "Select")

if sym.exists():
    overwrite = askYesNo("Warning", "File already exists, overwrite?")
    if overwrite:
        export(sym.absolutePath)
else:
    export(sym.absolutePath)

Importプラグイン

Exportプラグインで作成したテキストファイルを読み込み、関数とコメントを更新します。

Ghidraで解析した関数は自動的にパラメータが付与されるものとそうでないものがあるため、関数を削除してから、新しく作成し直しています。

Import.py

import codecs
import ghidra.program.model.data.Undefined as Undefined
import ghidra.program.model.data.IntegerDataType as IntegerDataType
import ghidra.program.model.listing.ParameterImpl as ParameterImpl
import ghidra.program.model.listing.Function.FunctionUpdateType as FunctionUpdateType
import ghidra.program.model.symbol.SourceType as SourceType


def str2dataType(str):
    if str == "undefined":
        return Undefined.getUndefinedDataType(0)
    elif str == "undefined2":
        return Undefined.getUndefinedDataType(2)
    elif str == "undefined4":
        return Undefined.getUndefinedDataType(4)
    elif str == "int":
        return IntegerDataType()
    return None


def import_function(line):
    parts = line.split(",")

    parts_count = len(parts)
    if parts_count >= 2:
        address = toAddr(int(parts[0], 16))
        name = parts[1]
        removeFunctionAt(address)
        func = createFunction(address, name)
        if parts_count > 2:
            argList = []
            i = 2
            while i < parts_count:
                list = parts[i].split(" ")
                argList.append(ParameterImpl(
                    list[1], str2dataType(list[0]), currentProgram))
                i += 1
            func.updateFunction(currentProgram.getCompilerSpec().getDefaultCallingConvention().getName(),
                                func.getReturn(),
                                argList,
                                FunctionUpdateType.DYNAMIC_STORAGE_FORMAL_PARAMS, True, SourceType.USER_DEFINED)


def import_comment(lines):
    count = len(lines)
    index = 0
    kind = 0
    comment = ""
    listing = currentProgram.getListing()
    address = toAddr(0)

    for line in lines:
        line = line.replace("\r", "").replace("\n", "")
        if "@address" in line:
            if index > 0:
                listing.setComment(address, kind, comment)

            parts = line.split(",")
            address = toAddr(int(parts[1], 16))
            kind = int(parts[2])
            comment = ""
        else:
            if len(comment) > 0:
                comment += "\n"
            comment += line
        if index == count-1:
            listing.setComment(address, kind, comment)

        index += 1


# main
infoFile = askFile("Select FunctionInfo .txt file", "Select")
with codecs.open(infoFile.absolutePath, "r", "utf-8") as f:
    list = f.readlines()

kind = 0
commnets = []
for line in list:
    if "@function" in line:
        kind = 1
    elif "@comment" in line:
        kind = 2
        continue
    elif "@end" in line:
        kind = 0
    if kind == 1:
        import_function(line.replace("\r", "").replace("\n", ""))
        continue
    elif kind == 2:
        commnets.append(line)
        continue

import_comment(commnets)

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up