Geminiに、内容が矛盾しているか機械的に判断できる関数を使ってもらう

Last updated at 2025-01-31Posted at 2024-11-18

真理表を作成し、論理状態(トートロジー、整合的、矛盾)を出力する関数をGeminiに呼び出して使ってもらう仕組みを作りました。

目的は、LLMがより論理的に判断できるようになることでした。もう少し詳しく言うと、演繹的推論が妥当か正確に判断できるようになることでした。
ここで説明している仕組みは、完全ではなく、成功もありつつ失敗も多々含んでいます。それを踏まえてご覧いただけると幸いです。

きっかけ

大学での論理学の講義がきっかけです。「内容の論理式を判定する関数を使ってもらえば、LLMは正確に判定できるのでは」と講義中に思いつき、軽率ですがGeminiで実装してみました。
既存のLLM(ChatGPT、Claude、Geminiなど)は基本的にどんな支離滅裂な内容でも、関数を使う以前から論理的に判断できていることを確認しました。正直作らなくてよかったと思ってしまったのですが、一種のアプローチとして作成してみたので、この記事に残させてください。

関数呼び出しにするメリット

以下のようなメリットがあると考えています。

確実性：論理式を解く過程を保証することができる
論理式が複雑で長い時：どんなに長い論理式でも関数に入力してしまえば、確実な計算ができ精度を上げられる　思考過程を残しながら推論するCoTプロンプティングに近いです

関数の流れ

関数は、1つの入力「論理式」、2つの出力「トートロジー or 整合的 or 矛盾」と「真理表」です。おそらくLLMなら真理表見れば前者はいらないかもしれませんが、精度も上がり、人間が直接見た際わかりやすくなるので入れています。

LLMの流れ

ユーザーの内容から原子式(言葉を置き換えたもの)をあらかじめLLM自身が定義し、論理式を作成して関数へ入力します。
関数の出力を参考にして、ユーザーと対話します。

コード

以下のコードを実行するには

Gemini API Keyをあらかじめ発行してください
google-generativeaiを環境にインストールしてください

以下h実際に作ったコードです。

import itertools
import google.generativeai as genai
from google.ai.generativelanguage_v1beta.types import content
import re
from config import GEMINI_API_KEY

logical_expression_solver_explanation = genai.protos.Tool(
    function_declarations = [
    genai.protos.FunctionDeclaration(
        name = "logical_expression_solver",
        description = """Solving the propositions using logical formulas.
Define atomic formulas yourself in advance, create logical formulas, and input the logical formulas only.
Use only ∧(conjunction), ∨(disjunction), ¬(negation), and →(implication).""",
        parameters = content.Schema(
            type = content.Type.OBJECT,
            properties = {
                "logical_expression": content.Schema(type = content.Type.STRING),
            },
        ),
    ),
    ],
)

def evaluate_expression(expression, values): # Evaluate the given logical expression based on truth value assignments
    for atomic_formula, bool_value in values.items():
        expression = expression.replace(atomic_formula, str(bool_value))
    try:
        expression_bool = eval(expression)
        expression_result = None
    except:
        expression_bool = None
        expression_result = "Error: logical_expression is invalid format. Remake logical_expression and retry"
    return expression_bool,expression_result

def logical_expression_solver(logical_expression): 
    edit_logical_expression = re.sub(r'¬([A-Za-z])', r'(¬\1)', logical_expression)
    edit_logical_expression = edit_logical_expression.replace("∧", " and ")
    edit_logical_expression = edit_logical_expression.replace("∨", " or ")
    edit_logical_expression = edit_logical_expression.replace("¬", " not ")
    edit_logical_expression = edit_logical_expression.replace("→", " <= ")
    edit_logical_expression = edit_logical_expression.replace("&", " and ")
    edit_logical_expression = edit_logical_expression.replace("->", " <= ")
    
    # Extract variables contained in the logical expression
    atomic_formula_list = sorted(set(filter(str.isalpha, logical_expression)))
    
    # Generate all combinations of truth values for the variables
    logical_expression_combinations = list(itertools.product([True, False], repeat=len(atomic_formula_list))) #[(True,False,),(True,True,)]
    
    table_data = []
    all_pattern_logical_expression_bool_list = []
    temp = atomic_formula_list.copy()
    temp.append(logical_expression)
    table_data.append(temp)

    for combination in logical_expression_combinations: #[(True,False,),(True,True,)]
        values = dict(zip(atomic_formula_list, combination)) # {'P': True, 'Q': True, 'R': True}
        temp = []
        for bool_value in combination:
            temp.append(str(bool_value))

        combination_bool,expression_error = evaluate_expression(edit_logical_expression, values)
        if expression_error:
            return expression_error, None
        
        temp.append(str(combination_bool))
        table_data.append(temp)
        all_pattern_logical_expression_bool_list.append(combination_bool)

    truth_table = "\n".join([",".join(map(str, row)) for row in table_data])
    
    if all(all_pattern_logical_expression_bool_list):
        logical_status = "Tautology"
    elif any(all_pattern_logical_expression_bool_list):
        logical_status = "Consistent"
    else:
        logical_status = "Contradiction"

    return logical_status, truth_table

def message(user_content,chat_session):

    response = chat_session.send_message(user_content)

    for part in response.parts:
        if part.text:
            print("Gemini: ",part.text)
        
        if fn := part.function_call:
            if fn.name == "logical_expression_solver":
                print("Active: logical_expression_solver")
                logical_expression = fn.args.get("logical_expression")
                print("Gemini generate argument: ",logical_expression)

                logical_status, csv_text = logical_expression_solver(logical_expression)

                system_nitification = f"logical_expression_solver's result:\nIt's {logical_status}\n\ntruth table\n{csv_text}"
                print("-"*40,"\n",system_nitification,"\n","-"*40)

                message(system_nitification,chat_session) #recursive call

def main():
    genai.configure(api_key=GEMINI_API_KEY)
    model = genai.GenerativeModel(model_name="gemini-1.5-flash",tools = [logical_expression_solver_explanation])
    chat_session = model.start_chat(history=[])

    while True:
        user_content = str(input("user: "))

        if user_content == "end":
            break

        message(user_content,chat_session) 

if __name__ == "__main__":
    main()

簡単な問題を試す

関数を媒介してGeminiに推論してもらいました。
これぐらいの問題であれば、Geminiは関数を使わなくてもできていますが、関数を使った場合でも、想定より良い結果が出ていました。
しかし、Geminiが論理式を作成する段階で間違えていることが多いです。やはり式自体が間違えていると意味はないですね...

問題点

時間と処理量が増える：外部から呼び出したり媒介するものが増えるので処理時間が増え、処理する量も増えてしまう
LLMの論理式生成のミス：論理式を生成するのは、結局LLMに依存してしまっており、間違いを増幅させてしまう恐れがある

まとめ

関数を媒介にすることで、論理式自体を判定するのが確実になるので、悪くはないアプローチだと考えました。

以上、Geminiに内容が矛盾しているか機械的に判断できる関数を使ってもらう話でした。

ここまでご覧いただきありがとうございました。

参考

Google Gemini
Google AI Studio

※一部ソースコードの作成にChatGPTを使用しています。

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up