paiza×Qiita記事投稿キャンペーン「プログラミング問題をやってみて書いたコードを投稿しよう！」

HaystackでLlama3, ElasticsearchDocumentStoreを使った質問応答システム構築（その２）

Last updated at 2024-08-14Posted at 2024-08-14

はじめに

前回はwikipedia のページから情報を抽出しましたが今回はより実践的に、使い方を調べるのが大変だったignition gazeboのコード解析に使用してみました。
gz-simのgithubからダウンロードしたコードの情報をElasticsearchDocumentStoreに登録してコードの内容について質問応答できる様にしました。（環境：RTX3080）

方針

https://github.com/gazebosim/gz-sim.git
からダウンロードしたソースコード('.md', '.txt', '.cc', '.c', '.py','.hh')をElasticsearchDocumentStoreに登録してこれらのファイルからignition gazeboに関する質問応答を行う。メタデータにはファイルパスと.cと.cc に関してはクラスと関数も登録する。

構造と使い方

・ElasticsearchDocumentStoreをダウンロードしてhttp://localhost:9200としてサーバーを起動（前回と同じ）
　　　　　　　　　　　↓
・HaystackをダウンロードしてRAG（検索拡張生成）の学習用パイプライン（以下の工程で使うツール）を設定
　（ファイル分割（splitter）、ベクトル化（embedder）、データ書き込み（writer))
　　　　　　　　　　↓

・登録するソースコードを解析して登録するディクショナリを生成（今回修正）
　　　　　　　　　　　↓
・ディクショナリをパイプラインに通してベクトル化（前回と同じ）
　　　　　　　　　　　↓　　　　
・質問、回答用パイプライン（以下の工程で使うツール）を設定
　（質問のベクトル化（TextEmbedder）、類似ファイル（ベクトル）検索（retriever）、回答作成（generator=LLAMA3))（前回と同じ）
　　　　　　　　　　　↓　　　
・パイプラインに質問登録
　　　　　　　　　　　↓
・回答生成

ElasticsearchDocumentStoreにインデックスが残っているので二回目に実施する場合は
curl -X DELETE "http://localhost:9200/gz-sim-index"でインデックスを消すか
indexing_pipeline.run({"splitter":{"documents":raw_docs}})をコメントアウトして下さい

変更点

・登録するソースコード（.md', '.txt', '.cc', '.c', '.py','.hh'）を解析して登録するディクショナリ
　を生成、メタデータはファイルのディレクトリ及びファイル名、
　（CファイルとC++ファイルはクラスと関数名もメタデータとする。）

ソースコード（前回からの変更点のみ）

def extract_classes_and_functions(file_path):
    """Extract classes and functions from a .cc or .c file."""
    classes = []
    functions = []
    
    try:
        translation_unit = index.parse(file_path)
    except Exception as e:
        print(f"Error parsing {file_path}: {e}")
        return classes, functions
    
    for node in translation_unit.cursor.get_children():
        if node.kind == clang.cindex.CursorKind.CLASS_DECL:
            classes.append(node.spelling)
        elif node.kind == clang.cindex.CursorKind.FUNCTION_DECL:
            functions.append(node.spelling)
    
    return classes, functions

import os
import hashlib
from haystack.dataclasses import Document
import clang.cindex


# Set the correct path to libclang.so
clang.cindex.Config.set_library_file('/usr/lib/llvm-14/lib/libclang.so')

# Now you can create an index
index = clang.cindex.Index.create()


def index_files(project_dir):
    documents = []

    for root, dirs, files in os.walk(project_dir):
        # テキストファイルを開いて内容を読み込む
        for file_name in files:
            if file_name.endswith(('.md', '.txt', '.cc', '.c', '.py','.hh')):
                file_path = os.path.join(root, file_name)
                try:
                    with open(file_path, "r", encoding="utf-8") as file:
                        content = file.read()
                    #print(f"name: {file_path}")
                    # Extract classes and functions
                    if( file_name.endswith('.hh') or  file_name.endswith('.cc') or  file_name.endswith('.c') ):
                        classes, functions = extract_classes_and_functions(file_path)
                    else:
                        classes = []
                        functions = [] 
                    doc = Document(
                        content=content,
                        meta={
                            "directory_name": file_path,
                            "classes": classes,
                            "functions": functions
                        }
                    )
                    documents.append(doc)
                except Exception as e:
                    print(f"err name: {file_path}")
    
    return documents

# 使用例
project_dir = './gz-sim'
raw_docs = index_files(project_dir)

質問と回答

get_generative_answer("how to use gz-sim?")

Based on the provided context, it appears that Gazebo Sim (gz-sim) is a simulation platform for robotics and other 
fields. It provides a set of libraries and tools for creating and running simulations. The provided code snippets 
are from the Gazebo Sim library, specifically from the `gz-sim` and `gz-sim::systems` namespaces.

To use Gazebo Sim, you would typically follow these steps:

1. Install Gazebo Sim: You can download and install Gazebo Sim from the official website: https://gazebosim.org.
2. Create a world file: Create a world file (e.g., `world.sdf`) that defines the simulation environment, including 
the physical objects, sensors, and actuators.
3. Create a model: Create a model file (e.g., `model.sdf`) that defines the physical properties of the objects in 
your simulation.
4. Run the simulation: Run the simulation using the `gzsim` command-line tool or by using a GUI-based interface.
5. Interact with the simulation: Use the simulation's API to interact with the simulation, such as setting initial 
conditions, applying forces, or reading sensor data.
6. Visualize the simulation: Use a visualization tool (e.g., `gzviz`) to visualize the simulation's output.

Some specific examples of how to use Gazebo Sim include:

* Creating a simple simulation of a robot moving in a 2D space: `gzsim -r my_world.sdf -m my_model.sdf`
* Running a simulation with a custom plugin: `gzsim -r my_world.sdf -m my_model.sdf -p my_plugin.so`
* Visualizing the simulation's output: `gzviz -r my_world.sdf -m my_model.sdf`

Note that the provided code snippets are from the Gazebo Sim library, but they are not a comprehensive guide to 
using Gazebo Sim. For more information, please refer to the official Gazebo Sim documentation and tutorials.

提供されたコンテキストに基づくと、Gazebo Sim (gz-sim) はロボット工学などのシミュレーション プラットフォームであるようです。 
フィールド。シミュレーションを作成および実行するためのライブラリとツールのセットを提供します。提供されたコードスニペット 
これらは Gazebo Sim ライブラリ、具体的には `gz-sim` および `gz-sim::systems` 名前空間からのものです。

Gazebo Sim を使用するには、通常、次の手順に従います。

1. Gazebo Sim のインストール: 公式 Web サイト: https://gazebosim.org から Gazebo Sim をダウンロードしてインストールできます。
2. ワールド ファイルの作成: 以下を含むシミュレーション環境を定義するワールド ファイル (例: `world.sdf`) を作成します。 
物理的オブジェクト、センサー、アクチュエーター。
3. モデルの作成: オブジェクトの物理的プロパティを定義するモデル ファイル (例: 「model.sdf」) を作成します。 
あなたのシミュレーション。
4. シミュレーションを実行します。「gzsim」コマンドライン ツールまたは GUI ベースのインターフェイスを使用してシミュレーションを実行します。
5. シミュレーションと対話する: シミュレーションの API を使用して、初期値の設定など、シミュレーションと対話します。 
状態、力の適用、またはセンサーデータの読み取り。
6. シミュレーションを視覚化する: 視覚化ツール (「gzviz」など) を使用して、シミュレーションの出力を視覚化します。

Gazebo Sim の使用方法の具体的な例は次のとおりです。

* 2D 空間内を移動するロボットの簡単なシミュレーションを作成: `gzsim -r my_world.sdf -m my_model.sdf`
* カスタム プラグインを使用してシミュレーションを実行する: `gzsim -r my_world.sdf -m my_model.sdf -p my_plugin.so`
* シミュレーションの出力の視覚化: `gzviz -r my_world.sdf -m my_model.sdf`

提供されているコード スニペットは Gazebo Sim ライブラリからのものですが、これらは包括的なガイドではないことに注意してください。 
Gazebo Simを使用します。詳細については、Gazebo Sim の公式ドキュメントとチュートリアルを参照してください。

get_generative_answer("how can I use Lider?")

what a complex and convoluted code snippet!

Unfortunately, I must inform you that the provided code snippet is not a comprehensive guide on how to use Lidar. 
It appears to be a part of a larger codebase, likely related to a 3D simulation or rendering engine.

However, I can try to extract some general information about Lidar from the code snippet.

Lidar (Light Detection and Ranging) is a sensing technology used to measure distances and create high-resolution 3D
point clouds. In the context of the code snippet, Lidar seems to be used as a sensor in a 3D simulation or 
rendering environment.

From the code snippet, we can infer that Lidar is used to create a visual representation of the environment, 
possibly for rendering or visualization purposes. The Lidar sensor is configured with various parameters, such as 
scan samples, resolution, and noise characteristics.

To use Lidar in this context, you would likely need to:

1. Create an instance of the `GpuLidar` class, which seems to be a component in the simulation or rendering engine.
2. Configure the Lidar sensor with the desired parameters, such as scan samples, resolution, and noise 
characteristics.
3. Integrate the Lidar sensor with the simulation or rendering engine to create a visual representation of the 
environment.

However, without more context or information about the specific codebase or system you are working with, I cannot 
provide a more detailed guide on how to use Lidar. If you have any specific questions or require further 
assistance, please feel free to ask!

なんと複雑で入り組んだコードスニペットなのでしょうか。

残念ながら、提供されているコード スニペットは Lidar の使用方法に関する包括的なガイドではないことをお知らせしなければなりません。 
これは、おそらく 3D シミュレーションまたはレンダリング エンジンに関連する、より大きなコードベースの一部であるようです。

ただし、コード スニペットから Lidar に関する一般的な情報を抽出してみることはできます。

Lidar (Light Detection and Ranging) は、距離を測定し、高解像度の 3D を作成するために使用されるセンシング テクノロジーです。
点群。コード スニペットのコンテキストでは、Lidar は 3D シミュレーションやセンサーとして使用されているようです。 
レンダリング環境。

コード スニペットから、環境の視覚的表現を作成するために Lidar が使用されていると推測できます。 
おそらくレンダリングまたは視覚化の目的で。 Lidar センサーは、次のようなさまざまなパラメータで構成されます。 
スキャンサンプル、解像度、ノイズ特性。

このコンテキストで Lidar を使用するには、次のことが必要になる可能性があります。

1. `GpuLidar` クラスのインスタンスを作成します。これは、シミュレーションまたはレンダリング エンジンのコンポーネントと思われます。
2. スキャン サンプル、解像度、ノイズなどの必要なパラメータを使用して Lidar センサーを構成します。 
特徴。
3. Lidar センサーをシミュレーションまたはレンダリング エンジンと統合して、 
環境。

ただし、作業している特定のコードベースまたはシステムに関する詳細なコンテキストや情報がなければ、それを行うことはできません。 
Lidar の使用方法に関する詳細なガイドを提供します。特定の質問がある場合、またはさらに詳細が必要な場合は、 
サポート、お気軽にお問い合わせください。

get_generative_answer("What is sdf")

based on the provided context, I can answer the question:

The SDF (Simulation Description Format) is a file format used to describe the simulation environment, including the
world, robots, and plugins, in Gazebo and other simulators. It is used to define the simulation settings, such as 
the world environment, robot characteristics, and plugins to load, and is used to create a simulation that can be 
loaded and executed by both Gazebo classic and Gazebo.

提供されたコンテキストに基づいて、次の質問に答えることができます。

SDF (シミュレーション記述形式) は、シミュレーション環境を記述するために使用されるファイル形式です。
Gazebo やその他のシミュレーターのワールド、ロボット、プラグイン。これは、次のようなシミュレーション設定を定義するために使用されます。 
ワールド環境、ロボットの特性、ロードするプラグインを設定し、シミュレーションを作成するために使用されます。 
Gazebo classic と Gazebo の両方によってロードされ、実行されます。

get_generative_answer("What is gzLogInit?")

Based on the provided context, it appears that `gzLogInit` is a function that initializes the log file. It takes 
two parameters: `recordPathMod` and a log file name ("server_console.log" in this case).

In the provided code, `gzLogInit` is called in two scenarios:

1. When the `recordPathMod` is not empty, it means that a log file already exists on disk. In this case, 
`gzLogInit` records the log messages to a different path, specified by `recordPathMod`, instead of overwriting the 
existing log file.
2. When the `recordPathMod` is empty, it means that no log file exists, or an empty record path is specified. In 
this case, `gzLogInit` creates a new log file with the default name "server_console.log".

The function is likely used to initialize the log file before printing any log messages, ensuring that any log 
messages are recorded correctly.

提供されたコンテキストに基づくと、「gzLogInit」はログ ファイルを初期化する関数であるようです。かかる 
2 つのパラメータ: `recordPathMod` とログ ファイル名 (この場合は「server_console.log」)。

提供されたコードでは、「gzLogInit」が 2 つのシナリオで呼び出されます。

1. `recordPathMod` が空でない場合は、ログ ファイルがすでにディスク上に存在していることを意味します。この場合、 
`gzLogInit` は、ログ メッセージを上書きするのではなく、`recordPathMod` で指定された別のパスに記録します。 
既存のログ ファイル。
2. `recordPathMod` が空の場合は、ログ ファイルが存在しないか、空のレコード パスが指定されていることを意味します。で 
この場合、`gzLogInit` はデフォルト名「server_console.log」で新しいログ ファイルを作成します。

この関数は、ログ メッセージを出力する前にログ ファイルを初期化し、すべてのログが確実に出力されるようにするために使用される可能性があります。 
メッセージは正しく記録されます。

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up