0
0

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?

Postgresql@17でベクトル保存を確認する(Mac)

Posted at

postgresql@17をインストールし、サービスを開始する:

brew services stop postgresql@16
brew install postgresql@17
brew services start postgresql@17
brew services list

Name          Status  User File
postgresql@14 none    tn
postgresql@16 none
postgresql@17 started tn   ~/Library/LaunchAgents/homebrew.mxcl.postgresql@17.plist

dbを作成する:

createdb testdb;
psql -U tn -d testdb

WARNING: psql major version 16, server major version 17.
         Some psql features might not work.

psqlを再インストールしても16.4のままだった:

~ $brew reinstall libpq
~ $psql --version
psql (PostgreSQL) 16.4

vector拡張のインストール、@17でも予想通りのエラー:

testdb=# create extension vector;

ERROR:  extension "vector" is not available
DETAIL:  Could not open extension control file "/usr/local/share/postgresql@17/extension/vector.control": No such file or directory.
HINT:  The extension must first be installed on the system where PostgreSQL is running.

ベクトル拡張のインストール

始めてインストール場合は、*1と同様に:

git clone --branch v0.7.4 https://github.com/pgvector/pgvector.git
cd pgvector
export PG_CONFIG=/usr/local/opt/postgresql@17/bin/pg_config
make PG_CONFIG=$PG_CONFIG all
make PG_CONFIG=$PG_CONFIG install

今回の筆者のようにmakeが2回目以上の場合:

cd ~/pgvector
export PG_CONFIG=/usr/local/opt/postgresql@17/bin/pg_config
make PG_CONFIG=$PG_CONFIG clean
make PG_CONFIG=$PG_CONFIG all 
make PG_CONFIG=$PG_CONFIG install

再度:

testdb=# create extension vector;
CREATE EXTENSION
testdb=# \dx
List of installed extensions
  Name   | Version |   Schema   |                     Description                      
---------+---------+------------+------------------------------------------------------
 plpgsql | 1.0     | pg_catalog | PL/pgSQL procedural language
 vector  | 0.7.4   | public     | vector data type and ivfflat and hnsw access methods
(2 rows)

とベクトル拡張が確認できる。

以下で、*2に習って、dbへの書き込みとその確認を行います。
予めollamaのインストールとembedding モデル"bge-m3"をollama pullしておきます(モデルは適当ですし、ollamaを使わずにOpenAIのもので、もちろん結構です。その場合は適宜変更ください)。

!pip install langchain-postgres psycopg langchain-ollama langchain

from langchain_postgres import PGVector
from langchain_postgres.vectorstores import PGVector
from langchain_core.documents import Document
from langchain_ollama import OllamaEmbeddings

embedding = OllamaEmbeddings(
   model="bge-m3"
)

#パスワードはかけていない
connection = "postgresql+psycopg://tn@localhost:5432/testdb" 
collection_name = "my_docs"

vectorstore = PGVector(
   embeddings=embedding,
   collection_name=collection_name,
   connection=connection,
   use_jsonb=True,
)

# langchainが作成したテーブルの確認
!psql -d testdb -c "\dt"
               List of relations
Schema |          Name           | Type  | Owner 
--------+-------------------------+-------+-------
public | langchain_pg_collection | table | tn
public | langchain_pg_embedding  | table | tn
(2 rows)
!psql -d testdb -c "\d langchain_pg_embedding"               

Table "public.langchain_pg_embedding"
   Column     |       Type        | Collation | Nullable | Default 
---------------+-------------------+-----------+----------+---------
id            | character varying |           | not null | 
collection_id | uuid              |           |          | 
embedding     | vector            |           |          | 
document      | character varying |           |          | 
cmetadata     | jsonb             |           |          | 
...
# *3のデータで、id:10を修正、id:11を追加した
docs = [
   Document(page_content='there are cats in the pond', metadata={"id": 1, "location": "pond", "topic": "animals"}),
   Document(page_content='ducks are also found in the pond', metadata={"id": 2, "location": "pond", "topic": "animals"}),
   Document(page_content='fresh apples are available at the market', metadata={"id": 3, "location": "market", "topic": "food"}),
   Document(page_content='the market also sells fresh oranges', metadata={"id": 4, "location": "market", "topic": "food"}),
   Document(page_content='the new art exhibit is fascinating', metadata={"id": 5, "location": "museum", "topic": "art"}),
   Document(page_content='a sculpture exhibit is also at the museum', metadata={"id": 6, "location": "museum", "topic": "art"}),
   Document(page_content='a new coffee shop opened on Main Street', metadata={"id": 7, "location": "Main Street", "topic": "food"}),
   Document(page_content='the book club meets at the library', metadata={"id": 8, "location": "library", "topic": "reading"}),
   Document(page_content='the library hosts a weekly story time for kids', metadata={"id": 9, "location": "library", "topic": "reading"}),
   Document(page_content='there are tigers in the yard', metadata={"id": 10, "location": "zoo", "topic": "animals"}),
   Document(page_content='there are dogs in the backyard', metadata={"id": 11, "location": "my home", "topic": "animals"})
]

# dbに書き込む
vectorstore.add_documents(docs, ids=[doc.metadata['id'] for doc in docs])

# オマケ
results = vectorstore.similarity_search_with_score(query="lion",k=5)
for doc, score in results:
   print(f"* [SIM={score:3f}] {doc.page_content} [{doc.metadata}]")

* [SIM=0.457508] there are tigers in the yard [{'id': 10, 'topic': 'animals', 'location': 'zoo'}]
* [SIM=0.494071] there are dogs in the backyard [{'id': 11, 'topic': 'animals', 'location': 'my home'}]
* [SIM=0.540048] ducks are also found in the pond [{'id': 2, 'topic': 'animals', 'location': 'pond'}]
* [SIM=0.541976] there are cats in the pond [{'id': 1, 'topic': 'animals', 'location': 'pond'}]
* [SIM=0.557055] the book club meets at the library [{'id': 8, 'topic': 'reading', 'location': 'library'}]

データベースからデータを取得する

メモリクリアのために、jupyterカーネルの再起動を行なって:

!pip install psycopg
import psycopg

conn = psycopg.connect("dbname=testdb user=tn")
cur = conn.cursor()
cur.execute('select * from langchain_pg_embedding')
for row in cur:
    formatted_output = f"id: {row[0]}\n" \
                    f"uuid: {row[1]}\n" \
                    f"page_content: {row[2][:100]}...\n" \
                    f"page_content(string): {row[3]}\n" \
                    f"metadata: {row[4]}\n"
    print(formatted_output)
cur.close()
conn.close()

id: 1
uuid: 46dd887b-7d08-43f5-b89f-6e6650b8594c
page_content: [-0.041045193,0.009569716,-0.093480445,0.01990515,0.00062993786,-0.057626557,-0.02735376,-0.01263911...
page_content(string): there are cats in the pond
metadata: {'id': 1, 'topic': 'animals', 'location': 'pond'}

id: 2
uuid: 46dd887b-7d08-43f5-b89f-6e6650b8594c
page_content: [-0.0365837,0.0019632874,-0.0848764,-0.007010041,-0.028483586,-0.027577631,-1.8776509e-05,-0.0053111...
page_content(string): ducks are also found in the pond
metadata: {'id': 2, 'topic': 'animals', 'location': 'pond'}

id: 3
uuid: 46dd887b-7d08-43f5-b89f-6e6650b8594c
page_content: [0.02102992,0.006635084,-0.05879326,-0.004522216,-0.0065848893,-0.03556204,0.009028944,0.03192697,0....
page_content(string): fresh apples are available at the market
metadata: {'id': 3, 'topic': 'food', 'location': 'market'}

id: 4
uuid: 46dd887b-7d08-43f5-b89f-6e6650b8594c
page_content: [-0.021443207,0.001050144,-0.07157226,0.009703566,9.951767e-05,0.0027933442,-0.01428185,0.008426342,...
page_content(string): the market also sells fresh oranges
metadata: {'id': 4, 'topic': 'food', 'location': 'market'}

id: 5
...
page_content: [-0.04939314,0.0038462288,-0.08330972,0.017653793,-0.023387564,0.011244474,0.02507997,0.012847753,-0...
page_content(string): there are dogs in the backyard
metadata: {'id': 11, 'topic': 'animals', 'location': 'my home'}

Output is truncated. View as a scrollable element or open in a text editor. Adjust cell output settings...

ちゃんと呼び出せたので、作成したDBを削除:

!dropdb -f testdb

参考情報:
*1: https://qiita.com/tnagata/items/7e6ae9956bdcaf167d94
*2: https://qiita.com/tnagata/items/c4a08d868b838e3bb3ea
*3: https://github.com/langchain-ai/langchain-postgres/blob/main/examples/vectorstore.ipynb

0
0
0

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
0
0

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?