nlp4j-llm-embeddings-e5
The purpose of this component is to utilize the Multilingual-E5-large embeddings through an HTTP server.
0. Prerequisites
It is assumed that a Docker environment is already set up.
An image of about 5GB will be downloaded.
1. Installation
Retrieve the image using the docker pull command:
docker pull oyahiroki/nlp4j-llm-embeddings-e5:1.0.0.0
An example of the output is as follows:
>docker pull oyahiroki/nlp4j-llm-embeddings-e5:1.0.0.0
1.0.0.0: Pulling from oyahiroki/nlp4j-llm-embeddings-e5
e17464c8c9fb: Download complete
f344618db07e: Download complete
...
Digest: sha256:7d21fbdb572e1b99242179b4e99ff7cde6876febeb4f2ce981fccd39afd98f04
Status: Downloaded newer image for oyahiroki/nlp4j-llm-embeddings-e5:1.0.0.0
docker.io/oyahiroki/nlp4j-llm-embeddings-e5:1.0.0.0
For more information about the image, please visit:
2. Execution
Run the following command to execute:
You may add options as you like.
The initialization process also downloads the Multilingual-E5-large model, so an additional 5GB of disk space will be used. This initialization process only occurs during the first startup and not on subsequent starts.
docker run -d --name nlp4j-llm-embeddings-e5 -p 8888:8888 oyahiroki/nlp4j-llm-embeddings-e5:1.0.0.0
This is how it looks on Docker Desktop:
After about 5 minutes, when the initialization process completes, the CPU usage will decrease.
3. Testing
You can vectorize text by sending it to the REST API using the Curl command:
curl http://127.0.0.1:8888/?text=%E3%81%93%E3%82%8C%E3%81%AF%E3%83%86%E3%82%B9%E3%83%88%E3%81%A7%E3%81%99%E3%80%82
If you receive a result like the following, it is working correctly:
{"message": "ok", "time": "2025-01-04T05:49:20", "text": "これはテストです。", "embeddings": [...]}
4. Usage
If you want to use it with Python, you can do so as follows:
import requests
import json
import time
# URL to send requests
url = "http://127.0.0.1:8888/"
params = {"text": "これはテストです。"}
try:
# Timestamp before sending request
start_time = time.time()
# Send GET request
response = requests.get(url, params=params)
# Timestamp after receiving response
end_time = time.time()
# Check response status code
response.raise_for_status()
# Parse JSON response
data = response.json()
# Calculate processing time
elapsed_time = end_time - start_time
# Display results
print("Message:", data["message"])
print("Time:", data["time"])
print("Text:", data["text"])
print("Embeddings (first 5):", data["embeddings"][:5])
print(f"Time from request to response: {elapsed_time:.4f} seconds")
except requests.exceptions.RequestException as e:
print(f"Error during HTTP request: {e}")
except json.JSONDecodeError as e:
print(f"Error parsing JSON: {e}")
Since it's a REST API, it can also be accessed from other languages like Java.
A dedicated library is also available for Java:
String text = "今日はとてもいい天気です。";
String endPoint = "http://localhost:8888/";
EmbeddingServiceViaHttp nlp = new EmbeddingServiceViaHttp(endPoint);
NlpServiceResponse res = nlp.process(text);
EmbeddingResponse r = (new Gson()).fromJson(res.getOriginalResponseBody(), EmbeddingResponse.class);
System.err.println(Arrays.toString(r.getEmbeddings()));