MetaのLlama 2をDatabricksで動かしてみる

Last updated at 2023-07-24Posted at 2023-07-20

早速こちらを試してみます。

サンプルノートブックが公開されているので触りだけ。

こちらのllm-models/llamav2/llamav2-7b配下にノートブックが格納されています。

Databricksの準備

Reposでリポジトリを追加します。

ノートブックにアクセスできるようになりました。

Databricks MLランタイム13.2のGPUクラスターを準備します。

Metaでの準備

https://ai.meta.com/ でDownload the Modelをクリックし、自分の情報を入力して、利用条項に同意します。

HuggingFaceの準備

HuggingFaceのトークンをメモしておきます。
https://huggingface.co/meta-llama/Llama-2-7b-chat-hf にアクセスして、アクセスをリクエストします。
リクエストが許可されるとメールが届きます。

ノートブックの実行

ノートブック01_load_inferenceを実行していきます。

途中、HuggingFaceのトークンの入力が必要です。

モデルのダウンロード。

# Load model to text generation pipeline
from transformers import AutoTokenizer, AutoModelForCausalLM
import transformers
import torch

# it is suggested to pin the revision commit hash and not change it for reproducibility because the uploader might change the model afterwards; you can find the commmit history of llamav2-7b-chat in https://huggingface.co/meta-llama/Llama-2-7b-chat-hf/commits/main
model = "meta-llama/Llama-2-7b-chat-hf"
revision = "0ede8dd71e923db6258295621d817ca8714516d4"

tokenizer = AutoTokenizer.from_pretrained(model, padding_side="left")
pipeline = transformers.pipeline(
    "text-generation",
    model=model,
    tokenizer=tokenizer,
    torch_dtype=torch.bfloat16,
    trust_remote_code=True,
    device_map="auto",
    revision=revision,
    return_full_text=False
)

# Required tokenizer setting for batch inference
pipeline.tokenizer.pad_token_id = tokenizer.eos_token_id

プロンプトの準備。

# Define prompt template, the format below is from: http://fastml.com/how-to-train-your-own-chatgpt-alpaca-style-part-one/

# Prompt templates as follows could guide the model to follow instructions and respond to the input, and empirically it turns out to make Falcon models produce better responses

INSTRUCTION_KEY = "### Instruction:"
RESPONSE_KEY = "### Response:"
INTRO_BLURB = "Below is an instruction that describes a task. Write a response that appropriately completes the request."
PROMPT_FOR_GENERATION_FORMAT = """{intro}
{instruction_key}
{instruction}
{response_key}
""".format(
    intro=INTRO_BLURB,
    instruction_key=INSTRUCTION_KEY,
    instruction="{instruction}",
    response_key=RESPONSE_KEY,
)

関数の作成。

# Define parameters to generate text
def gen_text(prompts, use_template=False, **kwargs):
    if use_template:
        full_prompts = [
            PROMPT_FOR_GENERATION_FORMAT.format(instruction=prompt)
            for prompt in prompts
        ]
    else:
        full_prompts = prompts

    if "batch_size" not in kwargs:
        kwargs["batch_size"] = 1
    
    # the default max length is pretty small (20), which would cut the generated output in the middle, so it's necessary to increase the threshold to the complete response
    if "max_new_tokens" not in kwargs:
        kwargs["max_new_tokens"] = 512

    # configure other text generation arguments, see common configurable args here: https://huggingface.co/docs/transformers/main_classes/text_generation#transformers.GenerationConfig
    kwargs.update(
        {
            "pad_token_id": tokenizer.eos_token_id,  # Hugging Face sets pad_token_id to eos_token_id by default; setting here to not see redundant message
            "eos_token_id": tokenizer.eos_token_id,
        }
    )

    outputs = pipeline(full_prompts, **kwargs)
    outputs = [out[0]["generated_text"] for out in outputs]

    return outputs

単一の入力で文書生成。

results = gen_text(["What is a large language model?"])
print(results[0])

A large language model is a type of artificial intelligence (AI) model that is trained on a large corpus of text data to generate language outputs that are coherent and natural-sounding. These models are designed to learn the patterns and structures of language by exposure to a wide range of texts, and can be used for a variety of applications such as language translation, text summarization, and language generation.

Some examples of large language models include:

1. BERT (Bidirectional Encoder Representations from Transformers): Developed by Google in 2018, BERT is a powerful language model that has achieved state-of-the-art results on a wide range of natural language processing (NLP) tasks. BERT uses a multi-layer bidirectional transformer encoder to generate contextualized representations of words in a sentence.
2. RoBERTa (Robustly Optimized BERT Pretraining Approach): Developed in 2019, RoBERTa is a variant of BERT that was specifically designed for text classification tasks. RoBERTa uses a modified version of the BERT architecture and training procedure to improve its performance on long-tail and out-of-vocabulary words.
3. DistilBERT (Distilled BERT): Developed in 2019, DistilBERT is a smaller and more efficient variant of BERT that has achieved comparable performance to BERT on a wide range of NLP tasks. DistilBERT uses a distillation technique to compress the knowledge of the full BERT model into a smaller model that requires fewer computational resources.
4. Longformer (Long-range dependence transformer): Developed in 2020, Longformer is a language model that is specifically designed to handle long-range dependencies in text. Longformer uses a novel attention mechanism that allows it to process input sequences of arbitrary length, making it well-suited for tasks such as machine translation and text summarization.

These are just a few examples of the many large language models that have been developed in recent years. Each of these models has achieved state-of-the-art results on a wide range of NLP tasks, and they have all contributed to the rapid advancement of the field of natural language processing.

日本語は？

results = gen_text(["大規模言語モデルとは？"])
print(results[0])

おおー、それなりに返ってきます。

大規模言語モデル（Big Language Model、BLM）とは、自然言語processing（NLP）における一種の人工知能（AI）で、大規模なデータセットを用いて、自然言語の表現を処理することができるように設計されている。

BLMは、通常、大量の文書やデータセットを用いて、言語モデルを学習することができる。これらのデータセットには、単語や文、文章などの自然言語の表現が含まれている。BLMは、これらのデータセットを用いて、自然言語の表現を処理するための特徴を学習する。

BLMは、自然言語の処理において、以下のような役割を果たすことができる。

1. 文本生成：BLMを使用することで、新しい文本を生成することができる。これには、単語や文、文章などの自然言語の表現を生成することができる。
2. 文本読み取り：BLMを使用することで、文本を読み取り、単語や文、文章などの自然言語の表現を抽出することができる。
3. 文本理解：BLMを使用することで、文本を理解することができる。これには、文本中の単語や文、文章などの自然言語の表現を理解することができる。
4. 自然言語処理：BLMを使用するこ

他の問い合わせはどうでしょう。

results = gen_text(["Databricksとは？"])
print(results[0])

Databricksは、Apache Sparkを基にした、 cloud-based big data analytics platformです。Databricksは、Sparkの強みを生かして、効率的に大規模データの処理を行うことができます。Databricksは、Data Engineers、Data Scientists、およびBusiness Analystsなど、幅広い層のユーザーによって使用されています。

Databricksの主な機能は以下の通りです。

1. Apache Sparkの導入：Databricksは、Apache Sparkを基にしており、Sparkの強みを生かして、効率的に大規模データの処理を行うことができます。
2. Cloud-based Platform：Databricksは、クラウドベースのプラットフォームであり、データの処理を容易にするために、Sparkの強みを生かしています。
3. Collaboration Tools：Databricksには、データの共有や協力を容易にするための強力なコラボレーションツールがあります。
4. Machine Learning：Databricksには、Machine Learningのための強力なツールがあります。
5. Data Governance：Databricksには、データの管理や安全性を確保するための強力なデータ管理機能があります。
6. Integration：Databricksには、Sparkの強みを生かして、各種データソースと統合することができます。

日本語でも結構動きそうなので、他にも試してみます。

Databricksクイックスタートガイド

Databricks無料トライアル

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up