0
0

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?

More than 1 year has passed since last update.

【Python】GPTブームに乗じて簡易的な機械学習AIを完全自作してみた

Posted at

突然なのですが、実は私は(到底信じてもらえないでしょうが)ChatGPTが流行る前から機械学習をやりたい思いはあったのですが、当時はプログラミングの知識がほとんどなく、結局Webアプリケーションに興味移ってしまったためしばらくやってませんでした。。。

ただ最近GPT絡みでAIブームが来ているので(?)簡易的な機械学習と言語モデル生成をやってみます。

趣味の範疇なので悪しからず。

コード

1.あらかじめ指定した文字列を整理(?)するやつ

import random

# function to generate n-grams from text
def generate_ngrams(text, n):
    ngrams = []
    words = text.split()
    for i in range(len(words)-n+1):
        ngrams.append(' '.join(words[i:i+n]))
    return ngrams

# function to generate a language model from text
def generate_language_model(text, n):
    ngrams = generate_ngrams(text, n)
    model = {}
    for i in range(len(ngrams)-1):
        if ngrams[i] not in model:
            model[ngrams[i]] = []
        model[ngrams[i]].append(ngrams[i+1])
    return model

# function to generate text using language model
def generate_text(model, n, length):
    current_ngram = random.choice(list(model.keys()))
    text = current_ngram
    for i in range(length):
        if current_ngram not in model:
            break
        possible_words = model[current_ngram]
        next_word = random.choice(possible_words)
        text += ' ' + next_word.split()[n-1]
        current_ngram = ' '.join(text.split()[-n:])
    return text

# example usage
text = 'The quick brown fox jumps over the lazy dog'
n = 2
model = generate_language_model(text, n)
generated_text = generate_text(model, n, 10)
print(generated_text)

2.標準入力をもとにJSONでモデル生成するやつ

import random
import json
# function to generate n-grams from text
def generate_ngrams(text, n):
    ngrams = []
    words = text.split()
    for i in range(len(words)-n+1):
        ngrams.append(' '.join(words[i:i+n]))
    return ngrams

# function to generate a language model from text
def generate_language_model(text, n):
    ngrams = generate_ngrams(text, n)
    model = {}
    for i in range(len(ngrams)-1):
        if ngrams[i] not in model:
            model[ngrams[i]] = []
        model[ngrams[i]].append(ngrams[i+1])
    return model

# function to generate text using language model
def generate_text(model, n, length):
    current_ngram = random.choice(list(model.keys()))
    text = current_ngram
    for i in range(length):
        if current_ngram not in model:
            break
        possible_words = model[current_ngram]
        next_word = random.choice(possible_words)
        text += ' ' + next_word.split()[n-1]
        current_ngram = ' '.join(text.split()[-n:])
    return text

# example usage
text1 = str(input())
text2 = str(input())
text = text1 + text2
n = 2
model = generate_language_model(text, n)
generated_text = generate_text(model, n, 10)
print(generated_text)

with open('model.json', 'w') as f:
    json.dump(model, f)

生成したモデル

モデル1

{
    "The quick": [
        "quick brown",
        "quick brown"
    ],
    "quick brown": [
        "brown fox",
        "brown fox",
        "brown fox",
        "brown fox"
    ],
    "brown fox": [
        "fox jumps",
        "fox jumps",
        "fox jumps",
        "fox jumps"
    ],
    "fox jumps": [
        "jumps over",
        "jumps over",
        "jumps over",
        "jumps over"
    ],
    "jumps over": [
        "over the",
        "over the",
        "over the",
        "over the"
    ],
    "over the": [
        "the lazy",
        "the lazy",
        "the lazy",
        "the lazy"
    ],
    "the lazy": [
        "lazy dog",
        "lazy dogThe",
        "lazy dogThe",
        "lazy dog"
    ],
    "lazy dog": [
        "dog The"
    ],
    "dog The": [
        "The quick"
    ],
    "lazy dogThe": [
        "dogThe quick",
        "dogThe quick"
    ],
    "dogThe quick": [
        "quick brown",
        "quick brown"
    ]
}

モデル2

{
    "1 2": [
        "2 3",
        "2 3"
    ],
    "2 3": [
        "3 4",
        "3 4",
        "3 4",
        "3 4"
    ],
    "3 4": [
        "4 5",
        "4 51",
        "4 51",
        "4 5"
    ],
    "4 5": [
        "5 1"
    ],
    "5 1": [
        "1 2"
    ],
    "4 51": [
        "51 2",
        "51 2"
    ],
    "51 2": [
        "2 3",
        "2 3"
    ]
}

モデル3(標準入力から生成するタイプ)

{
    "22233 3332231313131": [
        "3332231313131 daiwduai"
    ]
}

ノリと勢いでやりましたが案外いけますね

タノシイ

0
0
0

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
0
0

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?