More than 1 year has passed since last update.

[初心者]ChatGPTをつかってコードを修正してみる

Last updated at 2023-01-26Posted at 2023-01-24

はじめに

ChatGPTがコードを修正してくれると聞きましたので試してみました

直すコード

前回のコードで気になっておりましたのが、以下の同じ処理を繰り返す部分がかっこわるく、もう少しまとまらないかと思っておりました。

df_nile = df_drone_data.copy()
df_nile['出願人/権利者'] = df_nile['出願人/権利者'].str.contains('株式会社ナイルワークス')
df_nile = df_nile[~(df_nile['出願人/権利者'] == False)]

df_ntt = df_drone_data.copy()
df_ntt['出願人/権利者'] = df_ntt['出願人/権利者'].str.contains('株式会社ＮＴＴドコモ')
df_ntt = df_ntt[~(df_ntt['出願人/権利者'] == False)]

df_toy = df_drone_data.copy()
df_toy['出願人/権利者'] = df_toy['出願人/権利者'].str.contains('トヨタ自動車株式会社')
df_toy = df_toy[~(df_toy['出願人/権利者'] == False)]

やり方

ChatGPTに上記コードを張り付けて、「３つの処理をまとめてください。お願いします。」とのコメントを添えてみました。

結果

以下の回答がありました
Yes, you can simplify the code by combining the three separate operations into a single function and then applying it to the original DataFrame. Here's an example of how you could do this:

def filter_by_string(df, string, source_name):
    df_filtered = df.copy()
    df_filtered['出願人/権利者'] = df_filtered['出願人/権利者'].str.contains(string)
    df_filtered = df_filtered[~(df_filtered['出願人/権利者'] == False)]
    df_filtered['Source'] = source_name
    return df_filtered

df_nile = filter_by_string(df_drone_data, '株式会社ナイルワークス', 'Nile Works')
df_ntt = filter_by_string(df_drone_data, '株式会社ＮＴＴドコモ', 'NTT Docomo')
df_toy = filter_by_string(df_drone_data, 'トヨタ自動車株式会社', 'Toyota')

This way, you can use the same function for filtering the dataframe, and passing the different string you want to filter and source name.

このコードを張り付けましたところ無事に動きましたので、ほぼ正解が得られたと思います。（一部余計な処理も入っておりますが。）

さらなる改良

df_nile = filter_by_string(df_drone_data, '株式会社ナイルワークス', 'Nile Works')
df_ntt = filter_by_string(df_drone_data, '株式会社ＮＴＴドコモ', 'NTT Docomo')
df_toy = filter_by_string(df_drone_data, 'トヨタ自動車株式会社', 'Toyota')
の部分も冗長な気がしましたので、これも何とかしてもらいました。

やり方

ChatGPTに上記コードを張り付けて、「辞書などをつかって３つの処理をまとめてください。お願いします。」とのコメントを添えてみました。

結果

以下の回答がありました。
Here is an example of how you can modify the original code to read the strings from a dictionary and filter the dataframe:

strings_to_filter = {'nile': '株式会社ナイルワークス', 'ntt': '株式会社ＮＴＴドコモ', 'toy': 'トヨタ自動車株式会社'}

def filter_by_string(df, string):
    df_filtered = df.copy()
    df_filtered['出願人/権利者'] = df_filtered['出願人/権利者'].str.contains(string)
    df_filtered = df_filtered[~(df_filtered['出願人/権利者'] == False)]
    return df_filtered

for key, value in strings_to_filter.items():
    df_filtered = filter_by_string(df_drone_data, value)
    globals()['df_' + key] = df_filtered

In this example, I removed the source_name argument from filter_by_string function as it was not used in the function. And I also added the dictionary strings_to_filter with the strings you want to filter by as values and keys as nile, ntt, toy respectively. And then, in the for loop, it iterates over the dictionary, calling the "filter_by_string" function for each value and creating a new variable with the name "df_" + key, where key is the key of the current dictionary item.

It will filter the dataframe using the strings from the dictionary, and create new variables for each filtered dataframe.

英語でどういう改良をしたか説明しているようですが、ほぼ理解できません。しかし、こちらでも無事に動きましたので、正解が得られたと思います。

さらなる改良

{'nile': '株式会社ナイルワークス', 'ntt': '株式会社ＮＴＴドコモ', 'toy': 'トヨタ自動車株式会社'}を外部のCSVから読み込めればよいような気がしますので、「外部のCSVから {'nile': '株式会社ナイルワークス', 'ntt': '株式会社ＮＴＴドコモ', 'toy': 'トヨタ自動車株式会社'}の部分を読み込みたい。お願いします。」とのコメントを添えてみました。

結果

以下の回答がありました。
You can use the pandas library to read a CSV file and convert it into a DataFrame. Here is an example of how you could modify the code to read the dictionary from a CSV file:

import pandas as pd

strings_to_filter = pd.read_csv('strings_to_filter.csv',header=None)
strings_to_filter = dict(zip(strings_to_filter[0], strings_to_filter[1]))

def filter_by_string(df, string):
    df_filtered = df.copy()
    df_filtered['出願人/権利者'] = df_filtered['出願人/権利者'].str.contains(string)
    df_filtered = df_filtered[~(df_filtered['出願人/権利者'] == False)]
    return df_filtered

for key, value in strings_to_filter.items():
    df_filtered = filter_by_string(df_drone_data, value)
    globals()['df_' + key] = df_filtered

It assumes that the file name is "strings_to_filter.csv" and it's located in the same directory as the script. Also it assumes that the CSV file have two columns, first column for the key, second column for the value of the dictionary.
You may also want to add some error handling if the file is not found.

これで、外部にCSVにリストを作れば、何件でもデータフレームを作れることになりました。CSVのフォーマットや格納ディレクトリに関するアドバイスもありますので、親切すぎると思います。

感想

感想は、初心者レベルでしたらChatGPT相手に勉強してゆけばよいのか、と考えてしまいます。もちろん、回答しやすい質問を考える必要がありますが、まあ、すごい時代になったものです。
また、コードが正しいかは、初心者には判別できませんので、頼りすぎは禁物とも感じました。

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up