11
18

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?

More than 3 years have passed since last update.

VAEとGANで分子生成入門

Last updated at Posted at 2022-04-13

変分オートエンコーダー VAE (Variational Autoencoder) と 敵対的生成ネットワークGAN (Generative Adversarial Network) に入門する例としてMNISTなどがよく題材にされていますが、ここでは化学情報学でよく使われるSMILESを使って入門してみたいと思います。ただし、あくまで入門なので、生成された分子が目的の物性を持ってるかどうかは全く考えません。

PyTorch を使いますが、その基礎の基礎はこちらなどをご参照ください。

データの取得

化合物データとして、次のようなデータを pandas DataFrame 形式で持っているものと想定します。化合物の構造は、SMILES と呼ばれる文字列で表現されます。

df_reg[["Open Babel SMILES", "HOMO-LUMO gap"]]
Open Babel SMILES HOMO-LUMO gap
0 CN(CCCN1C(=CC(=C[C@@H](C1=O)C)C)C)C 5.064
1 OCc1c(C)cc(cc1C)C 6.041
2 OCc1cc(C)cc(c1O)CO 5.576
3 Oc1ccc(c(c1)C)C(C)C 5.837
4 C/C/1=C\CC(C)(C)/C=C/C/C(=C\CC1)/C 6.229
... ... ...
628 Oc1cc(O)c(c(c1O)C)O 5.124
629 OCc1occ(c(=O)c1)O 5.154
630 Cc1cc2[nH]cnc2cc1C 5.625
631 CC(c1cccc(c1O)C)C 5.984
632 Cc1cc(N)cc(c1N(=O)=O)C 4.250

633 rows × 2 columns

MLP (Multi-Layer Perceptron)

深層学習の基本の基本は多層パーセプトロンだと思うので、SMILES文字列を説明変数としたMLP回帰モデルを作ってみます。目的変数は HOMO-LUMO gap とします。

例として、学習のパラメーター等は次のようにします。

import torch

device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
SMILES_COL = "Open Babel SMILES"
TARGET_COL = 'HOMO-LUMO gap'
SMILES_MAXLEN = 50
BATCH_SIZE = 10

vocab

用いる語彙を vocab というリストに格納します。SMILES文字列においては、それを構成する1つ1つの文字を「語彙」とみなします。

vocab_freq =  {}
word_length_dist = []
for smile in df_reg[SMILES_COL]:
    for s in smile:
        if s not in vocab_freq.keys():
            vocab_freq[s] = 0
        vocab_freq[s] += 1
    word_length_dist.append(len(smile))

vocab = list(vocab_freq.keys())

念のため、SMILES文字列の長さの分布を見るとこんな感じ。

import matplotlib.pyplot as plt

plt.hist(word_length_dist)
plt.show()

VAEとGANで分子生成入門_10_0.png

SMILES の one-hot-vector

vocab を用いて、SMILES文字列を one-hot-vector に変換する関数を設計します。

import numpy as np

def smile2vec(vocab, vecsize, smile):
    vec = []
    for i in range(vecsize):
        v = [0 for _ in range(len(vocab))]
        if i < len(smile):
            v[vocab.index(smile[i])] = 1
        vec += v
    return vec

次のようにして、SMILES文字列を one-hot-vector から成る説明変数に変換します。

X = []
for smile in list(df_reg[SMILES_COL]):
    X.append(smile2vec(vocab, SMILES_MAXLEN, smile))

X = np.array(X)

目的変数を作成します。

T = list(df_reg[TARGET_COL])
T = np.array(T).reshape(len(T),1)

データ分割

説明変数と目的変数をそれぞれ教師データ、テストデータに分割します。

# 奇数番目のデータを教師データ、偶数番目のデータをテストデータとします。
index = np.arange(T.size)
X_train = X[index[index % 2 != 0], :] # 説明変数(教師データ)
X_test = X[index[index % 2 == 0], :] # 説明変数(テストデータ)
T_train = T[index[index % 2 != 0], :] # 目的変数(教師データ)
T_test = T[index[index % 2 == 0], :] # 目的変数(テストデータ)

テンソル、データセット、データローダー

PyTorchが取り扱えるように、テンソルに変換します。

X_train_tensor = torch.from_numpy(X_train).float()
X_test_tensor = torch.from_numpy(X_test).float()

T_train_tensor = torch.from_numpy(T_train).float()
T_test_tensor = torch.from_numpy(T_test).float()

説明変数と目的変数をまとめたデータセットにします。

from torch.utils.data import TensorDataset

train = TensorDataset(X_train_tensor, T_train_tensor)
test = TensorDataset(X_test_tensor, T_test_tensor)

PyTorchがバッチデータを取り出しやすいようにデータローダーに変換します。

from torch.utils.data import DataLoader

train_loader = DataLoader(train, batch_size=BATCH_SIZE, shuffle=True)
test_loader = DataLoader(test, batch_size=BATCH_SIZE, shuffle=True)

MLP Regressor モデル

MLP Regressor を次のように設計します。

import torch

class MLPR(torch.nn.Module):
    def __init__(self, n_input, n_hidden, n_output):
        super(MLPR, self).__init__()
        self.l1 = torch.nn.Linear(n_input, n_hidden)
        self.l2 = torch.nn.Linear(n_hidden, n_hidden)
        self.l3 = torch.nn.Linear(n_hidden, n_output)

    def forward(self, x):
        x = torch.sigmoid(self.l1(x))
        x = torch.sigmoid(self.l2(x))
        return self.l3(x)

モデルの作成と、最適化のための設定。

model = MLPR(len(vocab)*SMILES_MAXLEN, 10, 1)
criterion = torch.nn.MSELoss()
optimizer = torch.optim.SGD(model.parameters(), lr=0.01)

学習開始

losses_train = []
losses_test = []
for epoch in range(500):
    total_loss = 0
    for x_train, y_train in train_loader:
        x_train = torch.autograd.Variable(x_train)
        y_train = torch.autograd.Variable(y_train)
        optimizer.zero_grad()
        y_pred = model(x_train)
        loss = criterion(y_pred, y_train)
        loss.backward()
        optimizer.step()
        total_loss += loss.item()

    total_loss_test = 0
    for x_test, y_test in test_loader:
        x_test = torch.autograd.Variable(x_test)
        y_test = torch.autograd.Variable(y_test)
        optimizer.zero_grad()
        y_pred = model(x_test)
        loss = criterion(y_pred, y_test)
        total_loss_test += loss.item()

    losses_train.append(total_loss)
    losses_test.append(total_loss_test)
    if (epoch +1) % 100 == 0:
        print("Epoch: {}, Loss (train): {}, Loss (test): {}".format(epoch + 1, total_loss, total_loss_test))
Epoch: 100, Loss (train): 12.183247983455658, Loss (test): 16.908035904169083
Epoch: 200, Loss (train): 4.680343508720398, Loss (test): 15.275418043136597
Epoch: 300, Loss (train): 2.4838144555687904, Loss (test): 15.400072306394577
Epoch: 400, Loss (train): 1.7306331638246775, Loss (test): 16.064845487475395
Epoch: 500, Loss (train): 1.3097353847697377, Loss (test): 16.5964957177639

学習曲線

学習の結果。見事に過剰適合(過学習)している。

%matplotlib inline
import matplotlib.pyplot as plt
plt.plot(losses_train, label="train")
plt.plot(losses_test, label="test")
plt.xlabel("epoch")
plt.ylabel("loss")
plt.legend()
plt.grid()
plt.show()

VAEとGANで分子生成入門_32_0.png

いちおう、YYプロットを使って予測値と正解値を比較したらこんな感じ。

%matplotlib inline
import matplotlib.pyplot as plt
plt.figure(figsize=(6,6))
plt.scatter(T_train, model.forward(X_train_tensor).data.flatten(), alpha=0.5, label="train")
plt.scatter(T_test, model.forward(X_test_tensor).data.flatten(), alpha=0.5, label="test")
plt.plot([min(T), max(T)], [min(T), max(T)])
plt.grid()
plt.legend()
plt.xlabel('Observed')
plt.ylabel('Predicted')
plt.show()

VAEとGANで分子生成入門_34_0.png

本来なら、もっと良いモデルを作成するためにもっと頑張るところだけど、今回の目的は VAE と GAN の作成なので、MLPについてはこれ以上深追いはせずに次に進みます。

RDKit

ここまでは、SMILESをただの文字列として取り扱うだけでしたが、VAEやGANでは文字列を新規に生成します。生成した文字列がSMILESとして成立するかどうかを判定する必要があるため、化学情報学系ライブラリ RDKit をインストールして用います。

%%time 
!pip install git+https://github.com/maskot1977/rdkit_installer.git
from rdkit_installer import install
install.from_miniconda(rdkit_version="2020.09.1")
Collecting git+https://github.com/maskot1977/rdkit_installer.git
  Cloning https://github.com/maskot1977/rdkit_installer.git to /tmp/pip-req-build-d9hwjfi9
  Running command git clone -q https://github.com/maskot1977/rdkit_installer.git /tmp/pip-req-build-d9hwjfi9
Building wheels for collected packages: rdkit-installer
  Building wheel for rdkit-installer (setup.py) ... [?25l[?25hdone
  Created wheel for rdkit-installer: filename=rdkit_installer-0.2.0-py3-none-any.whl size=5768 sha256=8a64428c210829465a80f18a206be677b242a7176d4cfd735782e6b015f0943b
  Stored in directory: /tmp/pip-ephem-wheel-cache-xgo2fh_c/wheels/e6/72/a5/218f5f909a3a87c1ec1ccec03ac61298947fb5f1efa517eefa
Successfully built rdkit-installer
Installing collected packages: rdkit-installer
Successfully installed rdkit-installer-0.2.0


add /root/miniconda/lib/python3.7/site-packages to PYTHONPATH
python version: 3.7.13
fetching installer from https://repo.continuum.io/miniconda/Miniconda3-4.7.12-Linux-x86_64.sh
done
installing miniconda to /root/miniconda
done
installing rdkit
done
rdkit-2020.09.1 installation finished!


CPU times: user 761 ms, sys: 219 ms, total: 981 ms
Wall time: 1min 1s

VAE (Variational Autoencoder)

それでは、いよいよVAEです。各種パラメーターはこちら。

device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')

LEARNING_RATE = 1e-3
NUM_EPOCHS = 500
BATCH_SIZE = 64
N_INPUT = len(vocab)*SMILES_MAXLEN
N_HIDDEN = 400
N_Z = 20

テンソル、データセット、データローダー

テンソルに変換して、データセット作成して、データローダーに変換する一連の流れ。

import torch
from torch.utils.data import TensorDataset, DataLoader

X_tensor = torch.from_numpy(X).float()
T_tensor = torch.from_numpy(T).float()

dataset = TensorDataset(X_tensor, T_tensor)
data_loader = DataLoader(dataset, batch_size=BATCH_SIZE, shuffle=True)

VAE モデル

VAEを次のように設計します。

class VAE(torch.nn.Module):
    def __init__(self, n_input=784, n_hidden=400, n_z=20):
        super(VAE, self).__init__()
        self.fc1 = torch.nn.Linear(n_input, n_hidden)
        self.fc2 = torch.nn.Linear(n_hidden, n_z)
        self.fc3 = torch.nn.Linear(n_hidden, n_z)
        self.fc4 = torch.nn.Linear(n_z, n_hidden)
        self.fc5 = torch.nn.Linear(n_hidden, n_input)
        
    def encode(self, x):
        h = torch.nn.functional.relu(self.fc1(x))
        return self.fc2(h), self.fc3(h)
    
    def reparameterize(self, mu, log_var):
        std = torch.exp(log_var/2)
        eps = torch.randn_like(std)
        return mu + eps * std

    def decode(self, z):
        h = torch.nn.functional.relu(self.fc4(z))
        return torch.sigmoid(self.fc5(h))
    
    def forward(self, x):
        mu, log_var = self.encode(x)
        z = self.reparameterize(mu, log_var)
        x_reconst = self.decode(z)
        return x_reconst, mu, log_var

生成されたテンソルから、SMILESらしき文字列を多数生成できるのですが、その中で「SMILESとして成立」し、かつ「最も長い」文字列を選び出すための関数を設計します。

from rdkit import Chem

def get_best_smile(out_tensor):
    best_smile = ""
    for vec in out_tensor:
        vec = vec.reshape(SMILES_MAXLEN, len(vocab))
        smile = "".join([vocab[torch.argmax(v).item()] for v in vec])
        mol = Chem.MolFromSmiles(smile)
        while not mol:
            if len(smile) == 0: break
            smile = smile[:-1]
            mol = Chem.MolFromSmiles(smile)
        
        if len(best_smile) < len(smile):
            best_smile = smile

    return best_smile

モデルの作成と、最適化のための設定。

model = VAE(n_input=N_INPUT, n_hidden=N_HIDDEN, n_z=N_Z).to(device)
optimizer = torch.optim.Adam(model.parameters(), lr=LEARNING_RATE)

学習開始

losses = []
reconst_losses = []
kl_divs = []
for epoch in range(NUM_EPOCHS):
    for i, (x, _) in enumerate(data_loader):
        x = x.to(device).view(-1, N_INPUT)
        x_reconst, mu, log_var = model(x)
        
        reconst_loss = torch.nn.functional.binary_cross_entropy(x_reconst, x, reduction='sum')
        kl_div = - 0.5 * torch.sum(1 + log_var - mu.pow(2) - log_var.exp())
        
        loss = reconst_loss + kl_div
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()
        losses.append(loss.item())
        reconst_losses.append(reconst_loss.item())
        kl_divs.append(kl_div.item())
        
        if (i+1) % 100 == 0:
            print ("Epoch[{}/{}], Step [{}/{}], Reconst Loss: {:.4f}, KL Div: {:.4f}" 
                   .format(epoch+1, NUM_EPOCHS, i+1, len(data_loader), reconst_loss.item(), kl_div.item()))
    
    with torch.no_grad():
        z = torch.randn(BATCH_SIZE, N_Z).to(device)
        out = model.decode(z)
        print("Epoch[{}/{}], Generated SMILES: {}".format(epoch+1, NUM_EPOCHS, get_best_smile(out)))

        # out, _, _ = model(x)
        # print("Epoch[{}/{}], Reconstructed SMILES: {}".format(epoch+1, NUM_EPOCHS, get_best_smile(out)))
Epoch[1/500], Generated SMILES: CC1Occ-C1
Epoch[2/500], Generated SMILES: CC1ccccccccccoC/CCOC1
Epoch[3/500], Generated SMILES: CCC
Epoch[4/500], Generated SMILES: CC1cccccccccc1
Epoch[5/500], Generated SMILES: CCC1cccccccccc1CCCCC
Epoch[6/500], Generated SMILES: CC1ccccccccccccCC1C
Epoch[7/500], Generated SMILES: CCC1cccccccccc1
Epoch[8/500], Generated SMILES: CC1ccc(cccc(cCc(c1CCCCC)C))CCC
Epoch[9/500], Generated SMILES: CC1cccccc1
Epoch[10/500], Generated SMILES: CCC1ccC1
Epoch[11/500], Generated SMILES: CCCCC
Epoch[12/500], Generated SMILES: CCCCC
Epoch[13/500], Generated SMILES: CCC1ccCCcccccc1
Epoch[14/500], Generated SMILES: CCCCCCC
Epoch[15/500], Generated SMILES: CCC=CC1cc1C
Epoch[16/500], Generated SMILES: CCC=CC1cC1
Epoch[17/500], Generated SMILES: CCCCCCC
Epoch[18/500], Generated SMILES: CCC1ccCc1
Epoch[19/500], Generated SMILES: CCCCCCC
Epoch[20/500], Generated SMILES: CCCCCCC
Epoch[21/500], Generated SMILES: CCCCCCC
Epoch[22/500], Generated SMILES: CCC1CCc1
Epoch[23/500], Generated SMILES: CCC1CCC1C
Epoch[24/500], Generated SMILES: CCCCC=1cC1
Epoch[25/500], Generated SMILES: Cc1ccc(c(c1)C)C(C)CC
Epoch[26/500], Generated SMILES: Cc1ccc(c(c1)C)C(C(C)CC)
Epoch[27/500], Generated SMILES: CCCCCC\OC1cc(cccc1(CCCCC))
Epoch[28/500], Generated SMILES: Cc1ccc(c(c1)C)CCC
Epoch[29/500], Generated SMILES: Cc1ccc(c(c1)C)C(C)CC
Epoch[30/500], Generated SMILES: Cc1ccc(c(c1)C)CCC
Epoch[31/500], Generated SMILES: Cc1ccc(c(c1)C)C(C)CCCC
Epoch[32/500], Generated SMILES: CCC1cc(c(c1CC)C(C(CCCC)CC)C)C
Epoch[33/500], Generated SMILES: CCC1cc(c(c1)C)C(C(CCCC)CC)CCC
Epoch[34/500], Generated SMILES: CCC1cc(c(c1CC)C(C)CCCC)CC
Epoch[35/500], Generated SMILES: Cc1ccc(c(c1)C)C(C)CCCC
Epoch[36/500], Generated SMILES: CCC1CccCC1CC(CC)C(CCCCCCCCCCCCCC)C
Epoch[37/500], Generated SMILES: CCC1cc(c(c1CC)C(C(CCCC(CC)CCC)CCCC)O)
Epoch[38/500], Generated SMILES: Cc1ccc(c(c1)C)C(C)C
Epoch[39/500], Generated SMILES: Cc1ccc(c(c1CC)CCC)C
Epoch[40/500], Generated SMILES: Cc1ccc(c(c1)CCCCC)CC
Epoch[41/500], Generated SMILES: Cc1ccc(c(c1)C)C(C(CC)C(=C)C)C
Epoch[42/500], Generated SMILES: CCc1cc(C(c1C(CCCCCCCCC)CC)CCC)CCCC
Epoch[43/500], Generated SMILES: C/N=C(\Oc1ccc(c(c1)C(C(C)))C)C
Epoch[44/500], Generated SMILES: Cc1ccc(c(c1)CCCCC)C
Epoch[45/500], Generated SMILES: CCc1c(COc(ccc1)C)C(C(C)C)OOCO
Epoch[46/500], Generated SMILES: Cc1cc(C)c(c(c1)C)C
Epoch[47/500], Generated SMILES: CCC1cc(c(c1CC)C(C(CC)C)CC)C
Epoch[48/500], Generated SMILES: CCc1cccCcccccc1C
Epoch[49/500], Generated SMILES: C=Cc1ccc1
Epoch[50/500], Generated SMILES: CC[N]C1=N[C](N/C(=N/CCCCC)/N1)CCCC
Epoch[51/500], Generated SMILES: C/N=C(\Oc1ccc(c(c1CC)C(C)C)CO)
Epoch[52/500], Generated SMILES: C/N=C(\Oc1ccc(c(c1)CCC)C)CO
Epoch[53/500], Generated SMILES: CC1cc(C(ccc(c1CCCO)C(C)C)(C)CC)C
Epoch[54/500], Generated SMILES: CCc1cc(Ccc(c1CCCC)C(=C)CC)
Epoch[55/500], Generated SMILES: C/N=C(\Oc1ccc(c(c1)C(C)C)C)C
Epoch[56/500], Generated SMILES: Cc1ccc(c(c1)C)COCOCCCO
Epoch[57/500], Generated SMILES: Cc1cc(C)c(c(c1)C)
Epoch[58/500], Generated SMILES: CCc1cccc(c(C1)C(C)CC)C
Epoch[59/500], Generated SMILES: C/N=C(\Oc1ccccccc1)CC(C)
Epoch[60/500], Generated SMILES: Cc1cccCcccc(O)cccccccc1OC
Epoch[61/500], Generated SMILES: Cc1cc(C)ccccc1
Epoch[62/500], Generated SMILES: C=CCNCC(C)N(C(=CCC=C)C(C)CNCCCC)CCCC
Epoch[63/500], Generated SMILES: C/N=C(\Oc1ccc(c(c1)C(C)C)C)CCC
Epoch[64/500], Generated SMILES: C/N=C(\Oc1ccc(c(c1)C)C(C)C)C
Epoch[65/500], Generated SMILES: C/C(CC(OC1cc(c(c1C)C(C)C)C)CCCC)COC
Epoch[66/500], Generated SMILES: Cc1cc(C)c(c(c1)C)C
Epoch[67/500], Generated SMILES: Cc1ccc(c(c1)C)C(=O)C
Epoch[68/500], Generated SMILES: Cc1cc(Ccccc(c1)CC)C
Epoch[69/500], Generated SMILES: Cc1=c(C)cccc1
Epoch[70/500], Generated SMILES: CCc1cC(=c(C1=C)C)C
Epoch[71/500], Generated SMILES: CCc1cc(c(ccc(c1CC)C)COCOC)CCO(CCC)
Epoch[72/500], Generated SMILES: Cc1cc(C)ccccc1CCCOCC(c2C(c2C)C)C
Epoch[73/500], Generated SMILES: CCCCC(C)c1c(c(c(C(cccc1C)O)C)C)CO
Epoch[74/500], Generated SMILES: OCCc1cc(C(OO))ccccc1O
Epoch[75/500], Generated SMILES: COc1cc(cccc(c1)CC)C
Epoch[76/500], Generated SMILES: Cc1cc2c(c1ccccccc2C)C(CC)O
Epoch[77/500], Generated SMILES: CCc1ccc(c(=C)CN=C(C1C)CC)CCCOCOC
Epoch[78/500], Generated SMILES: OC(=O)c1c(C(cc1C)CN)O(O)
Epoch[79/500], Generated SMILES: Cc1ccc(c(c1)O)C(C)C(=C)C
Epoch[80/500], Generated SMILES: OCC1N(C)CN[C]CN/C(=NN1CC)/NC
Epoch[81/500], Generated SMILES: COC1CC(CCC(c1)CCCC(C)C)C
Epoch[82/500], Generated SMILES: OOc1cccc(cccc(c1C)C)C(=O)
Epoch[83/500], Generated SMILES: CC[C@H]1CCCccc(c(C)C(=O)ccC1=O)C
Epoch[84/500], Generated SMILES: CCc1nc(c(c(c1)C)C)C(=O)
Epoch[85/500], Generated SMILES: Cc1ccc(cccc(C1)O)c2ccc2
Epoch[86/500], Generated SMILES: Cc1ccc(c(c1)C)C(C)C(=O)
Epoch[87/500], Generated SMILES: CCc1ccccccc1(C)
Epoch[88/500], Generated SMILES: CCc1cc=Ccc(c1CC)CC(=N)COC
Epoch[89/500], Generated SMILES: COc1cc(c(cc1OCC(C)C(=O)C)CNC)(C)
Epoch[90/500], Generated SMILES: CC1CCCN/c(=N)C(C)C(C)C(=N1)O
Epoch[91/500], Generated SMILES: CCCc1c(COCC1C)CC=O
Epoch[92/500], Generated SMILES: Cc1ccCCcccc(c1CC)O
Epoch[93/500], Generated SMILES: Cc1ccc(ccc11C)CC1
Epoch[94/500], Generated SMILES: C/N=C(COC1cc(c(c1)CCO)CC(O)C)C
Epoch[95/500], Generated SMILES: CCNCC(\Oc1cc(C)c(c(c1)C)NC)/O/OC/N
Epoch[96/500], Generated SMILES: C/N=C(\Oc1(cc(c(c1)C)C)C)CC
Epoch[97/500], Generated SMILES: COc1cC(Cc(c1OC(C))c1=C(CC)CNC)CC1
Epoch[98/500], Generated SMILES: CCc1cc(C)ccc1CC
Epoch[99/500], Generated SMILES: OCNc1ccc(ccc(ccc1C)C)CC
Epoch[100/500], Generated SMILES: C/C1C(\OcCcccccccc1C)C
Epoch[101/500], Generated SMILES: Oc1cc(Cc(c(c1)O)C)
Epoch[102/500], Generated SMILES: CC(C1cCCcnc(c1CC)C)
Epoch[103/500], Generated SMILES: CCC1Cc(Cc(c(c(c1)C)C)CN=O)N(C)
Epoch[104/500], Generated SMILES: CCC1CC(=O)C1=CCCCC
Epoch[105/500], Generated SMILES: Cc1ccc(c(c1)C)C
Epoch[106/500], Generated SMILES: COc1cc(C)(c(c1)O)C(C=O)
Epoch[107/500], Generated SMILES: Oc1cc(cccc(=C)c(CO)1O)C
Epoch[108/500], Generated SMILES: Oc1ccc2ccccccccccc21CC
Epoch[109/500], Generated SMILES: CCc1ncccnc(c1)C
Epoch[110/500], Generated SMILES: C=CC(C(=O)CC=C1C(=O)C1O)
Epoch[111/500], Generated SMILES: CCc1cCc=C(C1CCC)C
Epoch[112/500], Generated SMILES: Cc1cc(C)c(c(cCccc2c1CN(=O)=O)C)Cn2OOOOC
Epoch[113/500], Generated SMILES: Cc1ccc(c(c1[C])N)C
Epoch[114/500], Generated SMILES: Cc1cccc(c1C)
Epoch[115/500], Generated SMILES: CNNC(O)c1ccccccc1
Epoch[116/500], Generated SMILES: Cc1cc(C)cc(c1)NC
Epoch[117/500], Generated SMILES: NCc1ccc(cccccc1)CC
Epoch[118/500], Generated SMILES: OCc1cc(C)ccc1C
Epoch[119/500], Generated SMILES: Cc1cc(C)cc(c1)C(CC(C)C)O
Epoch[120/500], Generated SMILES: C=c1cc(cccc1C)C(C)C
Epoch[121/500], Generated SMILES: OCC1CCc=c(C1)C
Epoch[122/500], Generated SMILES: CCC1Nccccc(C)=C/c(N1)CCCC
Epoch[123/500], Generated SMILES: CC1ccccc(cCc1CC)OCCO
Epoch[124/500], Generated SMILES: CCc1cc(c(cccc1C)C)C
Epoch[125/500], Generated SMILES: CN1C(C)c(C/C)N(C/c1O)C
Epoch[126/500], Generated SMILES: Cc1cc(C)c(cc1)O(C)
Epoch[127/500], Generated SMILES: Cc1cC2c1c1CCc(c1OOCCSO)c(c2)COOC
Epoch[128/500], Generated SMILES: Cc1cccc(c(N1)C=CCNCC)C
Epoch[129/500], Generated SMILES: CCc1ccccc(C1(C))C
Epoch[130/500], Generated SMILES: CCCC(CC)CN=C=CCCCC=CO
Epoch[131/500], Generated SMILES: CC(=CCc=c(c(=CCC)C(C)Cc(c1)CCC)CNOC(O)C)\1
Epoch[132/500], Generated SMILES: COc1cc2c(c1)OCCC(=C)n2
Epoch[133/500], Generated SMILES: Cc1cc(O)cc(c1)C
Epoch[134/500], Generated SMILES: CCCN1C(=O)CC=c1C
Epoch[135/500], Generated SMILES: Cc1ccc(cc(c1c1)OC)C(=O)C1
Epoch[136/500], Generated SMILES: CCc1ccccc(C1C)
Epoch[137/500], Generated SMILES: CCC1CC(=O)C(=C1C)CCC
Epoch[138/500], Generated SMILES: COc1cc(ccc1O)CCCC
Epoch[139/500], Generated SMILES: Cc1cnccccc1C
Epoch[140/500], Generated SMILES: Cc1cc(C)c(c(c1C)C)
Epoch[141/500], Generated SMILES: C/c1cc(Cccccccc1)C
Epoch[142/500], Generated SMILES: CCOc1cc(c(c1C)=C)C(=O)O
Epoch[143/500], Generated SMILES: CCCc1cCCCcccc1C
Epoch[144/500], Generated SMILES: Cc1cc(C)c(c(c1)C)C(CO)C
Epoch[145/500], Generated SMILES: Oc1cc(C)c2c(c1cncncccc2C)C
Epoch[146/500], Generated SMILES: CCc1cc(C(c1OC)CCC)C
Epoch[147/500], Generated SMILES: COc1ccccc(c1(C)CO)C
Epoch[148/500], Generated SMILES: OC(=O)c1c(C)c(c1)C
Epoch[149/500], Generated SMILES: CC(c1cC(ONcc1N)C)=O
Epoch[150/500], Generated SMILES: COc1ccc(c(c1)C)C
Epoch[151/500], Generated SMILES: CCC1cCccN(C=N)(=O)C1CN(CO)\N
Epoch[152/500], Generated SMILES: N=Cc1cc(C)c(c(c1)O)C(=O)OCCOCO
Epoch[153/500], Generated SMILES: O=Cc1ccc(c(c1)C)CC(C)CC
Epoch[154/500], Generated SMILES: Cc1cc(O)c2c(c1Ccccccn2)C
Epoch[155/500], Generated SMILES: CC1CccN(O)CcON(C1=C)CCCS
Epoch[156/500], Generated SMILES: CC(=O)c1c(C)CCC(C)N/C(N1)
Epoch[157/500], Generated SMILES: CCc1nc(=O)C(=c1c1C)Cc1
Epoch[158/500], Generated SMILES: Cc1cc(c(cccc1CCCCCNCN)C)
Epoch[159/500], Generated SMILES: COC1ccC(C)=c1
Epoch[160/500], Generated SMILES: C=N(N(C)C)
Epoch[161/500], Generated SMILES: Cc1ccc(c(c1C)C)C(=O)COC
Epoch[162/500], Generated SMILES: Oc1=c(C)c(c(cc1C(=C)O)CC2)2
Epoch[163/500], Generated SMILES: Cc1cc(ccnc1CCCC1C)cO1
Epoch[164/500], Generated SMILES: Cc1ccccccc1
Epoch[165/500], Generated SMILES: Cc1cccCccnc(C1=C)CC=O
Epoch[166/500], Generated SMILES: O=C(N(C)C1=cCCc(C(C1O)C(=O)C)C)
Epoch[167/500], Generated SMILES: COc1cc(CC=c1c1=N)c1
Epoch[168/500], Generated SMILES: CCCCCCC1CC(=O)C(CC1N)C
Epoch[169/500], Generated SMILES: Cc1cc(O)cnccc1CCCC(=O)OCCOOC
Epoch[170/500], Generated SMILES: CC(NCC[C@H]1CC)c(C1=C)
Epoch[171/500], Generated SMILES: CCCCC[N]CN/C(CN/C)
Epoch[172/500], Generated SMILES: Cc1c(=O)c1
Epoch[173/500], Generated SMILES: COc1cc2c(cccnc1OC)c(cc2OC)CC
Epoch[174/500], Generated SMILES: C/C1C(COC)cccc1CCCC
Epoch[175/500], Generated SMILES: CCc1c(CC)c(c1CCC)COC
Epoch[176/500], Generated SMILES: Cc1ccc(c(n1)C(=O)N)C
Epoch[177/500], Generated SMILES: COc1cc(cc(c2)C(C1=O)COCC2C)CCCCC
Epoch[178/500], Generated SMILES: CC(c1ccc(c(c1)CC)C)CO
Epoch[179/500], Generated SMILES: C=CCOC1cOC(=O)C(Cc1C)C
Epoch[180/500], Generated SMILES: Cc1cc(C)c(c(c1)C)C(=O)
Epoch[181/500], Generated SMILES: CNCCC1c(CC)Ccccccc1
Epoch[182/500], Generated SMILES: Cc1ccc1c(c1)cccc1CCC
Epoch[183/500], Generated SMILES: CCN1CCc1N(C)CC
Epoch[184/500], Generated SMILES: CCc1nc(c(c(c1)C)C)OC
Epoch[185/500], Generated SMILES: CC(c(COnccc(c1C)CC)Cc1C)OOCO
Epoch[186/500], Generated SMILES: CCC1cCc/c(C1)C=N
Epoch[187/500], Generated SMILES: Oc1cc(C)c(c(c1)C)C
Epoch[188/500], Generated SMILES: CCc1cc(C(ccc1)COC)CC
Epoch[189/500], Generated SMILES: CC(=C)c1c(CN)C1C
Epoch[190/500], Generated SMILES: CCc1cc(c(c(c1)C(C)CC)C)O
Epoch[191/500], Generated SMILES: O=Cc1cccCccc(ccc1OCOC)(CC)
Epoch[192/500], Generated SMILES: CC(=C(OOc1cccnccc1C(CC)C)C)C
Epoch[193/500], Generated SMILES: CCc1cc(c(c(c1CC)C)C)
Epoch[194/500], Generated SMILES: C[C]C=C(C)[C@]
Epoch[195/500], Generated SMILES: CCc1nc(COCC=O)(C1CO)O
Epoch[196/500], Generated SMILES: OCc1nc(c(c(c1)C)C)O
Epoch[197/500], Generated SMILES: Cc1cccCc(n1)(NC(C)CO)
Epoch[198/500], Generated SMILES: C/N=C(\Oc1ccc(c(c1)C)C(C)C)
Epoch[199/500], Generated SMILES: CCCCC1(CCCC(Cc1C)C)CO
Epoch[200/500], Generated SMILES: Cc1ccc(c(c1)C)C(C)(C)C
Epoch[201/500], Generated SMILES: OC(=O)c1cc(O)c(c(c1)C=O)C
Epoch[202/500], Generated SMILES: Cc1ccc(c(c1)C)C(=O)
Epoch[203/500], Generated SMILES: OOc1cc(ccc1OC)O
Epoch[204/500], Generated SMILES: Cc1cccCcc(c(c1)C)C(=O)
Epoch[205/500], Generated SMILES: CCc1cccccc(c1)C
Epoch[206/500], Generated SMILES: CCC1Cccccc(=O)C(C)C1C
Epoch[207/500], Generated SMILES: C=CCCc1c(C(=C)C)C1CC
Epoch[208/500], Generated SMILES: O=C(NNC(Cc(c1)c(C)c1OCN=C)(C)O)
Epoch[209/500], Generated SMILES: CC(COC(C(=O)C)=CCCNO)[O]O
Epoch[210/500], Generated SMILES: Cc1cc(C)c(c(c1C(CO)cn2)C)2
Epoch[211/500], Generated SMILES: Cc1ccc(c(c1)C)C
Epoch[212/500], Generated SMILES: CCC1C(COCC(C)C(Cc1OC=O)C)C
Epoch[213/500], Generated SMILES: C/N=CNCO
Epoch[214/500], Generated SMILES: C=C1CCC(=C)C(c(C1CO))C(C2O)CNc(c2)
Epoch[215/500], Generated SMILES: CCc1ccc(ccnc1CC)CC(C)CCOCCCC
Epoch[216/500], Generated SMILES: CNCC(=C)C1ccc(cC1C)COCOC
Epoch[217/500], Generated SMILES: Cc1ccc(c(c1)C)C
Epoch[218/500], Generated SMILES: Cc1ccc(c(c1)C)C(C)C(CO)C
Epoch[219/500], Generated SMILES: COc1cc(CC)cc(cccOcccc2OOCC22C)C21
Epoch[220/500], Generated SMILES: CCc1cc(C(n1)C)CCNCCOO[H]
Epoch[221/500], Generated SMILES: Cc1cc(C)cCc(c1CCOC)O
Epoch[222/500], Generated SMILES: CC1=CC(cc1c1=c2c1cc(CO)ccc2)C
Epoch[223/500], Generated SMILES: C=C1CC(=C(N)C(=C)N)CCCOCO1
Epoch[224/500], Generated SMILES: CCN=C(OCc1cc(C)ccccc1)
Epoch[225/500], Generated SMILES: CC[C]C1=N[C](cC1)O
Epoch[226/500], Generated SMILES: COc1cc(cc(c1O)C(C(CC)C)C)C
Epoch[227/500], Generated SMILES: CCc1cnc(c(cc(c1)C)N)C
Epoch[228/500], Generated SMILES: O=Cc1cc(C)cc(c1C)CO
Epoch[229/500], Generated SMILES: CCc1cC1=C[C]
Epoch[230/500], Generated SMILES: OC[C@H](c[C@]c1O(C))NC(=O)C1=O
Epoch[231/500], Generated SMILES: Cc1cc(C)cc(c1)C
Epoch[232/500], Generated SMILES: Cc1cc(C)c(c(c1CC)O)C(CC)O
Epoch[233/500], Generated SMILES: CCN=C(O)c1(c(Cc(c(c1C)))(C)C)
Epoch[234/500], Generated SMILES: OCC(=O)c1c(=O)c(c(c1O)COC)(C)
Epoch[235/500], Generated SMILES: CC(CO)N1C(C)C(C(C1)O)C
Epoch[236/500], Generated SMILES: COc1cc(c(c(c1)C(C)(C)C)O)C(C)(C)C
Epoch[237/500], Generated SMILES: O=C1C=CC1=C[C](C(=C)C1CC1C)CC
Epoch[238/500], Generated SMILES: Cc1ccc(c(cc1C)=N)
Epoch[239/500], Generated SMILES: CC(=O)c1cc(nc(c1C)C)NC
Epoch[240/500], Generated SMILES: CN1cCc(c2c1ccc2)COC
Epoch[241/500], Generated SMILES: CCc1ncc(cc1O)
Epoch[242/500], Generated SMILES: Cc1cc(C)cc(c1)N(CON(O))
Epoch[243/500], Generated SMILES: C=C1CC(C)N(c1C1O)CcCcc1
Epoch[244/500], Generated SMILES: CC(CO)O
Epoch[245/500], Generated SMILES: Cc1ccc(c(n1)CN)C
Epoch[246/500], Generated SMILES: Cc1ccc(c(c1)C)C(N)C(=N)CC
Epoch[247/500], Generated SMILES: C/C=CN(C1cc(CC(c1O/C(=NCC(C)C)C)C)C)
Epoch[248/500], Generated SMILES: Cc1cc(C)c(cc1C)C
Epoch[249/500], Generated SMILES: CCc1cc(C)c2c(C1O)c(C)C(o2)C
Epoch[250/500], Generated SMILES: CCc1nnc(ccnc1CONCC(C)CC)C
Epoch[251/500], Generated SMILES: COc1cc(C)ccc(C1CC=O)C
Epoch[252/500], Generated SMILES: COc1ccc2c(c1)c(CCNCC=O)c2
Epoch[253/500], Generated SMILES: COc1cc2CC(c1)Onc2
Epoch[254/500], Generated SMILES: CC(=O)c1cc(cc(c1C)N)
Epoch[255/500], Generated SMILES: CCCc1cc(ccccc(C)C(O)C1CC)C
Epoch[256/500], Generated SMILES: CCc1c(CC(c1c1)C(CC1=O)NC)
Epoch[257/500], Generated SMILES: CCc1cccc[cH]CC1CN[C@H][CH]/CC
Epoch[258/500], Generated SMILES: Cc1cccc(c1OCC(C1)N)Ccc1
Epoch[259/500], Generated SMILES: CCCC(CO)C(=c1C)C(=c1OC)C
Epoch[260/500], Generated SMILES: Cc1cc(C)c(c(c1)CC)CC=O
Epoch[261/500], Generated SMILES: COc1cc2(c(=O)ccc2c1=O)
Epoch[262/500], Generated SMILES: Oc1cc(C)c2c(c1)C(=O)cc2O
Epoch[263/500], Generated SMILES: Cc1cC(C(CC(C)CCCNO)CN1)
Epoch[264/500], Generated SMILES: Cc1ccC(cc(c1C)CC)O
Epoch[265/500], Generated SMILES: N#Cc1cc(C)c2c(c1OO)c(c2C)C
Epoch[266/500], Generated SMILES: CC1cncc(OC(C)C(c1OOC)C)
Epoch[267/500], Generated SMILES: Cc1ccc2c(c(cc(O)c1c2COC)NONCOO)
Epoch[268/500], Generated SMILES: Cc1cccccc(c1)C(C)NC
Epoch[269/500], Generated SMILES: CCCCOC1CN=C=cC(C(C1)CC(COCO)C1)C1OC
Epoch[270/500], Generated SMILES: COCc1ccc(CN)C=Ccc1C
Epoch[271/500], Generated SMILES: CCc1cc(C)c(cc1C)CCN
Epoch[272/500], Generated SMILES: Cc1cc(N)ccccc1C=O
Epoch[273/500], Generated SMILES: COc1ccc2c(c(c1)C)C(c2)
Epoch[274/500], Generated SMILES: Cc1nc(CCccccc1CCNC)
Epoch[275/500], Generated SMILES: COc1cc2c(c(=N)C2cc1)O
Epoch[276/500], Generated SMILES: OCc1cc(=O)c(co1)OOCCCC
Epoch[277/500], Generated SMILES: Cc1cc(Cc(c1c1C)O)1
Epoch[278/500], Generated SMILES: CCc1cc(C)(c(c1)OCCO)COCOC
Epoch[279/500], Generated SMILES: OCc1cc(C)cc(c1OCCCN=O)OC
Epoch[280/500], Generated SMILES: CC(Cc1ccc2ccc1CC)Cc(C(c2COCCCO)C)CO(S)
Epoch[281/500], Generated SMILES: CCc1ccc(c(n1)CC)CC
Epoch[282/500], Generated SMILES: CCc1nCc=c(cc(C1CCC(C)CC2))/O2
Epoch[283/500], Generated SMILES: Cc1cc(C)cc(c1)C(C)(C)CC=C/NCC(C)CC
Epoch[284/500], Generated SMILES: CCc1c(C)cnnc1N(CC)
Epoch[285/500], Generated SMILES: Cc1ccC(cc(c1ON)C1)C1C[2C]
Epoch[286/500], Generated SMILES: Cc1ccccc(cc1O)C(C)
Epoch[287/500], Generated SMILES: OCN=CCCN=C(C)CC(CC)CNC(O)O
Epoch[288/500], Generated SMILES: C/N=C(/Oc1ccc(c(c1)N(C)C)ON/C)C
Epoch[289/500], Generated SMILES: CCc1cc(Cccccc1OCCC(=O)C)C
Epoch[290/500], Generated SMILES: C=CCC1cC(CC)C(CC1(C)C)OC
Epoch[291/500], Generated SMILES: Cc1ccc(c(c1)C)NO
Epoch[292/500], Generated SMILES: OCc1c(C)c(c(cc1C)CC=O)CC
Epoch[293/500], Generated SMILES: C/N=C(\Oc1ccc(c=c1)CCSN(/OOC))O
Epoch[294/500], Generated SMILES: CC1=CC(=O)c(=CCc1O)C
Epoch[295/500], Generated SMILES: COc1cc(cc(c1OCc(c(C1)C(C)(C)C)CC1)/[NH])C
Epoch[296/500], Generated SMILES: CCC1c(C)Cc(c1)N
Epoch[297/500], Generated SMILES: CCc1nccccc(c1)
Epoch[298/500], Generated SMILES: COc1cc2cc11CC(=NNN1Cc1)ccc12
Epoch[299/500], Generated SMILES: CC1cN=C(ON[C]=1CN=C)
Epoch[300/500], Generated SMILES: C=Cc1=C(OC(c1cNcccc1)1O)(C)
Epoch[301/500], Generated SMILES: CCc1=C(O)c(c1CCO)C(C)C
Epoch[302/500], Generated SMILES: O=CNCC(=O)C
Epoch[303/500], Generated SMILES: CCN=C(O)c1cc(C)cCC1C(=O)
Epoch[304/500], Generated SMILES: Cc1ccc(c(cC2cN1C(/O))C)2
Epoch[305/500], Generated SMILES: Cc1ccccc(c1ccc2O)C2
Epoch[306/500], Generated SMILES: CCOc1cc(c(coccc1)CC)COCC
Epoch[307/500], Generated SMILES: O=C1Nc(C)Cnc1
Epoch[308/500], Generated SMILES: CC(c1cccCccc(c1)C)C
Epoch[309/500], Generated SMILES: CC(C1cc(C(=c(Oc1CcN1C)O)C)CO)OCnN1
Epoch[310/500], Generated SMILES: Cc(cccccc(c1)N=C)C(=O)N(C1COOC)CN
Epoch[311/500], Generated SMILES: OCCc1cOCCcc(cc1O)C
Epoch[312/500], Generated SMILES: Cc1cc(C)c(c(c1)C)CC=O
Epoch[313/500], Generated SMILES: COc1cc(C)ccc1C(CC)C
Epoch[314/500], Generated SMILES: OCCc1cc(cccc(c1O)CCC)CCCC/CC
Epoch[315/500], Generated SMILES: COc1cc2CCN([C@H](c2cc1O)CCCCC)
Epoch[316/500], Generated SMILES: Cc1ccnc(ccnc(c1)C)
Epoch[317/500], Generated SMILES: CCc1cc(cc(c1O)C(CO)C)C
Epoch[318/500], Generated SMILES: OOc1cccc(c1O)C
Epoch[319/500], Generated SMILES: C=Cc1c(C)ccc(c1C)C
Epoch[320/500], Generated SMILES: CN(C[N]CC=N[C](N/C(CNC(C(C)/N1)CCC)O)OC1)C
Epoch[321/500], Generated SMILES: C/N=C(\Oc1cc(ccc1C)C(CC))
Epoch[322/500], Generated SMILES: Cc1cc(C)c2c(c1)Ccc2C
Epoch[323/500], Generated SMILES: CC(c1cc(C(=C)O)c(cc1C)C)C
Epoch[324/500], Generated SMILES: CCc1ccc(cc(C)CCN=C1C)C
Epoch[325/500], Generated SMILES: Cc1ccCCOCCC(c1C)
Epoch[326/500], Generated SMILES: O=C1cc(cccc1C)CCC
Epoch[327/500], Generated SMILES: C/CCC1NCccccc(c1)O
Epoch[328/500], Generated SMILES: Cc1ccccc(c(c1)C)CCOC
Epoch[329/500], Generated SMILES: COc1cc(cc(c1O))CCCCC
Epoch[330/500], Generated SMILES: CCc1c(C)Cc(c1CC)C
Epoch[331/500], Generated SMILES: Cc1ccccc(c(c1)C(C)CC)CCCO
Epoch[332/500], Generated SMILES: OCc1cc(ccc1)C(OCCOC=O)CCC/OC\C
Epoch[333/500], Generated SMILES: Oc1ccc(c(ccc1)CCC)OCO
Epoch[334/500], Generated SMILES: O[C@@H]C1CC(O)(C1)C
Epoch[335/500], Generated SMILES: C=Cc1ccCCccc(c1CCC)=O
Epoch[336/500], Generated SMILES: Cc1ccc(c(c1)O)C(C)(CCCC)
Epoch[337/500], Generated SMILES: CNc1ccc(Cccc(cCCCcc1O)O)C
Epoch[338/500], Generated SMILES: Cc1ccc(c(c1)O)CN
Epoch[339/500], Generated SMILES: OCc1cc2c(c1c(C1)ccccc2=O)C(C)1
Epoch[340/500], Generated SMILES: Cc1cc2cnocccc1c2CC
Epoch[341/500], Generated SMILES: C=Cc1ccc(cccc(cc1C)OC=O)
Epoch[342/500], Generated SMILES: Cc1cc(C)c(cc1C)
Epoch[343/500], Generated SMILES: O=C1Ncc1C
Epoch[344/500], Generated SMILES: CCc1cnc(c(c1C))CCCCO
Epoch[345/500], Generated SMILES: O/N=C(\Oc1cc(cc(c1)C(CO)=O)C)CO/OOO
Epoch[346/500], Generated SMILES: Cc1=CC(=O)C1=N
Epoch[347/500], Generated SMILES: OC[N]C(=C[C]=C/C(C)/CCC)N
Epoch[348/500], Generated SMILES: CCCc1c(COcccc1C)(CO)
Epoch[349/500], Generated SMILES: C/N=C(\Oc1ccc(cc1C)C)C
Epoch[350/500], Generated SMILES: OCC1CCcOC(=O)C1C(=O)
Epoch[351/500], Generated SMILES: C=c1cc(C)ccncc(c1OC)C
Epoch[352/500], Generated SMILES: Cc1cc(Cc(c(c1)C)C)COC[CH]
Epoch[353/500], Generated SMILES: Cc1ccc(c(c1)C)C(=O)O
Epoch[354/500], Generated SMILES: Nc1ccccOccc(c(c1)C)C
Epoch[355/500], Generated SMILES: CC1ccc(C)nc(c1CC)O
Epoch[356/500], Generated SMILES: CCN=C(\Oc1/c(=C)cc(c1)C)O
Epoch[357/500], Generated SMILES: CCc1cc(c(cc1N(C)C)CCC)
Epoch[358/500], Generated SMILES: O/N=C(COCCOc1(c(n1)C(C)C)CO)OCC
Epoch[359/500], Generated SMILES: Cc1cc(Cncc(=c1c1)C)cNc1
Epoch[360/500], Generated SMILES: Cc1cccc(c1CCCCC)N
Epoch[361/500], Generated SMILES: C/N=C(\Oc1ccc(c(c1)C(C/C)CN/C)C)/OO
Epoch[362/500], Generated SMILES: CCCCOC1C(CC)cc(c1(C)C)O
Epoch[363/500], Generated SMILES: Cc1ccc(c(n1)C(CC)C)C
Epoch[364/500], Generated SMILES: COc1cc(C)ccc(c1=O)(C)
Epoch[365/500], Generated SMILES: CCc1cc(=O)c(=C)C1
Epoch[366/500], Generated SMILES: COc1ccc(cc1C)CC
Epoch[367/500], Generated SMILES: Cc1cc(C)c(c(c1)C)C
Epoch[368/500], Generated SMILES: Cc1cc(cc(c1C)C(=O)CO)NNC
Epoch[369/500], Generated SMILES: CC(CO)OC1CccOnC(C)cccc1CO
Epoch[370/500], Generated SMILES: CCc1cc(cc(c1c1C)ncccccc1)C
Epoch[371/500], Generated SMILES: CC(CCC(=O)C1=CC1=O)O
Epoch[372/500], Generated SMILES: Cc1c(C)c(c1)O
Epoch[373/500], Generated SMILES: Cc1cc(C)cc(c1)C(C)(C)CC
Epoch[374/500], Generated SMILES: Cc1cccc(c1)
Epoch[375/500], Generated SMILES: CCc1ccc1
Epoch[376/500], Generated SMILES: COc1cc(cc(C1C)=O)OC(C)CCOC
Epoch[377/500], Generated SMILES: Cc1ccc(c(c1)C)
Epoch[378/500], Generated SMILES: Cc1cc(C)c(c(c1c(c2CC)C)COC2C)OC
Epoch[379/500], Generated SMILES: CC1=C(\Oc1Cc(cCc1C)c1)O
Epoch[380/500], Generated SMILES: CC(C1c(C(=O)c(C(c1)O)C)C)CC
Epoch[381/500], Generated SMILES: CCc1cnc(c(n1)CC)CCNC
Epoch[382/500], Generated SMILES: CCCC1=c(O)c1(C)CC
Epoch[383/500], Generated SMILES: Cc1ccc(c(c1)CNC=N)CC
Epoch[384/500], Generated SMILES: CCN=C(\Oc1ccccccc1OC)C
Epoch[385/500], Generated SMILES: Oc1cc(C)c(c(c1)O)OO
Epoch[386/500], Generated SMILES: CCC1cc(cc(C1CN=C)CC=O)
Epoch[387/500], Generated SMILES: Cc1cc(C)nc(c1)N(C)CN
Epoch[388/500], Generated SMILES: Cc1cccCcc(c(c1)C)C(=O)
Epoch[389/500], Generated SMILES: Cc1cc(c(cc1O)C(C1cC1=C)C)
Epoch[390/500], Generated SMILES: CC(c1ccc(c(c1O)C(=O)O)O)C
Epoch[391/500], Generated SMILES: CC1=C(c1C(=O))
Epoch[392/500], Generated SMILES: Cc1ccc(cc1CCCNCN)
Epoch[393/500], Generated SMILES: CCC1OCc=c(c1c1C)Ccccc1
Epoch[394/500], Generated SMILES: CCc1ccccc(cc1C)C
Epoch[395/500], Generated SMILES: CC(Cc1ccc2cc(c1c(=C)Cc2)C)CNO
Epoch[396/500], Generated SMILES: Cc1cc(ccc(c(c1)C)C)
Epoch[397/500], Generated SMILES: CCc1cc(=C)c/c(C1)
Epoch[398/500], Generated SMILES: Cc1cc(C)cc(c1)C(C)C
Epoch[399/500], Generated SMILES: Cc1cc(c(c1)2)c(OOOOC=C)C2=O
Epoch[400/500], Generated SMILES: COc1cccC(N)CN(C1C)CC=O
Epoch[401/500], Generated SMILES: Cc1ccc(cc1C)CC
Epoch[402/500], Generated SMILES: CCc1cc(OC)ccc1C
Epoch[403/500], Generated SMILES: Oc1ccnc(c(C1C)C)
Epoch[404/500], Generated SMILES: OCc1ccc(c(c1)C)OO
Epoch[405/500], Generated SMILES: Cc1ccc2c(c1)cc=c(=O)C2
Epoch[406/500], Generated SMILES: C[C@H]1CCC(Cc(c1O)CCC)
Epoch[407/500], Generated SMILES: ONCc12c(ccc2cc1CC)O
Epoch[408/500], Generated SMILES: CC(=O)c1cc(Cccc1)CCC
Epoch[409/500], Generated SMILES: Cc1cc(C)cc(c1)C[C@@]
Epoch[410/500], Generated SMILES: CC(C1=CCCOC)C(c(C1)OC)(C=O)OCCC
Epoch[411/500], Generated SMILES: O=C1c(CC(=O)C(=C)C(C)c1)C
Epoch[412/500], Generated SMILES: Cc1ccccccc1C
Epoch[413/500], Generated SMILES: Cc1cc(O)c(nn1)OCO
Epoch[414/500], Generated SMILES: COc1cc(C)C(c1CC)CCOCc2CO2
Epoch[415/500], Generated SMILES: CC1CCc(ccc1CCCC=C)C(C)
Epoch[416/500], Generated SMILES: OC1cc(C)c(c1C)OC
Epoch[417/500], Generated SMILES: CC(c1cc(c(c(c1CC)C)C(CC)=O)CCC)C
Epoch[418/500], Generated SMILES: Oc1ccc(c(c1)C(CC)O)OOC(CO)
Epoch[419/500], Generated SMILES: Cc1cc(N(=O)=O)ccccc1C
Epoch[420/500], Generated SMILES: COc1cccc(c1C)OCCC
Epoch[421/500], Generated SMILES: OC(=N)c1cc(cc(c1)C)C
Epoch[422/500], Generated SMILES: C=C1cc(CC)cc(cc1C)N
Epoch[423/500], Generated SMILES: CC1ccc(cccc(c1)O)O
Epoch[424/500], Generated SMILES: C=Cc1cc(C)cc(c1)
Epoch[425/500], Generated SMILES: OCCc1cc(cccc(c1O)C)C
Epoch[426/500], Generated SMILES: CC1cC(O)C(=O)C(=O)O1
Epoch[427/500], Generated SMILES: CCc1cc(=c(C1C)OCNC1=cCC1CC1CCC)C1CC
Epoch[428/500], Generated SMILES: OOc1cc(C)cccnc1CC
Epoch[429/500], Generated SMILES: OCc1cc(cc(c1C))CCCC=O
Epoch[430/500], Generated SMILES: CCc1cc(C)cc(c1)C
Epoch[431/500], Generated SMILES: Cc1cc(C)c2c(c1)cc2C
Epoch[432/500], Generated SMILES: CC(=CCCOCc(c1O)O)C(C1)O
Epoch[433/500], Generated SMILES: CCc1cc(=ccc(c1C)C)(CNC)OC
Epoch[434/500], Generated SMILES: CCCC(CO)C1=c(c(c1C)OCCO)OOO[N]C[NH]
Epoch[435/500], Generated SMILES: CCc1cc(cccc(c1C)CO)C
Epoch[436/500], Generated SMILES: COc1ccc2c(C1)cc1C2ccc1
Epoch[437/500], Generated SMILES: Cc1ccc(cc(c(c1)C1C(=O)C)C)CO1
Epoch[438/500], Generated SMILES: C/NCC(\Oc1ccc(c(cc(c1C)C(C))CCC)CC)OCCC
Epoch[439/500], Generated SMILES: CCc1cnc(c(c1C)NC)(CO)
Epoch[440/500], Generated SMILES: CCc1ccc2c(O1)c(C)C(C)CC2C
Epoch[441/500], Generated SMILES: Cc1ccc2c(c1)cnccc(c2C)CCO
Epoch[442/500], Generated SMILES: CCCc1cc(C)c(C)C(=O)O(C1)
Epoch[443/500], Generated SMILES: Cc1cCcOccc(c1)C(C)C
Epoch[444/500], Generated SMILES: Cc1ccccccc1O
Epoch[445/500], Generated SMILES: Cc1ccc2c(c1)nccccc2c1c(=O)s1C
Epoch[446/500], Generated SMILES: CC1=C(C1OC(C=CC=C)CC=C)C
Epoch[447/500], Generated SMILES: O=C(N(C(C)Oc1)c(Cc1C))
Epoch[448/500], Generated SMILES: COc1ccC2c(c1)c(C)Nc2
Epoch[449/500], Generated SMILES: CCc1ccc=c(c(c1))CCC(=O)
Epoch[450/500], Generated SMILES: CCc1cc(Cccccc1C)CC
Epoch[451/500], Generated SMILES: C[C]=C1C=CC(N(C1)CCC)O
Epoch[452/500], Generated SMILES: CCc1cc(COccnc(c1C)CC)C(CO)N(C)CO/C
Epoch[453/500], Generated SMILES: CCc1ccc(c(NC)Cccc1)CCCC
Epoch[454/500], Generated SMILES: CC1cc(C)Cnc(c(c1)C)OOCCCOOOC=O
Epoch[455/500], Generated SMILES: Cc1=Cc(ccc1OC)C(=O)O
Epoch[456/500], Generated SMILES: COc1nc(c(N1)C)C
Epoch[457/500], Generated SMILES: CCCCCCCCCNCCC(=CCC)CC
Epoch[458/500], Generated SMILES: O/N=C(\Oc1cc(C)c(c(c1)CO)C(C)C)CO
Epoch[459/500], Generated SMILES: Cc1c(C(c(c1O)C(C)=O)CCN(=O)=O)CCC
Epoch[460/500], Generated SMILES: Cc1cc(O)nc(c1)C(CO)O
Epoch[461/500], Generated SMILES: C=C1cc(c(c1CC(=O))CC)N[CH]
Epoch[462/500], Generated SMILES: COc1cc(cc(c1OC)CN)
Epoch[463/500], Generated SMILES: C=Cc1ccccC/Cc(c(c1)OC)OCO
Epoch[464/500], Generated SMILES: OCC1c(C(cN(c1)C(C)CCCCNO)/NC)CC
Epoch[465/500], Generated SMILES: Cc1c(C)c(c1C)
Epoch[466/500], Generated SMILES: CNCCC(\Oc1ccc1cc(C)c(c2C)C)2C
Epoch[467/500], Generated SMILES: CCCc1cc(c(cC(=c1)CCOC)ONN)CCC/OC
Epoch[468/500], Generated SMILES: COc1cc(C)cc(c1)CCOCC
Epoch[469/500], Generated SMILES: COC(=O)c1cc(N(=O)=O)c(cc1C)NC
Epoch[470/500], Generated SMILES: Oc1cc(c(cnc(C)c(c(c1CNCCO))O)C)CCCCC
Epoch[471/500], Generated SMILES: CCC1cc(C(=N)c(C(c1)C))OCO
Epoch[472/500], Generated SMILES: COC(=COc1c/C(=c1C)CC)
Epoch[473/500], Generated SMILES: CNCCC(O)c1cccc2cc(c1C)(C2)
Epoch[474/500], Generated SMILES: CC[C@H]([C]c1CCN=C/C)/C1
Epoch[475/500], Generated SMILES: Oc1c(C)cc(c(c1CCO)C)C
Epoch[476/500], Generated SMILES: CC(=O)c1c(C)(c1C(CC)C(C)C)CCC
Epoch[477/500], Generated SMILES: Cc1cc(C)c(c(c1)C)O
Epoch[478/500], Generated SMILES: CCc1ccC(C)c1
Epoch[479/500], Generated SMILES: C=c1cc(C(=N[C]NC)N/CCN(C=C))C1=C
Epoch[480/500], Generated SMILES: OCNc1cc(ccncccc(c(cCC1C)/C)C)/O
Epoch[481/500], Generated SMILES: OCC1c(O)C(=c(C1)C)S
Epoch[482/500], Generated SMILES: CCc1c(cc(c(c1)O)C)CNC
Epoch[483/500], Generated SMILES: CC(c1nc(oC(C)ccnc1CC)=O)=O
Epoch[484/500], Generated SMILES: CCc1cccnc2c(c1)CCcc2
Epoch[485/500], Generated SMILES: CCC1(nc(C(=O)c(CNc(O)1C)CO)CC)CCN
Epoch[486/500], Generated SMILES: C=CCC1c(C)cc(c1CC)O
Epoch[487/500], Generated SMILES: CC1C(=O)C(=C(C)c(=c1CC))O
Epoch[488/500], Generated SMILES: Cc1cc(C)c(c(c1)C)C(=O)C
Epoch[489/500], Generated SMILES: C/N=C(\OC1=C)(c(C1)N(CNC)/O)
Epoch[490/500], Generated SMILES: OCCc1ccc(c(c1)C)CC(N)CO
Epoch[491/500], Generated SMILES: OOc1c(C(c(OO)N(C)1=O)CC)C
Epoch[492/500], Generated SMILES: OCc1ccc(c(=O)C(COCO)CCC2C1)CCO2
Epoch[493/500], Generated SMILES: CCc1cc(cc(c1)C)C
Epoch[494/500], Generated SMILES: CCC1c(C)c(cc1C)CCCC=O
Epoch[495/500], Generated SMILES: CCCC1nc(CC(c(CN1O))O)CO
Epoch[496/500], Generated SMILES: Cc1cc(c(nc(c1)C(C)CC)CC)=C
Epoch[497/500], Generated SMILES: C/C1C/c2c(cccc2C)O(Cc1)
Epoch[498/500], Generated SMILES: CCc1c(C)cc(c(cC1O)CC)C
Epoch[499/500], Generated SMILES: CC(=O)c1c(C)cccc1COOOCC
Epoch[500/500], Generated SMILES: Cc1ccc(c(c1)C(=O)c1)CCcCO1

最初の方は単純なSMILES文字列しか生成できてないけど、後半になると複雑なSMILES文字列を生成できている...かな?(主観)

念のためですが、良い分子(目的の物性を持った分子)かどうかは一切考慮していません。

学習曲線

学習の経過を以下に示します。左と右で同じ数値を示していますが、左が通常表示、右が対数表示です。

fig, axes = plt.subplots(nrows=3, ncols=2, figsize=(8,8))
plot_data = [losses, reconst_losses, kl_divs]
legends = ["Loss", "Reconst Loss", "KL div"]
for i in range(3):
    axes[i][0].plot(plot_data[i], label=legends[i], alpha=0.8)
    axes[i][0].grid()
    axes[i][0].legend()
    axes[i][1].plot(plot_data[i], label=legends[i], alpha=0.8)
    axes[i][1].grid()
    axes[i][1].legend()
    axes[i][1].set_yscale('log')
axes[2][0].set_xlabel("Steps")
axes[2][1].set_xlabel("Steps")
plt.show()

VAEとGANで分子生成入門_50_0.png

GAN (Generative Adversarial Network)

続いて、GAN です。GAN GAN いきましょう。

device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')

LEARNING_RATE = 1e-6
NUM_EPOCHS = 30000
BATCH_SIZE = 256
N_INPUT = len(vocab)*SMILES_MAXLEN
N_HIDDEN = 100
N_LATENT = 100

GAN モデル

贋作を作る生成器 (Generator) を次のように設計します。

G = torch.nn.Sequential(
    torch.nn.Linear(N_LATENT, N_HIDDEN),
    torch.nn.ReLU(),
    torch.nn.Linear(N_HIDDEN, N_HIDDEN),
    torch.nn.ReLU(),
    torch.nn.Linear(N_HIDDEN, N_INPUT),
    torch.nn.Tanh())

G = G.to(device)

生成器が作成した贋作と、データセットにある本物を区別する判別器 (Discriminator) を次のように設計します。

D = torch.nn.Sequential(
    torch.nn.Linear(N_INPUT, N_HIDDEN),
    torch.nn.LeakyReLU(0.2),
    torch.nn.Linear(N_HIDDEN, N_HIDDEN),
    torch.nn.LeakyReLU(0.2),
    torch.nn.Linear(N_HIDDEN, 1),
    torch.nn.Sigmoid())

D = D.to(device)

モデルの作成と、最適化のための設定。

criterion = torch.nn.BCELoss()
d_optimizer = torch.optim.Adam(D.parameters(), lr=LEARNING_RATE)
g_optimizer = torch.optim.Adam(G.parameters(), lr=LEARNING_RATE)

学習開始

ここでは、判別器の学習頻度を、生成器の学習頻度の 1/2 に下げています。

d_losses = []
g_losses = []
real_scores = []
fake_scores = []

total_step = len(data_loader)
for epoch in range(NUM_EPOCHS):
    for i, (data, _) in enumerate(data_loader):

        # 判別器の学習
        outputs = D(data)
        real_labels = torch.ones(outputs.shape[0], 1).to(device)
        d_loss_real = criterion(outputs, real_labels)
        real_score = outputs
        
        z = torch.randn(BATCH_SIZE, N_LATENT).to(device)
        fake_data = G(z)
        outputs = D(fake_data)
        fake_labels = torch.zeros(outputs.shape[0], 1).to(device)
        d_loss_fake = criterion(outputs, fake_labels)
        fake_score = outputs
        
        d_loss = d_loss_real + d_loss_fake
        d_optimizer.zero_grad()
        g_optimizer.zero_grad()
        if (i%2) == 0:
            d_loss.backward()
        d_optimizer.step()
        
        # 生成器の学習
        z = torch.randn(BATCH_SIZE, N_LATENT).to(device)
        fake_data = G(z)
        outputs = D(fake_data)
        
        real_labels = torch.ones(outputs.shape[0], 1).to(device)
        g_loss = criterion(outputs, real_labels)
        
        d_optimizer.zero_grad()
        g_optimizer.zero_grad()
        g_loss.backward()
        g_optimizer.step()
        
        # 結果の記録
        d_losses.append(d_loss.item())
        g_losses.append(g_loss.item())
        real_scores.append(real_score.mean().item())
        fake_scores.append(fake_score.mean().item())
        if (i+1) % 100 == 0 and (epoch+1) % 10 == 0:
            print('Epoch [{}/{}], Step [{}/{}], d_loss: {:.4f}, g_loss: {:.4f}, D(x): {:.4f}, D(G(z)): {:.4f}' 
                  .format(epoch, NUM_EPOCHS, i+1, total_step, d_loss.item(), g_loss.item(), 
                          real_score.mean().item(), fake_score.mean().item()))
    
    if (epoch+1) % 100 == 0:
        print("Epoch[{}/{}], Generated SMILES: {}".format(
            epoch+1, NUM_EPOCHS, get_best_smile(fake_data)))
Epoch[100/30000], Generated SMILES: S=CO
Epoch[200/30000], Generated SMILES: CC
Epoch[300/30000], Generated SMILES: S=C
Epoch[400/30000], Generated SMILES: SCC
Epoch[500/30000], Generated SMILES: CC
Epoch[600/30000], Generated SMILES: O=S
Epoch[700/30000], Generated SMILES: CC1-cCP1
Epoch[800/30000], Generated SMILES: CC1oCC1CC
Epoch[900/30000], Generated SMILES: CC1(ccCCc=1)
Epoch[1000/30000], Generated SMILES: CC1(C)nC1=SC
Epoch[1100/30000], Generated SMILES: CC1ccccOCC1
Epoch[1200/30000], Generated SMILES: CC1=ccC(c1CC)
Epoch[1300/30000], Generated SMILES: CC1=CccP1CCP
Epoch[1400/30000], Generated SMILES: CC1=ccccccCPc1
Epoch[1500/30000], Generated SMILES: CC1=ccC(c1CC)
Epoch[1600/30000], Generated SMILES: CCc1cccOcc1
Epoch[1700/30000], Generated SMILES: CC1=cccccc1
Epoch[1800/30000], Generated SMILES: CC1cccc=cc1C
Epoch[1900/30000], Generated SMILES: CC1=ccc/cc1=C
Epoch[2000/30000], Generated SMILES: CCC1ccccccCc1CCC
Epoch[2100/30000], Generated SMILES: CC1cccc/ccNc1
Epoch[2200/30000], Generated SMILES: CC1Ccccccc1
Epoch[2300/30000], Generated SMILES: CCCC
Epoch[2400/30000], Generated SMILES: CCc1cccccc1=C
Epoch[2500/30000], Generated SMILES: CC1OccccccN1CCCCC
Epoch[2600/30000], Generated SMILES: CC1/cccccc1
Epoch[2700/30000], Generated SMILES: CC1ccc(cccNcCCCcCCCcCCc1)O
Epoch[2800/30000], Generated SMILES: CC1ccc(ccc1)S
Epoch[2900/30000], Generated SMILES: CC1ncc(cccccCCc1)
Epoch[3000/30000], Generated SMILES: CC1Cc1
Epoch[3100/30000], Generated SMILES: CC1ccc(cccccC1)
Epoch[3200/30000], Generated SMILES: COC1cc(ccccNCCc1)
Epoch[3300/30000], Generated SMILES: CC1cc1
Epoch[3400/30000], Generated SMILES: CCCC
Epoch[3500/30000], Generated SMILES: C\1ccc(ccccc1)CC
Epoch[3600/30000], Generated SMILES: CCC1ccCcccccc1
Epoch[3700/30000], Generated SMILES: CCPC
Epoch[3800/30000], Generated SMILES: C=1ccccc1C
Epoch[3900/30000], Generated SMILES: CCC1ccCcc1
Epoch[4000/30000], Generated SMILES: CC1ccccO1
Epoch[4100/30000], Generated SMILES: CCCC
Epoch[4200/30000], Generated SMILES: CC1ccc(cS(cc1))
Epoch[4300/30000], Generated SMILES: Cc/1ccccccc1
Epoch[4400/30000], Generated SMILES: CCc1cc\SnCc1
Epoch[4500/30000], Generated SMILES: CC1ccccccCcC1
Epoch[4600/30000], Generated SMILES: Cc/1ccccc1
Epoch[4700/30000], Generated SMILES: CCc1cccc2ccccCC1Oc2
Epoch[4800/30000], Generated SMILES: CC1ccccCc=ccOc1=CC
Epoch[4900/30000], Generated SMILES: CC1ccccccccccCCsCCOc1C
Epoch[5000/30000], Generated SMILES: CC1ccccccCcc1
Epoch[5100/30000], Generated SMILES: CC1cccc\cc1
Epoch[5200/30000], Generated SMILES: CC1cccccccc\1C
Epoch[5300/30000], Generated SMILES: CC1(ccccccc(c\ccCCPC1C))
Epoch[5400/30000], Generated SMILES: Cc1ccccccccc1N
Epoch[5500/30000], Generated SMILES: Cc1ccc2ccc1cccCCC2CO\CCC
Epoch[5600/30000], Generated SMILES: COCC
Epoch[5700/30000], Generated SMILES: CC1ccc(ccc1)
Epoch[5800/30000], Generated SMILES: Cc1ccc(ccnc1)CC
Epoch[5900/30000], Generated SMILES: CC1CccccccccCc1
Epoch[6000/30000], Generated SMILES: CC1=ccccccccC1
Epoch[6100/30000], Generated SMILES: CC1cc-Ccccccc1
Epoch[6200/30000], Generated SMILES: CC1cc(C)cc1
Epoch[6300/30000], Generated SMILES: CC1c=c(cccccC1)
Epoch[6400/30000], Generated SMILES: CCC1\cccc1
Epoch[6500/30000], Generated SMILES: CCC1cccccc1
Epoch[6600/30000], Generated SMILES: CCc1cCcC1
Epoch[6700/30000], Generated SMILES: CCC1cccc1
Epoch[6800/30000], Generated SMILES: CCc1ccccNc1
Epoch[6900/30000], Generated SMILES: CCC
Epoch[7000/30000], Generated SMILES: Cc1cccccSc\c1
Epoch[7100/30000], Generated SMILES: CC1ccccC\c(ccc1C)
Epoch[7200/30000], Generated SMILES: Cc1ccc2c#cNcs21
Epoch[7300/30000], Generated SMILES: CC1cccccccnccnc1C
Epoch[7400/30000], Generated SMILES: CCC1ccCcccnccSC1C
Epoch[7500/30000], Generated SMILES: CCC1ccC=ccOcccc1C
Epoch[7600/30000], Generated SMILES: CCc1ccc=cc1
Epoch[7700/30000], Generated SMILES: CCc1=cccccccc1
Epoch[7800/30000], Generated SMILES: CCc1ccCccc1
Epoch[7900/30000], Generated SMILES: CC1ccccccCcCc1C
Epoch[8000/30000], Generated SMILES: Cc1ccccccc1
Epoch[8100/30000], Generated SMILES: COc1ccccccc1ONO
Epoch[8200/30000], Generated SMILES: C=1ccccccc1=NCC-CCC
Epoch[8300/30000], Generated SMILES: Cc1ccc(c=cc=1)
Epoch[8400/30000], Generated SMILES: Cc1c/cccccc(OC1C)
Epoch[8500/30000], Generated SMILES: CC1cc(=c(-ccccc1cC1CN)C1O)
Epoch[8600/30000], Generated SMILES: CC1Cc(ccccccc=cccc1/C)-S
Epoch[8700/30000], Generated SMILES: CCCCC1cCCcccccc1
Epoch[8800/30000], Generated SMILES: CCC=C(C(c1ccccccc1(C)))
Epoch[8900/30000], Generated SMILES: CC(CC(C))
Epoch[9000/30000], Generated SMILES: CCCCN
Epoch[9100/30000], Generated SMILES: CCC1ccCccCc(C1)
Epoch[9200/30000], Generated SMILES: CCC1ccCccCc(c1COC)
Epoch[9300/30000], Generated SMILES: CCc1cc(cccC(C1))
Epoch[9400/30000], Generated SMILES: CC1cccnnccc(C1)C
Epoch[9500/30000], Generated SMILES: CC1ccc(ccc1(C))
Epoch[9600/30000], Generated SMILES: CCc1cc(c(cc1C)N)C
Epoch[9700/30000], Generated SMILES: CC1ccc(c(ccOCc1))
Epoch[9800/30000], Generated SMILES: C=1ccc(c1)C1CCC1
Epoch[9900/30000], Generated SMILES: CC1cc1c(CcCccc1)1
Epoch[10000/30000], Generated SMILES: C/1Ccc=ccCCcC(=c1)CC
Epoch[10100/30000], Generated SMILES: CCNNC1C#cccccccCs1N=N
Epoch[10200/30000], Generated SMILES: CC2CCPCCCccCc\cCc2
Epoch[10300/30000], Generated SMILES: CCC1ccC=c1
Epoch[10400/30000], Generated SMILES: COC1cn-ccccOc1
Epoch[10500/30000], Generated SMILES: CCC1c\cCc(cOcCCCcCc1S)C
Epoch[10600/30000], Generated SMILES: CCc1ccCcc(ccc1CC)CC
Epoch[10700/30000], Generated SMILES: CC1cccccccc=cCCCCCc1
Epoch[10800/30000], Generated SMILES: CC1cccccccc=c1CC
Epoch[10900/30000], Generated SMILES: CC1ccc(ncc1)O
Epoch[11000/30000], Generated SMILES: CC1ccc(ccc1NO)
Epoch[11100/30000], Generated SMILES: C\1ccc(ccc1)N
Epoch[11200/30000], Generated SMILES: Cc1ccc(C1=C)C
Epoch[11300/30000], Generated SMILES: CC1Ccc(C(c1)C)
Epoch[11400/30000], Generated SMILES: CC1Ccc1P\C
Epoch[11500/30000], Generated SMILES: CCC/1c(C11CcCcc1)
Epoch[11600/30000], Generated SMILES: CCC=Cc\1/cccCcc1O
Epoch[11700/30000], Generated SMILES: CCC1Cc\1#S
Epoch[11800/30000], Generated SMILES: CC31C\c(c1cccC1C)1O3
Epoch[11900/30000], Generated SMILES: CC(=C(C(c1cccC1Cc2)P))sc2
Epoch[12000/30000], Generated SMILES: CCC
Epoch[12100/30000], Generated SMILES: CC1cc(c(ccccc1))N=CNO
Epoch[12200/30000], Generated SMILES: CC1cc(c(ccncc1))CC
Epoch[12300/30000], Generated SMILES: CC1cccc(cc1)
Epoch[12400/30000], Generated SMILES: Cc1ccccccc1(O)
Epoch[12500/30000], Generated SMILES: Cc1ccc(c(c1)N)
Epoch[12600/30000], Generated SMILES: CN1ccccc(c1)OCC/C
Epoch[12700/30000], Generated SMILES: Cc1ccc(c(c1)OCC\S)O
Epoch[12800/30000], Generated SMILES: Cc1ccc(c(c1)O)C-COOO
Epoch[12900/30000], Generated SMILES: Cc1ccc(c(c1)2)CcCO2
Epoch[13000/30000], Generated SMILES: Cc1ccc(c(c1)C)C\C
Epoch[13100/30000], Generated SMILES: Cc1ccc(c3cc=1)CS3NN
Epoch[13200/30000], Generated SMILES: Cc1ccc(ccc1c1)C\n1NC-O
Epoch[13300/30000], Generated SMILES: Cc1ccC(ccc1CC)CC
Epoch[13400/30000], Generated SMILES: Cc1ccc(cccc1)
Epoch[13500/30000], Generated SMILES: OOc1c=sccCC1(NN)C
Epoch[13600/30000], Generated SMILES: O=CC1SC1
Epoch[13700/30000], Generated SMILES: OC1CCCC1C
Epoch[13800/30000], Generated SMILES: OCc1CCc1C
Epoch[13900/30000], Generated SMILES: OCN=CCOCCC
Epoch[14000/30000], Generated SMILES: OCC1C/cc=1
Epoch[14100/30000], Generated SMILES: OCC=C/1CC1
Epoch[14200/30000], Generated SMILES: OCC1C/cCc1
Epoch[14300/30000], Generated SMILES: OCC=C1NCO1
Epoch[14400/30000], Generated SMILES: OCC1c(cCcNcccc1C)
Epoch[14500/30000], Generated SMILES: OCc1cccCc#cccs1
Epoch[14600/30000], Generated SMILES: SCc1c=cCccCccN1CONO
Epoch[14700/30000], Generated SMILES: OCc1cccC/cCcc\1C
Epoch[14800/30000], Generated SMILES: SC
Epoch[14900/30000], Generated SMILES: SCc1cc(c(c=1)\C)
Epoch[15000/30000], Generated SMILES: SC
Epoch[15100/30000], Generated SMILES: O\N
Epoch[15200/30000], Generated SMILES: OCN
Epoch[15300/30000], Generated SMILES: OCN/O
Epoch[15400/30000], Generated SMILES: N/1c(cCc1c1)1
Epoch[15500/30000], Generated SMILES: O=Nc1c(c(c1)C)
Epoch[15600/30000], Generated SMILES: Cc1c\cCccc1
Epoch[15700/30000], Generated SMILES: Cc1cncCcccccc1
Epoch[15800/30000], Generated SMILES: Cc1cc(ccccccc1)
Epoch[15900/30000], Generated SMILES: Cc1cc(c=cc1)
Epoch[16000/30000], Generated SMILES: Cc1ccscc\c1C
Epoch[16100/30000], Generated SMILES: Cc1cc(cC=c11-cC(CC1C)C)
Epoch[16200/30000], Generated SMILES: Cc1cCOcC1/O
Epoch[16300/30000], Generated SMILES: CC1=C=c1
Epoch[16400/30000], Generated SMILES: CC1NCCc1
Epoch[16500/30000], Generated SMILES: CCCNCC
Epoch[16600/30000], Generated SMILES: CCC=C\P(C)C
Epoch[16700/30000], Generated SMILES: CCC1cc2C1nc2C
Epoch[16800/30000], Generated SMILES: CCc1ccn(C)ccCnOc1
Epoch[16900/30000], Generated SMILES: COc1cccCc2ccc21
Epoch[17000/30000], Generated SMILES: CCS
Epoch[17100/30000], Generated SMILES: CCc1cccCcc1
Epoch[17200/30000], Generated SMILES: C\c1cccccoS1
Epoch[17300/30000], Generated SMILES: CCc1cc(ccc1)
Epoch[17400/30000], Generated SMILES: Cc1ccc(c(c1)O)
Epoch[17500/30000], Generated SMILES: Cc1ccccccc(C1)C
Epoch[17600/30000], Generated SMILES: Cc1ccc\c(c1)C
Epoch[17700/30000], Generated SMILES: CC1(c(\ccc1)C)
Epoch[17800/30000], Generated SMILES: CCNCOc=1cc1
Epoch[17900/30000], Generated SMILES: CCN(ONc1cccCCcccc1)
Epoch[18000/30000], Generated SMILES: CCN(ONc=1ccoCcccc1)
Epoch[18100/30000], Generated SMILES: CCNCONc1Cccc(nccc1)
Epoch[18200/30000], Generated SMILES: CCNCCNc1Cccc(cccc1)
Epoch[18300/30000], Generated SMILES: CCNCCCc1Ccccc\ccc1OCCCC
Epoch[18400/30000], Generated SMILES: CCNCCNCCC
Epoch[18500/30000], Generated SMILES: CCCCC-COC/O
Epoch[18600/30000], Generated SMILES: CCCCCCSCCS
Epoch[18700/30000], Generated SMILES: CCCCCC2CCC2C
Epoch[18800/30000], Generated SMILES: CCCCCC(CC)
Epoch[18900/30000], Generated SMILES: CCCCCc2CO-c(CCCC)2
Epoch[19000/30000], Generated SMILES: CC1Ccc(CcN1)
Epoch[19100/30000], Generated SMILES: COC#S
Epoch[19200/30000], Generated SMILES: CC1cnc(ccc1)C(CCC)
Epoch[19300/30000], Generated SMILES: Cc1ccc(c(c1)C)C
Epoch[19400/30000], Generated SMILES: Cc1ccc(c(c1)C)COP
Epoch[19500/30000], Generated SMILES: Cc1ccc(c(c1)C)COC
Epoch[19600/30000], Generated SMILES: Cc1ccc(c(c1)C)C-2cc2
Epoch[19700/30000], Generated SMILES: Cc1ccc(c(c1)S)CN(\No1)N=CNcc1C
Epoch[19800/30000], Generated SMILES: Cc1ccc(c(c1)C)N
Epoch[19900/30000], Generated SMILES: CCc1cccc(c1)C
Epoch[20000/30000], Generated SMILES: CC1ccccccC1C
Epoch[20100/30000], Generated SMILES: CCc1OcCccc1
Epoch[20200/30000], Generated SMILES: CCc1ccCCc1
Epoch[20300/30000], Generated SMILES: CCC1c(CCccccccc(CC)nCOC1)
Epoch[20400/30000], Generated SMILES: CCCCC(C1ccccCcc1C)
Epoch[20500/30000], Generated SMILES: CCCCC
Epoch[20600/30000], Generated SMILES: C2CcC=2
Epoch[20700/30000], Generated SMILES: CCC1C=c(ccccc(ccOC1)N)
Epoch[20800/30000], Generated SMILES: CCOCN
Epoch[20900/30000], Generated SMILES: CC\1Ncc=c1
Epoch[21000/30000], Generated SMILES: CCN1\cccc1C
Epoch[21100/30000], Generated SMILES: CC1cccccc1
Epoch[21200/30000], Generated SMILES: Cc1ccccco1
Epoch[21300/30000], Generated SMILES: Cc1ccccc(N)cc=CcPN1
Epoch[21400/30000], Generated SMILES: Cc1ccccc#c(ccc1)
Epoch[21500/30000], Generated SMILES: Cc1cccccOc1=N
Epoch[21600/30000], Generated SMILES: Cc1ccc1
Epoch[21700/30000], Generated SMILES: Cc1ccc1
Epoch[21800/30000], Generated SMILES: Cc1cccOc/C1-C
Epoch[21900/30000], Generated SMILES: CN-1cc=cN=1PC
Epoch[22000/30000], Generated SMILES: ONc(cC\1C)1
Epoch[22100/30000], Generated SMILES: CCNC\C=PN
Epoch[22200/30000], Generated SMILES: C/NNCC\ON
Epoch[22300/30000], Generated SMILES: O/CCOC
Epoch[22400/30000], Generated SMILES: C/N(C)
Epoch[22500/30000], Generated SMILES: CC(CO)
Epoch[22600/30000], Generated SMILES: CC(=C)
Epoch[22700/30000], Generated SMILES: CC(=O)\Pc1cOc(ccc1)
Epoch[22800/30000], Generated SMILES: CC(=O)\O
Epoch[22900/30000], Generated SMILES: CC(=O)\PS
Epoch[23000/30000], Generated SMILES: CC(=C)\Oc1CccccSc1
Epoch[23100/30000], Generated SMILES: CC(=P)c2cO2
Epoch[23200/30000], Generated SMILES: OC(=P)
Epoch[23300/30000], Generated SMILES: CCC=PC
Epoch[23400/30000], Generated SMILES: CO/O/C
Epoch[23500/30000], Generated SMILES: COc1=C=CCCc(NCC1C)O
Epoch[23600/30000], Generated SMILES: COOSNC
Epoch[23700/30000], Generated SMILES: CNNN
Epoch[23800/30000], Generated SMILES: COc1-C(C)c(C1)
Epoch[23900/30000], Generated SMILES: COc1cc=C(c1)
Epoch[24000/30000], Generated SMILES: COc1(c(c(c1))NCO)
Epoch[24100/30000], Generated SMILES: CCc1cc(C(c1))
Epoch[24200/30000], Generated SMILES: CC1ccccc(ccc\1)C
Epoch[24300/30000], Generated SMILES: CCc1cccc(cccc11C)C1
Epoch[24400/30000], Generated SMILES: Cc1cccccccccO1
Epoch[24500/30000], Generated SMILES: Cc1ccccccc1NC
Epoch[24600/30000], Generated SMILES: Cc1cccccc11ccccC1
Epoch[24700/30000], Generated SMILES: Cc1cccOccC1
Epoch[24800/30000], Generated SMILES: Cc1ccc=cC11nccc1
Epoch[24900/30000], Generated SMILES: Cc1ccc=cCC1=C
Epoch[25000/30000], Generated SMILES: Cc1ccc\cCn1
Epoch[25100/30000], Generated SMILES: Cc1cCc\cCo1
Epoch[25200/30000], Generated SMILES: Cc1cC1
Epoch[25300/30000], Generated SMILES: CCCCC
Epoch[25400/30000], Generated SMILES: CCC1cc\CCcc1
Epoch[25500/30000], Generated SMILES: CCC1cc\CCcc1
Epoch[25600/30000], Generated SMILES: CCCN
Epoch[25700/30000], Generated SMILES: CCC1ccc(cccCcc(OON1))
Epoch[25800/30000], Generated SMILES: COc1cccCccccCcC1CNCC
Epoch[25900/30000], Generated SMILES: COc1cccCccc(c1)C
Epoch[26000/30000], Generated SMILES: CCc1ccccccc(P1)CCC
Epoch[26100/30000], Generated SMILES: CNc1ccccccc(O1)
Epoch[26200/30000], Generated SMILES: CCc1cccc(c1P1)CN1
Epoch[26300/30000], Generated SMILES: C-c1cc=c(Cc1N)CC
Epoch[26400/30000], Generated SMILES: CN
Epoch[26500/30000], Generated SMILES: CC
Epoch[26600/30000], Generated SMILES: CC1ccCN1
Epoch[26700/30000], Generated SMILES: CC1ccCNcco1
Epoch[26800/30000], Generated SMILES: CC1cc(O)C1/C
Epoch[26900/30000], Generated SMILES: CC1CC1OOc1oc1
Epoch[27000/30000], Generated SMILES: C/N=C(O)C1NcC(c(c1)C)COC/O
Epoch[27100/30000], Generated SMILES: C/N=C(OOC1Nc2cc2c1)C
Epoch[27200/30000], Generated SMILES: C/N=CNOOC1Ncccccc1
Epoch[27300/30000], Generated SMILES: C/N(CN\OC)
Epoch[27400/30000], Generated SMILES: C/N(CO\C1)Ccc(c/c1)
Epoch[27500/30000], Generated SMILES: C=NP1O\O1
Epoch[27600/30000], Generated SMILES: C=NC=S\C
Epoch[27700/30000], Generated SMILES: C=NN=O
Epoch[27800/30000], Generated SMILES: C=CC=C\C
Epoch[27900/30000], Generated SMILES: COCC=C\C
Epoch[28000/30000], Generated SMILES: COCC=c2n1CcC/2CC1
Epoch[28100/30000], Generated SMILES: COC1ncN1N(=O)
Epoch[28200/30000], Generated SMILES: COC1=cNcN(C1)
Epoch[28300/30000], Generated SMILES: CCC1=cNcc(C1)
Epoch[28400/30000], Generated SMILES: CCc1Nc(cc(=O)CC1)
Epoch[28500/30000], Generated SMILES: CCc1(c(cc(O1)CC))
Epoch[28600/30000], Generated SMILES: CC
Epoch[28700/30000], Generated SMILES: CC=1OcCOc(=C)C1
Epoch[28800/30000], Generated SMILES: CCc1OcCOc(cccC11)1
Epoch[28900/30000], Generated SMILES: CCc1c(C)c(ccc1)
Epoch[29000/30000], Generated SMILES: CCc1c(C)c(ccc1SC)
Epoch[29100/30000], Generated SMILES: Cc1cc(C)cCCcc1
Epoch[29200/30000], Generated SMILES: Cc1cc(C)c/Ccc1
Epoch[29300/30000], Generated SMILES: Cc1cc(C)c\C#c1NC
Epoch[29400/30000], Generated SMILES: Cc1cc(C)ccCPccNsCC=1
Epoch[29500/30000], Generated SMILES: Cc1cc(C)cccNcnooCC1C
Epoch[29600/30000], Generated SMILES: Cc1cc(C)\cccccC1
Epoch[29700/30000], Generated SMILES: C
Epoch[29800/30000], Generated SMILES: Cc1cc(cc1ccc2)SPC2
Epoch[29900/30000], Generated SMILES: Cc1ccSOc1NSCS
Epoch[30000/30000], Generated SMILES: Cc1ccc=c(OS1C)

最初は単純なSMILES文字列しか生成できなかったのが、だんだん複雑なSMILES文字列を生成できるようになっている...かな?いつもというわけではないけど(主観)。

学習曲線

学習の経過を以下に示します。左と右で同じ数値を示していますが、左が通常表示、右が対数表示です。

fig, axes = plt.subplots(nrows=4, ncols=2, figsize=(8,8))
plot_data = [d_losses, g_losses, real_scores, fake_scores]
legends = ["D Loss", "G Loss", "D(x)", "D(G(z))"]
for i in range(4):
    axes[i][0].plot(plot_data[i], label=legends[i], alpha=0.8)
    axes[i][0].grid()
    axes[i][0].legend()
    axes[i][1].plot(plot_data[i], label=legends[i], alpha=0.8)
    axes[i][1].grid()
    axes[i][1].legend()
    axes[i][1].set_yscale('log')
axes[3][0].set_xlabel("Steps")
axes[3][1].set_xlabel("Steps")
plt.show()

VAEとGANで分子生成入門_62_0.png

何度も言いますが、良い分子(目的の物性を持った分子)かどうかは一切考慮していません。あくまで入門ということで。

11
18
1

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
11
18

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?