ハネリズム変換実装してみた

Python

Last updated at 2023-12-25Posted at 2023-12-25

最近TLにこのようなポストが流れてきました。

pic.twitter.com/yjHdz8jsRR
— ハネリズム変換bot (@ConvertToSwing) December 24, 2023

楽曲をはねたリズムに変換しています。いわゆるシャッフルリズムです。しかもかなり自然な感じに変換されています。
少し興味があったのでどのように実装できるか調べてみました。

まずシャッフルリズムはどのようなものかというと、一拍を3つに分割したリズムです。
一拍が4分音符だとすれば、4分音符を2:1か1:1:1で分けます。
拍の位置がわかれば、一拍ごとに2つに分割し、2:1になるように速度を変更すればシャッフルリズムになるはずです。

というわけでpythonで実装してみます。
まずビート位置を取得するためにmadmomのBeatProcessorを使います。¹

import librosa
import numpy as np
import madmom
import pydub
import os

y, sr = librosa.load(input_path, sr=None, mono=False)
y = ((y / np.max(y)) * np.iinfo(np.int16).max).astype("int16")
sound = pydub.AudioSegment(
    y.T.tobytes(),
    frame_rate=sr,
    sample_width=y.dtype.itemsize,
    channels=2)
sound.export(wav_file_path, format="wav")

processor = madmom.features.beats.DBNBeatTrackingProcessor(fps=100)
activations = madmom.features.beats.RNNBeatProcessor()(wav_file_path)
os.remove(wav_file_path)

beat_times = processor(activations)
beat_times = np.insert(beat_times, 0, 0)

取得したビート位置から速度変更のためのタイムマップを作成します。

alpha = 0.5
time_map = [(0, 0)]
for i in range(0, len(beat_times)-1):
    start = int(beat_times[i] * sr)
    end = int(beat_times[i + 1] * sr)

    beat_len = end - start
    triplet_len = beat_len / 3
    time_map.append((time_map[-1][0] + int(triplet_len), time_map[-1][1] + int(triplet_len * (2.0 - alpha))))
    time_map.append((time_map[-1][0] + int(triplet_len * 2), time_map[-1][1] + int(triplet_len * (1.0 + alpha))))

time_map.append((
    y.shape[-1],
    int(y.shape[-1] * (time_map[-1][1] / time_map[-1][0]))
))

最後に速度を変更して保存します。

from pyrubberband.pyrb import timemap_stretch
def stereo_time_stretch(sample, time_map, sr):
    stretched_audio = timemap_stretch(np.stack([sample[0], sample[1]], axis=-1), sr=sr, time_map=time_map)
    return np.array([stretched_audio[:, 0], stretched_audio[:, 1]])
    
y_stretched = stereo_time_stretch(y, time_map=time_map, sr=sr)
y_stretched = ((y_stretched / np.max(y_stretched)) * np.iinfo(np.int16).max).astype("int16")
sound = pydub.AudioSegment(
    y_stretched.T.tobytes(),
    frame_rate=sr,
    sample_width=y_stretched.dtype.itemsize,
    channels=2)
sound.export(output_path, format="mp3")

結果は以下のようになりました。意外といい感じではないでしょうか？

さらにビートを半分に分割すればハーフタイム・シャッフルもできます。

for i in range(0, len(beat_times)-1):
    start = int(beat_times[i] * sr)
    end = int(beat_times[i + 1] * sr)

    beat_len = end - start
    triplet_len = beat_len / 3
    half_triplet_len = triplet_len / 2
    time_map.append((time_map[-1][0] + int(half_triplet_len), time_map[-1][1] + int(half_triplet_len * (2.0 - alpha))))
    time_map.append((time_map[-1][0] + int(half_triplet_len * 2), time_map[-1][1] + int(half_triplet_len * (1.0 + alpha))))

    time_map.append((time_map[-1][0] + int(half_triplet_len), time_map[-1][1] + int(half_triplet_len * (2.0 - alpha))))
    time_map.append((time_map[-1][0] + int(half_triplet_len * 2), time_map[-1][1] + int(half_triplet_len * (1.0 + alpha))))

変化は少しわかりにくいですが若干はねています。

このような感じで意外と綺麗に変換することができました。

ここは手動でビート位置を決めたほうが確実です。BeatProcessorが間違っていれば変換も失敗します。 ↩

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up