More than 1 year has passed since last update.

歌詞付きMIDIをMusicXMLに変換その３：音符と歌詞の紐付け

Last updated at 2022-12-08Posted at 2022-12-04

概要

XF仕様に含まれるカラオケ情報と音符情報の紐付けをします。

シリーズ一覧：
歌詞付きMIDIをMusicXMLに変換リンクまとめ

準備

ライブラリの読み込み

必要なライブラリをインポートします（その２と共通のため、不要なものもあるかもです）
XFMIdiFileはその１で作成した、mido.MidiFileの機能拡張バージョンです。

import pandas as pd
import re
import jaconv
import copy
import xml.etree.ElementTree as ET
import itertools
from sudachipy import tokenizer
from sudachipy import dictionary
from typing import TypedDict, Union, List, Tuple, Dict, Optional
import json
from pathlib import Path

# XF仕様MIDIを読み込むための独自クラス
from xfmidifile import XFMidiFile

音符情報の読み込み

その２を参考に、整理した音符情報を入手しておきます。
音符にstart_timeが付与されており、休符も作られていればOKです。
その２の最後にnote_measuresを書き出していればそれを読み込みます。
note_measuresという変数であるとします。

with open("yorunikakeru_notes.json") as f:
  note_measures = json.load(f)
for notes in note_measures[:3]:
  print(notes)

[{'type': 'rest', 'duration_division': 32, 'start_division': 0}]
[{'type': 'rest', 'duration_division': 32, 'start_division': 0}]
[{'type': 'rest', 'duration_division': 24, 'start_division': 0}, {'type': 'note_on', 'start_time': 5280, 'duration_time': 238, 'note': 79, 'start_division': 24, 'duration_division': 4}, {'type': 'note_on', 'start_time': 5520, 'duration_time': 238, 'note': 82, 'start_division': 28, 'duration_division': 4}]
[{'type': 'note_on', 'start_time': 5760, 'duration_time': 358, 'note': 84, 'start_division': 0, 'duration_division': 6}, {'type': 'note_on', 'start_time': 6120, 'duration_time': 358, 'note': 80, 'start_division': 6, 'duration_division': 6}, {'type': 'note_on', 'start_time': 6480, 'duration_time': 238, 'note': 79, 'start_division': 12, 'duration_division': 4}, {'type': 'note_on', 'start_time': 6720, 'duration_time': 238, 'note': 77, 'start_division': 16, 'duration_division': 4}, {'type': 'note_on', 'start_time': 6960, 'duration_time': 238, 'note': 75, 'start_division': 20, 'duration_division': 4}, {'type': 'note_on', 'start_time': 7200, 'duration_time': 238, 'note': 77, 'start_division': 24, 'duration_division': 4}, {'type': 'note_on', 'start_time': 7440, 'duration_time': 238, 'note': 84, 'start_division': 28, 'duration_division': 4}]
[{'type': 'note_on', 'start_time': 7680, 'duration_time': 238, 'note': 82, 'start_division': 0, 'duration_division': 4}, {'type': 'note_on', 'start_time': 7920, 'duration_time': 118, 'note': 84, 'start_division': 4, 'duration_division': 2}, {'type': 'note_on', 'start_time': 8040, 'duration_time': 238, 'note': 79, 'start_division': 6, 'duration_division': 4}, {'type': 'note_on', 'start_time': 8280, 'duration_time': 358, 'note': 77, 'start_division': 10, 'duration_division': 6}, {'type': 'note_on', 'start_time': 8640, 'duration_time': 838, 'note': 75, 'start_division': 16, 'duration_division': 14}, {'type': 'rest', 'duration_division': 2, 'start_division': 30}]

カラオケ情報の読み込み

その１で作成したXFMidiFileを使って、カラオケ情報を読み込んでおきます。

xfmidi = XFMidiFile("song/yorunikakeru/yorunikakeru.mid", charset="cp932")
print(xfmidi.xfkm)

MidiTrack([
  MetaMessage('cue_marker', text='$Lyrc:1:312:JP', time=0),
  MetaMessage('cue_marker', text='&s', time=5240),
  MetaMessage('lyrics', text='<', time=20),
  MetaMessage('lyrics', text='沈[し', time=20),
  MetaMessage('lyrics', text='ず]', time=240),
  MetaMessage('lyrics', text='む', time=240),
  MetaMessage('lyrics', text='よ', time=360),
  MetaMessage('lyrics', text='う', time=360),
  MetaMessage('lyrics', text='に', time=240),
...

扱いやすいようにdictに変換します。

lyrics = [m.dict() for m in xfmidi.xfkm]
for l in print(lyrics[:10]):
  print(l)

{'type': 'cue_marker', 'text': '$Lyrc:1:312:JP', 'time': 0}
{'type': 'cue_marker', 'text': '&s', 'time': 5240}
{'type': 'lyrics', 'text': '<', 'time': 20}
{'type': 'lyrics', 'text': '沈[し', 'time': 20}
{'type': 'lyrics', 'text': 'ず]', 'time': 240}
{'type': 'lyrics', 'text': 'む', 'time': 240}
{'type': 'lyrics', 'text': 'よ', 'time': 360}
{'type': 'lyrics', 'text': 'う', 'time': 360}
{'type': 'lyrics', 'text': 'に', 'time': 240}
{'type': 'lyrics', 'text': '/', 'time': 220}

開始時間の付与

lyricsのtimeは直前の要素からの差時間のため、曲開始を0とした開始時間を付与します。
その２で作成したMIDI2MusicXML._add_start_timeを使います。

lyrics = MIDI2MusicXML._add_start_time(lyrics)
for l in lyrics[:10]:
  print(l)

{'type': 'cue_marker', 'text': '$Lyrc:1:312:JP', 'time': 0, 'start_time': 0}
{'type': 'cue_marker', 'text': '&s', 'time': 5240, 'start_time': 5240}
{'type': 'lyrics', 'text': '<', 'time': 20, 'start_time': 5260}
{'type': 'lyrics', 'text': '沈[し', 'time': 20, 'start_time': 5280}
{'type': 'lyrics', 'text': 'ず]', 'time': 240, 'start_time': 5520}
{'type': 'lyrics', 'text': 'む', 'time': 240, 'start_time': 5760}
{'type': 'lyrics', 'text': 'よ', 'time': 360, 'start_time': 6120}
{'type': 'lyrics', 'text': 'う', 'time': 360, 'start_time': 6480}
{'type': 'lyrics', 'text': 'に', 'time': 240, 'start_time': 6720}
{'type': 'lyrics', 'text': '/', 'time': 220, 'start_time': 6940}

noteに歌詞を紐付ける

noteにstart_timeが一致するlyricを紐付けていきます。
これは非常に雑な処理で、XF仕様のカラオケメッセージのstart_timeと音符のstart_timeは厳密に一致する保証はありません。XF仕様にて、一致させることが推奨はされているものの、厳密な要請ではないからです。カラオケの字幕という用途から考えても、ミリ秒の精度が求められることはないので、まあ妥当な仕様と思います。

ですが、ヤマハのサイトからDLしたMIDIであれば、見た感じ、start_timeは一致させてくれているようなので、信用して少ない労力で実装することを優先します。一致していても、ハモリ（和音）があった場合はどうするのかなど問題はあるのですが、これもメロディのchannelでは音のハモリはないものとして、一旦考えます。

メロディチャンネルの取得

まずnotesにおけるmelodyのチャンネルをカラオケメッセージチャンクのヘッダーから取得します

# 楽曲情報を取得
_, melody_channel, _, _ = XFMidiFile.get_xflyricinfo(filepath)
melody_channel = int(melody_channel) - 1 # melody_channelのindexが１ずれている。ヤマハだけ？

ヤマハの場合、いくつかMIDIをみた範囲ではmelody_channelは1のようでした。一方で、メロディのチャンネルは、実際に聞いた感じだと、messagesのchannel=0のパートのようです。
これは開始インデックスがずれていると判断すればよいのか、melody_channelが実質機能していないと見るべきなのかはよくわかりませんXFの仕様書を見てもindexの開始がずれているという注釈は特に見当たりませんでした。もしずれているなら注釈のひとつくらいあってもよさそうなものですが、よくわかりません。

その２をやっていない場合はこの方法でメロディチャンネルを判断して、適切なnoteを抽出してください。今回はその２ですでにchannel=0と決めつけて抽出してしまっているので順番が前後しています。

start_timeと歌詞の辞書の作成

lyricsをstart_tiemをキーとする辞書に変換します。

lyric_dict = {v["start_time"]: v["text"] for v in lyrics}

{0: '$Lyrc:1:312:JP', 5240: '&s', 5260: '<', 5280: '沈[し', 5520: 'ず]', 5760: 'む', 6120: 'よ', 6480: 'う', 6720: 'に', 6940: '/', 6960: '溶[と]', 7200: 'け', 7440: 'て', 7680: 'ゆ', 7920: 'く', 8040: 'よ', 8280: 'う', 8640: 'に', 9480: '/', 12940: '<', 12960: '二人[ふ', 13200: 'た', 13440: 'り]', 13800: 'だ', 14160: 'け', 14400: 'の', 14640: '空[そ', 14880: 'ら]', 15120: 'が', 15340: '/', ...

ここもstart_timeが要素間で同一にならない保証はないので多少危ない処理ですが、ヤマハを信頼して手抜きしています。

試しに紐付け

試しに、start_timeの一致に基づいて紐付けてみます。

for notes in note_measures:
  for note in notes:
    if note["type"] == "note_on":
      start_time = note["start_time"]
      if start_time in lyric_dict:
        print(start_time, lyric_dict[start_time])

5280 沈[し
5520 ず]
5760 む
6120 よ
6480 う
6720 に
6960 溶[と]
...

非常に良い感じがします。
<などの制御記号もいい感じに除かれており、処理がしやすそうです。
よみがなが添えられているのもMusicXMLを作成するうえでは非常に都合が良いです。

表層形と発音の取得

カラオケメッセージには大かっこを使ってよみがなをふることができます。
以下のような感じです。

5280 沈[し
5520 ず]

そこでこれらの文字列から表層形とよみがなを抽出する関数を作ります。
以降、関数はその２に引き続き、MIDI2MusicXMLのメソッドとして追加するものとします。

class MIDI2MusicXML:
  ...
  @staticmethod
  def _format_lyric(text):
    # 閉じ括弧の削除
    text = text.split("]")[0]
    # カタカナをひらがなに直す
    text = jaconv.kata2hira(text)
    if "[" in text:
      surface = text.split("[")[0]
      pronunciation = text.split("[")[1]
    else:
      surface, pronunciation = text, text
    # pronunciationからひらがなと長音以外を削除
    pronunciation = re.sub("[^\u3041-\u309Fー]", "", pronunciation)
    return surface, pronunciation

MIDI2MusicXML._format_lyric("沈[し")
MIDI2MusicXML._format_lyric("ず]")

('沈', 'し')
('ず', 'ず')

ヤマハを信頼しない場合は、よみがなが常に振られているとは限らないので、形態素解析等で読みを取得した上で、複数モウラの読みの場合は、音符を分割するなどの処理が必要です。今後の課題とします。

紐付け

_format_textで表層やよみを取得した上で、noteに要素として追加していきます。
まずnote単位で読みを追加する関数を作ります。

  @classmethod
  def _get_lyric(cls, start_time, lyric_dict):
    if start_time in lyric_dict:
      raw_text = lyric_dict[start_time]
      surface, pronunciation = cls._format_lyric(raw_text) # lryicの修正
      return raw_text, surface, pronunciation
    else:
      return None, None, None

  @classmethod
  def _add_lyric(cls, note, lyric_dict, *
                          , note_type = "note_on"
                          , type_key = "type"
                          , start_key = "start_time"
                          , lyric_raw_key = "lyric_raw"
                          , lyric_surface_key = "lyric_surface"
                          , lyric_pronunciation_key = "lyric_pronunciation"
                          ):
    note = copy.deepcopy(note)
    if note[type_key] != note_type:
      pass
    else:
      raw, surface, pronunciation = cls._get_lyric(note[start_key], lyric_dict)
      if raw is not None:
        note[lyric_raw_key], note[lyric_surface_key], note[lyric_pronunciation_key] = raw, surface, pronunciation
    return note

note_measuresの各要素にこの関数を適用します。

note_measures = [[MIDI2MusicXML._add_lyric(note, lyric_dict) for note in notes] for notes in note_measures]
for notes in note_measures[:5]:
  for note in notes:
    print(note)

{'type': 'rest', 'duration_division': 32, 'start_division': 0}
{'type': 'rest', 'duration_division': 32, 'start_division': 0}
{'type': 'rest', 'duration_division': 24, 'start_division': 0}
{'type': 'note_on', 'start_time': 5280, 'duration_time': 238, 'note': 79, 'start_division': 24, 'duration_division': 4, 'lyric_raw': '沈[し', 'lyric_surface': '沈', 'lyric_pronunciation': 'し'}
{'type': 'note_on', 'start_time': 5520, 'duration_time': 238, 'note': 82, 'start_division': 28, 'duration_division': 4, 'lyric_raw': 'ず]', 'lyric_surface': 'ず', 'lyric_pronunciation': 'ず'}
{'type': 'note_on', 'start_time': 5760, 'duration_time': 358, 'note': 84, 'start_division': 0, 'duration_division': 6, 'lyric_raw': 'む', 'lyric_surface': 'む', 'lyric_pronunciation': 'む'}
{'type': 'note_on', 'start_time': 6120, 'duration_time': 358, 'note': 80, 'start_division': 6, 'duration_division': 6, 'lyric_raw': 'よ', 'lyric_surface': 'よ', 'lyric_pronunciation': 'よ'}
{'type': 'note_on', 'start_time': 6480, 'duration_time': 238, 'note': 79, 'start_division': 12, 'duration_division': 4, 'lyric_raw': 'う', 'lyric_surface': 'う', 'lyric_pronunciation': 'う'}
{'type': 'note_on', 'start_time': 6720, 'duration_time': 238, 'note': 77, 'start_division': 16, 'duration_division': 4, 'lyric_raw': 'に', 'lyric_surface': 'に', 'lyric_pronunciation': 'に'}
{'type': 'note_on', 'start_time': 6960, 'duration_time': 238, 'note': 75, 'start_division': 20, 'duration_division': 4, 'lyric_raw': '溶[と]', 'lyric_surface': '溶', 'lyric_pronunciation': 'と'}
{'type': 'note_on', 'start_time': 7200, 'duration_time': 238, 'note': 77, 'start_division': 24, 'duration_division': 4, 'lyric_raw': 'け', 'lyric_surface': 'け', 'lyric_pronunciation': 'け'}
{'type': 'note_on', 'start_time': 7440, 'duration_time': 238, 'note': 84, 'start_division': 28, 'duration_division': 4, 'lyric_raw': 'て', 'lyric_surface': 'て', 'lyric_pronunciation': 'て'}
{'type': 'note_on', 'start_time': 7680, 'duration_time': 238, 'note': 82, 'start_division': 0, 'duration_division': 4, 'lyric_raw': 'ゆ', 'lyric_surface': 'ゆ', 'lyric_pronunciation': 'ゆ'}
{'type': 'note_on', 'start_time': 7920, 'duration_time': 118, 'note': 84, 'start_division': 4, 'duration_division': 2, 'lyric_raw': 'く', 'lyric_surface': 'く', 'lyric_pronunciation': 'く'}
{'type': 'note_on', 'start_time': 8040, 'duration_time': 238, 'note': 79, 'start_division': 6, 'duration_division': 4, 'lyric_raw': 'よ', 'lyric_surface': 'よ', 'lyric_pronunciation': 'よ'}
{'type': 'note_on', 'start_time': 8280, 'duration_time': 358, 'note': 77, 'start_division': 10, 'duration_division': 6, 'lyric_raw': 'う', 'lyric_surface': 'う', 'lyric_pronunciation': 'う'}
{'type': 'note_on', 'start_time': 8640, 'duration_time': 838, 'note': 75, 'start_division': 16, 'duration_division': 14, 'lyric_raw': 'に', 'lyric_surface': 'に', 'lyric_pronunciation': 'に'}
{'type': 'rest', 'duration_division': 2, 'start_division': 30}

休符は無視して、note_onのみにlyricを付与することができています。

助詞の発音の修正

助詞の「は」「へ」の発音は「わ」「え」ですが、よみがなでは「は」「へ」となっているので、修正します。助詞であるかどうかを判定するには、文章全体で形態素解析をする必要があるので、surfaceをつなげてから形態素解析し、「は」「へ」の場所を覚えておいて、あとで紐付けします。

  @staticmethod
  def _get_fixed_pronunciations(note_measures, *
                        , surface_key = "lyric_surface"
                        ):
    tokenizer_obj = dictionary.Dictionary(dict="full").create()
    mode = tokenizer.Tokenizer.SplitMode.A

    note_measures = copy.deepcopy(note_measures)
    # surfaceの位置をmeasure, noteの位置に変換
    surface_pos_to_note_pos = {}
    text = ""
    for measure_id, measure in enumerate(note_measures):
      for note_id, note in enumerate(measure):
        surface = note.get(surface_key,"")
        if surface == "": continue

        for c in surface:
          surface_pos_to_note_pos[len(text)] = (measure_id, note_id)
          text += c
    pronunciations = []
    tokens = tokenizer_obj.tokenize(text ,mode)
    surface_pos = 0
    for token in tokens:
      surface, pos = token.surface(), token.part_of_speech()[0]
      if surface == "は" and pos == "助詞": #助詞の「は」は「わ」になおす
        measure_id, note_id = surface_pos_to_note_pos[surface_pos]
        pronunciations.append(["わ", measure_id, note_id])
      elif surface == "へ" and pos == "助詞": # 助詞の「へ」は「え」になおす
        measure_id, note_id = surface_pos_to_note_pos[surface_pos]
        pronunciations.append(["え", measure_id, note_id])
      surface_pos += len(surface)
    return pronunciations

sudachiを用いて助詞かどうかの判定をしています。
mecabだと発音がダイレクトで取れるのでそっちのほうがいいかもしれません。
（「こんにちは」など助詞が１単語の一部になっている場合はsudachiだと太刀打ちできないので、その意味でも）

fixed_pronunciations = self._get_fixed_pronunciations(note_measures)
print(fixed_pronunciations)

[['わ', 32, 4], ['わ', 78, 6], ['わ', 78, 14], ['わ', 94, 6], ['わ', 103, 0], ['わ', 109, 0], ['わ', 117, 0], ['え', 118, 0]]

note_measuresの発音を更新する処理は以下のとおりです。

# 「は」、「へ」を「わ」「え」になおす
fixed_pronunciations = MIDI2MusicXML._get_fixed_pronunciations(note_measures)
for pronunciation, measure_id, note_id in fixed_pronunciations:
  note_measures[measure_id][note_id]["lyric_pronunciation"] = pronunciation

# 出力確認
for _, measure_id, note_id in fixed_pronunciations:
  print(note_measures[measure_id][note_id]

{'type': 'note_on', 'start_time': 62400, 'duration_time': 238, 'note': 72, 'start_division': 16, 'duration_division': 4, 'lyric_raw': 'は', 'lyric_surface': 'は', 'lyric_pronunciation': 'わ'}
{'type': 'note_on', 'start_time': 150480, 'duration_time': 118, 'note': 82, 'start_division': 12, 'duration_division': 2, 'lyric_raw': 'は', 'lyric_surface': 'は', 'lyric_pronunciation': 'わ'}
{'type': 'note_on', 'start_time': 151440, 'duration_time': 118, 'note': 82, 'start_division': 28, 'duration_division': 2, 'lyric_raw': 'は', 'lyric_surface': 'は', 'lyric_pronunciation': 'わ'}
...

ルールベースで発音を埋める

note_measuresの中身を眺めていると、ときどき発音が付与されていない音符（note_on）が存在します。これは原曲と聴き比べてみると、長く伸ばす歌詞で後半の音符が独立している場合と、シャウトのような歌詞に含まれていないが発声されているメロディである場合があるようです。前者の場合は伸ばし棒で埋めれば良いですが（NEUTRINOは伸ばし棒を処理できます）、後者は特に正解がありません。また前者と後者をMIDI飲みから区別する方法も特にありません。
そこである程度割り切って、以下のルールで発音を埋めていきます。

直前にnote_onが存在する場合、伸ばし棒とする
直前にnote_onが存在しない場合（直前がrestなど）、「あ」とする

実際には、直前にnote_onがあっても伸ばし棒ではないシャウトである場合、シャウトだとしても「あ」以外の発音である場合があるので、いろいろよくないのですが、なにか埋めさえすればNEUTRINOには歌ってもらえるのと、必要に応じて手動で修正すればよいので、妥協します。

  @staticmethod
  def _add_rule_base_lyric(note_measures, *
                                        , lyric_raw_key = "lyric_raw"
                                        , lyric_surface_key = "lyric_surface"
                                        , lyric_pronunciation_key = "lyric_pronunciation"
                                        , type_key = "type"
                                        , note_type = "note_on"
                                        ):
    note_measures = copy.deepcopy(note_measures)
    notes_flatten = []
    for measure_id, notes in enumerate(note_measures):
      notes_flatten += [(measure_id, note) for note in notes]
    temp = [[] for _ in note_measures]
    for i, (measure_id, note) in enumerate(notes_flatten):
      if lyric_raw_key in note:
        temp[measure_id].append(note)
      elif note[type_key] != note_type:
        temp[measure_id].append(note)
      else:
        if i == 0 or notes_flatten[i-1][1][type_key] != note_type:
          note[lyric_raw_key] = ""
          note[lyric_surface_key] = "あ"
          note[lyric_pronunciation_key] = "あ"
        else:
          note[lyric_raw_key] = ""
          note[lyric_surface_key] = "ー"
          note[lyric_pronunciation_key] = "ー"
        temp[measure_id].append(note)
    return temp

note_measures = MIDI2MusicXML._add_rule_base_lyric(note_measures)
for notes in note_measures:
  for note in notes:
    print(note["type"], note.get("start_time"), note.get("lyric_surface"), note.get("lyric_pronunciation"))

...
note_on 15360 広 ひ
note_on 15600 ろ ろ
note_on 15840 が が
note_on 16080 る る
note_on 16560 夜 よ
note_on 16800 る る
note_on 17040 に に
note_on 17280 ー ー
rest None None None
rest None None None
...
rest None None None
note_on 185520 あ あ
note_on 185760 ほ ほ
note_on 185880 ら ら
note_on 186000 ま ま
note_on 186120 た た
...

「よるにー」のように伸ばし棒が挿入できています。
「あほらまた」（あー、ほらまた）の「あ」もルールベースで挿入された「あ」と思われます。

気になること

以下のように、「っ」のタイミングで休符が挿入されていることがあるのですが、MusicXMLというかNEUTRINO的には不要な休符のような気がします。歌詞と休符の長さで総合的に判断して、不要な休符は埋めるような処理ができると良いかもしれません。

...
note_on 42000 かっ かっ
rest None None None
note_on 42480 た た
...

保存

歌詞と音符の紐付けがいい感じにできたので、measuresと統合してjsonで保存しておきます。

messages = [m.dict() for m in xfmidi.tracks[0]]
messages = MIDI2MusicXML._add_start_time(messages)
time_informations = MIDI2MusicXML._get_time_informations(messages)
measures = MIDI2MusicXML._get_measures(time_informations)
for i in range(len(measures)):
  measures[i]["notes"] = note_measures[i]

with open("measures_with_notes.json", "w") as f:
  json.dump(measures, f, indent=2, ensure_ascii=False)

measures_with_notes.json

[
  {
    "measure_id": 0,
    "start_time": 0,
    "time_information": {
      "numerator": 4,
      "denominator": 4,
      "ticks_per_beat": 480,
      "start_time": 0,
      "tempo": 600000,
      "notated_32nd_notes_per_beat": 8,
      "ticks_per_measure": 1920,
      "start_measure_id": 0,
      "division_note": 32,
      "divisions_per_measure": 32,
      "ticks_per_division": 60,
      "duration_time": 1920,
      "measure_num": 1
    },
    "notes": [
      {
        "type": "rest",
        "duration_division": 32,
        "start_division": 0
      }
    ]
  },
  {
    "measure_id": 1,
    "start_time": 1920,
    "time_information": {
      "numerator": 4,
      "denominator": 4,
      "ticks_per_beat": 480,
      "start_time": 1920,
      "tempo": 461538,
      "notated_32nd_notes_per_beat": 8,
      "ticks_per_measure": 1920,
      "start_measure_id": 1,
      "division_note": 32,
      "divisions_per_measure": 32,
      "ticks_per_division": 60,
      "duration_time": 272560,
      "measure_num": 141
    },
    "notes": [
      {
        "type": "rest",
        "duration_division": 32,
        "start_division": 0
      }
    ]
  },
  {
    "measure_id": 2,
    "start_time": 3840,
    "time_information": {
      "numerator": 4,
      "denominator": 4,
      "ticks_per_beat": 480,
      "start_time": 1920,
      "tempo": 461538,
      "notated_32nd_notes_per_beat": 8,
      "ticks_per_measure": 1920,
      "start_measure_id": 1,
      "division_note": 32,
      "divisions_per_measure": 32,
      "ticks_per_division": 60,
      "duration_time": 272560,
      "measure_num": 141
    },
    "notes": [
      {
        "type": "rest",
        "duration_division": 24,
        "start_division": 0
      },
      {
        "type": "note_on",
        "start_time": 5280,
        "duration_time": 238,
        "note": 79,
        "start_division": 24,
        "duration_division": 4,
        "lyric_raw": "沈[し",
        "lyric_surface": "沈",
        "lyric_pronunciation": "し"
      },
      {
        "type": "note_on",
        "start_time": 5520,
        "duration_time": 238,
        "note": 82,
        "start_division": 28,
        "duration_division": 4,
        "lyric_raw": "ず]",
        "lyric_surface": "ず",
        "lyric_pronunciation": "ず"
      }
    ]
  },
...

おわりに

いろいろサボった上で、lyricとnoteを紐付けることができました。
ヤマハ公式から入手したMIDIであれば、今回の実装でもある程度ハマるのではないかとも思います。
このあたり厳密にやりだすといろいろ沼な気がするので、気が向いたら改良は試みようと思います。

いよいよ次はMusicXMLへの変換です。

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up

歌詞付きMIDIをMusicXMLに変換 その３：音符と歌詞の紐付け

概要

準備