More than 5 years have passed since last update.

Python で音響解析：LibROSA （1）

Posted at 2018-09-14

Python で音響解析 :

LibROSAとは

LibROSAは、音楽と音声の解析のためのpythonパッケージです。また、NumPy/SciPyとも親和性が高く、機械学習のscikit-learnとも一緒に使えます。

音響解析には、専門用語や数学的な知識も小難しく見えるため、マニュアルに沿って、わからないことを調べつつ、勉強していきます。

LibROSAのサブモジュール

ibrosa.beat

Functions for estimating tempo and detecting beat events.
テンポとビートのタイミングを見積もる関数です

librosa.core

Core functionality includes functions to load audio from disk, compute various spectrogram representations, and a variety of commonly used tools for music analysis. For convenience, all functionality in this submodule is directly accessible from the top-level librosa.* namespace.
Coreの機能には、音源の読み込みや、各種のスペクトラム変換、音響解析によく使われるツールが含まれています。coreモジュールは直接 librosa.* 以下につけることによって使えるように便利になっています。

librosa.decompose

Functions for harmonic-percussive source separation (HPSS) and generic spectrogram decomposition using matrix decomposition methods implemented in scikit-learn.
ハーモニーとパーカッションを分けるHPSSとscikit-learnに実装されている次元圧縮を用いた一般的なスペクトグラムへの分解の機能があります。

librosa.display

Visualization and display routines using matplotlib.
matplotlibを用いたビジュアライズと表示のための機能です。

librosa.effects

Time-domain audio processing, such as pitch shifting and time stretching. This submodule also provides time-domain wrappers for the decompose submodule.
ピッチのシフトや、時間のストレッチなどの時間軸に関する処理です。decompose（分解）のサブモジュールでのtime-domain(時間領域)のラッパー関数でもあります。

librosa.feature

Feature extraction and manipulation. This includes low-level feature extraction, such as chromagrams, pseudo-constant-Q (log-frequency) transforms, Mel spectrogram, MFCC, and tuning estimation. Also provided are feature manipulation methods, such as delta features, memory embedding, and event-synchronous feature alignment.
特徴抽出とmanipulation（加工）をします。クロマグラムや、CQT、メルスペクトログラム、MFCCといった基本の特徴抽出とチューニングの概算をします。デルタやメモリーエンベディング、イベント同期特徴補正などの特徴加工手法も提供されています。

librosa.filters

Filter-bank generation (chroma, pseudo-CQT, CQT, etc.). These are primarily internal functions used by other parts of librosa.
帯域フィルターを生成します、他から呼び出される基本的な機能です

librosa.onset

Onset detection and onset strength computation.
オンセット（音の開始地点）検知と長さを計算

librosa.output

Text- and wav-file output.
テキストファイルまたはwavファイル出力

librosa.segment

Functions useful for structural segmentation, such as recurrence matrix construction, time-lag representation, and sequentially constrained clustering.
再帰的マトリクス方、タイムラグ、時系列のクラスタリングといった構造のセグメント化に有用な機能

librosa.sequence

Functions for sequential modeling. Various forms of Viterbi decoding, and helper functions for constructing transition matrices.
並び替えや、トランジションに役立つ、連続性のモデリングに関する機能

librosa.util

Helper utilities (normalization, padding, centering, etc.)
正規化や穴埋め、センタリングなどの有用なヘルパー関数

mission 1 : librosaのexsampleを写経せよ

import librosa
data, sample_rate = librosa.load(fname)

まず、fname(音楽ファイルのパス）をloadします
dataは、Numpyの一次元の浮動小数点数の配列で、オーディオデータを時系列でデコードしました。
sample_rate　は、アナログの波形全部を符号化（デジタル化）することはできないため、一秒間に何個のサンプル数を取ったかの数字で、22050 Hzがデフォルトになっています。ちなみに44.1kHzがCDなどの一般的な音源に使われています。

librosaのloadでは、ステレオの音源もモノラルにミックスして、22.05kHzにリサンプルされています。

サンプルレートについてはこちらの記事　
https://vook.vc/n/118　
がわかりやすいです。

音響解析では、decomposit（次元圧縮）したり、特徴抽出したりするので、ダウンサンプリングして問題はありません。

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up