数理最適化 Advent Calendar 2025

【GPU】GoogleColabでNVIDIA/cuOptを触ってみた

Last updated at 2025-12-13Posted at 2025-12-13

はじめに

本記事は、数理最適化のアドベントカレンダーの14日目の記事となります。

数理最適化アドベントカレンダーは初投稿となりますので、お手柔らかにお願いします

それは2年前のこと

2年ほど前、NVIDIAの最適化エンジンcuOptが、Pickup and Deliveryのベンチマークの一部で世界記録を樹立したというブログを見かけました。

しかも、GPU上で動くらしく、注目していたのですが、当時はOSSではなく、利用におそらく申請or契約が必要そう？(違ったらすみません)だったこともあり、利用を断念していました。

時は流れ、2025年3月に、NVIDIAよりcuOptがOSSになると発表されました。

OSSになったのは半年以上前ですが、なんだかんだ触れていなかったので、アドベントカレンダーを機に試してみることにしました。

ということで、本記事では NVIDIAのOSS最適化ソルバーcuOptの概要と、簡単な最適化の実行例を紹介します。

本記事で扱うこと

cuOptの概要
Python APIを利用した簡単な最適化の実行 in Google Colab

本記事で扱わないこと

ライトにcuOptを実行してみるという記事なので、以下については触れません。ご了承ください。

アルゴリズムの詳細
ベンチマークによる他ソルバーとの性能比較

cuOptの概要

改めて、cuOptとはNVIDIAにより開発されたソルバーです。

↑の公式サイトでは以下のように称されています。

意思決定の最適化のために設計されたオープンソースの GPU により高速化されたソルバー

このように、GPUを活用しているのが特徴かと思います。また、2025年3月からOSS化されたとのこと¹で、ソースコードは以下のリポジトリに公開されています（ライセンスはApache License 2.0です）。

cuOptのユーザーガイドは以下です

cuOptは現在以下の問題に対応しており、それぞれGPUを活用したアルゴリズム²を採用しているそうです。

混合整数線形計画問題 (MILP: Mixed-integer Linear Programming)
線形計画問題 (LP: Linear Programming)
配送計画問題 (VRP: Vehicle Routing Problems)

また、公式HP³によると、以下の問題・ベンチマークで世界一の記録を得たとのことです。

問題	ベンチマーク
VRPTW	Gehring & Homberger benchmark (15/300 instance)⁴
PDPTW	Li & Lim benchmark (8/354 instance) ⁵
MIP	MIPLIB (bts4-cta instance) ⁶

ソルバー自体はC++で実装されており、ラッパーとして以下がサポートされています。

C
C++
Python
REST API サーバー
サードパーティーのモデリングツール（AMPL、PuLP等）

ただし、READMEによるとC++とPython APIは将来大幅に変更される可能性があるとのことで、利用の際はご注意ください。

なお、cuOptのリポジトリとは別に、様々な問題に対するサンプルコードもGitHubで提供されています。（この後、このサンプルコードの内の一つを紹介します。）

cuOptをGoogle Colabで実行してみる

ということで、cuOptを動かしてみようと思います。ラッパーとしてPython APIを利用します。また、GPU搭載のプライベート端末を持ち合わせていなかったので、Google Colabで動かしてみます。

まず、ノートブックを開いた後、「ランタイムのタイプを変更」からGPUを選択します（以下は無料版のキャプチャです）。

なお、ColabでのPythonのバージョンは3.12で、CUDAは12.4でした。

その後、最初のセルで、cuOpt実行に必要なライブラリをインストールします。CUDAのバージョンが12.4なので、名前にcu12が含まれるライブラリをインストールします（なお、私がインストールした際は、バージョン競合のエラーが出ましたが、検証上は関係なさそうだったので無視しました）。

!pip uninstall -y cuda-python cuda-bindings cuda-core
!pip install --upgrade --extra-index-url=https://pypi.nvidia.com cuopt-cu12 nvidia-nvjitlink-cu12 rapids-logger==0.1.19

#CUDA 13の場合は以下を実行
#!pip install --upgrade --extra-index-url=https://pypi.nvidia.com cuopt-cu13 nvidia-nvjitlink-cu13 rapids-logger==0.1.19

また、必要なライブラリをインストール＆importしておきます。（pandasとnumpyはもともとインストールされていました。）

!pip install -q matplotlib

import os
import pandas as pd
import numpy as np
import cudf
from cuopt import routing
import matplotlib.pyplot as plt
import matplotlib.font_manager as fm

準備が整ったので、内容に入っていきます。ここでは、先ほど紹介したcuopt-examplesリポジトリに含まれるノートブックの一つである、cvrp_daily_deliveries.ipynbを動かしてみます。このノートブックでは、cuOptのroutingモジュールで、簡単なCVRPを解いています。問題の詳細は以下の通りです：

8つの地点
- 拠点（MFC⁷） 1か所
  - demand: [0]
- 配送先7か所
  - demand：[4, 4, 2, 2, 1, 2, 1]
3台の配送車両
- トラック2台
  - capacity：[8, 8]
- バン1台
  - capacity：[4]

これらの情報をPythonコードとして定義します。まず、各地点の名前と位置は以下の通りです。

location_names       = [ "MFC",    "A",    "B",    "C",    "D",    "E",    "F",    "G"  ]
location_coordinates = [ [4, 4], [1, 3], [8, 1], [2, 1], [6, 7], [0, 2], [7, 6], [5, 3] ]

また、各地点のdemandの定義は以下の通りです。注目すべき点として、Pandasではなく、そのGPU版であるcuDFのSeriesとして定義しています。

location_demand = cudf.Series([0, 4, 4, 2, 2, 1, 2, 1], dtype=np.int32)
n_locations = len(location_demand)

距離行列 distance_matrixは以下の通りです。こちらもcuDFのDataFrameとして定義しています。

distance_matrix = [
    [0.0, 3.1, 5.0, 3.6, 3.6, 4.5, 3.6, 1.4],
    [3.1, 0.0, 7.3, 2.2, 6.4, 1.4, 6.7, 4.0],
    [5.0, 7.3, 0.0, 6.0, 6.3, 8.1, 5.1, 3.6],
    [3.6, 2.2, 6.0, 0.0, 7.2, 2.2, 7.1, 3.6],
    [3.6, 6.4, 6.3, 7.2, 0.0, 7.8, 1.4, 4.1],
    [4.5, 1.4, 8.1, 2.2, 7.8, 0.0, 8.1, 5.1],
    [3.6, 6.7, 5.1, 7.1, 1.4, 8.1, 0.0, 3.6],
    [1.4, 4.0, 3.6, 3.6, 4.1, 5.1, 3.6, 0.0]
]

distance_matrix_df = cudf.DataFrame(np.array(distance_matrix).astype(np.float32))

最後に、車両のcapacityであるvehicle_capacityは以下の通りです。location_demandと同様、cuDFのSeriesとして定義しています。

vehicle_capacity = cudf.Series([8, 8, 4], dtype=np.int32)
n_vehicles = len(vehicle_capacity)

次に、routing.DataModelを利用してモデルの初期化をします。第一引数に地点数、第二引数に車両数を指定します。

data_model = routing.DataModel(n_locations, n_vehicles)

その後、変数・制約の設定をしていきます。まず以下が距離行列の設定。

data_model.add_cost_matrix(distance_matrix_df)

次に、以下でdemandとcapacityの設定をします。

data_model.add_capacity_dimension("demand", location_demand, vehicle_capacity)

そして、以下で車両の設定を行います。veh_start_locationsとveh_end_locationsは、それぞれ各車両の出発・到着地点のIDを示しており、ここでは全車両に対し拠点である0を設定しています。

veh_start_locations = cudf.Series([0, 0, 0])
veh_end_locations = cudf.Series([0, 0, 0])
data_model.set_vehicle_locations(veh_start_locations, veh_end_locations)

最後に、以下で計算の制限時間を5秒に設定します。

solver_settings = routing.SolverSettings()
solver_settings.set_time_limit(5)

これで、設定完了です。（個人的に、Google OR-Toolsのルーティングモジュールと使用感が似ているなと思いました。）

設定が完了したら、実際に計算を行います。具体的には、routing.Solve関数を実行することで、計算を実行できます。

solution = routing.Solve(data_model, solver_settings)

計算が正常に終了したら、結果の取得を行います。結果取得用のコードは以下の通りです。

# Process returned data
if solution.get_status() == 0:  # Success
    print("Cost for the routing in distance: ", solution.get_total_objective())
    print("Vehicle count to complete routing: ", solution.get_vehicle_count())
    
    # Get the routes
    routes_df = solution.get_route()
    
    # Display the routes by vehicle
    for vehicle_id in routes_df['truck_id'].unique().to_pandas():
        vehicle_route = routes_df[routes_df['truck_id'] == vehicle_id]
        route_locations = vehicle_route['route'].to_arrow().to_pylist()
        route_names = [location_names[loc] for loc in route_locations]
        print(f"Vehicle {vehicle_id} route: {' → '.join(route_names)}")
else:
    print("NVIDIA cuOpt Failed to find a feasible solution. Status:", solution.get_status())

# 以下が出力されるはず。
# Cost for the routing in distance:  26.799999594688416
# Vehicle count to complete routing:  2
# Vehicle 0 route: MFC → A → E → C → G → MFC
# Vehicle 1 route: MFC → B → F → D → MFC

また、グラフで出力すると以下のようになります。

グラフ出力コード

# Used to plot the Co-ordinates
def gen_plot(df):
    """
    Generate a plot with the locations.
    
    Args:
        df: DataFrame with location coordinates (xcord, ycord)
        
    Returns:
        matplotlib plot object
    """
    plt.figure(figsize=(11, 11))
    
    # Plot depot (first location)
    plt.scatter(
        df["xcord"][:1],
        df["ycord"][:1],
        label="Depot",
        color="Green",
        marker="o",
        s=100,
    )
    
    # Plot other locations
    plt.scatter(
        df["xcord"][1:],
        df["ycord"][1:],
        label="Locations",
        color="Red",
        marker="o",
        s=100,
    )
    
    plt.xlabel("x - axis")
    plt.ylabel("y - axis")
    plt.title("Simplified Map")
    plt.legend()

    # Add location labels
    for i, label in enumerate(df.index.values):
        plt.annotate(
            label,
            (df["xcord"][i], df["ycord"][i]),
            fontproperties=fm.FontProperties(size=16),
        )
    
    return plt


# Used to plot arrows
def add_arrows(df, route, plt, color="green"):
    """
    Add directional arrows to the plot to represent a route.
    
    Args:
        df: DataFrame with location coordinates
        route: List of location indices representing the route
        plt: matplotlib plot object
        color: Color for the arrows
        
    Returns:
        matplotlib plot object with arrows added
    """
    prev_cord = ()
    for i, label in enumerate(route):
        if i > 0:
            arrow_props = dict(
                arrowstyle="simple, head_length=0.5, head_width=0.5, tail_width=0.15",
                connectionstyle="arc3",
                color=color,
                mutation_scale=20,
                ec="black",
            )
            plt.annotate(
                "",
                xy=(df["xcord"][label], df["ycord"][label]),
                xytext=prev_cord,
                arrowprops=arrow_props,
                label=f"vehicle-{i}",
            )
        prev_cord = df["xcord"][label], df["ycord"][label]

    return plt


# Map vehicle routes
def map_vehicle_routes(df, solution, colors):
    """
    Creates a plot visualizing vehicle routes.
    
    Args:
        df: DataFrame with location coordinates
        solution: cuOpt routing.Solution object
        colors: List of colors to use for different vehicles
    
    Returns:
        matplotlib plot object with routes visualized
    """
    # Initialize plot with locations
    plt = gen_plot(df)
    
    # Get routes from solution
    routes_df = solution.get_route()
    
    # Convert to pandas if it's a cudf DataFrame
    if hasattr(routes_df, 'to_pandas'):
        routes_df = routes_df.to_pandas()
        
    # Get unique vehicle IDs
    vehicle_ids = routes_df['truck_id'].unique()
    
    # Create a mapping from vehicle ID to color index
    vid_map = {vid: i % len(colors) for i, vid in enumerate(vehicle_ids)}
    
    # Draw routes for each vehicle
    for v_id in vehicle_ids:
        v_route = routes_df[routes_df['truck_id'] == v_id]
        route_locs = v_route['route'].tolist()
        plt = add_arrows(
            df, 
            route_locs,
            plt, 
            color=colors[vid_map[v_id]]
        )
    
    return plt

location_coordinates_df = pd.DataFrame(location_coordinates, columns=['xcord', 'ycord'], index=location_names)
vehicle_colors = ["red", "green", "blue"]
map_vehicle_routes(location_coordinates_df, solution, vehicle_colors).show()

ノートブックでは、追加の制約として、車両の最小利用数を3に設定する例が紹介されていました。具体的には、以下のようなコードを記述し設定します。

data_model.set_min_vehicles(n_vehicles)

↑を設定後、再計算を行うと、以下のように、無理やり車両3台利用した結果が出力されます。

まとめ

本記事では、数理最適化のアドベントカレンダーの14日目の記事として、cuOptの概要とCVRPのサンプルコードをGoogle Colabで動かしてみた結果を紹介しました。

サンプルには、他にもTime Windows有のVRPの例⁸や、PuLP経由で数独を解く例⁹等、色々ありますので、気になる方は見てみて下さい。

最適化計算へのアプローチとしてGPUを活用することはとても面白いなあと思った一方で、様々なドキュメントを見る中で感じた個人的な懸念点を以下に記しておきます。

cuOptのVRPベンチマーク計測では、GPUとしてH100が利用されていた¹⁰が、実利用でそんなリッチなハードを用意するのは大変
（専門家の方からしたら当たり前の話かと思いますが、）問題の内容によって向き不向きがあるはずで、他のソルバーと比較し必ずしも性能が良いわけではない
- 例えば、cuOptのLPソルバーの場合、「Mittelmanのベンチマークでは、49の公開インスタンスのうち8つで、収束が遅く、1時間後にタイムアウトしました。」と公式ブログで述べられている¹¹
先述の通り、一部APIの破壊的変更がありうると記載されているため、アップデートにより動かなくなる可能性がある

ただし、注目されていることは間違いない（例えば、2025 COIN-OR Cupを受賞した¹²他、Gurobiのサポートサイトでも「Gurobi is excited about the open-source release of cuOpt’s PDLP」と述べられている¹³）とは思うので、個人的にも今後ウォッチしていきたいと思っております。

ということで以上になります。最後までお読みいただきありがとうございました。まだ、勉強中の身であるため、もし誤りなどありましたらコメントでご指摘お願いいたします。

https://blogs.nvidia.com/blog/cuopt-open-source/?ncid=no-ncid や https://www.coin-or.org/2025/03/18/the-coin-or-foundation-bridging-new-frontiers-in-open-source-optimization-with-nvidia/ を参照 ↩
詳細はソースコードや、以下等の各種ブログやセッションを参照ください：
https://developer.nvidia.com/ja-jp/blog/record-breaking-nvidia-cuopt-algorithms-deliver-route-optimization-solutions-100x-faster/
https://developer.nvidia.com/blog/solve-linear-programs-using-the-gpu-accelerated-barrier-method-in-nvidia-cuopt/
https://www.nvidia.com/en-us/on-demand/session/gtc25-s72290/ ↩
公式HPの該当箇所へのリンクはこちら ↩
https://www.sintef.no/projectweb/top/vrptw/homberger-benchmark/ ↩
https://www.sintef.no/projectweb/top/pdptw/li-lim-benchmark/ ↩
https://miplib.zib.de/instance_details_bts4-cta.html ↩
MFCは、Micro Fulfillment Centersの訳。小型の物流拠点のこと。 ↩
https://github.com/NVIDIA/cuopt-examples/blob/main/last_mile_delivery/cvrptw_service_team_routing.ipynb ↩
https://github.com/NVIDIA/cuopt-examples/blob/main/PuLP_integration_example/Sudoku_pulp.ipynb ↩
https://developer.nvidia.com/ja-jp/blog/record-breaking-nvidia-cuopt-algorithms-deliver-route-optimization-solutions-100x-faster/ ↩
https://developer.nvidia.com/ja-jp/blog/accelerate-large-linear-programming-problems-with-nvidia-cuopt ↩
https://www.coin-or.org/2025/10/26/2025-coin-or-cup-award-nvidia-cuopt/ ↩
https://support.gurobi.com/hc/en-us/articles/360012237852-Does-Gurobi-support-GPUs ↩

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up