4
1

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?

DRP-AIシリーズ・KakipAdvent Calendar 2024

Day 5

DRP-AI日記その5 OpenBLASを実行してみる

Last updated at Posted at 2024-12-28

KakipでOpenBLASを実行してみました.

DRP-AIシリーズ・Kakip

OpenBLASを入れたNxの実行手順

sudo apt update
sudo apt install libopenblas-dev
git clone https://github.com/zacky1972/nx_sgemm.git
cd nx_sgemm/
mix deps.get
mix test

ベンチマーク

mix new nx_sgemm_bench_openblas
mix.exs
defmodule NxSgemmBenchOpenblas.MixProject do
  use Mix.Project

  def project do
    [
      app: :nx_sgemm_bench_openblas,
      version: "0.1.0",
      elixir: "~> 1.18",
      start_permanent: Mix.env() == :prod,
      deps: deps()
    ]
  end

  # Run "mix help compile.app" to learn about applications.
  def application do
    [
      extra_applications: [:logger]
    ]
  end

  # Run "mix help deps" to learn about dependencies.
  defp deps do
    [
      # {:dep_from_hexpm, "~> 0.3.0"},
      # {:dep_from_git, git: "https://github.com/elixir-lang/my_dep.git", tag: "0.1.0"}
      {:nx_sgemm, github: "zacky1972/nx_sgemm", branch: "main"},
      {:benchee, "~> 1.0", only: :dev}
    ]
  end
end
mix deps.get 
mix compile
benchmark.exs
Benchee.run(
  %{
    "Nx" => fn input -> Nx.multiply(input, 2.0) end,
    "OpenBLAS" => fn input -> NxSgemm.multiply(input, 2.0) end
  },
  inputs: %{
    "Small" => Nx.iota({1_000}) |> Nx.multiply(1.0),
    "Medium" => Nx.iota({10_000}) |> Nx.multiply(1.0),
    "Bigger" => Nx.iota({100_000}) |> Nx.multiply(1.0)
  }
)

Kakipでの実行結果

$ mix run -r benchmark.exs
Operating System: Linux
CPU Information: Unrecognized processor
Number of Available Cores: 4
Available memory: 7.02 GB
Elixir 1.18.1
Erlang 27.2
JIT enabled: true

Benchmark suite executing with the following configuration:
warmup: 2 s
time: 5 s
memory time: 0 ns
reduction time: 0 ns
parallel: 1
inputs: Bigger, Medium, Small
Estimated total run time: 42 s

Benchmarking Nx with input Bigger ...
Benchmarking Nx with input Medium ...
Benchmarking Nx with input Small ...
Benchmarking OpenBLAS with input Bigger ...
Benchmarking OpenBLAS with input Medium ...
Benchmarking OpenBLAS with input Small ...
Calculating statistics...
Formatting results...

##### With input Bigger #####
Name               ips        average  deviation         median         99th %
OpenBLAS        2.83 K        0.35 ms     ±8.04%        0.35 ms        0.44 ms
Nx           0.00883 K      113.22 ms     ±5.90%      108.35 ms      122.41 ms

Comparison: 
OpenBLAS        2.83 K
Nx           0.00883 K - 320.41x slower +112.87 ms

##### With input Medium #####
Name               ips        average  deviation         median         99th %
OpenBLAS       26.88 K      0.0372 ms    ±38.05%      0.0351 ms      0.0616 ms
Nx             0.111 K        9.00 ms     ±2.23%        8.90 ms        9.47 ms

Comparison: 
OpenBLAS       26.88 K
Nx             0.111 K - 241.77x slower +8.96 ms

##### With input Small #####
Name               ips        average  deviation         median         99th %
OpenBLAS      136.96 K        7.30 μs   ±337.95%        6.42 μs       65.63 μs
Nx              1.10 K      911.64 μs     ±3.22%      906.96 μs     1017.12 μs

Comparison: 
OpenBLAS      136.96 K
Nx              1.10 K - 124.86x slower +904.34 μs

CPUで実行したときにOpenBLASを使うとこのくらいは速くなります.

  • Bigger: 320.41倍
  • Medium: 241.77倍
  • Small: 124.86倍
4
1
0

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
4
1

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?