LoginSignup
2
0

More than 5 years have passed since last update.

OpenFOAMのプロファイリング (perf編)

Last updated at Posted at 2017-04-25

前回、gprofを使おうとしてうまくいかなかったので、今回はperfを使うことにする。

perfはカーネルに直結しているので使いやすい反面、特殊なカーネル(例えばスパコンとか)だと使えないこともあるが、まぁ良しとしよう。
perf自体の使い方は先人のまとめがあるのでそっちを見てもらうとして。

とりあえず前回と同じように、-g でデバッグシンボルを付けた状態でビルドして実行する1

$ source ~/OpenFOAM/OpenFOAM-dev/etc/bashrc
$ cd ~/OpenFOAM-BenchmarkTest/channelReTau110/NoBatch-mesh_3M/cases/mpi_0001-method_scotch
$ perf record pimpleFoam
$ perf report

結果は以下のようになった。

  24.15%  pimpleFoam  libOpenFOAM.so                        [.] _ZNK4Foam11DICSmoother6smoothERNS_5FieldIdEERKS2_hi
  14.79%  pimpleFoam  libOpenFOAM.so                        [.] _ZNK4Foam9lduMatrix8residualERNS_5FieldIdEERKS2_S5_RKNS_10FieldFieldIS1_dEERKN
  11.36%  pimpleFoam  libOpenFOAM.so                        [.] _ZNK4Foam9lduMatrix4AmulERNS_5FieldIdEERKNS_3tmpIS2_EERKNS_10FieldFieldIS1_dEE
   3.10%  pimpleFoam  libOpenFOAM.so                        [.] _ZNK4Foam10GAMGSolver5scaleERNS_5FieldIdEES3_RKNS_9lduMatrixERKNS_10FieldField
   2.89%  pimpleFoam  libOpenFOAM.so                        [.] _ZNK4Foam18DILUPreconditioner13preconditionTERNS_5FieldIdEERKS2_h
   2.87%  pimpleFoam  libOpenFOAM.so                        [.] _ZNK4Foam18DILUPreconditioner12preconditionERNS_5FieldIdEERKS2_h
   2.18%  pimpleFoam  libOpenFOAM.so                        [.] _ZNK4Foam9lduMatrix4TmulERNS_5FieldIdEERKNS_3tmpIS2_EERKNS_10FieldFieldIS1_dEE
   1.58%  pimpleFoam  libOpenFOAM.so                        [.] _ZNK4Foam5PBiCG5solveERNS_5FieldIdEERKS2_h
   1.51%  pimpleFoam  [kernel]                              [k] 0xffffffffb6a3b457
   1.26%  pimpleFoam  libfiniteVolume.so                    [.] _ZN4Foam26surfaceInterpolationSchemeINS_6VectorIdEEE14dotInterpolateINS_14Geom
   1.24%  pimpleFoam  libfiniteVolume.so                    [.] _ZN4Foam2fv9gaussGradIdE5gradfERKNS_14GeometricFieldIdNS_13fvsPatchFieldENS_11
   1.10%  pimpleFoam  libfiniteVolume.so                    [.] _ZN4Foam2fv9gaussGradINS_6VectorIdEEE5gradfERKNS_14GeometricFieldIS3_NS_13fvsP
   1.04%  pimpleFoam  libOpenFOAM.so                        [.] _ZN4Foam8multiplyERNS_5FieldIdEERKNS_5UListIdEES6_
   1.01%  pimpleFoam  libOpenFOAM.so                        [.] _ZNK4Foam17GAMGAgglomeration12prolongFieldIdEEvRNS_5FieldIT_EERKS4_ib.constpro
   0.97%  pimpleFoam  libOpenFOAM.so                        [.] _ZNK4Foam10GAMGSolver5solveERNS_5FieldIdEERKS2_h
   0.96%  pimpleFoam  pimpleFoam                            [.] _ZN4Foam5FieldIdEaSERKS1_
   0.96%  pimpleFoam  libOpenFOAM.so                        [.] _ZN4Foam17DICPreconditioner15calcReciprocalDERNS_5FieldIdEERKNS_9lduMatrixE
   0.94%  pimpleFoam  libOpenFOAM.so                        [.] _ZNK4Foam17GAMGAgglomeration13restrictFieldIdEEvRNS_5FieldIT_EERKS4_ib.constpr
   0.78%  pimpleFoam  pimpleFoam                            [.] _ZNK4Foam8fvMatrixINS_6VectorIdEEE1HEv
   0.72%  pimpleFoam  libfiniteVolume.so                    [.] _ZN4Foam26surfaceInterpolationSchemeIdE14dotInterpolateINS_17geometricOneField
   0.71%  pimpleFoam  libOpenFOAM.so                        [.] _ZN4Foam18DILUPreconditioner15calcReciprocalDERNS_5FieldIdEERKNS_9lduMatrixE
   0.58%  pimpleFoam  [kernel]                              [.] 0xffffffffb6e9b1d7
   0.57%  pimpleFoam  libOpenFOAM.so                        [.] _ZN4Foam6divideERNS_5FieldIdEERKNS_5UListIdEES6_
   0.53%  pimpleFoam  libOpenFOAM.so                        [.] _ZNK4Foam9lduMatrix4sumAERNS_5FieldIdEERKNS_10FieldFieldIS1_dEERKNS_8UPtrListI
   0.52%  pimpleFoam  pimpleFoam                            [.] _ZN4Foam8subtractIddNS_13fvsPatchFieldENS_11surfaceMeshEEEvRNS_14GeometricFiel
   0.52%  pimpleFoam  libOpenFOAM.so                        [.] _ZNK4Foam10GAMGSolver6VcycleERKNS_7PtrListINS_9lduMatrix8smootherEEERNS_5Field
   0.50%  pimpleFoam  pimpleFoam                            [.] _ZNK4Foam9lduMatrix1HINS_6VectorIdEEEENS_3tmpINS_5FieldIT_EEEERKS7_
   0.47%  pimpleFoam  pimpleFoam                            [.] _ZNK4Foam5FieldINS_6VectorIdEEE9componentEh
   0.46%  pimpleFoam  pimpleFoam                            [.] _ZN4Foam3fvc16surfaceIntegrateIdEEvRNS_5FieldIT_EERKNS_14GeometricFieldIS3_NS_
   0.44%  pimpleFoam  libfiniteVolume.so                    [.] _ZN4Foam26surfaceInterpolationSchemeINS_6TensorIdEEE14dotInterpolateINS_14Geom
   0.44%  pimpleFoam  libOpenFOAM.so                        [.] _ZN4Foam7sumProdIdEEdRKNS_5UListIT_EES5_
   0.41%  pimpleFoam  libfiniteVolume.so                    [.] _ZN4Foam8multiplyINS_6VectorIdEENS_12fvPatchFieldENS_7volMeshEEEvRNS_14Geometr
   0.40%  pimpleFoam  libturbulenceModels.so                [.] _ZN4Foam3fvc16surfaceIntegrateINS_6VectorIdEEEEvRNS_5FieldIT_EERKNS_14Geometri
   0.37%  pimpleFoam  pimpleFoam                            [.] _ZN4Foam5FieldIdEaSERKNS_3tmpIS1_EE

なぜかマングリングされている・・・。まぁextern "C"されているのでしかたなさそう。
ただなんとなく名前を見れば分かるのでよしとする。

ということでコールグラフを見るために、

$ perf record -g pimpleFoam
$ perf report -g -G

を実行する。

# To display the perf.data header info, please use --header/--header-only options.
#
#
# Total Lost Samples: 47277
#
# Samples: 3M of event 'cycles:ppp'
# Event count (approx.): 3373946376318
#
# Children      Self  Command     Shared Object                         Symbol                                                                                         
# ........  ........  ..........  ....................................  ............................................................................................................................................................................................................................................................................................
#
    25.64%     0.00%  pimpleFoam  libfiniteVolume.so                    [.] _ZN4Foam8fvMatrixIdE15solveSegregatedERKNS_10dictionaryE
            |
            ---_ZN4Foam8fvMatrixIdE15solveSegregatedERKNS_10dictionaryE
               |          
                --25.56%--_ZNK4Foam10GAMGSolver5solveERNS_5FieldIdEERKS2_h
                          |          
                           --24.60%--_ZNK4Foam10GAMGSolver6VcycleERKNS_7PtrListINS_9lduMatrix8smootherEEERNS_5FieldIdEERKS8_S9_S9_S9_S9_S9_RNS1_IS8_EESD_h
                                     |          
                                      --24.07%--_ZNK4Foam11DICSmoother6smoothERNS_5FieldIdEERKS2_hi

    25.64%     0.00%  pimpleFoam  pimpleFoam                            [.] _ZN4Foam8fvMatrixIdE5solveERKNS_10dictionaryE
            |
            ---_ZN4Foam8fvMatrixIdE5solveERKNS_10dictionaryE
               |          
                --25.64%--_ZN4Foam8fvMatrixIdE15solveSegregatedERKNS_10dictionaryE
                          |          
                           --25.56%--_ZNK4Foam10GAMGSolver5solveERNS_5FieldIdEERKS2_h
                                     |          
                                      --24.60%--_ZNK4Foam10GAMGSolver6VcycleERKNS_7PtrListINS_9lduMatrix8smootherEEERNS_5FieldIdEERKS8_S9_S9_S9_S9_S9_RNS1_IS8_EESD_h
                                                |          
                                                 --24.07%--_ZNK4Foam11DICSmoother6smoothERNS_5FieldIdEERKS2_hi

    25.64%     0.00%  pimpleFoam  [unknown]                             [.] 0x00007ffd70006465
            |
            ---0x7ffd70006465
               _ZN4Foam8fvMatrixIdE5solveERKNS_10dictionaryE
               |          
                --25.64%--_ZN4Foam8fvMatrixIdE15solveSegregatedERKNS_10dictionaryE
                          |          
                           --25.56%--_ZNK4Foam10GAMGSolver5solveERNS_5FieldIdEERKS2_h
                                     |          
                                      --24.60%--_ZNK4Foam10GAMGSolver6VcycleERKNS_7PtrListINS_9lduMatrix8smootherEEERNS_5FieldIdEERKS8_S9_S9_S9_S9_S9_RNS1_IS8_EESD_h
                                                |          
                                                 --24.07%--_ZNK4Foam11DICSmoother6smoothERNS_5FieldIdEERKS2_hi

よくわからないので可視化しよう

$ sudo apt install graphviz
$ git clone https://github.com/jrfonseca/gprof2dot.git
$ perf script | c++filt > perf.script
$ gprof2dot/gprof2dot.py perf.script -f perf -n1 -e1 -w | dot -Tsvg -o output.svg

たくさん出てきますが、大きいところは

image.png

  • 26%: Foam::fvMatrix::solve
  • 15%: Foam::lduMatrix::residual()

のようですね。

というところで、とりあえずFoam::fvMatrix::solveから見ていけば良さそうということが、分かりました。
OpenFOAMは大きなアプリケーションなので、これ以上細かい情報をプロファイラーできれいに取ることはめんどそうむずかしそうです。

ということで、ターゲットも分かったので、これ以降はソースコードを読んだりタイマーを仕込んだりして更に詳しく調べることにします。


  1. あとで試したら実はデバッグシンボルあってもなくても変わらなかった 

2
0
0

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
2
0