Last updated at Posted at 2020-05-02

1. はじめに

MLIRとは、Multi Level Intermediate Representationの略である。ザクッというと、コンパイラのどの段階でも使える中間表現フレームワークである。このフレームワークが登場した理由は、深層学習の流行により様々なコンパイラが登場している。それらのコンパイラはそれぞれスクラッチから作られているため、作成に工数もかかり、品質もばらばらであった。この問題を解決するべく登場した。


  • Fortran IR (flang)
  • TensorFlow graph

なお、TensorFlow Graphの場合、Dialectの定義コードは、llvm配下ではなく、tensorflow配下にある。registerDialectで探せる。

1.1. コード量


github.com/AlDanial/cloc v 1.85  T=0.03 s (680.4 files/s, 150292.0 lines/s)
Language                     files          blank        comment           code
C++                              8            350            651           1497
C/C++ Header                     7            243            314            865
CMake                            3              6              0             50
SUM:                            18            599            965           2412


github.com/AlDanial/cloc v 1.85  T=0.73 s (1023.3 files/s, 250817.7 lines/s)
Language                             files          blank        comment           code
C++                                    305          15481          23995          80922
C/C++ Header                           250           6675          13209          20129
Markdown                                41           3604              0          14099
CMake                                  136            395             74           2763
SVG                                      2              0              0            960
Python                                   5            263            346            734
Windows Module Definition                2             12              0            148
vim script                               4             30             43            131
JSON                                     1              1              0            112
Bourne Shell                             3             21             43             39
Lisp                                     1             11             31             37
YAML                                     2              0              1             20
SUM:                                   752          26493          37742         120094

1.2. 環境構築

Ubuntu 19.10 では、以下の手順でコンパイルできる。ただし。masterブランチだとコンパイルが成功しないときがある。以下の手順は、以下のパッチでコンパイルができた。

  • llvmorg-11-init   (2020/01/15)
  • f50bc823fe6f4279eb2f426dd54f3151878c0216 (2020/04/23)
  • 35cf2f42dda4d708741e06570b2dbe91cec4dc41 (2020/04/22)
  • fa284e136e1b67e233f445fcf643eeaa10d6835c (2020/04/20)


  • 1811061c387baeff59446a090890368da3d86d42 (20202/04/21)
    • 2eda87dfbe63bae43b81b22c8c76a3139147797b のパッチのためコンパイルできなくなっていると思われる。
$ sudo apt update
$ sudo apt -y upgrade
$ sudo apt install -y cmake
$ sudo apt install -y ninja-build
$ sudo apt install -y g++
$ git clone https://github.com/llvm/llvm-project.git
$ cd llvm-project
$ mkdir build; cd build
$ cmake -G Ninja ../llvm \
$ cmake --build . --target check-mlir


-- Testing: 453 tests, 8 workers --
Testing:  0.. 10.. 20.. 30.. 40.. 50.. 60.. 70.. 80.. 90..

Testing Time: 2.53s
  Unsupported Tests:  15
  Expected Passes  : 438

AWS EC2 t3.2xlargeの実行時間結果は以下の通りである。

real    16m14.676s
user    122m37.133s
sys     6m5.102s

1.3. テスト実行


$ cd llvm-project/build

simple.mlirは、MLIRのLLVM Dialect(方言)のサンプルコードとなる。そして、LLVM Dialectは、mlir-cpu-runnerで、実行する。ここで、実行関数のデフォルトは、mainである。しかし、-eで別関数名を指定することもできる例えば、-e fooとして別関数を実行できる。

$ bin/mlir-cpu-runner ../mlir/test/mlir-cpu-runner/simple.mlir

sgemm_native_codegen.mlirは、mlir-optでMLIRのAffine Dialect(方言)Linalg Dialect(方言)からLLVM Dialect(方言)へと変換する。そして、mlir-cpu-runnerで実行する。

$ bin/mlir-opt -convert-linalg-to-loops -lower-affine -convert-loop-to-std -convert-std-to-llvm ../mlir/test/mlir-cpu-runner/sgemm_naive_codegen.mlir | bin/mlir-cpu-runner -O3 -e main -entry-point-result=void -shared-libs=lib/libmlir_runner_utils.so

unranked_memref.mlirは、Linalg Dialect(方言)で記述している。このため、LLVM Dialect(方言)へ変換して、実行する。

$ bin/mlir-opt -convert-linalg-to-loops -convert-loop-to-std -convert-std-to-llvm ../mlir/test/mlir-cpu-runner/unranked_memref.mlir | bin/mlir-cpu-runner -O3 -e main -entry-point-result=void -shared-libs=lib/libmlir_runner_utils.so

utils.mlirは、Linalg Dialect(方言)で記述している。このため、LLVM Dialect(方言)へ変換して、実行する。ここで、複数の関数を指定できるので、-e print_0d-e print_2dとすることもできる。

$ bin/mlir-opt -convert-linalg-to-loops -convert-linalg-to-llvm -convert-std-to-llvm ../mlir/test/mlir-cpu-runner/utils.mlir | bin/mlir-cpu-runner -O3 -e print_1d -entry-point-result=void -shared-libs=lib/libmlir_runner_utils.so


func @print_0d() {
  %f = constant 2.00000e+00 : f32
  %A = alloc() : memref<f32>
  store %f, %A[]: memref<f32>
  %U = memref_cast %A :  memref<f32> to memref<*xf32>
  call @print_memref_f32(%U): (memref<*xf32>) -> ()
  dealloc %A : memref<f32>

2. チュートリアル

なお、MLIRの中では、複数のDialect(方言)に変換する。例えば、Toy, AffineそしてLLVMIRである。このように変換していき最終的に機械語に変換する。また、最適化は、Dialect上の各Pass上の処理(Canonicalize, Inliner)で行っている。

2.1. 第一章 Toy言語からASTまで


  • 2次元以内のテンソル
  • データ型は64bitの浮動小数点


  • transpose
  • print
$ bin/toyc-ch1 ../mlir/test/Examples/Toy/Ch1/ast.toy -emit=ast

2.2. 第二章 ASTからMLIRまで

前章のAST(抽象構文木)に続いて、MLIR(ML中間表現)への変換を行う。このため、Toy Dialect(方言)の定義、Toy操作の定義を行い、コードの生成を行う。
まず、toycdumpMLIRでToy Dialectの定義およびMLIRの生成呼び出しを行う。ここで、Toy Dialectの登録(register)を行い、mlirGenでMLIR生成を行う。

int dumpMLIR() {
  // Register our Dialect with MLIR.

  mlir::MLIRContext context;

  // Handle '.toy' input to the compiler.
  if (inputType != InputType::MLIR &&
      !llvm::StringRef(inputFilename).endswith(".mlir")) {
    auto moduleAST = parseInputFile(inputFilename);
    if (!moduleAST)
      return 6;
    mlir::OwningModuleRef module = mlirGen(context, *moduleAST);
    if (!module)
      return 1;

    return 0;

  // Otherwise, the input is '.mlir'.
  llvm::ErrorOr<std::unique_ptr<llvm::MemoryBuffer>> fileOrErr =
  if (std::error_code EC = fileOrErr.getError()) {
    llvm::errs() << "Could not open input file: " << EC.message() << "\n";
    return -1;

  // Parse the input mlir.
  llvm::SourceMgr sourceMgr;
  sourceMgr.AddNewSourceBuffer(std::move(*fileOrErr), llvm::SMLoc());
  mlir::OwningModuleRef module = mlir::parseSourceFile(sourceMgr, &context);
  if (!module) {
    llvm::errs() << "Error can't load file " << inputFilename << "\n";
    return 3;

  return 0;

次に、Dialectの延長で、Operation Definition Specification (ODS)による演算子が定義される。コード上は、Ops.tdで定義している。以下で、ConstantOpの定義例まで示す。

// Defines the operations of the Toy dialect.

#ifndef TOY_OPS
#define TOY_OPS

include "mlir/IR/OpBase.td"
include "mlir/Interfaces/SideEffects.td"

// Provide a definition of the 'toy' dialect in the ODS framework so that we
// can define our operations.
def Toy_Dialect : Dialect {
  let name = "toy";
  let cppNamespace = "toy";

// Base class for toy dialect operations. This operation inherits from the base
// `Op` class in OpBase.td, and provides:
//   * The parent dialect of the operation.
//   * The mnemonic for the operation, or the name without the dialect prefix.
//   * A list of traits for the operation.
class Toy_Op<string mnemonic, list<OpTrait> traits = []> :
    Op<Toy_Dialect, mnemonic, traits>;

// Toy Operations

// We define a toy operation by inheriting from our base 'Toy_Op' class above.
// Here we provide the mnemonic and a list of traits for the operation. The
// constant operation is marked as 'NoSideEffect' as it is a pure operation
// and may be removed if dead.
def ConstantOp : Toy_Op<"constant", [NoSideEffect]> {
  // Provide a summary and description for this operation. This can be used to
  // auto-generate documentation of the operations within our dialect.
  let summary = "constant";
  let description = [{
    Constant operation turns a literal into an SSA value. The data is attached
    to the operation as an attribute. For example:

      %0 = toy.constant dense<[[1.0, 2.0, 3.0], [4.0, 5.0, 6.0]]>
                        : tensor<2x3xf64>

  // The constant operation takes an attribute as the only input.
  let arguments = (ins F64ElementsAttr:$value);

  // The constant operation returns a single value of TensorType.
  let results = (outs F64Tensor);

  // Specify a parser and printer method.
  let parser = [{ return ::parseConstantOp(parser, result); }];
  let printer = [{ return ::print(p, *this); }];

  // Add custom build methods for the constant operation. These method populates
  // the `state` that MLIR uses to create operations, i.e. these are used when
  // using `builder.create<ConstantOp>(...)`.
  let builders = [
    // Build a constant with a given constant tensor value.
    OpBuilder<"Builder *builder, OperationState &state, "
              "DenseElementsAttr value", [{
      build(builder, state, value.getType(), value);

    // Build a constant with a given constant floating-point value.
    OpBuilder<"Builder *builder, OperationState &state, double value">

  // Invoke a static verify method to verify this constant operation.
  let verifier = [{ return ::verify(*this); }];



$ bin/toyc-ch2 ../mlir/test/Examples/Toy/Ch2/codegen.toy -emit=mlir


 CMakeLists.txt             |   19 +
 include/CMakeLists.txt     |    1
 include/toy/CMakeLists.txt |    4
 include/toy/Dialect.h      |   45 ++++
 include/toy/Lexer.h        |    4
 include/toy/MLIRGen.h      |   32 +++
 include/toy/Ops.td         |  251 ++++++++++++++++++++++++
 mlir/Dialect.cpp           |  254 +++++++++++++++++++++++++
 mlir/MLIRGen.cpp           |  452 +++++++++++++++++++++++++++++++++++++++++++++
 toyc.cpp                   |   87 +++++++-
 10 files changed, 1136 insertions(+), 13 deletions(-)

2.3. 第三章 MLIRの高レベル最適化(1)

グラフレベル最適化の一つとして、層演算の融合がある。これは、MLIR独自のGeneric DAG Rewriter Infrastructureを使う。書き換え方法としては、2つあり、C++ベースと、DRRの2つがある。

    // Add a run of the canonicalizer to optimize the mlir module.




def TransposeOp : Toy_Op<"transpose", [NoSideEffect]> {
  let summary = "transpose operation";

  let arguments = (ins F64Tensor:$input);
  let results = (outs F64Tensor);

  let assemblyFormat = [{
    `(` $input `:` type($input) `)` attr-dict `to` type(results)

  // Enable registering canonicalization patterns with this operation.
  let hasCanonicalizer = 1;

  // Allow building a TransposeOp with from the input operand.
  let builders = [
    OpBuilder<"Builder *b, OperationState &state, Value input">

  // Invoke a static verify method to verify this transpose operation.
  let verifier = [{ return ::verify(*this); }];


/// Register our patterns as "canonicalization" patterns on the TransposeOp so
/// that they can be picked up by the Canonicalization framework.
void TransposeOp::getCanonicalizationPatterns(OwningRewritePatternList &results,
                                              MLIRContext *context) {


// This is an example of a c++ rewrite pattern for the TransposeOp. It
/// optimizes the following scenario: transpose(transpose(x)) -> transpose(x)
struct SimplifyRedundantTranspose : public mlir::OpRewritePattern<TransposeOp> {
  /// We register this pattern to match every toy.transpose in the IR.
  /// The "benefit" is used by the framework to order the patterns and process
  /// them in order of profitability.
  SimplifyRedundantTranspose(mlir::MLIRContext *context)
      : OpRewritePattern<TransposeOp>(context, /*benefit=*/1) {}

  /// This method attempts to match a pattern and rewrite it. The rewriter
  /// argument is the orchestrator of the sequence of rewrites. The pattern is
  /// expected to interact with it to perform any changes to the IR from here.
  matchAndRewrite(TransposeOp op,
                  mlir::PatternRewriter &rewriter) const override {
    // Look through the input of the current transpose.
    mlir::Value transposeInput = op.getOperand();
    TransposeOp transposeInputOp =

    // Input defined by another transpose? If not, no match.
    if (!transposeInputOp)
      return failure();

    // Otherwise, we have a redundant transpose. Use the rewriter.
    rewriter.replaceOp(op, {transposeInputOp.getOperand()});
    return success();



// Basic Pattern-Match and Rewrite

// Reshape(Reshape(x)) = Reshape(x)
def ReshapeReshapeOptPattern : Pat<(ReshapeOp(ReshapeOp $arg)),
                                   (ReshapeOp $arg)>;



$ bin/toyc-ch3 ../mlir/test/Examples/Toy/Ch3/trivial_reshape.toy -emit=mlir -opt


 CMakeLists.txt             |   19 +++++++++---
 include/toy/CMakeLists.txt |    2 -
 include/toy/Ops.td         |   15 ++++++---
 mlir/ToyCombine.cpp        |   69 +++++++++++++++++++++++++++++++++++++++++++++
 mlir/ToyCombine.td         |   62 ++++++++++++++++++++++++++++++++++++++++
 toyc.cpp                   |   50 ++++++++++++++++++++++----------
 6 files changed, 191 insertions(+), 26 deletions(-)

2.4. 第四章MLIRの高レベル最適化(2)


    // Inline all functions into main and then delete them.


// ToyInlinerInterface

/// This class defines the interface for handling inlining with Toy
/// operations.
struct ToyInlinerInterface : public DialectInlinerInterface {
  using DialectInlinerInterface::DialectInlinerInterface;

  // Analysis Hooks

  /// All operations within toy can be inlined.
  bool isLegalToInline(Operation *, Region *,
                       BlockAndValueMapping &) const final {
    return true;

  // Transformation Hooks

  /// Handle the given inlined terminator(toy.return) by replacing it with a new
  /// operation as necessary.
  void handleTerminator(Operation *op,
                        ArrayRef<Value> valuesToRepl) const final {
    // Only "toy.return" needs to be handled here.
    auto returnOp = cast<ReturnOp>(op);

    // Replace the values directly with the return operands.
    assert(returnOp.getNumOperands() == valuesToRepl.size());
    for (const auto &it : llvm::enumerate(returnOp.getOperands()))

  /// Attempts to materialize a conversion for a type mismatch between a call
  /// from this dialect, and a callable region. This method should generate an
  /// operation that takes 'input' as the only operand, and produces a single
  /// result of 'resultType'. If a conversion can not be generated, nullptr
  /// should be returned.
  Operation *materializeCallConversion(OpBuilder &builder, Value input,
                                       Type resultType,
                                       Location conversionLoc) const final {
    return builder.create<CastOp>(conversionLoc, resultType, input);


// ToyDialect

/// Dialect creation, the instance will be owned by the context. This is the
/// point of registration of custom types and operations for the dialect.
ToyDialect::ToyDialect(mlir::MLIRContext *ctx) : mlir::Dialect("toy", ctx) {
#define GET_OP_LIST
#include "toy/Ops.cpp.inc"


include "mlir/Interfaces/CallInterfaces.td"
def GenericCallOp : Toy_Op<"generic_call",
    [DeclareOpInterfaceMethods<CallOpInterface>]> {
  let summary = "generic call operation";
  let description = [{
    Generic calls represent calls to a user defined function that needs to
    be specialized for the shape of its arguments. The callee name is attached
    as a symbol reference via an attribute. The arguments list must match the
    arguments expected by the callee. For example:

     %4 = toy.generic_call @my_func(%1, %3)
           : (tensor<2x3xf64>, tensor<2x3xf64>) -> tensor<*xf64>

    This is only valid if a function named "my_func" exists and takes two

  // The generic call operation takes a symbol reference attribute as the
  // callee, and inputs for the call.
  let arguments = (ins FlatSymbolRefAttr:$callee, Variadic<F64Tensor>:$inputs);

  // The generic call operation returns a single value of TensorType.
  let results = (outs F64Tensor);

  // The return operation only emits the input in the format if it is present.
  let assemblyFormat = [{
    $callee `(` $inputs `)` attr-dict `:` functional-type($inputs, results)

  // Add custom build methods for the generic call operation.
  let builders = [
    OpBuilder<"OpBuilder &builder, OperationState &state, "
              "StringRef callee, ArrayRef<Value> arguments">
/// Return the callee of the generic call operation, this is required by the
/// call interface.
CallInterfaceCallable GenericCallOp::getCallableForCallee() {
  return getAttrOfType<SymbolRefAttr>("callee");
/// Return the callee of the generic call operation, this is required by the
/// call interface.
CallInterfaceCallable GenericCallOp::getCallableForCallee() {
  return getAttrOfType<SymbolRefAttr>("callee");

/// Get the argument operands to the called function, this is required by the
/// call interface.
Operation::operand_range GenericCallOp::getArgOperands() { return inputs(); }




$ bin/toyc-ch4 ../mlir/test/Examples/Toy/Ch4/codegen.toy -emit=mlir -opt


--- norm.0      2020-04-30 15:20:47.077823958 +0000
+++ opt.0       2020-04-30 15:20:59.721346678 +0000
@@ -1,20 +1,11 @@

 module {
-  func @multiply_transpose(%arg0: tensor<*xf64>, %arg1: tensor<*xf64>) -> tensor<*xf64> attributes {sym_visibility = "private"} {
-    %0 = toy.transpose(%arg0 : tensor<*xf64>) to tensor<*xf64>
-    %1 = toy.transpose(%arg1 : tensor<*xf64>) to tensor<*xf64>
-    %2 = toy.mul %0, %1 : tensor<*xf64>
-    toy.return %2 : tensor<*xf64>
-  }
   func @main() {
     %0 = toy.constant dense<[[1.000000e+00, 2.000000e+00, 3.000000e+00], [4.000000e+00, 5.000000e+00, 6.000000e+00]]> : tensor<2x3xf64>
-    %1 = toy.reshape(%0 : tensor<2x3xf64>) to tensor<2x3xf64>
-    %2 = toy.constant dense<[1.000000e+00, 2.000000e+00, 3.000000e+00, 4.000000e+00, 5.000000e+00, 6.000000e+00]> : tensor<6xf64>
-    %3 = toy.reshape(%2 : tensor<6xf64>) to tensor<2x3xf64>
-    %4 = toy.generic_call @multiply_transpose(%1, %3) : (tensor<2x3xf64>, tensor<2x3xf64>) -> tensor<*xf64>
-    %5 = toy.generic_call @multiply_transpose(%3, %1) : (tensor<2x3xf64>, tensor<2x3xf64>) -> tensor<*xf64>
-    toy.print %5 : tensor<*xf64>
+    %1 = toy.transpose(%0 : tensor<2x3xf64>) to tensor<3x2xf64>
+    %2 = toy.mul %1, %1 : tensor<3x2xf64>
+    toy.print %2 : tensor<3x2xf64>


 CMakeLists.txt                         |   13 ++-
 include/toy/CMakeLists.txt             |    7 +-
 include/toy/Dialect.h                  |    2
 include/toy/Ops.td                     |   35 +++++++++-
 include/toy/Passes.h                   |   26 +++++++
 include/toy/ShapeInferenceInterface.h  |   28 ++++++++
 include/toy/ShapeInferenceInterface.td |   30 ++++++++
 mlir/Dialect.cpp                       |   81 +++++++++++++++++++++++
 mlir/MLIRGen.cpp                       |    4 +
 mlir/ShapeInferencePass.cpp            |  113 +++++++++++++++++++++++++++++++++
 mlir/ToyCombine.cpp                    |    5 +
 toyc.cpp                               |   13 +++
 12 files changed, 345 insertions(+), 12 deletions(-)

2.5. 第五章 低レベル言語への部分変換


  if (isLoweringToAffine) {
    // Partially lower the toy dialect with a few cleanups afterwards.

    mlir::OpPassManager &optPM = pm.nest<mlir::FuncOp>();

    // Add optimizations if enabled.
    if (enableOpt) {

MLIR Affineの効果は以下で確認できる。-optのありなしで確認する。

$ bin/toyc-ch5 ../mlir/test/Examples/Toy/Ch5/affine-lowering.mlir -emit=mlir-affine


t$ diff -urpN none.0 opt.0
--- none.0      2020-04-30 01:26:01.689305461 +0000
+++ opt.0       2020-04-30 01:25:52.445313589 +0000
@@ -19,31 +19,22 @@ module {
     %cst_3 = constant 5.000000e+00 : f64
     %cst_4 = constant 6.000000e+00 : f64
     %0 = alloc() : memref<3x2xf64>
-    %1 = alloc() : memref<3x2xf64>
-    %2 = alloc() : memref<2x3xf64>
-    affine.store %cst, %2[0, 0] : memref<2x3xf64>
-    affine.store %cst_0, %2[0, 1] : memref<2x3xf64>
-    affine.store %cst_1, %2[0, 2] : memref<2x3xf64>
-    affine.store %cst_2, %2[1, 0] : memref<2x3xf64>
-    affine.store %cst_3, %2[1, 1] : memref<2x3xf64>
-    affine.store %cst_4, %2[1, 2] : memref<2x3xf64>
+    %1 = alloc() : memref<2x3xf64>
+    affine.store %cst, %1[0, 0] : memref<2x3xf64>
+    affine.store %cst_0, %1[0, 1] : memref<2x3xf64>
+    affine.store %cst_1, %1[0, 2] : memref<2x3xf64>
+    affine.store %cst_2, %1[1, 0] : memref<2x3xf64>
+    affine.store %cst_3, %1[1, 1] : memref<2x3xf64>
+    affine.store %cst_4, %1[1, 2] : memref<2x3xf64>
     affine.for %arg0 = 0 to 3 {
       affine.for %arg1 = 0 to 2 {
-        %3 = affine.load %2[%arg1, %arg0] : memref<2x3xf64>
-        affine.store %3, %1[%arg0, %arg1] : memref<3x2xf64>
-      }
-    }
-    affine.for %arg0 = 0 to 3 {
-      affine.for %arg1 = 0 to 2 {
-        %3 = affine.load %1[%arg0, %arg1] : memref<3x2xf64>
-        %4 = affine.load %1[%arg0, %arg1] : memref<3x2xf64>
-        %5 = mulf %3, %4 : f64
-        affine.store %5, %0[%arg0, %arg1] : memref<3x2xf64>
+        %2 = affine.load %1[%arg1, %arg0] : memref<2x3xf64>
+        %3 = mulf %2, %2 : f64
+        affine.store %3, %0[%arg0, %arg1] : memref<3x2xf64>
     toy.print %0 : memref<3x2xf64>
-    dealloc %2 : memref<2x3xf64>
-    dealloc %1 : memref<3x2xf64>
+    dealloc %1 : memref<2x3xf64>
     dealloc %0 : memref<3x2xf64>


 CMakeLists.txt              |   17 +-
 include/toy/CMakeLists.txt  |    4
 include/toy/Ops.td          |    3
 include/toy/Passes.h        |    5
 mlir/LowerToAffineLoops.cpp |  316 ++++++++++++++++++++++++++++++++++++++++++++
 toyc.cpp                    |   39 ++++-
 6 files changed, 366 insertions(+), 18 deletions(-)

2.6. 第六章 LLVM IR(LLVM中間表現)へ変換

前章では、Dialectの変換例を示した。ここでは、LLVM IRへの変換例を示す。

  if (isLoweringToLLVM) {
    // Finish lowering the toy IR to the LLVM dialect.


// ToyToLLVMLoweringPass

namespace {
struct ToyToLLVMLoweringPass
    : public PassWrapper<ToyToLLVMLoweringPass, OperationPass<ModuleOp>> {
  void runOnOperation() final;
} // end anonymous namespace

void ToyToLLVMLoweringPass::runOnOperation() {
  // The first thing to define is the conversion target. This will define the
  // final target for this lowering. For this lowering, we are only targeting
  // the LLVM dialect.
  LLVMConversionTarget target(getContext());
  target.addLegalOp<ModuleOp, ModuleTerminatorOp>();

  // During this lowering, we will also be lowering the MemRef types, that are
  // currently being operated on, to a representation in LLVM. To perform this
  // conversion we use a TypeConverter as part of the lowering. This converter
  // details how one type maps to another. This is necessary now that we will be
  // doing more complicated lowerings, involving loop region arguments.
  LLVMTypeConverter typeConverter(&getContext());

  // Now that the conversion target has been defined, we need to provide the
  // patterns used for lowering. At this point of the compilation process, we
  // have a combination of `toy`, `affine`, and `std` operations. Luckily, there
  // are already exists a set of patterns to transform `affine` and `std`
  // dialects. These patterns lowering in multiple stages, relying on transitive
  // lowerings. Transitive lowering, or A->B->C lowering, is when multiple
  // patterns must be applied to fully transform an illegal operation into a
  // set of legal ones.
  OwningRewritePatternList patterns;
  populateAffineToStdConversionPatterns(patterns, &getContext());
  populateLoopToStdConversionPatterns(patterns, &getContext());
  populateStdToLLVMConversionPatterns(typeConverter, patterns);

  // The only remaining operation to lower from the `toy` dialect, is the
  // PrintOp.

  // We want to completely lower to LLVM, so we use a `FullConversion`. This
  // ensures that only legal operations will remain after the conversion.
  auto module = getOperation();
  if (failed(applyFullConversion(module, target, patterns, &typeConverter)))

/// Create a pass for lowering operations the remaining `Toy` operations, as
/// well as `Affine` and `Std`, to the LLVM dialect for codegen.
std::unique_ptr<mlir::Pass> mlir::toy::createLowerToLLVMPass() {
  return std::make_unique<ToyToLLVMLoweringPass>();

なお、toycコマンドでは、LLVM IRやJITへの変換することができる。以下では、JITへと出力しているがそれ以外にも、-emit=mlir-emit=mlir-affine-emit=mlir-llvmおよび-emit=llvmへの出力を設定できる。

$ echo 'def main() { print([[1, 2], [3, 4]]); }' | ./bin/toyc-ch6 -emit=jit
1.000000 2.000000
3.000000 4.000000


 CMakeLists.txt              |   22 +++-
 include/toy/CMakeLists.txt  |    4
 include/toy/Ops.td          |    6 -
 include/toy/Passes.h        |    4
 mlir/LowerToAffineLoops.cpp |    1
 mlir/LowerToLLVM.cpp        |  204 ++++++++++++++++++++++++++++++++++++++++++++
 toyc.cpp                    |  132 +++++++++++++++++++++++-----
 7 files changed, 337 insertions(+), 36 deletions(-)

2.7. 第七章 言語の拡張


関連しているソースコードは、以下の通り(Toy Ch6とCh7の差分)

 CMakeLists.txt              |   14 --
 include/toy/AST.h           |   90 +++++++++++--
 include/toy/CMakeLists.txt  |    4
 include/toy/Dialect.h       |   54 +++++++
 include/toy/Lexer.h         |   11 +
 include/toy/Ops.td          |   73 +++++++++-
 include/toy/Parser.h        |  258 +++++++++++++++++++++++++++++++++-----
 mlir/Dialect.cpp            |  264 +++++++++++++++++++++++++++++++++++---
 mlir/LowerToAffineLoops.cpp |    1
 mlir/MLIRGen.cpp            |  298 ++++++++++++++++++++++++++++++++++++++------
 mlir/ToyCombine.cpp         |   18 ++
 parser/AST.cpp              |   49 ++++++-
 toyc.cpp                    |    1
 13 files changed, 1002 insertions(+), 133 deletions(-)


2.8. ソースコード構成


  • CMakeLists.txt
  • toyc.cpp (Ch1-7) コンパイラ本体
  • include/
    • toy/
      • CMakeLists.txt
      • AST.h (Ch1-7) Toy言語AST変換ヘッダ
      • Dialect.h (Ch2-7) Dialect用ヘッダ
      • Lexer.h (Ch1-7) 字句解析ヘッダ
      • MLIRGen.h (Ch2-7) MLIR生成用ヘッダ
      • Parser.h (Ch1-7) パーサ用ヘッダ
      • Passes.h (Ch4-7) Pass用ヘッダ
      • Ops.td  (Ch2-7) Opsのテーブル定義ファイル
      • ShapeInferenceInterface.h (Ch4-7) ShapeInferenceInterface用ヘッダ
      • ShapeInferenceInterface.td (Ch4-7) ShapeInferenceInterfaceのテーブル定義コード
  • parser/
    • AST.cpp (Ch1-7) Toy言語AST変換
  • mlir/
    • Dialect.cpp (Ch2-7) Dialectコード
    • LowerToLLVM.cpp (Ch6-7) LLVM出力用コード
    • ShapeIOnferencePass.cpp (Ch4-7) ShapeInferencePass用コード
    • ToyCombine.td (CH3-7) ToyCombineのテーブル定義ファイル
    • LowerToAffineLoops.cpp 全結合層の最適化コード
    • MLIRGen.cpp (Ch2-7) MLIR生成コード
    • ToyCombine.cpp (Ch3-7) ToyCombine最適化コード

A. 参考資料

A.1. 公式資料



A.2. その他


