More than 3 years have passed since last update.

鼻から悪魔を出さずに浮動小数点数の値を整数に変換する(C++)

Last updated at 2020-06-13Posted at 2020-06-13

これは何?

以前
浮動小数点数の値が整数であるかどうかを調べる(C++, Ruby, JS, Go) という記事を書いたんだけど、これの C/C++ に関する記述がまずかったので、この記事で補完する。

まずい例

まずはまずい例を。

c++

#include <cstdio>
#include <ios>
#include <iostream>

int main() {
  float f = 0xffffffff;
  int n = (int)f;
  printf("(int)%.1f == %d, %d\n", f, n, (int)f);
  std::cout << std::fixed             //
            << "(int)" << f << " == " //
            << n << ", " << (int)f    //
            << std::endl;
}

このコードを実行するとどうなると思う？

出力

$ clang++ -std=c++17 -O0 main.cpp && ./a.out
(int)4294967296.0 == -2147483648, -2147483648
(int)4294967296.000000 == -2147483648, -2147483648

$ clang++ -std=c++17 -O2 main.cpp && ./a.out
(int)4294967296.0 == -437462840, -437462824
(int)4294967296.000000 == 73896, 73896

$ g++-9 -std=c++17 -O0 main.cpp && ./a.out
(int)4294967296.0 == -2147483648, -2147483648
(int)4294967296.000000 == -2147483648, -2147483648

$ g++-9 -std=c++17 -O2 main.cpp && ./a.out
(int)4294967296.0 == 2147483647, 2147483647
(int)4294967296.000000 == 2147483647, 2147483647

メチャクチャである。

特にメチャクチャなのが clang++ の -O2 で、

毎回結果が違う
変数に受けているときと受けずに出すときで結果が違う
printf と cout で結果が違う

と、アメイジングな感じになっている。

このファンタスティックな動作はもちろん未定義動作でテンションが上ったコンパイラのクリエイティブなコード生成の結果である。

何がまずいのか

規格を見てみる

N4956 を見てみると

7.10 Floating-integral conversions
(略)
The behavior is undefined if the truncated value cannot be represented
in the destination type.

とある。

端数を捨てた値が変換先の型で表現できないときは、未定義動作。鼻から悪魔を出してもよい。ということ。

ということで

未定義動作なので前述の記事にある

c++

#include <cstdint>
bool is_int32( double x ){
  return static_cast<std::int32_t>(x)==x;
}

も駄目。

ではどうするか。

やりたいことは

/** from_type である v を、 to_type として正確に変換できるかどうか調べる */
template< typename to_type, typename from_type >
constexpr bool //
can_represent_by(from_type v);

の実装にしよう。

from_type は float, double, long double のいずれか。
to_type は、組み込み整数 (enum も bool もサポートしない) としようか。

Try 1: numeric_limits の max とかをつかった挑戦

numeric_limits::max が使えそうだと思うよね。

c++17

#include <limits>
#include <type_traits>

template <typename to_type, typename from_type>
constexpr bool //
can_represent_by(from_type v) {
  using to_lim = std::numeric_limits<to_type>;
  if (v < to_lim::lowest() || to_lim::max() < v) {
    return false;
  }
  return static_cast<to_type>(v) == v;
}

しかしこれはうまく行かない。たとえば、

c++

can_represent_by<std::uint32_t>((float)(1ULL << 32));

が未定義動作になる。

未定義動作になるのは to_lim::max() < v が意図通りに動かないから。
to_lim::max() は、 0xfffffff である。 v は float なので、比較のために型変換が発生する。この場合、私の理解が正しければ、両辺 float になる。
0xffffffff を float にすると、 (float)(1ULL << 32) になるので、 to_lim::max() < v は false になって static_cast に到達してしまう。そして鼻から悪魔。

Try 2: ちゃんとやる

まあ比較前に v を long double に変換すれば(そして 128bit 整数がなければ)正しくはなるんだけど、それは float 以外の浮動小数点計算がクソ遅い処理系があったりするので避けたいところ。

で。真面目に書いてみた。

c++17

#include <limits>
#include <type_traits>

template <typename to_type, typename from_type>
constexpr bool //
can_represent_by(from_type v) {
  using from_lim = std::numeric_limits<from_type>;
  using to_lim = std::numeric_limits<to_type>;
  static_assert(from_lim::is_iec559, "from_type should be IEEE754 type");
  static_assert(from_lim::radix == 2, "radix should be 2");
  static_assert(to_lim::radix == 2, "radix should be 2");
  auto diff_digits = to_lim::digits - from_lim::digits;
  auto raw_lo = to_lim::lowest();
  auto raw_hi = to_lim::max();
  auto mask = 0 < diff_digits //
                  ? ~((to_type(1) << diff_digits) - 1)
                  : ~to_type(0);
  // 'lo' is the smallest value of from_type that can be represented by to_type
  auto lo = raw_lo & mask;
  // 'hi' is the largest value of from_type that can be represented by to_type
  auto hi = raw_hi & mask;
  if (v < lo || hi < v) {
    return false;
  }
  return v == static_cast<to_type>(v);
}

長い。

diff_digits は、変換元と変換先で表現できる桁数の差。これが負だと、無邪気に to_lim::max() < v と計算しても大丈夫。float から int32_t の場合などはこれが正になり、 to_lim::max() < v という計算が直感的ではなくなる。

そこで「from_type でも to_type でも正確に表現できる値の集合」の最大値と最小値が必要になる。これが hi と lo 。
符号付き整数の表現が「2の補数表現」であることを前提に書いているので、そうでない処理系ではうまく動かないと思う。とはいえ、そんな処理系は見たことがないのであまり心配はしていない。

lo と hi の間の値(両端含む)なら、無邪気にキャストしても未定義動作にならないので大丈夫。

Try 3: C++11 でもコンパイルできるようにする

先の例は constexpr な関数の中で変数定義したり if 文書いたりしているので C++11 だとエラーになる。

まだ C++11 を使うこともあるので、対応してみた。

c++11

#include <limits>
#include <type_traits>

namespace can_represent_by_impl {
template <typename to_type_, typename from_type_> //
struct T {
  using to_type = to_type_;
  using from_type = from_type_;
  using from_lim = std::numeric_limits<from_type>;
  using to_lim = std::numeric_limits<to_type>;
  static_assert(from_lim::is_iec559, "from_type should be IEEE754 type");
  static_assert(from_lim::radix == 2, "radix should be 2");
  static_assert(to_lim::is_integer, "to_type should be integer");
  static_assert(to_lim::radix == 2, "radix should be 2");
  static constexpr int diff_digits() {
    return to_lim::digits - from_lim::digits;
  }
  static constexpr to_type raw_lo() { return to_lim::lowest(); }
  static constexpr to_type raw_hi() { return to_lim::max(); }
  static constexpr to_type mask() {
    return 0 < diff_digits() //
               ? ~((to_type(1) << diff_digits()) - 1)
               : ~to_type(0);
  }
  static constexpr to_type lo() { return raw_lo() & mask(); }
  static constexpr to_type hi() { return raw_hi() & mask(); }
};

} // namespace can_represent_by_impl

template <typename to_type, typename from_type>
constexpr bool //
can_represent_by(from_type v) {
  using t = can_represent_by_impl::T<to_type, from_type>;
  return t::lo() <= v && v <= t::hi() && v == static_cast<to_type>(v);
}

補足

g++ だと __uint128_t とかがあるけど、numeric_limits<__uint128_t> が無いので、上記のコードは動かない。
動かしたければ、 digits とか lowest を得る方法を用意する必要がある。

まとめ

int の範囲外の浮動小数点数を int にキャストしたりすると未定義動作になって鼻から悪魔が出るかもしれないよ。
numeric_limits::max() との比較だけだと未定義動作を避けられないよ。
IEEE754 と 2の補数表現を前提にしてよければ numeric_limits を使って実装できるよ。

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up