char型とint型の関係

Posted at 2024-11-08

はじめに

文字列探索のアルゴリズムのひとつであるBoyer-Moore法(BM法)を学習しているときに、とある疑問が浮かんだ。

まずは、BM法のコードを示す。(BM法の詳細は割愛)

BM法

package org.example;

public class StringSearch {
  public static int bmSearch(String text, String pattern) {
    int textLength = text.length();
    int patternLength = pattern.length();

    int[] skipTable = new int[256];  // 1byteの文字に対応
    for (int i = 0; i < 256; i++) {
      skipTable[i] = patternLength;
    }
    for (int i = 0; i < patternLength - 1; i++) {
      skipTable[pattern.charAt(i)] = patternLength - i - 1;
    }

    int i = patternLength - 1;
    while (i < textLength) {
      int j = patternLength - 1;
      while (j >= 0 && text.charAt(i - (patternLength - 1 - j)) == pattern.charAt(j)) { 
        j--;
      }
      if (j < 0) {
        return i - (patternLength - 1);
      }
      i += skipTable[text.charAt(i)];
    }
    return -1;
  }
}

疑問

skipTable[pattern.charAt(i)]やskipTable[text.charAt(i)]の部分でインデックスの指定にint型ではなくchar型の値が指定されている。
なぜ？int型じゃないとエラーが起きるのでは？
※ 文字列.charAt(i) は、文字列のi番目の文字を返すメソッド。

疑問に思い調べてみると、char型は整数のUnicode値を持っており、Javaでは暗黙的に int型へと自動変換されるため、配列のインデックスとして利用可能とのこと。
例えば、'A'はUnicodeで65、'g'はUnicodeで103のように変換される。

char → int

package org.example;

public class TypeCast {
  public static void char2int(char c) {
    int n = c;
    System.out.printf("「%c」 はUNICODEで %d%n", c, n);
  }

  public static void main(String[] args) {
    char2int('A');  // 「A」 はUNICODEで 65
    char2int('g');  // 「g」 はUNICODEで 103
  }
}

逆にint型をchar型にキャストしてみるとどうなるだろうか？
結果としては、65は'A'に、103は'g'に変換された。

int → char

package org.example;

public class TypeCast {
  public static void int2char(int n) {
    char c = (char)n;
    System.out.printf("「%d」 のchar型は %c%n", n, c);
  }

  public static void main(String[] args) {
    int2char(65);  // 「65」 のchar型は A
    int2char(103);  // 「103」 のchar型は g
  }
}

まとめ

char型とint型を相互に変換することは可能。

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up