1
1

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?

More than 5 years have passed since last update.

vpternlogd

Last updated at Posted at 2014-12-06

3項論理演算します。

3入力、1出力のLUTがあれば、3入力の論理演算を実現できます。(これを応用したのがFPGAと言える)

例えば、 (A | B) & C は、

A B C (A or B) &C
0 0 0 0
0 0 1 0
0 1 0 0
0 1 1 1
1 0 0 0
1 0 1 1
1 1 0 0
1 1 1 1

という感じですね。

これを実現するのが、vpternlogd (intrinsic は _mm512_ternarylogic_epi32) 等です。

3つのベクタレジスタと、8bitの即値を入力にとります。

各ベクタレジスタから 1bit ずつとってきて、3bit のインデックスを作ります。8bitの即値から、そのインデックスに対応するビットを出力します。

これを使うと、複数の論理演算を一命令で実現できる可能性が出てきます。

#include <immintrin.h>
#include <stdio.h>

unsigned int data1[16] = {0x00000fff};
unsigned int data2[16] = {0x0000faaa};
unsigned int data3[16] = {0x000000ff};
                   
unsigned int out[16];

static inline unsigned int __attribute__((always_inline))
test(const unsigned int table)
{
    __m512i a = _mm512_loadu_si512(data1);
    __m512i b = _mm512_loadu_si512(data2);
    __m512i c = _mm512_loadu_si512(data3);
    __m512i d;

    d = _mm512_ternarylogic_epi32(a,b,c,table); // 4番目の引数は定数必須なので-O2とかでビルドすること
    _mm512_storeu_si512(out, d);

    for (int i=0; i<8; i++) {
        printf("table[%d] = %d\n", i, (table>>i)&1);
    }

    for (int i=0; i<16; i++) {
        unsigned int bit1 = (data1[0]>>i)&1;
        unsigned int bit2 = (data2[0]>>i)&1;
        unsigned int bit3 = (data3[0]>>i)&1;

        unsigned int bit4 = (out[0]>>i)&1;

        printf("0b%d%d%d = %d\n", bit1, bit2, bit3, bit4);
    }

}

int
main()
{
    test(0b01010101);
    puts("--");
    test(0b00001111);
}
$ ./sde -- ./a.out
table[0] = 1
table[1] = 0
table[2] = 1
table[3] = 0
table[4] = 1
table[5] = 0
table[6] = 1
table[7] = 0
0b101 = 0
0b111 = 0
0b101 = 0
0b111 = 0
0b101 = 0
0b111 = 0
0b101 = 0
0b111 = 0
0b100 = 1
0b110 = 1
0b100 = 1
0b110 = 1
0b010 = 1
0b010 = 1
0b010 = 1
0b010 = 1
--
table[0] = 1
table[1] = 1
table[2] = 1
table[3] = 1
table[4] = 0
table[5] = 0
table[6] = 0
table[7] = 0
0b101 = 0
0b111 = 0
0b101 = 0
0b111 = 0
0b101 = 0
0b111 = 0
0b101 = 0
0b111 = 0
0b100 = 0
0b110 = 0
0b100 = 0
0b110 = 0
0b010 = 1
0b010 = 1
0b010 = 1
0b010 = 1

明日からは @tanakmura がみんな大好きシャッフル命令を一個ずつ説明していく可能性があります。

1
1
0

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
1
1

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?