LoginSignup
2
2

More than 5 years have passed since last update.

vpermt2d

Posted at

レジスタ2本をくっつけてできる 32bit整数 32要素をまぜまぜします。

こっちは128bit境界どころか、256bit,512bit境界も超えられます。やったね。もうこれだけあれば十分なんじゃね?(まあレジスタに値をロードするのがつらい場面はありそうだけど)

#include <immintrin.h>
#include <stdio.h>

int in0[16] = {100,
               101,
               102,
               103,
               104,
               105,
               106,
               107,
               108,
               109,
               110,
               111,
               112,
               113,
               114,
               115};

int in1[16] = {116,
               117,
               118,
               119,
               120,
               121,
               122,
               123,
               124,
               125,
               126,
               127,
               128,
               129,
               130,
               131};


int table[16] = { 8,   8, 25, 31,
                  10, 10,  9, 15,
                  0,   0, 16, 16,
                  24, 24, 28, 28};

int out[16];

int
main()
{
    __m512i v0 = _mm512_loadu_si512(in0);
    __m512i v1 = _mm512_loadu_si512(in1);
    __m512i t = _mm512_loadu_si512(table);

    __m512i v = _mm512_permutex2var_epi32(v0, t, v1);

    _mm512_storeu_si512(out, v);

    for (int i=0; i<16; i++) {
        printf("%2d:%3d\n", i, out[i]);
    }
}

 $ sde -- ./a.out
 0:108
 1:108
 2:125
 3:131
 4:110
 5:110
 6:109
 7:115
 8:100
 9:100
10:116
11:116
12:124
13:124
14:128
15:128

明日は @tanakmura が vpshufb について書きます。

2
2
0

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
2
2