LoginSignup
1

More than 5 years have passed since last update.

512bit 内の 128bit に含まれる32bit値4個をまぜまぜします。x86の伝統に従い、128bit 境界は超えられません。


#include <immintrin.h>
#include <stdio.h>

int in0[16] = {100,
               101,
               102,
               103,
               104,
               105,
               106,
               107,
               108,
               109,
               110,
               111,
               112,
               113,
               114,
               115};


int out[16];

static inline int __attribute__((always_inline))
test(const int imm)
{
    __m512i v = _mm512_loadu_si512(in0);
    v = _mm512_shuffle_epi32(v, imm);

    _mm512_storeu_si512(out, v);

    for (int i=0; i<16; i++) {
        printf("%2d:%3d\n", i, out[i]);
    }
}



int
main(void)
{
    test(_MM_SHUFFLE(3,3,3,3));
    puts("--");
    test(_MM_SHUFFLE(3,2,1,0));
    puts("--");
    test(_MM_SHUFFLE(0,0,2,2));
}
$ sde -- ./a.out 
 0:103
 1:103
 2:103
 3:103
 4:107
 5:107
 6:107
 7:107
 8:111
 9:111
10:111
11:111
12:115
13:115
14:115
15:115
--
 0:100
 1:101
 2:102
 3:103
 4:104
 5:105
 6:106
 7:107
 8:108
 9:109
10:110
11:111
12:112
13:113
14:114
15:115
--
 0:102
 1:102
 2:100
 3:100
 4:106
 5:106
 6:104
 7:104
 8:110
 9:110
10:108
11:108
12:114
13:114
14:112
15:112

明日は @tanakmura が vpermt2d について書きます。

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
1