Help us understand the problem. What is going on with this article?

vpshud

More than 5 years have passed since last update.

512bit 内の 128bit に含まれる32bit値4個をまぜまぜします。x86の伝統に従い、128bit 境界は超えられません。

#include <immintrin.h>
#include <stdio.h>

int in0[16] = {100,
               101,
               102,
               103,
               104,
               105,
               106,
               107,
               108,
               109,
               110,
               111,
               112,
               113,
               114,
               115};


int out[16];

static inline int __attribute__((always_inline))
test(const int imm)
{
    __m512i v = _mm512_loadu_si512(in0);
    v = _mm512_shuffle_epi32(v, imm);

    _mm512_storeu_si512(out, v);

    for (int i=0; i<16; i++) {
        printf("%2d:%3d\n", i, out[i]);
    }
}



int
main(void)
{
    test(_MM_SHUFFLE(3,3,3,3));
    puts("--");
    test(_MM_SHUFFLE(3,2,1,0));
    puts("--");
    test(_MM_SHUFFLE(0,0,2,2));
}
$ sde -- ./a.out 
 0:103
 1:103
 2:103
 3:103
 4:107
 5:107
 6:107
 7:107
 8:111
 9:111
10:111
11:111
12:115
13:115
14:115
15:115
--
 0:100
 1:101
 2:102
 3:103
 4:104
 5:105
 6:106
 7:107
 8:108
 9:109
10:110
11:111
12:112
13:113
14:114
15:115
--
 0:102
 1:102
 2:100
 3:100
 4:106
 5:106
 6:104
 7:104
 8:110
 9:110
10:108
11:108
12:114
13:114
14:112
15:112

明日は @tanakmura が vpermt2d について書きます。

Why not register and get more from Qiita?
  1. We will deliver articles that match you
    By following users and tags, you can catch up information on technical fields that you are interested in as a whole
  2. you can read useful information later efficiently
    By "stocking" the articles you like, you can search right away