Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

New posts in avx

is there an inverse instruction to the movemask instruction in intel avx2?

x86 intrinsics avx avx2 icc

Bitwise xor of two 256-bit integers

sse simd avx

Fastest Implementation of Exponential Function Using AVX

x86 simd avx exponential avx2

Get sum of values stored in __m256d with SSE/AVX

c++ optimization sse avx avx2

Why is GCC's AVX slower while LLVM's faster?

gcc assembly llvm julia avx

What's the fastest way to perform an arbitrary 128/256/512 bit permutation using SIMD instructions?

c++ assembly sse avx avx2

8 bit shift operation in AVX2 with shifting in zeros

c sse simd avx avx2

Disabling AVX2 in CPU for testing purposes

Does the Linux kernel have its own SSE/AVX context?

Fastest way to expand bits in a field to all (overlapping + adjacent) set bits in a mask?

c assembly x86 sse avx

What's the difference between vextracti128 and vextractf128?

x86 simd avx avx2

Horizontal minimum and maximum using SSE

c++ max sse minimum avx

Using SIMD on amd64, when is it better to use more instructions vs. loading from memory?

Half-precision floating-point arithmetic on Intel chips

Unexpectedly good performance with openmp parallel for loop

Aligned and unaligned memory access with AVX/AVX2 intrinsics

gcc avx avx2

Efficiently find least significant set bit in a large array?

Difference between the AVX instructions vxorpd and vpxor

vectorization intel xor simd avx

Which versions of Windows support/require which CPU multimedia extensions? (How to check if SSE or AVX are fully usable?)

windows assembly sse avx avx512

Are older SIMD-versions available when using newer ones?

c++ c sse simd avx