Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

New posts in avx2

Is it really efficient to use Karatsuba algorithm in 64-bit x 64-bit multiplication?

Which is the reason for avx floating point bitwise logical operations?

c++ simd avx avx2

gdb reverse debugging avx2

c gdb glibc avx2

uint32_t * uint32_t = uint64_t vector multiplication with gcc

c gcc vectorization avx2 gcc9

Getting GCC to generate a PTEST instruction when using vector extensions

c gcc vectorization sse avx2

How to do _mm256_maskstore_epi8() in C/C++?

c++ simd intrinsics avx avx2

AVX2 byte gather with uint16 indices, into a __m256i

c intrinsics avx pack avx2

Efficient (on Ryzen) way to extract the odd elements of a __m256 into a __m128?

What is the floating-point (__m256d) version of the non-temporal streaming load intrinsic (_mm256_stream_load_si256)?

c++ x86 simd intrinsics avx2

Find the first instance of a character using simd

x86 sse simd avx avx2

AVX2 instructions latency and throughput

performance x86 x86-64 simd avx2

Intel IACA analyzer alters assembly?

assembly simd avx2 iaca

how verify that operating system support avx2 instructions

AVX2 sparse matrix multiplication

_mm256_slli_si256: error "last argument must be an 8-bit intermediate"

c gcc simd avx avx2

Why doesn't Intel design its SIMD ISAs in a more compatible or universal way?

intel simd avx avx2 avx512

How to store lower or higher values from AVX/AVX2(YMM) register to memory like the SSE movlps/movhps does?

x86 sse simd avx avx2

How to use this macro to test if memory is aligned?