Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

New posts in avx

Websocket data unmasking / multi byte xor

c x86 sse simd avx

Does VS2010 SP1 support only part of the AVX instruction set?

Difference between _mm256_xor_si256() and _mm256_xor_ps()

intrinsics avx avx2

C++ AVX2 Instrinsic function Non-Standard Size

c++ simd intrinsics avx avx2

Different semantic of comparison intrinsic instructions in avx512?

c++ sse intrinsics avx avx512

Integer dot product using SSE/AVX?

c++ vectorization sse simd avx

Unpack 12-bit data quickly (where the nibbles aren't contiguous; how to shuffle nibbles?)

c# c++ avx avx2 pixelformat

Intel vector instruction to zero-extend 8 4-bit values packed in a 32-bit int to a __m256i?

sse avx avx2

How to implement 16 and 32 bit integer insert and extract operations with AVX-512?

intrinsics avx avx512

how abundant is hardware support for FMA instruction set

x86 hardware sse simd avx

AVX equivalent for _mm_movelh_ps

c++ sse intrinsics avx

Add saturate 32-bit signed ints intrinsics?

Mixing SSE with AVX128 for shorter instructions?

Is there a more efficient way to broadcast 4 contiguous doubles into 4 YMM registers?

gcc intel simd intrinsics avx

Best way to mask a single bit in AVX2?

c x86 simd avx avx2

Simple AVX512 dot-product loop only 10.6x faster, expected 16x

AVX2: U8 absolute difference

sse simd neon avx avx2

How can I do efficiently bitwise majority voting on 3, 5, 7, 9 inputs with SSE/SSE2/AVX/...?

assembly sse avx neon avx512

avx three operands for sqrt?

Convention for displaying vector registers

x86 sse simd avx