Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

New posts in sse

SSE3 intrinsics: How to find the maximum of a large array of floats

c++ sse intrinsics

Setting __m256i to the value of two __m128i values

c sse simd avx

Loading 8 chars from memory into an __m256 variable as packed single precision floats

c++ sse simd avx avx2

Shuffling by mask with Intel AVX

c++ sse simd intrinsics avx

Control flow divergence in SIMT and SIMD

cuda sse simd

Are there SIMD(SSE / AVX) instructions in the x86-compatible accelerators Intel Xeon Phi?

intel sse simd avx intel-mic

Faster lookup tables using AVX2

Does using mix of pxor and xorps affect performance?

assembly x86 sse simd

What is the minimum supported SSE flag that can be enabled on macOS?

Is casting to simd-type undefined behaviour in C++? [duplicate]

GCC - How to realign stack?

c gcc stack pthreads sse

What's the most efficient way to load and extract 32 bit integer values from a 128 bit SSE vector?

c gcc sse simd

Saturated substraction - AVX or SSE4.2

c gcc optimization sse avx

Writing a portable SSE/AVX version of std::copysign

c++ x86-64 sse simd avx

How to convert byte array of image pixels data to grayscale using vector SSE operation

Get GCC to preserve an SSE register throughout a function that uses inline asm

SSE _mm_movemask_epi8 equivalent method for ARM NEON

arm sse neon

How to reverse an __m128 type variable?

c++ c x86 sse simd

Why is the compiler generating a push/pop instruction pair?

c assembly x86 sse

XMM Registers Total or Per Core