Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

New posts in sse

For XMM/YMM FP operation on Intel Haswell, can FMA be used in place of ADD?

sse avx throughput flops fma

What is the difference between these 128bit SIMD xor operations

simd sse intrinsics sse2

Determine cause of segfault when using -O3?

c++ gdb sse gcc4.9

access violation _mm_store_si128 SSE Intrinsics

intel c++ x86 simd sse intrinsics

AVX scalar operations are much faster

intel c memory x86 sse avx

Most efficient way to convert vector of uint32 to vector of float?

SSE2 instruction to typecast an integer register to short register and vice-versa

x86 sse simd sse2

Is there a way to utilize all XMM registers?

Implement a near real-time CPU capability like glAlphaFunc(GL_GREATER) with RGB source and RGBA overlay

c++ opengl assembly sse rgba

Setting last or first n bits in SSE register

c++ x86 sse simd intrinsics

Translating SSE to Neon: How to pack and then extract 32bit result

c++ arm sse neon intrinsics

AVX/SSE round floats down and return vector of ints?

c++ intel sse intrinsics avx

Shuffle AVX 256 Vector elements by 1 position left/right - C intrinsics

c sse hpc intrinsics avx

Why does AES in SSE not provide full function?

glibc and SSE functionality

c performance sse

Storing individual doubles from a packed double vector using Intel AVX

x86 x86-64 sse avx

bool judgement is so slow? [closed]

c++ c optimization sse

Why movlps and movhps SSE instructions are faster than movups for transferring misaligned data?

optimization assembly sse

how invert __m128 into ints

c++ sse

AVX 256-bit code performing slightly worse than equivalent 128-bit SSSE3 code

c++ performance sse avx2