Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

New posts in sse

SSE intrinsics - comparison if/else optimization

c++ sse intrinsics

Fastest way to compare one byte array with many others?

c algorithm assembly x86-64 sse

Fast transposition of an image and Sobel Filter optimization in C (SIMD)

c optimization sse simd

SSE: unaligned load and store that crosses page boundary

"Safe" SIMD arithmetic on aligned vectors of odd size?

Loading non contiguous values with Intel SIMD SSE

assembly x86 intel sse simd

SSE with doubles, not worth it?

Shifting SSE/AVX registers 32 bits left and right while shifting in zeros

x86 sse simd avx avx2

Efficient way of rotating a byte inside an AVX register

c sse simd avx avx2

How compilers treat SSE (or any) intrinsic functions?

SSE: reciprocal if not zero

c normalization sse

_mm_shuffle_ps() equivalent for integer vectors (__m128i)?

c sse

Have different optimizations (plain, SSE, AVX) in the same executable with C/C++

How to optimize C-code with SSE-intrinsics for packed 32x32 => 64-bit multiplies, and unpacking the halves of those results for (Galois Fields)

c optimization x86 sse simd

SSE multiplication of 2 64-bit integers

x86 sse simd multiplication sse2

What does this x86 assembly instruction do (addsd xmm0, ds:__xmm@41f00000000000000000000000000000[edx*8])?

assembly x86 sse

Profiling SIMD Code

c++ c sse simd

How can I set __m128i without using of any SSE instruction?

c++ constants sse simd sse2

sqrt of uint64_t vs. int64_t

SSE2 code optimization

c++ sse simd intrinsics sse2