Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

New posts in sse

SSE vectorization of math 'pow' function gcc

How do declare a memory range as uncacheable using gcc on x86 platform?

gcc assembly x86 sse

How can I add together two SSE registers

c++ c intel sse avx2

SSE2: Double precision log function

c++ c optimization sse simd

Check XMM register for all zeroes

c++ sse simd intrinsics

Vectorizing Dot Product Calculation using SSE4

c performance sse dot-product

How to efficiently combine comparisons in SSE?

c optimization assembly sse avx

Do all CPUs which support AVX2 also support SSE4.2 and AVX?

sse simd avx avx2

C++ use SSE instructions for comparing huge vectors of ints

c++ vector sse

Storing two x86 32 bit registers into 128 bit xmm register

assembly x86 simd sse

SSE optimized emulation of 64-bit integers

c++ optimization x86 64-bit sse

Bitwise cast from __m128 to __m128i on MSVC

visual-studio sse

What are the 128-bit to 512-bit registers used for?

Most efficient way to check if all __m128i components are 0 [using <= SSE4.1 intrinsics]

c++ integer sse simd intrinsics

AVX2 slower than SSE on Haswell

c++ x86 sse simd avx2

how to work with 128 bits C variable and xmm 128 bits asm?

c sse simd

SSE micro-optimization instruction order

Initializing an __m128 type from a 64-bit unsigned int

c++ sse intrinsics

How to optimize "u[0]*v[0] + u[2]*v[2]" code line with SSE or GLSL

c++ c optimization sse glm-math

Unable to detect why the following piece of code was not vectorized

c sse vectorization icc stencils