Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

New posts in sse

Using std::atomic with aligned classes

c++ c++11 sse

Why does gcc/clang use two 128bit xmm registers to pass a single value?

c++ c assembly clang sse

When program will benefit from prefetch & non-temporal load/store?

c sse prefetch temporal

Am I breaking strict aliasing rules?

c++ c++11 sse strict-aliasing

8 bit shift operation in AVX2 with shifting in zeros

c sse simd avx avx2

G++ SSE memory alignment on the stack

Does the Linux kernel have its own SSE/AVX context?

Optimizing variable-length encoding

c++ c assembly sse

Does compiler use SSE instructions for a regular C code?

Fastest way to expand bits in a field to all (overlapping + adjacent) set bits in a mask?

c assembly x86 sse avx

Is an __m128i variable zero?

c++ c intel sse simd

SIMD signed with unsigned multiplication for 64-bit * 64-bit to 128-bit

Strict aliasing, -ffast-math and SSE

Compute the absolute difference between unsigned integers using SSE

c++ unsigned sse

Horizontal minimum and maximum using SSE

c++ max sse minimum avx

Using SIMD on amd64, when is it better to use more instructions vs. loading from memory?

How do you populate an x86 XMM register with 4 identical floats from another XMM register entry?

c++ c x86 inline-assembly sse

How to allocate 16byte memory aligned data

c memory sse icc

What is the fastest way to test if a double number is integer (in modern intel X86 processors)

c optimization assembly x86 sse

Fast counting the number of set bits in __m128i register

c sse simd sse2 hammingweight