Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

New posts in sse

G++ SSE memory alignment on the stack

Does the Linux kernel have its own SSE/AVX context?

Optimizing variable-length encoding

c++ c assembly sse

Does compiler use SSE instructions for a regular C code?

Fastest way to expand bits in a field to all (overlapping + adjacent) set bits in a mask?

c assembly x86 sse avx

Is an __m128i variable zero?

c++ c intel sse simd

SIMD signed with unsigned multiplication for 64-bit * 64-bit to 128-bit

Strict aliasing, -ffast-math and SSE

Compute the absolute difference between unsigned integers using SSE

c++ unsigned sse

Horizontal minimum and maximum using SSE

c++ max sse minimum avx

Using SIMD on amd64, when is it better to use more instructions vs. loading from memory?

How do you populate an x86 XMM register with 4 identical floats from another XMM register entry?

c++ c x86 inline-assembly sse

How to allocate 16byte memory aligned data

c memory sse icc

What is the fastest way to test if a double number is integer (in modern intel X86 processors)

c optimization assembly x86 sse

Fast counting the number of set bits in __m128i register

c sse simd sse2 hammingweight

Using SSE instructions with gcc without inline assembly

c x86-64 sse simd intrinsics

Can CUDA use SIMD extensions?

cuda gpu sse simd vectorization

Intel SSE: Why does `_mm_extract_ps` return `int` instead of `float`?

c sse simd

How to negate (change sign) of the floating point elements in a __m128 type variable?

c x86 vectorization sse simd

How to divide 16-bit integer by 255 with using SSE?