Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

New posts in sse

AVX/SSE version of xorshift128+

c performance sse avx

SSE and C++ containers

128-bit values - From XMM registers to General Purpose

assembly x86 sse

Bitwise xor of two 256-bit integers

sse simd avx

A better 8x8 bytes matrix transpose with SSE?

c matrix optimization sse simd

Why don't GCC and Clang use cvtss2sd [memory]?

Get sum of values stored in __m256d with SSE/AVX

c++ optimization sse avx avx2

SIMD programming languages

How to load a pixel struct into an SSE register?

c pixel x86-64 sse intrinsics intel

Testing equality between two __m128i variables

c x86 sse simd

How can I check if my installed numpy is compiled with SSE/SSE2 instruction set?

python numpy sse

How to properly use prefetch instructions?

Complex Mul and Div using sse Instructions

x86 sse simd complex-numbers

Proper way to enable SSE4 on a per-function / per-block of code basis?

xcode clang llvm sse

SSE: convert short integer to float

x86 sse simd

How to get GCC to use more than two SIMD registers when using intrinsics?

gcc assembly x86 sse simd

byte array permute SSE optimization

c++ gcc x86-64 sse simd

NEON vs Intel SSE - equivalence of certain operations

c++ c sse simd neon

indexing into an array with SSE

c sse simd

What's the fastest way to perform an arbitrary 128/256/512 bit permutation using SIMD instructions?

c++ assembly sse avx avx2