Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

New posts in avx

Memory argument of VMOVDQU partially out of allocated range

How to convert int 64 to int 32 with avx (but without avx-512)

simd sse avx

AVX2 integer comparison for smaller equal

c integer compare avx avx2

Macro for generating immediates for AVX shuffle intrinsics

c macros intel intrinsics avx

optimising column-wise maximum with SIMD

c++ sse simd intrinsics avx

Find Absolute in AVX

Force compiler to use memory operand from Intrinsics

c memory intrinsics avx operands

AVX-512 Instruction Encoding - {er} Meaning

assembly x86 avx avx512

Improving a recursive hadamard transformation

c simd avx

No insert and extract for float/double in SSE and AVX?

c++ floating-point sse simd avx

Why won't simple code get auto-vectorized with SSE and AVX in modern compilers?

Why gcc is so much worse at std::vector<float> vectorization of a conditional multiply than clang?

Penalty for switching from SSE to AVX?

c++ sse avx sse2

Getting wrong results with using AVX instructions and -O3 compiling option

c compiler-optimization avx

Testing whether AVX register contains some equal integer numbers

c++ x86 simd avx avx2

Why is this code using VMULPD to write registers that will be overwritten by VFMADD? Isn't that useless?

assembly avx fma

Vectorization of modulo multiplication

c++ algorithm sse simd avx

How to run bitwise OR on big vectors of u64 in the most performant manner?

c++ performance assembly cpu avx

_mm256_fmadd_ps is slower than _mm256_mul_ps + _mm256_add_ps?

Why is (V)SHUFPS not in Intel's constant time instruction list?