Questions Linux Laravel Mysql Ubuntu Git Menu

HTML CSS JAVASCRIPT SQL PYTHON PHP BOOTSTRAP JAVA JQUERY R React Kotlin

New posts in avx

What is the most efficient way to clear a single or a few ZMM registers on Knights Landing?

Feb 22, 2022

assembly avx xeon-phi avx512 knights-landing

Packing and de-interleaving two __m256 registers

Apr 18, 2022

c++ x86 simd avx avx2

How to do an indirect load (gather-scatter) in AVX or SSE instructions?

Aug 19, 2022

c vector intel sse avx

Why both? vperm2f128 (avx) vs vperm2i128 (avx2)

Nov 15, 2022

intel simd avx avx2

Is it useful to use VZEROUPPER if your program+libraries contain no SSE instructions?

Nov 03, 2022

performance assembly x86 avx micro-optimization

Is it okay to mix legacy SSE encoded instructions and VEX encoded ones in the same code path?

Nov 08, 2014

assembly x86 sse avx intel

Is it possible to use SIMD instructions in Rust?

Feb 07, 2022

rust simd avx avx2

When using a mask register with AVX-512 load and stores, is a fault raised for invalid accesses to masked out elements?

Mar 31, 2022

x86 avx avx512

Is vxorps-zeroing on AMD Jaguar/Bulldozer/Zen faster with xmm registers than ymm?

Jan 06, 2022

assembly x86 avx micro-optimization amd-processor

what's the difference between _mm256_lddqu_si256 and _mm256_loadu_si256

Feb 04, 2022

x86 simd intrinsics avx micro-optimization

Using AVX with GCC - avxintrin.h missing

Mar 08, 2022

c++ gcc avx

AVX/SSE version of xorshift128+

Apr 12, 2022

c performance sse avx

L1 memory bandwidth: 50% drop in efficiency using addresses which differ by 4096+64 bytes

May 02, 2022

c caching memory x86 avx

is there an inverse instruction to the movemask instruction in intel avx2?

Dec 05, 2021

x86 intrinsics avx avx2 icc

Bitwise xor of two 256-bit integers

Nov 17, 2022

sse simd avx

Fastest Implementation of Exponential Function Using AVX

Sep 14, 2019

x86 simd avx exponential avx2

Get sum of values stored in __m256d with SSE/AVX

Feb 09, 2022

c++ optimization sse avx avx2

Why is GCC's AVX slower while LLVM's faster?

Mar 28, 2022

gcc assembly llvm julia avx

What's the fastest way to perform an arbitrary 128/256/512 bit permutation using SIMD instructions?

Mar 07, 2021

c++ assembly sse avx avx2

8 bit shift operation in AVX2 with shifting in zeros

Jan 20, 2018

c sse simd avx avx2

« Newer Entries Older Entries »