Questions Linux Laravel Mysql Ubuntu Git Menu

HTML CSS JAVASCRIPT SQL PYTHON PHP BOOTSTRAP JAVA JQUERY R React Kotlin

New posts in avx2

Sparse array compression using SIMD (AVX2)

Jun 01, 2018

c++ optimization simd sparse-array avx2

perf report shows this function "__memset_avx2_unaligned_erms" has overhead. does this mean memory is unaligned?

Oct 17, 2020

c++ profiling avx perf avx2

gcc auto vectorization control flow in loop

Feb 07, 2022

c gcc avx2 auto-vectorization

Is using AVX2 can implement a faster processing of LZCNT on a word array?

Oct 05, 2020

x86 simd avx micro-optimization avx2

AVX2, How to Efficiently Load Four Integers to Even Indices of a 256 Bit Register and Copy to Odd Indices?

Oct 07, 2018

x86 sse simd avx avx2

How to convert 32-bit float to 8-bit signed char? (4:1 packing of int32 to int8 __m256i)

Jan 24, 2022

c x86 simd intrinsics avx2

_mm_alignr_epi8 (PALIGNR) equivalent in AVX2

Sep 01, 2020

x86 simd intrinsics avx avx2

Loading 8 chars from memory into an __m256 variable as packed single precision floats

Jun 17, 2021

c++ sse simd avx avx2

Using a variable to index a simd vector with _mm256_extract_epi32() intrinsic

Feb 26, 2022

simd intrinsics avx avx2

Why do processors with only AVX out-perform AVX2 processors for many SIMD algorithms?

Sep 17, 2019

c# c++ simd avx avx2

Does /arch:AVX enable AVX2?

May 08, 2019

c++ visual-c++ visual-studio-2012 vectorization avx2

Best way to load/store from/to general purpose registers to/from xmm/ymm register

Nov 03, 2022

assembly x86 simd sse2 avx2

Fully utilizing pipelines on kaby lake

Nov 16, 2022

performance assembly x86-64 micro-optimization avx2

How to concatenate two vector efficiently using AVX2? (a lane-crossing version of VPALIGNR)

Mar 08, 2022

c simd intrinsics avx avx2

Counting 1 bits (population count) on large data using AVX-512 or AVX-2

Mar 19, 2022

assembly avx2 avx512 bitcount population-count

Shifting SSE/AVX registers 32 bits left and right while shifting in zeros

Nov 27, 2018

x86 sse simd avx avx2

Efficient way of rotating a byte inside an AVX register

Mar 06, 2022

c sse simd avx avx2

Count leading zero bits for each element in AVX2 vector, emulate _mm256_lzcnt_epi32

Mar 17, 2022

bit-manipulation simd avx avx2 avx512

Optimal SIMD algorithm to rotate or transpose an array

Nov 26, 2020

assembly intel simd transpose avx2

« Newer Entries Older Entries »