Questions
Linux
Laravel
Mysql
Ubuntu
Git
Menu
HTML
CSS
JAVASCRIPT
SQL
PYTHON
PHP
BOOTSTRAP
JAVA
JQUERY
R
React
Kotlin
×
Linux
Laravel
Mysql
Ubuntu
Git
New posts in simd
How to optimise this 8-bit positional popcount using assembly?
May 16, 2022
go
assembly
x86
simd
avx
No speedup when summing uint16 vs uint64 arrays with NumPy?
Sep 05, 2022
python
numpy
performance
compiler-optimization
simd
SSE SIMD Optimization For Loop
Nov 09, 2022
visual-c++
sse
simd
OpenCL distribution
Nov 12, 2022
installation
cross-platform
distribution
opencl
simd
neon float multiplication is slower than expected
Aug 10, 2022
c++
gcc
arm
simd
neon
implict SIMD (SSE/AVX) broadcasts with GCC
Dec 31, 2021
gcc
sse
simd
avx
Fast SSE threshold algorithm
May 07, 2021
performance
algorithm
optimization
sse
simd
What is the floating-point (__m256d) version of the non-temporal streaming load intrinsic (_mm256_stream_load_si256)?
Nov 18, 2022
c++
x86
simd
intrinsics
avx2
How to speed up calculation of integral image?
Sep 07, 2022
c++
image-processing
sse
simd
avx
best way to shuffle across AVX lanes?
Dec 01, 2021
c++
x86
sse
simd
avx
GEMM kernel implemented using AVX2 is faster than AVX2/FMA on a Zen 2 CPU
May 11, 2022
assembly
matrix-multiplication
simd
avx
micro-optimization
SIMD C++ library
Sep 17, 2022
c++
gcc
simd
How to store the contents of a __m128d simd vector as doubles without accessing it as a union?
Jun 18, 2022
c
x86
simd
intrinsics
sse2
For for an SSE vector that has all the same components, generate on the fly or precompute?
Jul 01, 2022
c++
sse
simd
avx
How to write c++ code that the compiler can efficiently compile to SSE or AVX?
Jan 17, 2020
visual-c++
sse
simd
avx
auto-vectorization
Find the first instance of a character using simd
Mar 13, 2020
x86
sse
simd
avx
avx2
AVX2 instructions latency and throughput
Mar 22, 2022
performance
x86
x86-64
simd
avx2
Intel IACA analyzer alters assembly?
Sep 15, 2022
assembly
simd
avx2
iaca
Bitwise-AND Slower with SIMD than Scalar
Oct 19, 2022
performance
gcc
bit-manipulation
simd
scalar
What is the fastest way to do a SIMD gather without AVX(2)?
Aug 22, 2018
x86
sse
simd
sse4
« Newer Entries
Older Entries »