Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

New posts in neon

ARM GCC bug? Uses chains of vldr instead of one vldmia…

gcc assembly arm neon

Sum all elements in a quadword vector in ARM assembly with NEON

math assembly arm neon

Loop takes more cycles to execute than expected in an ARM Cortex-A72 CPU

Efficient floating point comparison (Cortex-A8)

c++ c neon cortex-a8 arm7

LSB to MSB bit reversal on ARM

arm bit-manipulation neon

ARM Neon: How to convert from uint8x16_t to uint8x8x2_t?

c++ c arm vectorization neon

How can I optimize a looped 4D matrix-vector-multiplication with ARM NEON?

android c android-ndk arm neon

Compacting data in buffer from 16 bit per element to 12 bits

c arm simd neon

How to convert _mm_shuffle_ps SSE intrinsic to NEON intrinsic?

arm sse simd neon

On iOS how to quickly convert RGB24 to BGR24?

Summing 3 lanes in a NEON float32x4_t

ios arm simd neon intrinsics

Is there an advantage of specifying "-mfpu=neon-vfpv3" over "-mfpu=neon" for ARMs with separate pipelines?

gcc assembly arm neon armv7

Fastest way to test a 128 bit NEON register for a value of 0 using intrinsics?

neon

128-bit rotation using ARM Neon intrinsics

c rotation intrinsics neon

ARM and NEON can work in parallel?

SSE _mm_movemask_epi8 equivalent method for ARM NEON

arm sse neon

128bit hash comparison with SSE

Which one is better, gcc or armcc for NEON optimizations?

embedded arm simd neon cortex-a8

Detect ARM NEON availability in the preprocessor?

Arm NEON and poly8_t and poly16_t

c++ c arm neon intrinsics