Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

New posts in micro-optimization

C++ Adding 2 arrays together quickly

Fast Euclidean division in C

Avoiding AVX-SSE (VEX) Transition Penalties

Why is using structure Vector3I instead of three ints much slower in C#?

For loop performance: counters with same value vs. different values

Are there any performance test results for usage of likely/unlikely hints?

Using bools in calculations to avoid branches

Two's complement of long integer

Why are these 8 byte-writes not optimized into a MOV?

How to force NASM to encode [1 + rax*2] as disp32 + index*2 instead of disp8 + base + index?

Most efficient popcount on `__uint128_t`?

What's the easiest way to determine if a register's value is equal to zero or not?

Difference between "or eax,eax" and "test eax,eax" [duplicate]

Which Intel microarchitecture introduced the ADC reg,0 single-uop special case?

Why are bitwise operators slower than multiplication/division/modulo?

Is thread time spent in synchronization too high?

Does calling the constructor of an empty class actually use any memory?

Faster implementation of Math.round?

Java: micro-optimizing array manipulation