Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Huge OpenGL performance difference in Linux versus macOS, same hardware [closed]

I'm building an OpenGL application. The only inconvenient thing it does with OpenGL is that it uses a few (5 or more) rather large (2000x2000 and larger) textures. The rest is pretty default modern OpenGL 3.3 stuff (FBO's, VBO's, IBO's, VOA's, shaders, etc). Because these textures are so large, and require some bit depth more than 8-bit, I use the GL_R11F_G11F_B10F internal pixel format to reduce memory (however, changing this to something simple, does not help (see bottom)).

Now, here is the thing: the very exact same code, runs on Windows, Linux and macOS (I'm using SDL as abstraction layer). The performance difference between Linux and macOS on the same hardware (my late 2011 MacBook Pro 13", Intel HD Graphics 3000 @1280x800), same compiler (clang -O3 -mavx), is huge. On macOS, my frametime is about 30ms to 80ms. However, on Linux, it is a stunning 1ms to 4ms. Again, same laptop, just rebooted in different OS. Shrinking the application window to about 600x400, lowers the frametime to 13ms on macOS. Thus, it seems that the pixel shader/rasterisation is the bottleneck (my shader indeed is pretty complex).

I must say that I had better frame times in the past on macOS (around 13ms to 20ms). So, I'm getting really suspicious after discovering this, that Apple might be "degrading" the graphics driver for the Intel HD Graphics 3000 on purpose through system updates, to push costumers to buy new products. I must say that I have been thinking about buying a new laptop, but since I discovered this, a sudden disgust raised.

Now the question: what do you think what might be happening here? Buggy driver? Apple intentionally making things slower? An unoptimised GLSL compiler included in the driver? Or maybe some bad practice in my OpenGL code in the application? Is it common for drivers to have bad support for non-8-bit texture formats?

I just hate that the application is fantastic to use in Linux, and non-pleasant in macOS. The hardware is able to do better.


Some tests, as requested by @BDL:

  • Reducing textures in size by a factor 4 in every dimension (thus 16 times less memory, leaving us with a 500x500 texture, approximately), does not influence frame time.
  • Using GL_RGB8 or GL_SRGB8 as internal format doesn't influence frame time.
  • Reducing a lot of shader complexity does help: I can get it down to 8ms average when dropping a lot of computations in the fragment shader.

I'll try a glsl shader optimizer tomorrow: https://github.com/aras-p/glsl-optimizer Hopefully, this helps a little.

like image 460
Martijn Courteaux Avatar asked Dec 06 '25 20:12

Martijn Courteaux


1 Answers

What method exactly are you using to measure the frame rendering times? In my experiments regarding timing behaviour of various OpenGL implementations the Mesa / Intel HD drivers had about the most difficult to explain timing behaviour.

The Intel HD Graphics drivers for MacOS X are a completely different codebase (zero source code overlap!), written by a whole different development team (mostly Apple folks AFAIK).

Keep in mind that OpenGL employs an asynchronous execution model and that there is no hard specification on the exact timing of the buffer swap call. On Linux both AMD and NVidia OpenGL pretty much have …SwapBuffers block until V-Sync (if V-Sync is enabled). However I found the Mesa / Intel implementation to treat …SwapBuffers as just another queued command and the real block would happen only with the command queue filled up and a call being made that ultimately can be executed only after the buffer swap (like clearing the back buffer).

To cut a long story short I found the only reliable way to actually measure frame render-till-presentation¹ times by placing the glClear call right after …SwapBuffers (i.e. clearing for the next frame that's to come in the next iteration) and measure the time from rendering start until after that unusually placed glClear call.

Pure rendering times (without the presentation part) are better measured through query objects, anyway.

like image 164
datenwolf Avatar answered Dec 08 '25 15:12

datenwolf



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!