3D explorer and WebWorkers

Published on
Last changed on

Update December 1st: the workers are now live and… seem to be working quite well! :-)

I have had some recent feedback from several people, about something that had been nagging me for a litte while: during a star's approach, the browser/graphics stutter. This is noticeable on almost any machine and, while it never lasts for very long (between 50 and 150 milliseconds depending on the hardware), it is obvious enough to degrade the immersion/fluidity/whatever. It does not feel nice.

That is the reason why I am currently working on another series of improvements to the 3D explorer. As a reminder, this 3D tool is only an experiment in various aspects for what hopefully will become part of a browser-based authoritative server game:

  • It should provide a sense of scale
  • It should offer simple, yet acceptable graphics (i.e., you won't need a dual 4090 GTX to run it)
  • It will be reasonably slow-paced, with hopefully intense sequences of pure adrenaline and fear of loss

I am afraid that these improvements are not linked to pure eye candy: I have not made enough progress on procedural textures and the like yet… However, they are an attempt at laying the groundwork for a more easily extensible framework/game.

WebWorkers on Adlumens

The reason for the stuttering mentioned above was the succession of the following events in a (very) short period of time:

  1. The client would receive some data form the server
  2. It would unpack the message(s) into javascript data structures (arrays, mostly)
  3. It would then insert a fair amount of meshes into the scene and start rendering them. These meshes include: the planets and their satellites, each with a three-level visual LOD; the orbits; and the asteroid belts and planetary rings
  4. It would then populate the DOM with various elements, what's more extremely dynamic as they follow the bounding sphere projections of most meshes in the scene (within the frustum), so are very frequently updated (and DOM access is somewhat expensive when done every frame)

All of this is still happening, and the frontload was, frankly, a little bit overwhelming … for the main thread. Well, main as in the sole thread.

The solution

I can say with a reasonable amount of confidence that WebWorkers are the solution, as development on local seems to prove it. I am about 80% there as of writing this.
The stuttering has dissapeared on my personal machine and others I tested on. Hopefully the same will happen to you! :-)

Here is a very high level architecture of how the work previously handled in one thread is now spread across several workers:

WorkerRolesLoad
Network Opens up the websocket connection to the server, receives and send messages, but also packs and unpacks them. In the extreme majority of cases, it forwards messages to the main thread via postMessage In the case of short messages, the time is absolutely negligible; however, for larger payload (mostly incoming from the server such as initial LOD_SYSTEM data) which can reach 500KB (yes…), then the time taken is no longer negligible.
I have witnessed time taken spikes of around 50ms, although arguably the largest payload are usually unpacked in around 15-20 ms.
Either way, offloading that work is always a saving of frame CPU time for the main thread…
Render Does the heavy lifting: mesh management, rendering, etc - all in an offscreen canvas. But all under the control of the main thread (itself at least partly under the control of the server). Heavy load, although I am constantly trying to tighten loops, flatten lists and generally deal with smaller payloads and/or lighter geometries.
Main Spawns the two workers and acts as an orchestrator between them:
  • Forwards messages when and if needed, potentially enriching them before passing them on
  • Acts as the origin and recipient of client-side-only messages (mostly between it and the render thread but also happens with the network thread)
Of course, also handles all the DOM interactions, event listening, etc.
Reasonable, as a lot is no longer done here and delegated to the workers.

But there is a trick: BabylonJS also runs on the main thread. I, too, initially thought that all was going to seamlessly move from the sole thread world to that of workers.
But its did not work like that. As documented in the relevant BJS documentation:

The main caveat of using Babylon.js with an offscreen canvas in a worker thread is that you will need to do special work to communicate with the engine from the main thread. You will have to use the same messaging API we used before to communicate between threads.

Furthermore, Babylon.js will not be able to handle inputs for you and so APIs like camera.attachControls() will not work and you will have to message inputs to workers.

When I read this I though "Well f*** me twice and upside down, how am I going to do this?" So I searched around a little bit and actually found some pretty cool tricks and quite awesome implementations such as this (read the whole thread (haha), it is worth it).

But I didn't feel like implementing something that felt way too hackish (even for me) and so prone to breaking at the slighest change in API support or other types of evolution.

A little trick

Instead of message inputs to workers, I chose to let them be handled locally in the main thread, and pass the outcome to the render worker.

I hear saying "what is the points of having a worker if you are going to run BJS in the main thread anyway?" Yes it sounds inefficient and borderline stupid but check this:

StepComments
Initialise an empty engine, canvas & scene, parallel to the main offscreen canvas The engine has all the options to make it as light as possible, etc. I think. 8-)
The "proxy" has the same dimensions as the offscren canvas; this is ensured via message passing between the main thread and the render thread.
Initialise a FlyCamera … on the main thread and linked to the "proxy" canvas. In this way, the camera's controls, etc, are fully leveraged. This is actually the only BJS element/entity running on the scene/canvas hosted by the main thread.
Another advantage is that all custom key mappings can be managed and applied from here. The rendering thread does not have to know that you use W or T for forward movement.
Synchronise Every once in a while, the main thread communicates the following to the render thread:
  • local FlyCamera position
  • local FlyCamera rotation
And the render thread automatically applies the relevant changes. Given the cameras in both threads have the same FOV, the rendering is in sync.
The main thread's camera position is also sent periodically to the server via the network thread, which replies accordingly.

The WebGL context in the main thread is extremely light: it consumes very little memory and CPU and I do not think the GPU is used by it at all as it does not render any mesh; and if it does, it must be in an extremely minor way.

Why is my fan making so much noise?

… I asked myself after offloading the rendering to the dedicated worker.

The reason was that a main thread, when not "visible" automatically pauses activity triggered by requestAnimationFrame and the like. But, to my knowledge at least, it does not do so for code being executed in a worker.

The solution I found was to listen to the visibilitychange event on the main thread and notify the render worker, whose engine render loop is conditioned by a global flag very easy and quick to toggle from and to true and false :-)

I will release this new version in the next few days, and hope everyone's experience will be all the better for it!

Please signin to add your comment.