[GSoC] WebAssembly for collisions idea

mo7sen · March 9, 2020, 12:46am

 -------------------------------------------
|  # of objects  |     1  |     2  |     0  |
|----------------|--------|--------|--------|
|  # times called|  1000  |   500  |  100000|
|----------------|--------|--------|--------|
|  Time          | 1.4ms  | 1.04ms |  72ms  |
 -------------------------------------------

So it seems that simply calling the function with an empty array takes about 800ns.

4ian · March 10, 2020, 10:38pm

Alright
We could go ahead with measurements by taking a GDevelop exported game and try to include this “fake library”, calling it at each frame. Would be great to see the impact on the game and leans toward a more realistic implementation…

What do you think?

So it seems that simply calling the function with an empty array takes about 800ns .

We really need measurements that can be shown with a precision that is more than a milliseconds if possible. There is no way for the measurements to be exactly 3ms

This still means that calling the wasm function 1000 times will take roughly 3ms, so 3/16 = 19% of the time we have to render a frame… can be a lot.

That’s where we have to be smart: in my implementation so far, I tried to minize the calls to the update method and the functions to check the collisions.
This being said, if somehow someone is checking the collision between 1 object and another, and this for 1000 objects, there will be 1000 calls.

Hence another question: should we have an implementation in wasm and another in JavaScript (basically the one I wrote), and change which one we use according to the number of objects or number of calls? Ideally, we would NOT do this because it’s using the double of memory - just throwing this idea to see if this can generate other ideas.

mo7sen · March 11, 2020, 12:46pm

Most of the time is wasted because of deserializing the data into a structure (GDObjectPosition). If we can ensure that all data passed are of numeric types, the deserializing should not be needed and will lead to a speedup of at least 10x-15x when 1 object is passed. There are probably a few more ways that we can use to get rid of the deserialization’s overhead.

I also found a couple of RTree crates for rust. They seem to be well built and they have bulk_load-ing. There are still no guarantees on the time it would take for the "load’ operation and it will almost always be slower than what it currently is. Nonetheless, if the deserializing overhead is taken care of, it should be quite performant.

4ian · March 12, 2020, 1:22pm

Interesting! Hopefully we could limit all data to be numeric types.
See GDevelop/objectpositionsmanager.js at 44e10cc9cb15510a96eea956fa3c6c9a7f13eb0b · 4ian/GDevelop · GitHub

What I think is needed (at least in my PR) is:

the object id,
the object name id, (identifier corresponding to the object name)
x, y position
center x and y position
the hitboxes, which is a list of polygons. A polygon is described by a list of vertices (x,y)
the AABB of the object, which is 4 numbers (min X, min Y, max X, max Y).

In theory, that’s all you need to manage the collisions in a wasm (or JS) library. It should be all numeric data. The most complex thing is the hitboxes, which is a list of polygon, and a polygon a list of vertices, which are two numbers.

The rest is not dynamic data, just numbers… so we could imagine passing these structured numbers directly to Wasm, so that the serialization overhead will be minimal?

All the operations are:

Load object positions (a list of what I described)
Remove object position (a list of object ids)
Update object positions (possibly can be a remove and then load, that’s how it’s done here)
Searching in the RTree, given an AABB search area (4 numbers) and a list of object ids (list of numbers): GDevelop/objectpositionsmanager.js at 44e10cc9cb15510a96eea956fa3c6c9a7f13eb0b · 4ian/GDevelop · GitHub - this should return a list of ids.

Basically, I described how we rewrite gdjs.RTreeObjectPositionsContainer to wasm: GDevelop/objectpositionsmanager.js at 44e10cc9cb15510a96eea956fa3c6c9a7f13eb0b · 4ian/GDevelop · GitHub

Note that this might still do a lot of calls to wasm. In which case we might want to put more things in Wasm (more methods of gdjs.ObjectPositionsManager).

To continue, we could look at making a wasm version of gdjs.ListObjectPositionsContainer or gdjs.RTreeObjectPositionsContainer - the first is a container of objects positions coded in JS but without any spatial data structure (like in old versions of GD).

You can replace which container is used here: GDevelop/objectpositionsmanager.js at 44e10cc9cb15510a96eea956fa3c6c9a7f13eb0b · 4ian/GDevelop · GitHub

So you could easily swap your Rust made container

mo7sen · March 23, 2020, 2:51pm

I just updated some of the benchmarks to get more precision with measurements.

Also, I created a function for adding a single object to the position manager that is a lot faster as it doesn’t have to deal with a list.

 ------------------------------------------------
|  # times called|  1,000  |   10,000  |  100,000|
|----------------|---------|-----------|---------|
|  Time          |  350us  |   4.53ms  |  49ms   |
 ------------------------------------------------

Having zero deserialization needed when passing a single object beats whatever little deserialization that happens when passing a list of objects. We can use this function internally when the amount of objects in a list is less than those needed for the bulk_loading method to have a better performance.

This can still be thrown away if I manage to think of a better method.

4ian · March 24, 2020, 8:23am

Nice! It would be interesting to compare multiple calls with a single object VS single call with list of objects.
As you said we want to reduce the serialization/call overhead cost to almost nothing as possible

arthuro555 · November 9, 2020, 8:57am

This topic seems dead It’s a shame, collisions with web assembly would be really cool. I found this if it can be useful in any way: https://ncollide.org/, though I doubt it since the major problem seems to be converting data from js to rust data structures and back.

If I can in any way try to help with that I would be happy to do so.

4ian · November 9, 2020, 5:34pm

The real difficulty is integrating the current game engine with the potential wasm library - there is a careful balance to be found between not giving enough information to the wasm library (that would result in having to transfer too much data at every call, making the overhead too big) and giving it too much (that would result in having to rewrite most of the game engine in this library).

I’ve made this fairly large Pull Request a while ago: Experimental object collision and positions handling improvements by 4ian · Pull Request #1393 · 4ian/GDevelop · GitHub

It introduce a gdjs.ObjectPositionsManager, which is a class responsible for holding the position of objects, their hitboxes, and giving operations like compute collisions, compute distances, etc…
Once we have this abstraction we could:

improve the efficiency of the “naive” collision/raycast/etc by storing object position in a spatial partioning structure (much like what is done in the platformer engine and in some other extensions), which is in the case of this PR a RTree. This would still be done in JavaScript as a first step.
Then, we could remove this JS implementation of a gdjs.ObjectPositionsManager and instead implement it in Rust/C++/whatever compiled to wasm. This would be possible because gdjs.ObjectPositionsManager was designed with an API that only use simple datastructures and deal with objects using their ids (so it means that for example if you want to test the collision between objects, you just have to pass their ids, which are numbers. You don’t have to serialize the full object).

Unfortunately I stopped because the performance improved in a lot of cases, but performance was also worse in some other cases for example the Asteroids example with 1500 asteroids.
This is because in some situations, it becomes more time consuming to update the gdjs.ObjectPositionsManager every time an object moves, rather than doing what we do now (which is to not done anything, and let the collision conditions iterate on all objects).

Of course it’s something that we can expect: there will be always a situation where the performance, even with a smart spatial partitioning, will be worse than a naive approach. So I also added an implementation of gdjs.ObjectPositionsManager that would do nothing (i.e: it would just register object position, and the collision conditions would just iterate on all objects).
But even with this, some examples were still having a decreased performance, because of the “bookeeping” work of calling a function like this._runtimeScene.getObjectPositionsManager().markObjectAsDirty(this); every time an object is moved.

If you want to take a look and help, you can fetch the branch (expect some merge conflicts to happen and to be solved). You’ll see that gdjs.ListObjectPositionsContainer is the “naive” class doing no special bookeeping of objects. I’d like this class to be as performant as now in the case of lots of objects being moved (like the Asteroids example with 1500 asteroids). If we can find a way to do so, we would then be able to offer an option to activate spatial partitioning (gdjs.RTreeObjectPositionsContainer: it’s a RTree)
and then even replace this last class by a wasm powered library, which would offer predictable performance

(for example, funnily enough, when I was doing benchmarks between gdjs.RTreeObjectPositionsContainer and gdjs.ListObjectPositionsContainer, it was important to reload the preview multiple time, to give the time to the JS engine to Just-In-Time compile the “hot” functions).

4ian · November 9, 2020, 5:44pm

There are also smaller improvements (TypeScript annotations) that could be ported for the PR in smaller PRs to be merged safely in master.
Overall it would be an interesting project to dig this out, confirm the performance in various examples (i made quite a few benchmark games, and you can see the list in my PR) and see if we can find a solution so that performance do not decrease too much in the worst cases.

Performance is largely better when testing collision between a lot of objects against a lot of objects (example 1000 objects against 1000 others). This is because the spatial partitioning will help a lot to decrease the number of tests and iterations.
We could also think of another approach that could use wasm, as long as the overhead of calling wasm functions is not killing the benefits.

arthuro555 · November 9, 2020, 10:02pm

Tbh I am not even able to correctly benchmark a game so this might be above my skill level Is there a way in GDevelop to run benchmarks (I think I saw a benchmarks folder in the GDJS tests) or should the JavaScript profiler be used?

4ian · November 18, 2020, 8:42pm

Change this._profiler = null in runtimescene.js to this._profiler = new gdjs.Profiler() and uncomment // this._renderProfileText(); //Uncomment to display profiling times in runtimescene-pixi-renderer.js and you should have the frame times shown (average on 600 frames)

Should be documented somewhere.