Try three: Python + Rust
Last changed on
A new language
Let's start with a few fundamental statements:
- I am not a Rust expert, just a guy who wanted to try it out
- The experiment has benefited from enormous progress thanks to Rust
- At the time of this writing, I am very unlikely to switch from Rust to anything else
Let's delve somewhat into the details of the above, and more.
Why Rust?
Naturally, having spent some time on the C version, I did ponder a while about C++. In a nutshell, I needed something that:
- Was compiled and as a consequence, (usually **cough**) faster than Python
- Was at least partially compatible with my "desire" for object-oriented design
- Was well-documented
- Enjoyed an active & friendly community and as a consequence, benefited from a rich ecosystem (modules/packages and all that)
It wasn't long before I read good things about Rust. I remember spending a very long train journey reading the official documentation and thinking: "Ok, this is good - and if all their so-called crates are documented with such quality, this is going to be a joy to use".
Now even though the Rust base documentation is absolutely great, especially when starting out, I found that many crates' documentation is lacking in examples, explanations, etc. Maybe it's a just me, though :-)
Anyway, Rust it was.
The porting and Rust expertise
I am a Rust "noobie". No illusion about it:
- 90% of the code is
struct,implandfn - traits are hardly used because objects are quite specialised (maybe that's the wrong reason!)
And the main culprit: I see Rust as a means to an end, a (very) useful tool to have the experiment: 1) do what I want 2) with an acceptable speed. Now if the Rust implementation leads me to become better at the language, great (and I'm sure it will). But unless confronted with new technical problems and I have to learn and implement new things, I pretty much stick to "cookie cutter solutions"
Structure
To summarise: the experiment's Rust code is very basic: crates, sub crates, structs and functions. That's is. Nothing fancy, as dictated by the objects very linear generation and lifetimes.
Something nice?
A bit of context: for obvious reasons, the experiment tries to be as realistic as performance allows, when it runs to simulate stars, planets, moons, etc. One of the consequences of that is the "most granular" (at this time) entities in the code are quite low-level:
- Elements: from Hydrogen to Oxygen to Cerium and Dysprosium.
- Compounds: from Oxygen and Methane to Olivine and Argentite
Both these entity types manifest as strucs (naturally)… and there are quite many of them:
- Elements: 98
- Compounds: 308 and counting
The natural consequence of this is huge files; for example the compound.rs file contains around 13,000 lines. And I hate long files (who doesn't?): they are hard to read, a pain to maintain, error-prone and generally frustrating to update. But in my case, they were necessary.
So I became a heretic and did this: generated Rust code using Python.
I know, but this was the situation and the requirements to speed up/ease the work:
- Hundreds of entities to manage
- Each with fields to be filled in (duh) but sometimes also to be updated
- Ability to keep track of which entities were done, which were still pending updates/additions, etc
- Ability to search, filter, etc.
- Ability to track usage statistics which Elements are used, how often, etc, across the Compound population
- and more
All the above is not a joy to do in the context of long files. Really.
How it works
Quite well technically, but also philosophically: the compound/element rust entities are pretty much a lazy database (once_cell::sync::Lazy) used at runtime, to determine the aggregated nature/state of their parent "containers" -- it turns out this aligns perfectly with the storage of the raw data in an actual database, in this case a good old MySQL.
Now, working on the Django admin's UI has its drawbacks. But it is good enough, and extensible enough, to handle my needs. In fact, working as described below has increased QoL by an order of magnitude:
- Update, create and delete as needed in the django admin
- Perform regular
dumpdatabackups - On demand, re-generate the rust files from the data
- Compile rust
The "compile rust" step is wrapped inside a management command able to handle compilation both on Windows and Linux; this management command also unsintalls and re-installs the resulting wheel.
A typical view in the Django admin:
Where it is extremely easy to know what is done, what remains to be done and sort/filter by pretty much anything you like, given a few reasonably cleverly annotated querysets (Case, When and the like).
This worked so well for me that I applied it to several other entity types. Which means step 3. above is actually composed of several sub-steps of rust files generation.
On multithreading
None is implemented at the moment. I will probably have to come round to doing it but wait & see.
Edit : some multithreading is gradually being integrated.
Versus pure Python
Remember the table from this post? Well, thanks to Rust, the numbers are now similar to the following:
| Cube density | Number of cubes | Number of stars | Total time | ~Time per star |
|---|---|---|---|---|
| 10 | 100 | 1000 | 5ms | ~0.005ms |
| 20 | 100 | 2000 | 10ms | ~0.005ms |
| 10 | 1000 | 10000 | 50ms | ~0.005ms |
…which represents a speedup of around 20, after the math was made massively more complex and more flexible.
Now this may feel disappointing when compared with the speedup obtained thanks to C; however, the C code at the time was doing about 15% of the work done by the Rust code. Hence the final performance gains are not really comparable. The same holds true against the initial pure Python version.
Functionally, the rust version is now far ahead of anything that came before it. For then on, only this version will be covered, with a potentially a few nostalgia(?) driven exceptions here and there :-)
In closing
And that concludes the "which language" series! As mentioned in the about page, further posts will come as a chaotic pace. As may be guessed by now, the data explorer is still in a constant state of change. both from a functional perspective but also from a data generation one. In any case, yours truly hope you will enjoy at least at little walk across the proposed procedural galaxies!


Please signin to add your comment.