Optimizing a Factorio Clone - Memory Bound Optimizations

BloodStainedCrow | Tim Aschhoff

I made a Factorio Clone, and optimized it to be able to simulate 60 times bigger factories, than Factorio can simulate. Note, that this article is as yet unfinished. In this article I will explore how I got there, soon-ish.

Introduction

NOTE: This article is not yet finished.

Factorio is a game about building and designing factories. These factories are simulated by the game in real time.

Among Factorio players, there is a subset obsessed with Megabasing, a playstyle focused around building the largest factories possible, which Factorio can still process in real time.

While I was building my megabase in Dyson Sphere Program and Factorio respectively, I ran into performance issues. I simply could not expand my base any further, while still running at 60 UPS. While I definitively could have improved my factories, in my hubris I declared: "How hard can it be, to built something which runs faster". And thus, here we are, two years later.

I have based the mechanics on Factorio v1.1 (with the simpler 2.0 fluid mechanics), since this were the mechanics either implemented or announced when I started this project. In the next chapter I will explain the basic mechanics of factorio, which are needed to understand the subsequent chapters. If you already know Factorio, you can skip ahead.

If you only want to see how big of a factory we can simulate with this engine, you can skip to the Results chapter. For instruction for trying out the clone yourself, go here.

What is Factorio

The basic unit of resources in Factorio is an Item. There are lots of different kind of items:

  • Iron/Copper Ore
  • Coal
  • Electronic Circuits
  • etc.

Additionally, some resources exists in Fluids like Water, Crude Oil and Petroleum Gas.

Assembling Machines take some items according to a Recipe and convert these ingredient items into other different output items. Here the term Assembling Machine is an umbrella term for all machines, which transform items like this. This includes furnaces1 smelting items, oil refineries processing oil and petroleum, as well as the typical Assembling Machine crafting Gears from Iron Plates.

Modules are special items, which can be inserted into Assembling Machines to improve their crafting speed and productivity.

Beacons are buildings, which transmit the effect of Modules placed inside of them to nearby Assembling Machines as long as both are powered.

Conveyor Belts move solid items along its length. The move along, until they bump into the end of the belt, where they stop (and do not tumble off the end), or bump into another item, which is stuck. It is (rather obviously) not possible to put fluids onto conveyor belts.

Inserters move solid items from either the output of an Assembling Machine or a Belt, into the input of another Assembling Machine or onto another Belt. They cannot interact with fluids.

Internally, Factorio runs on Updates or Ticks2 Each tick is a fixed timestep update, corresponding to a 60th of a second. So if Factorio is running at 60 UPS(Updates per Second), the world is running in real time, and the player does not experience lag. If the UPS drops below 60, the game starts running in slow motion, in order to still allow all calculations to take place.

Settings some goals

When I started this project, I had a couple of design requirements, that I wanted my clone to fullfill.

  • I do not intend to have the exact same behaviour of Factorio. While most things should work so similar that anyone not intimately knowlegable about Factorio mechanics should be able to see this as "That works the same as far as I can tell".
  • The simulation has to be fully deterministic. This allows Multiplayer using a technique called Deterministic Lockstep and Replays which only rely on the inputs to the simulation.
  • I am optimizing for what I call Simulation Performance. This is a metric I came up with. It means the running speed of a Factory in steady state, without any buildings being placed or removed. All performance claims, or comparisons are made using Simulation Performance as the metric. While I generally did try to keep performance of things like placing buildings as good as possible (mostly to keep my own sanity intact as you will see later), some operations on huge factories tend to be slower than a "proper"/"commercial" game might find acceptable3. I will outline what I mean later, and you can judge for yourself it that is unreasonable.
  • The clone should be moddable. While I do not strife for the (frankly ridiculous4) amount of moddibility of Factorio itself, the clone should support mods adding new ores, items, recipes, machines and such changes. This importantly means not (overly) restricting the number of possible different items/recipes/machines etc, by e.g. storing an ID in a u8.

The less ambitous goal in terms of moddibility, is likely the main opportunity for optimizations. As such, you should not see this project as a "Factorio, do this and be fast", but rather a "If Factorio was redesigned and rewritten from scratch today, with restrictions on mods and no concern for backwards compatibility, this is the kind of performance a Factory Builder with similar mechanics could achieve".

Approximating a Performance Target

All good Performance-Aware-Programming5 starts with estimating the potential performance, based on the hardware capabilities available.

I will assert here, that simulating a Factorio style factory is (incredibly) memory bound6. The Factorio developers have mentioned this multiple times on their development blog. Check it out, it is very interesting! I will prove this fact empirically later on, or I could theoretically calculate the Operational intensity of the simulation, but that is nerd shit :P.

Thus, the amount of resources we have available (at least on my machine) is 40GB/s.

With a goal of 60UPS, we have a budget of 667MB per Tick.

Using this very simplistic model of what needs to be done every tick, let's assume 600 bytes of memory access per tick for each Assembling Machine. With this assumption we can support ~1_000_000 Assembling Machines. That is an insane amount. The biggest Factorio bases have around 100_000.

In order to achieve the full memory bandwidth of 40 GB/s, my PC needs to use at least 4 cores. This effect, of requiring multiple cores to fully saturate memory bandwidth, is enhanced if some parts of the simulation do not fully use the available bandwidth. Additionally this effect is more pronounced in modern machines, with higher memory bandwidths.

This necessitates being able to update parts of the simulation in parallel. In particular, just updating the power system and belt system in two independent threads, would not be enough parallelism!

The Rest of the Owl

Your have reached the point, after which I need to finalize the write-up still. Annoyingly these sections keep on changing whenever I find a new way to restructure the engine to squeeze out a couple more UPS.

After here, there is the results section talking about the largest factories possible and the instruction to try out the clone yourself.

Results

Now that we have established, how the engine works, let's put some numbers on how fast it can run:

For benchmarking I have recreated this 40k SPM Factorio 1.1 megabase by smurphy1 in my engine.

In Factorio, this megabase reaches ~40 UPS, or uses around 25 mspt. In my engine, this megabase runs at a solid 60 UPS, using ~2mspt7.

A very zoomed out picture of the megabase with its solar panels. On the side is a litte window showing 60UPS and 2.6 mspt.

In order to stress my engine further, I filled a save file with more and more clones of smurphy1's megabase.

The engine was nicely able to run the save file, containing 60 clones of the megabase, at 60 UPS, using most of the available ~16.6 mspt8. An even further zoomed out picture of 60 megabases in a grid. The UPS window shows 60 UPS and 15.2 mspt. Half of the screen is taken up by a graph, showing the item usage per minute, reading 1.9 million of each science pack used per minute

Fun Facts about the Gigabase

The base produces ~2_400_000 SPM continously (note, this is in Factorio 1.1 scale, not Space Age nor Quality). It contains ~6_000_000 Assembling Machines and Furnaces.

In total, the factory uses ~20 TW continously. To produce this much power, 1_000_000_000 Solar Panels and Accumulators are needed. This much power is pretty much exactly the amount of energy humanity is using, although humanity only uses around 3TW of it as electric energy9.

The base produces around 3_000_000 copper wire every second! It combines ~45_000 blue belts of iron ore and ~35_000 blue belts of copper ore into 740 blue belts of each science pack.

The base stretches over an area ~30_000 tiles wide and ~80_000 tiles tall. This is equivalent to ~2_500 square kilometers, or about half the size of the Ruhr Valley in Germany. If I included the solar panels to power this gigabase, it would span ~10_000 square kilometers, about the size of a third of Belgium.

Trying it out

There is an online version available. Due to limitations when running in the browser, the performance is significantly worse. It is able to run the 40k SPM megabase at 60 UPS though. Importantly it is also does NOT allow for saving the game. This is a limitation of WebAssembly. You can find the online version here.

The best performance, and most tested version is running on native. Instructions can be found on the github page.

1

Furnaces in Factorio actually work a bit differently, they set their recipe automatically based on their contents. I decided that not recreating this detail was acceptable, since Krastorio 2 a popular Factorio mod also disables this.

2

You might know this tick based system from Minecraft, which runs on ticks corresponding to a 20th of a second.

3

Though Factorio itself also has a bunch of situations, where placing or removing buildings results in lag spikes.

5

Here is a very good series on programming with performance in mind, made by Casey Muratori. It was my first introduction into what modern CPUs actually do, and the information within should IMO be mandatory for all CS students.

6

Pushes up glasses. AFAIK, Factorio is actually Memory-Latency-Bound, while the clone I am building is almost exclusively Memory-Thoughput-Bound. This means that the back of the napkin calculations are a bit off for Factorio, but from testing, Factorio is still able to saturate about half of my DRAM bandwith, so the estimates are only about off by a factor of 2x.

8

The save file containing mutiple clones of the megabase uses Infinity Batteries for powering the factory. As previously discussed, solar panels and accumulators do not require processing power to be updated, but they do require RAM keeping track of them and my 32GB are stretched thin on the other ~30_000_000 entities already.

7

All measurements, which do not specify otherwise, were made on my machine using an Intel 12400F running DDR4 3600 MT/s memory.

9

https://en.wikipedia.org/wiki/World_energy_supply_and_consumption