Retrofitting data

The last couple of months we have spent preparing for what looks to be a busy fall, we will have more to say about that in coming updates but for now I am going to stick to the somewhat dry topic of data inheritance.

Mike Acton said about architecting systems is that the first thing you should think about is how the data flows through it – after having established that, the code essentially writes itself. Great advice to be sure, but with a project like Backworlds that started out as a quick hack and has been mostly playable since about one week of development, the way we think about what our data defines has changed massively. While we probably should have started over and deprecated old content when the structure of the game became clear, it is usually safer to make incremental changes so we have maintained support for most old maps.

Most of the data changes relate to the original prototype essentially defining all data relating to the game per level – level properties,  graphics, game object settings and even the player avatar was stored in individual XML nodes per level. While this works great in a prototyping environment where you don’t have to worry about where the data stems from and you can change whatever you want in a single level without risking breaking something else, it is a really inefficient use of disk space and XML parse time as objects get duplicated a lot, and making sweeping changes to properties that you want across the entire game is difficult and time-consuming. We have made changes over time to address this, here are a few of them;

World settings


Those that have been with us long enough to try the Backworld prototype or have read our blogposts about scrapped mechanics know that we have tried different mechanical changes in the different worlds, ranging from changes to the player’s speed and movement options to how time space functions to changes to the painting mechanic itself. All of these settings were controlled individually per level and the only consistency we had was that new levels would default to use the same settings as their closest neighbor.

As we decided to go deeper on fewer mechanics we picked out the ones we liked best and one of the first things we did in the way of data inheritance was to create a file of world archetypes – levels now specify a name to one of these types rather than a block with all of the individual settings. For backwards compatibility, if levels still specify their individual settings they will be matched against the archetypes on load and replaced with the type reference if we find a match – if not, we simply use the custom data. This is a very quick process compared to the rest of the work we do when loading a level, and we rarely even do it as we typically just add new archetypes if we know that we are going to use them.

Render data

Raw texture data always resided in image files on disk – at least in development builds – but information about how those textures should be created as well as information about the size, anchor points and animation information of sprites was also stored directly in the level data. We had a prefab system early on that would simply copy XML nodes into levels and that’s how we shared both sprites and game objects, but this made changing things like HSLE settings across levels really painful – enter render libraries.

About when we started working on production art, we built a system for specifying the render data in a separate XML file and then simply parsing this together with the level’s render data on load. This way, a level only needed to specify what libraries it was using to share sprite information with other levels and any changes would propagate to all of them. The only system we had for replacing duplicates in old level was that name conflicts would always favor the library data over the level data – a counter-intuitive stance since you can imagine wanting to overload specific information in a specific level, but it made the system a lot less error-prone and new sprites could always be created if you needed specific data.

We had to change very little in existing levels since this was put in before we had a lot of production art. A drawback of this system is that libraries tend to err on the side of loading too much for a level, which is why library information is discarded when we make playtest builds – we simply load all the graphic data, discard what we don’t need and save the remaining as part of the level. As we still want to share data and need a better system for reusing render data that is used in adjacent levels, this will probably be rewritten again though.

Game objects

Game object settings was the most complex and arguably the most important data to share – like render data, it is mostly shared between the levels but unlike render data game objects typically have information (position, for instance) that is specific to each instance. Again, we could use the prefab system to create new instances but if we wanted to, say, change the pushing speed of a standard box across all levels we’d have to find all instances and change it individually. It should be mentioned that a well-thought-out data plan, like Acton describes, would foresee this and structure instance-specific and permanent data separately at the very least, but that’s not the system we had to work with. Game objects controlled their own XML parsing and structured their data in different ways.

Having a way to allow for generic XML inheritance seemed like the most versatile and safest option – nodes would be able to inherit all their properties and children from a base node, and if any of the information had changed we would write the new information instead and let that override the old info. YAML – which I probably would have used instead of XML had I started this today – allows for a similar thing as a feature of the language, and at first it seemed easy enough. The big problem comes when considering child nodes – we do not consider the order in which child nodes appear to be important, so if a child node does not match any child node in the base object, we do not know which node changed – in addition, removing a child node may be a deliberate action and if we just assume that no child nodes means we want to use the base settings we cannot cover that case.

<!-- The object we will inherit from -->
  <subobject />

<!-- it is not clear whether we want to remove the subobject or use the settings from the baseobject -->
<object _base="baseobject" />

Ultimately, we decided to just look over unique names of all subnodes and for each specific name discard all information if the nodes matched, or keep all information if they did not. This still creates redundancy in cases where small pieces of information in the child nodes change, and it does not handle the removal issue at all but looking at our data it solves the vast majority of the problems we had. While we tried to avoid retrofitting too much data, the cases where we actually had to make changes were very few and fixing them was a lot faster than coming up with a more robust solution.

While I always advocate for the cleanest and quickest solution regardless of whether it means changing code or data, having worked on released games with large amounts of user-generated content i know painfully well that this is not always a choice you have.

This post is a bit late because I went to visit SGDQ last week! Here’s a picture of the Backworlds avatar hanging out with Cuphead while watching Beckski93’s Tomb Raider run;

… And here’s another one while watching Oatsngoats fighting Spore Spawn – something you rarely see in Super Metroid speed runs, so that was something of a treat.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.