Dec 3, 2020

Map Editor - Line of Sight / Performance

Part 5 of infinity.

Introduction

As I mentioned in my previous post on light and walls, I was planning on adding some line-of-sight to the map editor as well, so players can’t magically see everything just because it happens to be light out. I did that, and quickly ran into an issue where the whole editor got really, really laggy. I thought it might be nice to do a quick writeup of why, and use it to discuss some things I encounter frequently when I try to performance-tune stuff.

This is more about general software engineering practices and less about particular bits of code you might find interesting. If you’ve worked with me and been subjected to a lengthy rant about performance, there’s probably nothing new in it for you.

How Line-Of-Sight Worked

Ultimately I just reused the logic I already built for light and darkness, with some slight caveats. First, instead of each light source carving out an arc around it, each player did. Second, instead of making the stuff you don’t see a big blob of darkness, I made it a big blob of static grey, except if there was also darkness, in which case I made them both darkness. (Otherwise in some places there would be a distinction between blocked light and blocked vision that would reveal info to the players.)

Also, in the process of doing both of these, I added logic so that walls won’t be drawn unless the players have line-of-sight to them and they are lit. (Otherwise, even in darkness, you can see the interruptions of the grid that indicate the shape of buildings and hallways, which is again leaks information.) This ended up being the main problem.

Light blocked by walls

Example of light blocked by walls. Note that the higher wall is actually one interrupted line, but only some of the gridlines are covered by it.

Performance Issues

The first session I ran with this logic enabled, I was pretty happy with the visuals, but the whole system started crawling.

To briefly review some stuff I think I communicated in an earlier post, the system uses SignalR to keep all clients in sync. Whenever anything changes, it actually wraps up the whole map in an update message (which is obviously inefficient given that most changes involve a few bytes of information). One consequence of this - besides sending relatively large blocks of network traffic - is that the editor does a full redraw every time it receives an update.

This is probably slightly worse than it sounds for at least one reason that might not be obvious. One of the most common things that happens during a DnD session is combat (and in fact, the non-combat portions of the sessions often don’t require showing the map at all - I run sessions on a videoconference and switch between screen-sharing for combat and just the video for roleplaying).

In combat, mostly the updates are to people’s hitpoint totals as they get hit - somebody might hit a monster for, I dunno, 15 damage, and then the monster hits a party member for 23 damage, and then somebody else hits the monster for 7 damage, and so forth. The players and I are all just rolling dice to determine these amounts and then I’m updating the values in the editor. Sometimes this means I type in a new number, but other times I just click on the number-input and press “down” 15 times. Which changes the value 15 times. Which sends 15 updates. Which redraws the scene 15 times. And, now that the drawing logic bogs everything down, it makes the page lock up for several seconds.

A (somewhat) brief aside about compute times

Back when I worked at Quicken Loans, I spent a lot of time worrying about performance-tuning. In fact, in a weird way, I think QL was just the right size to be the most concerned about performance-tuning - Google would just throw more compute resources at the problem, usually, because compute time was cheap and software engineering time was expensive. QL had pretty large workloads but didn’t have the same near-infinite pool of compute resources (and also Google paid more for software engineers, although both were great jobs), so we had to actually write software efficiently, or at least occasionally come through and tune things up.

This would often lead to conversations where I would ask why something was so slow and people would try to explain to me that it wasn’t slow, it only took a couple of seconds, and I would respond that “seconds” is a measure of time that should apply to people, not computers. If I ask you to do something and it’s done five second later, you’re an amazing person. If I ask a computer to do something and it’s done five seconds later, it better involve hundreds of MB of data, or the kind of process you try to offload to a GPU (Machine Learning, mostly).

So, the fact that redrawing the screen was now taking somewhere in the neighborhood of a second was not great.

Another aside, this one about profiling

If you reach a point where you need to care about performance, profile your code. Even if you think you see why it’s inefficient, profile it. Don’t bother performance-tuning code without profiling it.

The reason for this is that even when code is really inefficient, a lot of times it doesn’t matter. Modern computers are insanely fast. Maybe some particular part of your process is 10x slower than it needs to be. Or maybe 100x slower. But it still might take 1ms. Or 0.1ms. Or 0.001ms. Sometimes there’s fifteen things wrong with your process and it’s taking 5 seconds to run, but the first wrong thing is taking 4.999 seconds, so you don’t really need to care about the next one until you fix it (and probably you don’t need to care about it ever, because 1ms is plenty fast).

Even though I knew that the performance issues started with the addition of the line-of-sight calculation, I still profiled it to make sure it wasn’t just some weird idiosyncratic piece of that process rather than the main logic of it. Although, in this case the answer was extremely straightforward.

Why redraw was so slow

Whenever I was checking whether I had line-of-sight (or lighting) on a particular entity or wall, I would build the sight-arcs from the entities or the light sources and see if they reached that entity/wall. In fact, I was literally doing this once per wall segment (i.e., five feet of wall) and once per entity. Which meant I did a complete rebuild of the sight-arcs dozens or hundreds of times. Building the sight-arcs involved getting each wall segment and seeing if the arc reached it, and for what angles (if any) that wall segment was the closest thing to the center of the arc. So that’s a really straighforwardly O(n^2) process (or really, O(n^2 + n*m), where n is wall segments and m is entities). And I draw a lot of wall segments - generally I’m viewing a 20x20 map and possibly also scrolling around in it, so several hundred segments might be drawn on a given level. Naturally it took a long time.

Now, you might rightly ask, “Why do such a stupidly inefficient thing in the first place?” And my answer would be, “Because it might not matter at all.” Again, modern computers are really fast. I was actually already doing this once per entity (so, O(n*m) as above) when I added the light logic and it hadn’t made a meaningful difference.

It’s generally a good idea to just not worry about performance issues unless you’re running some process forty billion times (like Google search) or if the time it takes to run causes some detectable issue (like this). And like the Google philosophy above, it’s not worth spending your time as a software engineer on performance-tuning unless you can recover some measurable value from that time.

Having said all of that, this is obviously a case where I needed to fix it, because it caused me physical pain to run DnD sessions, which is really not what I had in mind when I started this probject.. So I rewrote the constructor for the main map object (technically “Board” because “map” is an overloaded term but I still think of it as “map” in my head) so it builds all of the sight-arcs for the lights and player-entities once, and then references those arcs repeatedly in the place where it was recalculating them before. There’s really nothing exciting about this code so I won’t bother including it.

I do want to note one other thing - I’m still recalculating all of the sight-arcs each time I update an entity’s hitpoints, which means it still happens repeatedly in cases where the map hasn’t changed at all. This goes back to what I was saying above - it doesn’t matter. At this point the editor is back to being totally fluid, at least to human perception. I don’t know how much time is wasted on redraws - probably on the order of a few milliseconds per redraw - but I know I don’t care, and it’s not worth my time to go through and tinker with my constructor logic to make it preserve whatever sight-arcs were calculated before, which would probably involve making some kind of hybrid double-copy constructor or something, which is messier and harder to understand… better to just let it “wastefully” recalculate everything once per update.

Useful Lessons

Computers are really fast, which means you shouldn’t care about performance unless it actually makes your software less valuable
Computers are really fast, which means anything operating at the level of human perception is really slow
Use profiling

Remaining Issues

As noted, there’s still a ton of waste here, mostly in the form of any update to the game state that doesn’t change the line-of-sight or lighting conditions. If/when that ends up being relevant, I will definitely alter the update logic to copy the previous light and line-of-sight arcs for those