Friday, August 30, 2019

Networking Code Gets Complicated

Disclaimer: You'll only find this interesting if you're curious about network game programming.  Otherwise you might as well skip this one.  

So I noticed a couple months ago that there was a problem with the positioning of ships far away from the player.  Their position was not updating properly, or not at all, until the player gets within distance of about 2/3 the way to the edge of the minimap to them.  Then they suddenly snap to the correct position.  Now that is the distance as seen on the server, not the client.  The biggest problem on the client is when ships have moved far away from their starting position the player could approach what they believe to be a nearby ship and get right onto them, but that ship is actually very far away and the client is just not getting an update.  

So this was confusing to me, because this was not always the case.  When I originally wrote and tested the network API this appeared to be working fine.  I've also seen it working fine in Broadside previously.  But I have made some updates to the network API since then, and changes to the ships themselves, so I was not sure when this broke or why.  I did though know it had something to do with being subscribed or unsubscribed.  

Object Subscriptions


So to reduce network traffic and support larger numbers of networked objects I created a system built into the network API I call Subscriptions.  The way it works is periodically at a set interval the zone server checks range from all connected players to all other networked objects.  Any objects within that range the player's connection gets subscribed to, and beyond that range they are unsubscribed.  Networked objects are anything the server controls and syncs with the clients.  Ships and cannonballs are the best examples, but there's actually even more stuff.  

Currently I'm running the zone servers at a frame rate of 25 frames per second, and sending position updates for ships specifically every 1/25 seconds, or about every server frame.  That is the rate for subscribed clients.  If the client is unsubscribed to the object I was sending an update once every 2 seconds.  This significantly reduces the amount of data the server needs to send to each client, but shouldn't look that odd since those ships getting slow updates are far away.  

There's more to subscriptions than just that too.  Such as syncing specific variables only to subscribers like how much flooding a ship has, how many crew it has.  When a client gets subscribed all these "SyncVars" are automatically synced all at once to the newly subscribed client, so there's no difference really if it is information the client only needs when they are close to the object (those 2 specifically are just used for displaying above the ship at close range).  

Investigation


So I determined the positioning problem was only affecting objects the player was unsubscribed to, primarily ships.  What is going on with the unsubscribed ships?  Looking at the code I didn't pick up on anything obvious.  I first tried reducing the unsubscribed update frequency to once a second.  No change.  I found a bug that in certain circumstances could cause a network object which snapped to a new position on the client to be moved back to a previous position, which may appear that it didn't actually move.  That's the bug I fixed for 0.7.12 mentioned in the dev log.  That turned out not to be the cause though.  

I investigated if sending messages targeting unsubscribed clients in general was broken (I don't use this in many places in the code, so it was possible), but no dice.  The network API uses a channel system, where each channel has different settings.  Settings include whether to send with encryption, to send reliably, to hold short messages for a short time to allow for other messages on the same channel to be combined, etc.  The JCGTransformSync component that handles syncing positions actually is the only thing which uses channels 2 and 3.  Channel 2 is used for subscribed clients, and channel 3 is used for unsubscribed.  Almost everything else in Broadside is either on channel 0 or channel 1.  Ah ha!  Something might be set up wrong with channel 3!  

Nope, nothing wrong with channel 3.  

I looked at a lot more areas of code, more investigating, and today I finally found the issue as I was walking through all the code involved.  

The Cause


So the issue ended up being in JCGTransformSync itself.  Even though I send position updates 25 times per second to subscribers and now 1 per second to unsubscribed clients, I only do so if the object has moved since the last time an update was sent.  I send the update and save those values.  The next time I try to send an update I check if the object has moved more than a certain amount (very small amount) and only if so do I send the update.  This means that a large fleet, even if you are subscribed, actually generates next to no network traffic if they are not moving.  

The problem was I send the update to all subscribed clients, save the new position sent.  Then I try to send to all unsubscribed clients and first compare the current position against the last sent position.  Well it is comparing against the position just sent to the subscribed clients a moment ago, which is exactly the same as the position now obviously.  So no unsubscribed updates are sent to anyone.  

This bug though has been there since I originally wrote JCGTransformSync though, why didn't I see it back then?  I've only noticed the issue for a few months at most.  

Ahh, the bug has existed since Broadside 0.6.0, since September 2018, but it only became really noticeable starting in 0.7.2 when I changed the update frequency from every 1/10 second to every 1/25 second.  You see at a 25 FPS frame rate when sending updates to subscribers every 1/10 second there will be some frames when I don't try to send an update to subscribers but I do try to send an update to the unsubscribed clients.  The objects are all really distant so I probably didn't notice they weren't happening as often as they should, but the positions wouldn't end up that wildly off as they have been.  Now that I'm sending subscriber updates every frame (even when there aren't actually any clients subscribed it still runs through the code as if it is going to but just doesn't send anything out) there never is a frame when attempting to send to the unsubscribed in which I didn't already attempt to send to the subscribed.  

So now I changed the code so whenever it tries to update the unsubscribed it ignores if the object has moved or not.  It just sends them.  Sending an update every second or two for each object isn't a lot of traffic, but I may revisit this later to track last sent subscribed and unsubscribed positions separately.  But for now this feature should be working.  

Monday, August 19, 2019

Adding Zones

So I've been slow with the current work.  Last month I finished the feature of the master server spinning up and down the various zone servers.  This now makes possible the ability to fill out additional zones, as previously any zones in the game had to be started by the master server and couldn't be spun down.  This put a hardware limitation on the number of zones which could be running at the same time (around 5 on the relatively low spec test server, even though the current live server can support more than 3x that many).

Now I've been adding additional zones, currently I've been adding the Netherlands, North Sea, Baltic Sea, the Shetlands, and some ocean west of Scotland and Ireland.  The plan is over the next few months to add all of Europe and go out into the Atlantic towards North America.  I ran into a problem though with how I was triggering a zone transfer for the player.  What I had been doing, and was working fine up until now, is manually placing trigger colliders onto the map.  When the player enters a trigger collider for an adjacent zone the zone server would then attempt to send the player to that zone.

The problem though was in placing those trigger colliders by hand.  This was easy around the UK since the land provides some easy reference points.  When getting into borders between zones that are out to sea though there are no such reference points.  After struggling with this for about a week, I decided to redesign the system.

Now in code each zone is a list of bounding boxes, with a north west latitude/longitude point and one to the south each.  Already the zone server recalculates the player's lat/long position every second or two.  Now this will be checked against all zone bounding boxes, and if the location correlates to another zone then the zone server will attempt to transfer the player.

This should eliminate problems placing the trigger colliders, which when misplaced can result in the player transferring to another zone then immediately getting transferred back because of bad placement.  This will also eliminate the overlapping area between zone servers where two players can at the edge of a zone actually occupy the same location while connected to two different zone servers.

So I'm optimistic I will have this all completed within the next day or two and pushed out to the live server after testing.  After that I'll start working on the French, Spanish, and Portuguese coasts, and then the Mediterranean.  This will make for a much more expansive game compared to now, letting the player travel for hours discovering new lands.  :)

Edit:
The new zone transfer system and new zones are now undergoing testing in 0.7.11 and it has been very successful.  After additional testing is done, and some relatively minor bugs with the new zones are resolved, the build will be pushed to the Perilous cluster. 

Tuesday, August 13, 2019

Some gameplay footage

Here's just a bit of raw gameplay combat.  Click below to play the video.  This is a fight between British and French NPC's outside of London I sailed into and joined in.  Nothing special.  


Sometime soon I'll post some better videos with commentary.  But just a taste here for now.