Wednesday, November 14, 2018

Continued progress, networking API success, market updates, current development

Networking API Updates


The new in house networking API has been very successful.  Only relatively minor bugs and improvements have been needed to it so far after putting it through testing in the project.  The biggest issue found was some testers were unable to receive certain packets but were not disconnecting.  I tracked down the most likely cause to some ISP's were dropping larger sized UDP packets without fragmenting them even though the packets should not have been sending with the don't fragment flag (was using packet sizes near max MTU). 

When first integrated into Broadside, sizes of different messages and overall UDP payload size was hard coded in all over the place with ample buffer room so nothing would accidentally overflow.  I went and completed the packet size feature, where on one of the main scripts I can set the exact maximum UDP payload size (packet size equals UDP payload + UDP header + IP header) and all other scripts maximize message sizes to the exact maximum number of bytes to fit.  Then I set the default UDP payload size to 508, which is a good safe maximum from my research where ISP's won't drop (or even fragment) UDP packets.  Testing by testers with the problem ISP's was successful. 

I'm seriously considering releasing JCGNetwork to other developers for free either to use directly, or can be used as example code on how they can utilize the Socket class for their own projects. 

City Markets

Buying/selling at city markets is now complete, as well as the supply/demand system.  The lists of all items as well as what is available from NPC's at various markets is still being worked on, but the core mechanics and the UI are all complete and successfully tested.  It was kind of exciting making the first trade runs while trying to avoid NPC pirates.  Default pricing of various ships and other ship related items, which had previously been set at 1 real, have been changed to more realistic pricing.  The game is starting to feel like a real MMO.  

To do still are player created buy/sell orders.  The back end of this is complete, but the UI still needs to be created.  

Server Performance

I did a small redesign of how server instances are launched to support creating a separate development mode build just for the zone servers which I can attach the profiler and examine what is burning so much CPU.  CPU performance on the server has been a bubbling issue where the current CPU usage on the main thread would prevent hitting the target of 60 to 100 concurrent users per zone.  As it is AI ships are producing the same CPU load as player ships, so this was easy to test by increasing the number of AI ship spawns.  

Zone servers are multithreaded, but the main thread was using approximately 2/3 of the CPU cycles.  I'm using a 5 year old AMD 8 core CPU on the test server, so a newer CPU with better single core performance would raise the top end, but on the test server I'd see performance issues cropping up at around 40 ships, significant issues around 60 ships, and become completely unresponsive much higher than that.  

The profiler exposed that to my surprise it was the physics system using the majority of the CPU.  I'm going to experiment with disabling colliders on AI ships that aren't near any players, but I found that the default fixed time step of 0.02 (50 FPS physics frame rate) is unnecessary for this game.  I was able to significantly drop physics usage by going to a 0.04 fixed time step, cutting the physics CPU usage by roughly half with no negative noticeable change to the user.  

In addition, I had previously assumed that all of my various systems that are updating every ship on every frame were the primary cause of high CPU usage, but my previous optimization passes were apparently more successful than I had thought.  These systems used a very low amount of CPU.  Also JCGNetwork used almost no CPU (granted I was testing with low user counts), and GC (garbage collection - managed code automatic memory management) was insignificant CPU usage.  Still, once the physics system was reduced in CPU usage, my scripts were taking up a little over half of the CPU cycles used.  I tracked the issue to a 3rd party script related to the water system I am using, specifically the ship wake feature on the surface of the water as ships move.  Even when ships are idle this script was burning CPU, and I don't even activate the wakes on the server.  

A simple "if (JCGNetworkManager.IsClient())" line so none of the code runs on the server anymore, and this bug was resolved, dropping script usage by well over half.  And because I dropped the physics frame rate down to 25, I didn't see a reason to keep the overall frame rate at 60, so I went ahead and dropped that to 25 as well for now.  The result is CPU usage that may very well support 100 clients per zone as it is now, with several avenues for further improvement to be investigated.  

Current Development

My volunteer testers are getting excited about the game, I'm getting excited about the game, and things are going generally very well.  One issue that has bugged the testers is I am currently just killing the server processes which is causing corruption of the database (not data corruption, but certain updates from the zone servers are getting written to disk by the database, which conflict with other data that isn't updated yet).  The biggest issue is ships, for example if the player switches to a different ship and then logs off, that player is immediately updated in the database with the ship they are sailing, but the ships in the city are not.  If I kill the server processes before the cities update to the database the same ship could be recorded as both being sailed by the player and stored at the dock in a city, causing problems in the game because these situations are unexpected and aren't explicitly handled.  So for now I've been just frequently wiping the database, annoying the testers who put work into their characters.  Some of this is unavoidable, but once a week is an unnecessary irritation.  

So I'm currently working on the Command Console, which is a separate application for controlling the server cluster.  The first feature added is the server cluster spin down command.  Where the Command Console tells the Master Server to switch to auto spin down mode.  The Master Server then manages a safe power down of the entire cluster.  It starts by shutting down the Login Server, followed by all the Zone Servers.  Before shutting down, the zone servers disconnect all players and update all player characters, ships, city storage, and city markets to the database.  This is then followed by shutting down the Tracker Server, followed by shutting down the Database Server which writes to disk before exiting.  After that the Master Server powers off.  The server cluster is then ready for a build update safely.  

After this is added, then I'll implement automatic spinning down of empty zones, which is just a single zone server doing the same process as above, the master server powering on zones when requested by the login server or zone servers when a player needs to enter the zone, and finally transitions between zones as the player travels the map.  That will be exciting because a lot of the fun in Broadside is just sailing and exploring, but you're currently limited to zone 0's English Channel area map.  I'll then work on implementing the rest of Northern Europe, followed by the Mediterranean and west Africa, eventually getting to India and east Asia.  This will allow for a lot of the major trade routes to be tested, and get more feedback on the game, as well as allowing a lot of exploration with hours of sailing possible without hitting the edge of the implemented world, even though this will be only approximately 1/3 or less of the planned game world.