Page 4 of 4

Re: Lag and server status

Posted: Mon Jul 27, 2009 9:37 pm
by Jack Nickelson
I logged on in Jhelom like 45 minutes ago to try to gain some swords/related skills. After walking to the bank I realized I was facing imminent death due to the lag so I logged. At the same time though, I played a lot during the day today and had no problems.

Jack Nickelson

Re: Lag and server status

Posted: Mon Jul 27, 2009 10:30 pm
by Arcott Ramathorn
Seems to have been fixed after the restart(so far) any ideas as to what the cause was?

Re: Lag and server status

Posted: Tue Jul 28, 2009 11:10 am
by Derrick
Yes, I believe the problem has been found and corrected as of this mornings restart.

We've had a few separate yet somewhat related problems over the last week or so which have contributed to the problems we've been seeing.

Before we moved the server, system resources were getting short pretty fast. The number of items and players in game has been growing at an astonishing rate, up to 2.4 million items now, 23124 active accounts, 67801 mobiles. The memory requirements of the shard exceeded the capacity of the current operating system. On Friday 7/24/2009 at 15:34 ET, the old shard crashed for the last time with an Out Of Memory error when attempting to create some blood splatter (noted only for irony).

We were already in the process of spinning up a new server, and moved over to it somewhat in a hurry. We had problems immediately following which resulted in two shard hangs which are believed to have been network related, and have not reoccurred since some reconfiguration was done.

Due to the improved speed of the new server and additional processor cores, it's able to do more at once than our previous box. This surprisingly led to further problems, which took me some time to find; possible due to severe lack of sleep which I finally got caught up with on Sunday. The three incidents we've seen over the last weekend and yesterday with prolonged periods of unplayable lag were the result of a resource warning being generated and logged to disk. The system was inappropriately flagging certain internal resource consumption as dangerous (pool sizes) . This was corrected last night, and this problem is not likely to reoccur.

At this point, I have an enormous confidence that all the problems we've seen in the last week have been mitigated, and with the new server we should easily be able to support upwards of 1500 connections, likely much more that that.

Re: Lag and server status

Posted: Tue Jul 28, 2009 11:19 am
by Faust
Problems should always be expected among the player base when shifting to a new server.

I am a bit curious how many items, active accounts, and mobiles existed 6 months ago.

Great news that the problems should be fixed and good work as always Derrick.

Re: Lag and server status

Posted: Tue Jul 28, 2009 12:02 pm
by Ronk
Derrick wrote:Yes, I believe the problem has been found and corrected as of this mornings restart.

We've had a few separate yet somewhat related problems over the last week or so which have contributed to the problems we've been seeing.

Before we moved the server, system resources were getting short pretty fast. The number of items and players in game has been growing at an astonishing rate, up to 2.4 million items now, 23124 active accounts, 67801 mobiles. The memory requirements of the shard exceeded the capacity of the current operating system. On Friday 7/24/2009 at 15:34 ET, the old shard crashed for the last time with an Out Of Memory error when attempting to create some blood splatter (noted only for irony).

We were already in the process of spinning up a new server, and moved over to it somewhat in a hurry. We had problems immediately following which resulted in two shard hangs which are believed to have been network related, and have not reoccurred since some reconfiguration was done.

Due to the improved speed of the new server and additional processor cores, it's able to do more at once than our previous box. This surprisingly led to further problems, which took me some time to find; possible due to severe lack of sleep which I finally got caught up with on Sunday. The three incidents we've seen over the last weekend and yesterday with prolonged periods of unplayable lag were the result of a resource warning being generated and logged to disk. The system was inappropriately flagging certain internal resource consumption as dangerous (pool sizes) . This was corrected last night, and this problem is not likely to reoccur.

At this point, I have an enormous confidence that all the problems we've seen in the last week have been mitigated, and with the new server we should easily be able to support upwards of 1500 connections, likely much more that that.
I love tech talk :-) And hey...if we manage to get up to 1500 connections then maybe the next step can be to implement era accurate server lines. Lol.