Some strange behavior - I don't know if it are errors

trungus · 02-06-2018, 02:58 AM

Hi guys,

I have some strange behavior but I can't confirm if they are bugs or not.

1- When I expand the world changing the "Pregenerate Distance" from 20 to 200, I notice two things

First: the amount of memory used for the server (with no users) rise from 190MB to 435MB on the first start and rise even more in the next, up to 1069MB.
When the world already expanden if I restart the server I toke the message "Preparing spawn (world): 2.05% (820/40000; 817.55 chunks / sec)", even if I set the spawnpoint in a specific place.

2- When I move world from my laptop to a t2.micro instance on amazon, I always receive a error message when I stop the server, but only on amazon, the error is

[13:57:29] DeadlockDetect: Some CS objects (4) haven't been removed from tracking
[13:57:29]   CS 0x7f0f7dc72900 / World freakworld2 clients
[13:57:29]   CS 0x7f0f7d5211d8 / World freakworld2 players
[13:57:29]   CS 0x7f0f7dc728b0 / World freakworld2 tasks

Please some one can guide me ?

Best Regards
Trungus

NiLSPACE · 02-06-2018, 05:06 AM

Remember that the amount of chunks the server will have to generate is the pregenerated distance squared. So with a distance of 20 the server has to load/generate 400 chunks. With a distance of 200 however the amount of chunks needed is 40000. All these chunks contain blocks ids and metadata which is why the memory usage is much higher. @xoft has a brilliant post about how memory usage works with large amounts of chunks: https://forum.cuberite.org/thread-1888-p...l#pid19670

The second point you make is because the world is taking so long to load/generate that the server isn't sure if something bad happened to the world. You can disable the check in the settings.ini by setting [DeadlockDetect].Enabled to 0.

If I were you I'd set the pregenerated distance back to something sensible like the default 20.

trungus · 02-06-2018, 06:02 AM

Hi NiLSPACE

I'll read the post about memory tonight, but I have a doubt if no users are connected to the server, the unused chunk stay in memory?.

About my second point, my problem is when I shutdown the server, just in that moment.

And one more question, once the world was generated, and al the chunk was written to the disk, will help me reduce the pregenerated distance?

Thank for the fast answer.

Best Regards
Trungus

xoft · 02-07-2018, 06:14 AM

The pre-generate distance is only used upon server startup. It simply means that the server will try to have all chunks within that distance present in RAM - either generated, or loaded. Hence you get the memory usage, you're basically telling the server to hold 40k chunks in RAM. If I remember correctly, the logic is pretty dumb here and the server simply uses a cChunkStay to pregenerate chunks, meaning it really tries to fit all those chunks into RAM at the same time. Once all the chunks are prepared, the server will unload them, possibly in batches, and the memory usage will drop back to normal.
The pre-generation was not meant for such high numbers. Get the ChunkWorx plugin instead, it can generate (and re-generate) chunks in rectangle areas and it does so intelligently, holding only a few chunks worth in memory each time. It's perfectly fine to run it while actually playing on the server.
The message you get is normal, the server is just telling you that it's trying to load all those 40k chunks.

As for the second point, that is interesting, it would indicate some problem with the code, but if it happens only on an Amazon VPS, it's difficult to fix. However, it shouldn't be anything critical, it's just about some house-keeping. Is there any relevant message from the server just before these? I'd only expect to see these messages when the server is aborting its shutdown due to some other error, possibly crashing upon shutdown.

Are you running a binary version downloaded off our main page, or did you compile from the source? Perhaps if you could compile from source in Debug mode and try that, it would tell us more.

trungus · 02-07-2018, 08:41 PM

Hi xoft

I understand now, my error was think that pregenerated distance used for world generation and you are right (xoft & NiLSPACE), when I chance the value of pregenerated distance back to 20 the memory consumption goes back to normal.

NiLSPACE thanks for the link, was very usefull.

About my executable, I always compile it from source, right now I have the source pushed on Jaunary 22.

About the message error, I have a theory but I have no idea how probe it. Let me explain it. As I mention before, I have cuberite running in an amazon free account, the instance for this account have only one Xeon E5 core, just one, and only 1Gb of ram, so with a pregenerated distance of 200 cuberite need a lot of memory forcing the OS to swap everything else, even cuberite pages if the aren't locked to memory. When cuberite need some of this info, the CPU is can't take it back before the dead lock detection raise an alert. For these reason the error message only appear on amazon instance and not in my laptop with 4 cores and 8Gb ram.

What do you think?

Regards
Trungus

xoft · 02-07-2018, 09:31 PM

That's very unlikely to be the cause. The error messages are not from the deadlock detector doing its work, they are from the cleanup phase, when the deadlock detector is being terminated as part of the server shutdown. Let me explain in more detail:
Our deadlock detector works by checking whether each world's clock (game tick) gets incremented at least once during the pre-set interval (20 seconds by default). If it doesn't, the detector considers this a deadlock and aborts the server. At the time of the abort, it also writes some useful information to the console, part of which is which mutex (CriticalSection / CS in our Windows-originated terminology) is locked by which thread - this usually points directly at the deadlock loop. Because there's no generic way to enumerate all mutexes used by a program, we use manual registation - hand-selected mutexes get registered with the deadlock detector upon their creation, and unregistered upon their destruction. And the messages you get mean that the detector is being destroyed while some of the mutexes haven't unregistered yet. Since the deadlock detector is destroyed very late in the server shutdown sequence, I believe it has no negative impact on the data (nothing should get lost in the world saves or player saves), even if it means some problem with the code. Also it means I have no idea what could be causing this, beside some really wild theories about exceptions.

Since you're building your own executable, you could probably help us out by compiling in Debug mode (-DCMAKE_BUILD_TYPE=Debug passed to cmake) and running the server under gdb within the Amazon instance, setting a breakpoint to the code that prints the error messages (src/DeadlockDetect.cpp, line 42) and getting a stacktrace of all the threads when that breakpoint is hit.

trungus · 02-13-2018, 11:46 PM

(02-07-2018, 09:31 PM)xoft Wrote: That's very unlikely to be the cause. The error messages are not from the deadlock detector doing its work, they are from the cleanup phase, when the deadlock detector is being terminated as part of the server shutdown. Let me explain in more detail:
Our deadlock detector works by checking whether each world's clock (game tick) gets incremented at least once during the pre-set interval (20 seconds by default). If it doesn't, the detector considers this a deadlock and aborts the server. At the time of the abort, it also writes some useful information to the console, part of which is which mutex (CriticalSection / CS in our Windows-originated terminology) is locked by which thread - this usually points directly at the deadlock loop. Because there's no generic way to enumerate all mutexes used by a program, we use manual registation - hand-selected mutexes get registered with the deadlock detector upon their creation, and unregistered upon their destruction. And the messages you get mean that the detector is being destroyed while some of the mutexes haven't unregistered yet. Since the deadlock detector is destroyed very late in the server shutdown sequence, I believe it has no negative impact on the data (nothing should get lost in the world saves or player saves), even if it means some problem with the code. Also it means I have no idea what could be causing this, beside some really wild theories about exceptions.

Since you're building your own executable, you could probably help us out by compiling in Debug mode (-DCMAKE_BUILD_TYPE=Debug passed to cmake) and running the server under gdb within the Amazon instance, setting a breakpoint to the code that prints the error messages (src/DeadlockDetect.cpp, line 42) and getting a stacktrace of all the threads when that breakpoint is hit.

Hi xoft, I'm trying to set the breakpoint but, when I use the "l" command of gdb to go to cDeadlockDetect::~cDeadlockDetect(), I only get this message

37 in ../../../src/libgcc/config/i386/crtfastmath.c

Do you know which is the problem?

xoft · 02-14-2018, 12:25 AM

try "bt" and then "thread apply all bt", and possibly "info threads". (Taken from the "How to report crashes on Linux" thread, https://forum.cuberite.org/thread-631.html . I'm not too skilled at gdb)