Cuberite Forum
[FIXED] Found a deadlock - Printable Version

+- Cuberite Forum (https://forum.cuberite.org)
+-- Forum: Cuberite (https://forum.cuberite.org/forum-4.html)
+--- Forum: Development (https://forum.cuberite.org/forum-13.html)
+--- Thread: [FIXED] Found a deadlock (/thread-404.html)



[FIXED] Found a deadlock - FakeTruth - 03-23-2012

Just writing down what locks what..

Code:
cSocketThread
Waits for cChunkMap::m_CSLayers (owned by ChunkSender) in cChunkMap::RemoveClientFromChunks
Owns cClientHandle::m_CSChunkLists

ChunkSender
Waits for cClientHandle::m_CSChunkLists (owned by cSocketThread) in cClientHandle::Send
Owns cChunkMap::m_CSLayers

ServerTickThread
Waits for cChunkMap::m_CSLayers  (owned by ChunkSender) in cChunkMap::Tick

The deadlock is between cSocketThread and ChunkSender

ChunkSender stack:
[Image: Screenshot-2012-03-22_16.18.14.png]

cSocketThread stack:
[Image: Screenshot-2012-03-22_16.19.02.png]

Would a dump work here as well?


RE: Found a deadlock - xoft - 03-23-2012

This one will be tricky to fix Smile I pretty much know what the issue is, so no dump is necessary.
I think I've found the trick. In the client's RemoveFromAllChunks() I attempt remove the client from all chunks in the world, instead of having to lock the list of loaded chunks and using that.
The plus side is, removing the client doesn't need the m_CSChunkList locked anymore.
The minus side is that this solution makes a performance hit by traversing all chunks in memory, instead of only the client's ones.

I think the plus side outweighs the minus side for this solution.

Another solution I was considering was locking the list, making a copy, then removing clients from the chunks in the copy list; and cycling this until the chunklist was empty (in the case that someone somehow added the client to another chunk while the copy list was being traversed). That would incur too many locking and unlocking and I think that my final solution outperforms this one.
Note about the mentioned outperforming - code that removes a client from a list of chunks still needs to find a chunklayer for each chunk, which means traversing the list of chunklayers for each chunk in the list. That could mean an O(n * m) efficiency (n = number of chunks, m = number of chunklayers) which is actually O(n^2) because n and m are correlated. My fix is an O(n) solution (and actually an Theta(n) Wink