Thread ownership of a World

FakeTruth · (This post was last modified: 06-09-2013, 10:09 PM by FakeTruth.)

There's something that is bothering me in MCServer: There is no thread that is the real owner of the world data. There are CriticalSections locking things all over the place. Any thread can just access the blocks in a world or manipulate entities and players whenever it likes.

I suggest making it so that the world data can only be accessed by a single thread and get rid of all the CriticalSections. The most logical thread that comes to my mind is the Tick thread.

In order to still access the world data from other threads we can add a function to cWorld, RunOnTickThread(). It has a callback as a parameter and that callback will be executed on the Tick thread while it is idle (after it has ticked the world, entities, etc.). The Tick thread will have a queue of these callbacks that will be executed in the order they were added. This way we will only need a single CriticalSection. This CriticalSection will protect the queue of callbacks.

There should be no performance penalty by doing this. The way it is right now the world data can only be accessed by one thread at a time as well. The only thing this solution does is make sure the data is always accessed by the same thread instead of several different threads.

As far as I know this solution should work fine and will make data ownership much clearer. There is however one problem I know of right now.. sigh.. plugins.

Lua plugins (like the world data) can only be accessed by a single thread at a time. A single plugin cannot be accessed by multiple threads so the callback method will not work the way it is now. The callback will always be executed by the Tick thread, while the OnChat function for example might be called from the Socket threads. The Socket thread will lock the plugin, the plugin requests execution on the Tick thread and you got a deadlock.

A solution to the plugin problem is using yield functions (https://forum.cuberite.org/showthread.php?tid=768) which allows a thread to give up ownership of the plugin while another thread is accessing it. This solution works, but in my opinion this is a bad solution for the same reason I created this thread. A plugin can only be accessed by one thread at a time, why not make a thread owner of the plugin?

The better solution for the plugins is in my opinion to also make sure it is always executed from the same thread and get rid of the CriticalSections surrounding them. Plugins tend to change/enhance gameplay and interact with the world and players so the most logical thread to give ownership of the plugins to is the Tick thread as well.

Any thoughts on this?

xoft · 06-09-2013, 10:18 PM

With this attitude, we'll soon be going back to single-threaded model Tongue

I'll think about this for a moment before writing an irrational fast answer Smile

FakeTruth · 06-09-2013, 10:27 PM

I know, but the problem is that stuff is only accessed by a single thread at a time, so there's no point in having multiple threads, at least the way it is right now.

xoft · 06-10-2013, 12:18 AM

That's not true. The chunkmap *is* accessed from multiple threads - the tick thread, the generator thread, the storage thread and the socket threads. Therefore, it needs locking.

I'm failing to see what exactly is your point with this. It feels like quite a big change, yet I cannot see any advantage that it would bring, aside from being "academically more elegant". Also, why do you want an object to be owned by a thread? I think a much better approach is to have objects own other objects, with the owned objects possibly representing threads. That makes much more sense, since usually you have one main thread in an app, and then a lot of local worker threads, that operate locally on several objects.

One thing that I've noted about your proposed design is that it's prone to cache contention. You're effectively shuffling data between threads; those threads may run in parallel on different cores, so the data that they access must be synchronized across each core's caches. That slows operations quite considerably, if the code does that, it can run orders of magnitude slower. Since that is quite a delicate and intricate thing, I'd rather not mess with it and rather go around it.

That said, it's true that there are deficiencies in the current object model that we're using. I'm just too lazy to rewrite stuff. I think the cRoot should own the cServer object and cWorld objects; each cWorld should own its own tick thread etc. Also, the main app thread should call cRoot:Initialize() and the method should create threads and then return; the main app thread should be doing what now the InputThread does (so when we decide to wrap some kind of a GUI around it, it's easy to do so).

FakeTruth · (This post was last modified: 06-10-2013, 12:26 AM by FakeTruth.)

Yes I know the chunkmap is accessed from multiple threads, but is it being accessed in parallel?
I could be mistaken and that's why I'm posting this but AFAIK the chunkmap can only be accessed by one thread at the same time so there's not actually any performance gained by allowing all threads into the chunkmap. It just creates confusion.

xoft · 06-10-2013, 01:06 AM

Yes, it's accessed sequentially, but how will your proposal change that? Can you sketch out some pseudocode?
The performance is gained when the threads don't access the chunkmap - the generator for example, it chomps away happily at noise calculations and voronoi diagrams, and the, once it has produced a chunk, puts it into the chunkmap. It is only logical that it doesn't need to lock the chunkmap unless it actually needs it. So the tickthread may be running at the same time, processing the fluid simulator, and then once they switch, suddenly there's a new chunk, but hey, it's almost instantaneous.

FakeTruth · (This post was last modified: 06-10-2013, 01:55 AM by FakeTruth.)

My proposal does not change that it is accessed sequentially, it simply makes it more clear. If you run a command through a function called RunOnTickThread() you know exactly what is going to happen, it will be executed sequentially on the Tick thread. Right now a person might think setting blocks is done in parallel while in reality threads are stalling.

My solution to this confusion simply makes things much easier to comprehend. The chunkmap will only be accessed by the same thread, always, so we can get rid of all the CriticalSections surrounding it.

Alright, some pseudocode. Imagine that worlds have their own thread already.

class cWorld
{
public:
    // ...
    void RunOnTickThread(cCallback* a_pCommand)
    {
        if (GetCurrentThreadId() == GetTickThreadId())
        {
            a_pCommand->Run();
            return;
        }
        m_CommandQueueCS.Lock();
        m_CommandQueue.push_back(a_pCommand);
        m_CommandQueueEvent.Set();
        m_CommandQueueCS.Unlock();
        a_pCommand->GetDoneEvent().Wait();
    }

    cCallback* GetNextCommand()
    {
        cCSLock(m_CommandQueueCS);
        if (m_CommandQueue.empty()) return NULL;
        //
        CCommand* Command = m_CommandQueue.front();
        m_CommandQueue.pop_front();
        return Command;
    }
    // ...
private:
    cEvent                  m_CommandQueueEvent;
    cCriticalSection        m_CommandQueueCS;
    std::vector<cCallback*> m_CommandQueue;
};



void cWorld::TickThread()
{
	cTimer Timer;
    long long msPerTick = 50;
    long long LastTime = Timer.GetNowTime();

    while (...)
    {
        // Update world
        long long NowTime = Timer.GetNowTime();
        float DeltaTime = (float)(NowTime-LastTime);
        TickWorld(DeltaTime);
        long long TickTime = Timer.GetNowTime() - NowTime;


        // Handle command queue
        long long CommandTimeLeft = msPerTick - TickTime;
        do 
        {
            long long BeforeCommandTime = Timer.GetNowTime();
            //
            while (cCallback* c = GetNextCommand())
            {
                c->Run();
                c->GetDoneEvent().Set();
            }
            //
            long long CommandDuration = Timer.GetNowTime() - BeforeCommandTime;
            CommandTimeLeft -= CommandDuration;
        }
        while(m_CommandQueueEvent.Wait(MAX(CommandTimeLeft,0)) != TIMED_OUT);

        LastTime = NowTime;
    }
}

xoft · 06-10-2013, 03:25 AM

And you want to do this for each access to the chunkmap? That's an enormous overhead. You'd need to write a special callback for everything, each and every action.

It might be useful to have this mechanism for when we need to offload something to the tick thread, but I don't think it should be the main interface.

FakeTruth · (This post was last modified: 06-10-2013, 03:31 AM by FakeTruth.)

No, only use this when you need something from the chunkmap while in a different thread.
For example the simulators don't need to lock anything because they are executed in the Tick thread already.

xoft · 06-10-2013, 03:31 AM

Which is almost always