Posts: 254
Threads: 16
Joined: Oct 2013
Thanks: 3
Given 20 thank(s) in 18 post(s)
Ok, so here is a thought that just came right out of the blue for me. I don't know if it's viable or not; let me know what you think:
The thing that takes up the most space on the hard drive for minecraft is the world files. They contain all of the chunk data. However, it is a proven fact that we can generate the same terrain over and over again without getting a single block out of place, so storing data we can reproduce is kind of redundant other than to save on processor time.
What if instead of saving the entire chunk we only stored block changes? When a chunk was needed, we could generate it and then apply the changes necessary to transform it into what it was when the chunk was saved last. And we could still store data about entities and such no problem. This could allow for extremely large worlds with small data files.
Chunk Generation could be farmed out to multiple threads to decrease loads times.
Let me know what y'all think.
Posts: 1,469
Threads: 57
Joined: Jul 2012
Thanks: 66
Given 127 thank(s) in 108 post(s)
That would be really cool, but probably a bit buggy. I'd still go with it as an option, as it could work and would really reduce the disk usage. (It shouldn't be default because it makes it incompatible with vanilla.
Posts: 6,485
Threads: 176
Joined: Jan 2012
Thanks: 131
Given 1075 thank(s) in 852 post(s)
This would make the server very CPU-heavy, and it already is so.
The storage is not only for the block types, but also for the lighting. I've made some profiling just a few days ago, and found that generating uses about 20% CPU and lighting uses about 60 % CPU. By removing the storage for these, you'll be making the server generate and light chunks an order of magnitude more times.
If we went with this scheme, it would mean that for each chunk save, we'd need to generate the original chunk data, and compare with current content. Save the differences. To load, we'd need to generate the original chunk data, load the differences and apply them. This is 2 generations more just for making a slight saving in the savefile size.
This gets even worse when needing to light the chunk, because for lighting, you need all the 3x3 neighbors' blocktypes. So you need to load 9 chunks (including their generating) and then light one chunk. Sure, when loading neighboring chunks the load gets distributed among the chunks so that there's an almost 1:1 ratio of generating / lighting, but the regular loading patterns say that the loading isn't done in neighbors that much.
Also, consider that there's already compression at work while saving. You might not believe it, but the compression actually does help *a lot*. And there's no telling if the differences would be compressible as good as the original data, I have a feeling it won't.
Posts: 783
Threads: 12
Joined: Jan 2014
Thanks: 2
Given 73 thank(s) in 61 post(s)
For the CPU load, I've got an algorithm for doing GPU lighting which could seriously reduce CPU load for lighting. As for generating the differences this is easy on a GPU.
So for systems without GPU support its not worth it but it might be worth truing as an option on builds with GPU offloading.
Posts: 783
Threads: 12
Joined: Jan 2014
Thanks: 2
Given 73 thank(s) in 61 post(s)
One possablty if we can separate the generator would be to implement this as a separate tool that could be used to compress chunks that aren't used. Then its only a matter of generating on load.
Posts: 953
Threads: 16
Joined: May 2013
Thanks: 67
Given 106 thank(s) in 90 post(s)
Save only chunks that someone changed. Unsaved ones will regenerate. For lighting, assign everything a light of 15. If the client doesn't calculate it itself, nobody will really notice anyway. If they do, they'll just attribute it to their brightness.
Posts: 254
Threads: 16
Joined: Oct 2013
Thanks: 3
Given 20 thank(s) in 18 post(s)
Technically, you don't need to regen the chunk to compare differences. You could just keep a running tally of block changes. For instance, in the chunk when a block is changed we flag it with a bool to say it was changed, and then on save we just save the ones that have the change flag. Or we just store the index of the changed block somewhere and just grab the block info when we save.
As for compression, I don't know how the current compression works, but how much further can you compress the block index plus the block type? Because that would be all you would need to store.
Posts: 783
Threads: 12
Joined: Jan 2014
Thanks: 2
Given 73 thank(s) in 61 post(s)
Bool per block is 8k ram per chunk.
Posts: 254
Threads: 16
Joined: Oct 2013
Thanks: 3
Given 20 thank(s) in 18 post(s)
Or 4k if we used a bitfield.