<M <Y
Y> M>

August Film Roundup: Another month full of major progress on major projects, but I managed to squeeze in four features:

[Comments] (1) The Minecraft Geologic Survey: I've been waiting for all the pieces to go into place before writing about this on NYCB, and now the pieces are in place. The lightning strikes my castle laboratory and the Minecraft Geologic Survey rises! (See Fig. 1.)


Fig. 1

Back in May I announced that I'd downloaded 65,000 Minecraft maps from the official Minecraft forum, and used the data to make my @MinecraftSigns bot. Later I took over Allison Parrish's defunct @minecraftebooks and revitalized it with _ebooks-style quotes from the books found in Minecraft worlds. (Plus, as of a few days ago, command block outputs that incorporate the names of followers, Exosaurs-style.)

But all the while, in the background, I was downloading. Worlds, screenshots, mods, player skins, texture packs... everything with a URL. I ended up with about two terabytes of data, an amount that here in 2014 is not difficult for me to store but is very difficult to transfer or process.

To get the signs and the books for my bots, I had to load every Minecraft world into Python and go through every chunk looking for entities. I ended up with about 180,000 worlds, and iterating over them all was a very time-consuming process. Fortunately, I had two more projects that would amortize all that computer time.

Both projects required that I take "core samples" of each world, extracting individual chunks that were likely to be interesting and forming a new world (like the one pictured above) containing only those chunks. The resulting dataset is representative of the full more-than-a-terabyte package of original worlds, but because it's just a very tiny sample, the whole thing weighs in at a comparatively slim 12 gigabytes.

That's small enough to go on the Internet Archive, and small enough for you to download it and use it in your own project. I wrote a detailed guide to the data, which includes not only 170,000 synthetic Minecraft worlds but a big JSON file (also available on its own) containing all the metadata and sign text and other things you'd need to do a text-based project.

The other project is The Reef, a series of Minecraft maps that combine the chunks obtained from the survey into mashup maps that incorporate designs from many different authors. For instance, you've got The Reef #1, which sticks spawn chunks from 10,000 different maps together to form a (mostly) naturally-sprawling terrain. Or maybe you'd prefer the Skyburbs, a thousand Skyblock maps jammed next to each other.

I've got plenty more ideas for Reef maps, but now that the data is available I think this is a good point to put the project on pause for a while. I will be publishing the code I use to make my Twitter bots and the Reef maps, to encourage you to play with the data and do your own thing.

I'm concerned about the Minecraft servers that have been shutting down since Mojang changed their EULA to include strict rules on monetization. People have been giving a lot of attention to the Microsoft buyout, but the EULA change is what's affecting servers right now. I would really like to offer an archive service for Minecraft servers that are being shut down (plus just original worlds that people have lying around on their hard drives), but I don't see a good way to get the word out. It's not like the typical Archive Team project where you can go into a server that's shutting down and download everything. The server owner has to take the initiative. Also bandwidth and storage become a problem for me at this point. So this is more of an open question than something I know how to solve. It may not get solved.


[Main]

Unless otherwise noted, all content licensed by Leonard Richardson
under a Creative Commons License.