Persistence - what would you use it or need it for?

Discussion in 'Plugin Development' started by EvilSeph, Feb 14, 2011.

Thread Status:
Not open for further replies.
  1. Offline

    NathanWolf

    One thing I want to point out, because it is a very fine point, is that an object persistence engine is a separate entity from a database.

    That being said, I completely understand what you mean- this is likened to HQL/SQL. Devs use HQL and Hibernate, Hibernate makes database-specific SQL out of it depending on what you're connected to.

    I will say, though, that writing a parser for this kind of thing is a lot more work than is really necessary. We're all devs here, we can use a programmatic API :) It's still basically a parser, it's just a parser that gets to deal in atomic units instead of text.

    It's kind of like the part of the compiler that sits behind the parser, if that helps you visualize what I'm talking about.

    Hibernate also has an API like this that you can use, and avoid HQL (as well as SQL, of course) altogether if you want. It's basically an OO representation of queries, you have a Query object that you add Filters to, request Fields from, that sort of thing.

    Personally, I much prefer working with that sort of system. The point of having an abstract data store, really, is to mask you from having to do things like write XQL code.

    My 2c- for what it's worth, I have not done much work in this area in Persistence. Very little, actually- so in terms of "what's already out there", other than the few very basic SQL interface plugins that exist now, this is pretty much a blank slate.... so keep the ideas coming! :D
    --- merged: Feb 19, 2011 1:23 AM ---
    This is the model I've been using as well- for instance, Persistence itself manages some global data classes, such as PlayerData, which "mirrors" Player.

    Then, in NetherGate, I've got "NetherPlayer". It has a reference to a PlayerData as it's id. I'm toying with the idea of allowing inherited persistence, so I could just do "NetherPlayer extends PlayerData" and not have an additional id .... this is not implemented yet, though.

    I do the same thing with WorldData/NetherWorld and so forth.

    Anyway, this lets me "tack on" player-specific data. Since PlayerData is it's id, it's always a 1:1 relationship, and you can always easily get at the PlayerData instance from a NetherPlayer (there's some handy stuff in there- last recorded Location, last login/logout times, etc).

    Seems to work pretty well so far!
     
  2. Offline

    amkeyte

    Well from a more use-case perspective, I plan to use persistence for my primary data storage system. I'm not a database programmer, but I've got a slight clue.

    Things that I don't want to do are save data in random files in my plugin folder. I want a neat and tidy place to save anything I can edit as an admin in-game -> player objects including permissions, loads of blocks for back-up purposes (a-la BorderGuard). I would use it to save the regions coordinates of a map that define player permissions for areas, setting up PvP teams, chat options, pre-build automated chat messages, and pretty much anything else that I didn't want hard-coded.

    I would like to see a base interface with methods that make this easy for commonly used bukkit classes:
    Block.persist()
    Block.persistDelete()
    (as examples)

    At the very least I would like to see something that allows this to work very simply if not as part of the actual objects:
    server.persist(<native types>|Block|Player|Location|World|SomeOtherPersistableClass|etc...)

    In fact this may be more appropriate, and less bloated, in the case that someone wants to use their own method of saving data.

    It would also be useful to have a base persistable class that standardized the process of saving and reproducing persisted bukkit objects (for instance needing a reference to a valid world instance to recreate a Location, a server instance to recreate a World object, etc...)

    ok.. break time's over :)
     
  3. Offline

    Don Redhorse

    well I guess most of the discussion atm focuses around the nuts and bolts (storage engine, implementation etc) but not about what EvilSeph asked about:

    I think we all know that persistence will store data and keep it somewhere, how it does store this and where it is kept is not the issue atm.

    But what do you guys need? And what should the Persistence NOT be used for?

    Need:

    General Default Objects and Values, for example

    Player:
    GUID (ServerSpecific)
    PlayerID (Minecraft Specific)
    Playername (Minecraft Specific)
    NickName (ServerSpecific?)
    WorldGuid (ServerSpecific)

    World:
    GUID (ServerSpecific)
    Name (ServerSpecific)
    Location (ServerSpecific)

    Inventory: (Yeah I think you thought why it is not in the Player Object) :rolleyes:
    GUID (ServerSpecific)
    Player (PlayerGuid)
    World (WorldGuid) (yeah, for every world a unique Inventory)
    InventorySlot1 etc

    Home:
    GUID (ServerSpecific)
    Owner (PlayerGuid/GroupGuid)
    World (WorldGuid)
    Location (x,y,z, etc)

    Groups:
    GUID (ServerSpecific)
    Groupname (ServerSpecific)
    Members (PlayerGuid / GroupGuid)
    World (WorldGuid) yeah different groups per world

    Objects / Classes:
    GUID (ServerSpecifc)
    Object / Classname (ServerSpecific)
    PluginGUID (PluginSpecific)
    Obit

    Schema:
    GUID (ServerSpecific)
    Name (ServerSpecfic)
    Plugin (PluginGUID)
    Object / Class (Object / Class GUID)
    Extension (PluginSpecific)
    Relies on (PluginGUID)
    Required by (PluginGUID)

    So do we need more Objects which should be STANDARD? Perhaps Money? So that several different plugins could reference it? Yeah I know that there are plugins which don't have money as a currency so does this make sense?

    Ofcourse this Objects could be enhanced by other plugins, the Values could be enhanced too.

    But I think a default set should be defined and that all changes should be discussed so that a plugin developer who want's to enhance the default classes needs to get an ok first from other developers. Every developer can ofcourse create his own Objects and Classes, for example a chat plugin could create

    ChatStore:
    GUID (ServerSpecific)
    Owner (PlayerGuid)
    Receiver (PlayerGuid / GroupGuid / WORLD) (WORLD = everybody)
    Timestamp (Time and Date)
    Text (Text goes here)



    he would just need to make sure that the object / class name isn't already taken, again a central storage for this would be cool solving issues before the arise.

    What persistance shouldn't been used for is MASS Storage of information, for example BigBrother. Stuff like this should go into a own storage, preferable via it's own storage API, it would be cool if a plugin or bukkit for example would supply one connection to a mysql database and all other plugins could use it.

    Persistance should be used to store core functionality data and configuration, it should contain an api for storing and retrieving information, for deleting information, changing the schema and compressing and garbage collection.

    For example if I delete a plugin, the Obit of that Plugin in the Object / Classes will be set so that the garbage collection can delete the now unnecessary information first before removing the Plugin.

    I think something like this (even it is also going a little bit into the nuts and bolts, because that is needed to explain the concept a little) is more the information the bukkit team was looking for OR?

    Why did I choose such a structure?

    To make the information as easy accessibly as possible without limiting it in the beginning or requiring to many extension to the schema and plugin cross interaction.

    For example the Inventory:

    There is a plugin which allows different inventory per world, so we would need the structure I have described like above. If we would have attached the inventory to the player the plugin developer would have needed to create his own object / class for it which than would cause issues with any other plugin which directly access the inventory. The setup mentioned above could also be used by the gate / teleport / warp systems to only allow certain objects into the different worlds, they would only need to store the information which objects where removed somewhere else, but still all other plugins could access the inventory for every world directly.

    Was this a good example? I don't know, didn't write those plugins... the plugin developers should know and that is what evilseph asked for.

    Information what should be stored how...

    just my 2 mio cents...
    --- merged: Feb 21, 2011 7:12 PM ---
    I forgot one thing I think:

    Schema Description.
    GUID (Serverspecific)
    SchemaObject (SchemaGuid)
    SchemaDefinition

    which would be needed to create the correct objects with types and links etc.

    no comments?
     
  4. Offline

    deltahat

    I'd use persistence to save stat data for each player. In my case, that data is key/value pairs.
     
  5. Offline

    hash

    I'm strongly opposed to any suggestion of trying to implement encryption in any part of the persistence scheme, and I feel that none of the suggestions on that topic have been thought through.

    What does hiding information from the server administrator entail? Such a system would require players to keep cryptographic keys on their own machines (which entails a looong train of key management problems with how those players are still able to login from multiple machines, how data persists if someone needs to change a key, etc), and would never allow the server software to operate on the data other than to store it. Think about it -- if the encryption key is ever on the server, how do you expect to ever do more than be mildly inconvenient to the server administrator if he wants to get it? And ya'll realize that pretty much every single thing you do in a multiplayer game of minecraft is completely in-the-clear while it's on the wires between you and the server anyway, right?

    Long story short, security requires planning from end to end and it's never going to happen without serious client-side modifications. It's an extremely high investment to make and requires a great deal of specialized education, tons of testing and peer review, and should simply never be undertaken lightly.
     
  6. Offline

    eltorqiro

    I don't think we need any bukkit-included persistence to have any knowledge of anything other than plugin isolation. The persistence engine shouldn't be used to query for common properties of concepts such as Players and Blocks, which have been attached by different plugins. If PluginA wants to query the properties of a Block that PluginB has attached, it should use PluginB's API, not try to query the data directly. Plugins should have knowledge of other plugins APIs, not their data stores. I would go as far to say that the plugin itself should be the atomic feature of the persistence engine, not any other entity.

    Overlaying an object model for common entitites such as Player, Block, Region etc, probably helps from an administrative perspective because interrogating the information offline becomes more straightforward. However, I would argue against that being the fundamental purpose of the persistence engine. Offline administration can still be performed without needing this kind of perspective. Any online administration would be done via specific plugin APIs, which negates any need for common entities in that situation.

    I also don't see avoiding redundancy as being an actual issue here, either. Who cares if two plugins have stored information like Player.Location? And more importantly, why should we be trying to enforce that information as being the same for both plugins? Let's not force plugin developers to try and second guess what other plugins may do with a normalised store.

    Someone mentioned plugin compatibility and sociability earlier in the thread, giving an example of competing inventory systems. I can accept that case as a genuine issue. However, I don't think it's a problem to do with persistence, but rather shared objects in general, as the problem occurs even if persistence did not exist at all.

    Therefore, I would expect that the persistence engine for bukkit could be quite simple, and would be better off for being that way. Allow us to persist data in the store explicitly from a plugin perspective, and completely isolated from other data in the store. As long as each individual plugin can use the store in its own way, that's all that counts.

    In terms of implementation, because we really don't know what plugin developers are going to want to store, something based around NoSQL would be ideal. Someone earlier mentioned mongodb, and I'm in favour of that type of concept, or at least something similarly JSON. I'm not referring to the backend engine (multiple implementations of the interface would be ideal), but rather the persistence API itself. If developers can set up a new store for their plugin, stuff in whatever data they want, in whatever structure they want, and then retrieve it at some point in the future, that is all that is needed (apart from a specific store removal, of course). If it can be queried with a format that exists elsewhere, such as LINQ, Json Query etc, all the better.

    Tying up this idea with grand concepts from the EE world seems pointless as even attempting to solve complex ORM issues in something as cool-but-small as bukkit would just make it devolve into endless bickering and edge cases.
     
  7. Offline

    hash

    Strong agree. If the persistence system starts thinking of itself as trying to be a magic wand to make plugins communicate even when their designers are too lazy to design APIs intelligently, then things are going to get messy.

    I think that thing that is hardest (read: impossible) to provide per plugin is synchronizing saves so that a backup system can do the equivalent of minecraft's existing save-off save-all save-on things. Some sort of coordination is needed there. Translation from in-memory objects to serial storage can, as others have said, be a holy war (and there's actually a good reason for that I think, since there really isn't any single good one-size-fits-all solution), but synchronization is a problem that needs to be solved universally, and right now there's just no way to do it without completely shutting down a server.
     
  8. Offline

    4am

    How about allowing a plugin to define a scope? Public (open to all plugins), Protected (read-only), Private (only accessible to the plugin which created the data)
     
  9. Offline

    RustyDagger

    What i think would be nice is is it stored all the id=name in 1 central location where every plug in can access it i have that many different item.db on my server its annoying. could benefit from it a whole lot. Simple things like that plug in devs some times over look.
     
  10. Offline

    eltorqiro

    For what purpose? IPC should be done via the API, rather than the datastore. I cannot think of a situation where communication between plugins through the datastore is advantageous (unless of course, that *is* the API).
     
  11. Offline

    4am

    Well, perhaps most situations for "public" would be a bad idea - plugins would fight over changed data and you'd end up with loops (and a headache). Being able to read values from the data store that other plugins make readable (protected) could simplify some things. Then again, if you are creating something so involved it should have an API to allow access by other plugins, perhaps you should just create the API...

    Unfortunately I can't think of a use case for read-only other than laziness on the part of the originator of the data (instead of an API, they just store read-only and there you go)
     
  12. Offline

    narrowtux

    Some ideas come to my mind:
    • Saving per-plugin-settings that the server admin can easily change via text/gui-interface
    • Enabling data-sharing for plugins (e.g. Plugin A saves that Player X has VIP-Status and Plugin B wants to know this)
      • This would include that a persistence object could be linked to an entity or block
     
  13. In a general sense, my wishlist for a persistence api would offer the following:


    A store of persistence, indexed by string, eg "org.bukkit.ExampleCollection"

    collection = get_persistent_store("org.bukkit.ExampleCollection")


    From that collection, I would like to be able to fetch values by a simple key

    data = collection.get("ExampleData")


    The types of data that are stored in the collection should be SIMPLE, and should be VALUES and not references. It should be the role of the plugin author to consider the data that they want to save and to figure out how to represent it. It should not require additional class references to be available to unpack the data.

    PersistentType could be any variety of the following:
    int
    double
    string
    byte[] for opaque data such as worldedit saves
    list of PersistentType values
    map of PersistentType key: PersistentType value (very optional)

    Optionally it would be nice to have a simplified Location type, primarily for keying off of in a map. Just world,x,y,z as string,int,int,int. It isn't necessary though, as you could make that work with a list. There may be one or two other types considered integral to the server that could benefit from having formalized complex representations, but for the most part things should be able to be represented simply with the above. Heck, even the map storage type could be represented as a list of lists, and it would be up to the plugin to marshall those into whatever variety of HashMap, TreeMap, etc that it wanted to work with.

    From the above it should be possible to construct almost any complex data.

    It should be possible to iterate over the collections. It should be possible to iterate over the keys in a collection, and to check the type of the data stored there. This could simply be an enum or the Class reference.

    for key,val in collection {
    dtype = collection.getType(key);
    print "key:", key, "is a", str(dtype);
    }

    From a convenience perspective, the API should offer a way to obtain and typecheck, for example while

    data = collection.get("ExampleList")

    would work to get you a value, you could instead

    data = collection.getList("ExampleList")

    which would check that the "ExampleList" key had value in the collection, and then make certain that it exists and was a list, raising an Exception if it were not, and

    data = collection.getList("ExampleList", null)

    would check for the ExampleList value in the collection, and return null if it either did not exist or was not a list


    I would prefer if the persistence data were there for long-term storage, and that it not be used as the "working set" For instance, a plugin would ideally only want to read from the storage once at load, but it may write it out whenever it feels the need.

    HOWVER, I would like it if I could register an event watcher for the persistence data to be notified that it has changed. While I would not prefer to use the storage as an inter-plugin communication interface, I ABSOLUTELY foresee this happening anyway, such as with economy addons, and it would be better if the plugin could ask to be notified when those values change rather than polling them repeatedly to make sure it's got the right value.

    add_persistence_watcher("org.bukkit.ExampleCollection", my_watcher_instance)


    And finally, I would like to be able to destroy both the mappings inside of a store, and the store itself.

    collection.remove("ExampleData")
    destroy_persistent_store("org.bukkit.ExampleCollection")

    So the above describes (not as coherently as I would like) the simplest use-case data storage. I could envision a layer that is available that sits on top of the simplified value storage that would enable more complex mappings to be transparent to plugin authors. For example, a Location map for associating data with blocks. As that concept is fairly general, it should be perfectly acceptable to write a LocationMap that would marshal itself back and forth from the simpler underlying persistant types, but would provide a way to associate data directly to a world,x,y,z coordinate, very much the same as if you'd stored the persistent data upon the block at those coordinates. Likewise for player or entity mappings. The underlying data would simply be strings, but a layer above that would perform the lookups and translations to associate the data with a Player instance, without the plugin author having to do the legwork.

    Having two layers available would permit both the simplified use cases that I envision for myself, and the more complex and interesting cases such as the warp associations mentioned earlier in the thread.

    EDIT by Moderator: merged posts, please use the edit button instead of double posting.
     
    Last edited by a moderator: May 9, 2016
  14. Offline

    OvermindDL1

    As having been programming for more decades then I would prefer to admit, I have used a huge variety of persistence schemes ranging from LISP being able to persist and change its own code to about 20 methods in C++, including some that involved some rather fascinating meta-templates (think LISP-style changing code), to Java serialization (which I personally still think is a joke, but then again that is my thoughts about Java in general for a huge multitude of reasons) and third-party utilities, to a variety of schemes in Erlang ranging from Mnesia to Riak and others, to a variety in Python and many more.

    Personally, for an 'integrated' system, I think the serialization abilities of Boost.Serialization (for C++) is about one of the most powerful for saving and loading chunks, lets you define a variety of formats to export to from Text to XML to raw binary and platform agnostic binary with an easy interface to integrate whatever else you want, like to a remote KV store even, handles circular references and all. However, it is not anywhere remotely what could be considered good when you are needing just 'pieces' of data as you must unserialize the whole structure that contains the data. This is also a problem that Java serialization and others have as well.

    To work around that previous problem you can start to get into an atomic-unit type of store, these can range from anything from a hashmap to a full KV store (like HyperTable, Riak, Dynamo, Mnesia, etc...) to pretending a SQL table is a KV store (with a speed hit of course due to SQL inefficiencies). However this 'design' has a problem in that you can grab individual pieces of data, you usually cannot do so in a single transaction, so the data the can change, however about all of these have ways to work around that, SQL itself has queries, and for just about every type of KV store they have some variation of map-reduce that effectively does the same thing.

    So between these two extremes, it should be rather obvious that a 'chunk' saving scheme is not appropriate for reasons including, but not limited to being that they tend to be slower, they create more massive memory allocations, you generally always get more data then you actually want, however they tend to be more 'simple' to use. A KV store still has abilities for grabbing multiple data values at once, but that usually requires quite a bit more code then grabbing a chunk using the first type. There are a variety of ways to minimize that code however, the most popular are DSELs, which are easily done in C++ (using Boost.Proto for example), or LISP, or Erlang (both using their own native constructs), and although Java is capable of that, it requires pre-processing steps or a nasty amount of back-end code to mutate its own data structures and the lack of operator overloading and other such niceties in Java makes those DSEL's all the more difficult to use.

    As an example of persistence platform that I think works very well (although it has its own limitations, due to the back-end, not its design) is Mnesia in the Erlang language. Since Erlang is a naturally distributed language and Mnesia is built into it then it can replicate transparently, that feature is obviously not useful here since the Minecraft server is completely incapable of scaling. Mnesia works as such, first you create a table as such:
    Code:
    mnesia:create_table(test, []).
    The created table 'test' can then be used, if it is a new table then it is created, if it existed in the past it loads the data in it. The [] is where you can pass a variety of options saying how it replicates, whether is is RAM-only, DISK-only, or stored in RAM and DISK. This command does the loading of existing tables in the back-end and the database returns a still_loading error if access before it is finished, there is a function call that returns if it is up or not, and one that blocks until it is up (you see a lot more blocking things in Erlang since it uses a much higher abstraction of threading then just 'Threads' as C++/Java/Python/etc know it, although in reality *nothing* in Erlang blocks).

    There are a couple of types of ways to look up data (beyond the variety of match types), you can use "ditry_" commands to access data outside of a transaction (which is completely safe as long as only one thread/process access that table, if concurrent access is even possible at the time then use transactions), or the non-dirty ones to be used inside a transaction. A transaction in Erlang is done by passing a functor with the direct access code into the transact function, which then executes the passed in functor/function on the mnesia database itself, lets you do anything from a simple key lookup/write or map/reduce kind of functionality.

    Mnesia works well because it is built into the language, and although it is *very* fast (a simple mnesia or riak database on my home servers *vastly* outperform the heavily optimized SQL servers at work, tested with 10k entries with mnesia and riak and 10 million entries totaling over a terabyte with riak). A KV store should theoretically always outperform a SQL table except for potentially the most complex of 100+join queries (which KV should still outperform if you used a complex enough map/reduce for that as well).

    So in short, SQL is too complex and requires external dependencies. A simple KV store with at least map/reduce semantics (or more powerful types that are more simple to program, like how Riak or Sector/Sphere can do) would be the best.

    Potentially, a Riak type design would work very well. Riak has a Bucket, inside Buckets are tables, inside tables is a KV store (although in Riak different buckets can have different types of KV stores, some optimized for text searches, etc...). A lookup would be like this in Java'ish style code:
    Code:
    Store.get("BucketName", "TableName", "KeyName");
    Which would return a Byte chunk, or perhaps could even do something map/reduce'ish for more complicated things.

    Personally, I think a KV store would be best, great mix of speed and usability, and since it is all local it would be faster to do your 'joins' in your code instead of a SQL query. However, if you wanted to use a remote store, MySQL or PostgreSQL would be best.

    Regardless of what is used, anything would be nice, as long as it is performant for masses of quick access (maybe optimize it to handle 128 gets or 64 puts per tick at least?), and still capable of saving massive amounts of data (and I do mean massive, think BigBrother kind of massive, my BigBrother DB is over 50 gigs right now) *without* causing massive slowdowns for gets/puts (Riak slows from, say, from 8us per insert to 12us per insert between 10k to 10million entries, and reads stay rather constant between 4-6us per key depending on access patterns, and that is an external database program, compare that to your optimized SQL..., do note that is microseconds, not milliseconds, and HyperTable vastly outperforms even that).

    Try not to require an external database for efficiency though, although it would be nice for it to save to MySQL so we can access it easily externally, do not make the internal one slow as heck (and I know how difficult that can be to pull off with Java's weird and somewhat random memory access patterns).

    In short: A KV store would be perfect and is easy to create and does not require any external programs, just make sure that it does not require it all to be in memory at once. And having it be able to save to an external database (things like Riak are trivially easy to access, MySQL is good too) as an alternative would be great, but if possible have the option for tables/buckets to use the internal version only for speed reasons.
     
  15. Offline

    nickguletskii

    • Groups and permissons;
    • Object persistence;
    • Configs
    That is pretty much everything.
     
  16. Offline

    matejdro

    Any news on that one?
     
  17. Offline

    Mixcoatl

    Persistence is in. Not sure on groups and permissions.
     
  18. Offline

    matejdro

    :O

    Is there any documentation around?
     
  19. Offline

    Mixcoatl

    I never saw a public release notification indicating it was available. But my head may have been in places better left unmentioned. In any case, I've participated in several discussions on these forums relating to persistence. @Sammy put up a nice YouTube tutorial on it, too.
     
  20. Offline

    matejdro

    Thanks, i will check that out.

    @Bukkit Team why was nothing mentioned about this? Last thing i saw was post where you said that it's in the works.
     
  21. Offline

    Sammy

    @matejdro Bukkit Developers don't really like to be the center of attentions loool
    They should be more communicative, if the update doesn't break anything they won't announce it ^^
    That's why so many times people bash them for "not working"
     
  22. Offline

    matejdro

    They made a lot of announcements about how is Persistence in works, but they won't announce that is completed? That does not sound really logical to me. I
     
  23. Offline

    phondeux

    Just to speculate here - maybe the thought it wasn't ready for release? The entire bukkit project is in beta for an application in beta, etc,....
     
  24. Offline

    Daniel Heppner

    I need documentation. Seriously, I'd love to use persistence, but I don't know how. Anyone want to make a tut? Is there some Javadocs page I'm missing or something?
     
  25. Offline

    Dreadreaver

    Daniel Heppner likes this.
  26. Offline

    Daniel Heppner

  27. Offline

    Huns

    It would be nice to see Bukkit come with SQLite v3 support built in. Right now, a number of plugin developers either include it or automatically download it, and then you run into runtime linking problems if you have two plugins that both try to load it (because Java can't have two classes in the same namespace with the same name.)
     
  28. Offline

    Dreadreaver

    just use native persistance instead of sqlite .. problem solved
     
  29. Offline

    Huns

    That would solve the persistence issue, but that's all it would solve. Java's built-in persistence isn't really the same as sqlite. You can't run queries against it, which means you have to hard code that functionality into your plugins.
     
  30. Offline

    Dreadreaver

    word but for me thats no biggie
     
Thread Status:
Not open for further replies.

Share This Page