| « Bento | CentOS » |
Memento Offload
February 13th, 2007Fragments
Memento is an XML database. It organizes XML documents as document fragments.
These fragments are kept in memory using soft references so they can be discarded when not in use.
The fragments can be used to compose a very large document. An application can navigate a document that is too large to fit into memory using an in memory document object model.
The fragments can be extracted from the database using queries to form XML result sets. Fragments can be stored and indexed so they can be retrieved as a document composed of the found fragments.
Document manipulation is performed using the document object model of your choice. Fragments are XML documents. They are created by writing to an output stream, so you can create fragments the same way you write XML to file. You create fragments using the DOM of your choice, or XSLT transforms, or else copying XML files from disk.
Fragments reference other fragments by virtue of elements in a specific namespace. When they are encountered, the application should substitute the fragment referenced by the fragment element.
Document navigation is performed by making this jump when a fragment element is detected.
The fragments can be interpreted as a single, seamless document when using a navigation API like the ones that back Jaxen and Saxon, streaming APIs such as SAX or STaX, or interface based document APIs such as Dom4J, and W3 DOM.
With these APIs the fragment jump can be implemented behind the interfaces of the API, they can skip over the boundaries between fragments to present their contents as a single XML document.
Big Files
If you are using Saxon to transform an XML document that will no longer fit into memory, you can read it into a Memento document using a SAX filter that will break it up into fragments at intervals of your choosing. Then, you can then navigate the document, using a Saxon NodeInfo wrapper.
Lot’s of Little Files
If you were to collect a lot of syndicate feeds, you could store the individual entries as single documents and index them on the date of the entry, to create an aggregate feed that is ordered by time.
(You get includes out of this don’t you?)
ACID
Memento uses Multi-Version Concurrency Control so that changes to a Memento document are atomic, consistant, isolated and durable.
Memento journals changes to the database, so that changes can be rolled back, or in the event of a shutdown, rolled back for forward, to ensure that the database is repaired.
Memento uses opportunistic locking, so that you can write new versions of the database, and so long as the fragments you’ve changed have not been changed by another process, they are committed.
The opportunistic locking makes Memento a very lively database. While a thread changes the database, writing out new versions of fragments, other threads can navigate an unchanged version of that fragment.
Indices
How would an index be used?
I can’t think of how an index would be used. The only way that occurs to me, is to do this.
Define an index. Populate the index using an XPath query. Then query the index using string values.
It could be queried from with XPath and XSLT using an extension function.
That is a very simple implementation of in index that could be used to prove the concept. One could then store atom entries using the UUID. Search for the UUID and obtain a navigator.
That is an implementation that I could use to store atom feeds.
Nodes
There are a number of DOMs that could be candidates for use with Memento. JDOM and DOM4J appear to be serializable, so they can be stored directly to a managed file, and everything could be serialized as XML and parsed to retrieve.
Navigating all these DOMs is a trick. There is no good solution. There are two navigation abstractions of which I’m aware, one in Saxon and one in Jaxen.
I’m already using Saxon and almost dependent on it, however, it is very hard to understand, and very difficult to use outside of Saxon.
W3C DOM itself provides an abstraction layer that is almost universal. Creating a W3 DOM wrapper might be a good bet, and it it would allow for pointer surgery, and easy wrapping.
Jaxen navigation is a nice universal navigation.
Serialization
There are two routes for serialization, write XML using proper serialization, or serialize the objects using Java serialization.
Additionally, one could persue the option of writing out XML in a format that is fixed width, or otherwise easy to read.
Java Serialization
One would hope that serialization using Java DOM is fast
Easy Storage DOM
References
References into the nodes start with references to the fragments and offsets into the fragment in document order. Finding the offset into the fragment may be a time consuming operation, if the document fragment is unduly large.
Keeping a skip list in a hash table might expedite this search for some document object models.
Memento is Fragments
Memento allocates a new fragment. You drop your XML into the fragment. The fragment may contain references to other fragments.
You can always navigate the entire Memento document as one big document. This huge document is rooted at a memento node.
Fragments can be visible or invisible.
Fragments are stored in the DOM using a node in the memento namespace called fragment that has a fragment attribute that references the root node of the fragment that should replace this fragment.
Namespace Prefixes
Because memento is a bunch of fragments, namespace prefixes need a declaration strategy. Three pop to mine.
- vivify, make one up as needed,
- external so that you look it up out of a map,
- internal so that every fragment is itself a correct document with correct prefix declarations.
Traversing the parent axis for a namespace declaration might be time consuming, and some error might mean that the declaration is not there, or rather, we need to ensure that it is there at insert. External might be missing the namespace. Vivify is goofy looking, but will work and work fast.
Node Surgery and References
References into a fragment are perverted when you add nodes to the document. Are the discernable?
If you insert nodes, on a child axis, you push forward all the following nodes, thus…
If you have a version number lying around, is it possible to adjust references between versions?
Then when do you vacuum this bit of housekeeping?
Couple of things…
References include the version. A container of references includes the version. Reference counting?
Concurrency
Can concurrency be optimistic?
If not, how do we enter into the document, without having to lock down the root?
Because of indices, we can navigate the document with a query, and lock only the root fragments that interest us. There is no root, except for the virtual root.
Navigating the document from the virtual root means that we’ll lock shared in document order.
There are some rules to follow…
You can only lock downward.
Thus, if you jump into a fragment that is not a root fragment, so long as you do not navigate back up the tree, to the parent, you are cool.
If you perform a query and get a bunch of random junk, then you can’t lock in document order. You need to lock in some order though, and it needs to be consistent.
Allocation order will be the same as document order for root fragments. The fragment id will correspond to the order of the virtual document.
Can you read the individual fragment housekeeping? That’s different from locking the fragment for reading, isn’t it?
Thus, I can select a fragment, grab the latest version that is not mutating, read it, because updating that housekeeping ought to be atomic. Passing through a fragment ought to be quick.
The only time to lock the thing down, really, is when you are cleaning up the older version. There is no need to lock anything, is there?
After I’ve perverted all my fragments, I can then commit the changes, and if someone else has perverted one of my fragments since I started, then I can throw an exception.
That opportunistic.
Then it’s on the application to deal with problems where a fragment keeps bombing out over and over again, like a header document or a summary document.
And the locking occurs only on those fragments that have been mutated, so that you can simply lock them in allocation order, to lock them for update. Check that there are no newer versions. Then write your changes. Bada-bing!
It doesn’t even matter about the root anymore.
Versions
The version number can be the time down to millisecond, or hey, why not down to the second? They chances that two threads will update the same record at the same time had better be low, or else you’ll have a lot of exceptions.
Using time makes for less housekeeping, also more concurrency, since there is no version number header to maintain.
Roots and Indices
There is more than one root meaning that a Memento database can have multiple root documents, or none at all, since you can find fragments using an index.
Reindexing
Reindexing is still an annoyance. Must remove the item from the index, but won’t know how. Could keep the keys that find the fragment in the fragment, so that the fragment can remove itself when it changes. It is difficult to know if the fragment is still a part of the index.
Indices are going to have to be fragment based, for now then. There is something that determines if a fragment belongs, so you remove the previous version of the fragment from the index, then there is a list of indices that reference the fragment at least once.
What’s the big deal. If you index the fragment, and retain the logic that indexed the fragment, you can then run the logic on the fragment and iterate through the results to find your matches. That’s fast enough, isn’t it?
Store as XML
Why not? Then you can read it back in as XML. You can back up a lot easier too, can’t you?
You can deserialize into the DOM of your choice. You can use node surgery, using copy on write, then you can save that, that’s the opportuntsitic lock.
Okay, but to navigate, I can use an abstraction, like the Navigator, enter into a new fragment and use any available DOM to navigate, then if I need to perform surgery, convert the DOM, and find my way back to the place where I’m at.
Thus, I can use any old DOM I want, and there are many to chose from.
Rather than versioning each node, as I’d done prior, fragments are versioned.
Memento clusters a huge XML file into document fragments that are managed using MVCC. Each fragment can have mutiple versions.
Nodes are kept in an array in the file. The text referenced by text nodes and attirbutes are stored in a heap. Each node in the array has a pointer to a child.
It can get very compact.
It can be a list of node objects, they do not have to be of fixed length, since we’re going to navigate them in order anyway, just however many fit on a page.
An element has a namespace and an name. The namespace can be interned, as can the name, but not all strings, that is too complicated.
Thus, it doesn’t matter how big the elements are, the fragments are versioned. Perhaps they are numbered and indexed in a B+Tree?
Indices are tricky. Aren’t they always. Does the indice reference the node directly? Or does it identifiy a fragment and offset into the fragment?
comments



