CentOS
February 16th, 2007I purchased two servers at GoDaddy.com, one for myself and one for Think New Orleans, since they are not that expensive, I thought I’d purchase new servers as Think New Orleans took on more projects.
GoDaddy.com was chosen because they supported Fedora Core 4, which is what I’m currently running at ljubljana.blogometer.com. The server I purchased for Think New Orleans got off to an inauspcious start. Messages were popping up on the console informing me that the CPU was out of its temperature range. After filing a report, and waiting quite a while, I got a call telling me that the CPU fan was out. It was replaced.
Then I spend a couple hours wrestling with Plesk, trying to get it to host a WordPress blog. There were security features of PHP that prevented WordPress from running. I found myself trolling web bulletin boards reading about PHP configurations. Life it too short for this.
Why bother? I thought it would be nice to have managed hosting, and that Plesk would make that easier, since the GoDaddy.com managers would be familiar with it. That only creeped into my head as I was fixing Plesk, though. Something is broken, must fix, must justify loss three hours of life. Obsessive compulsive server adminsitration.
Plesk is rubbish. The application is virtual hosting of a lot of pet stores and such. Easy install guest book script, oh boy!
After filing a request to rebuild the server without it, I picked off the dependencies with rpm -e while watching the Daily Show, so I hope they don’t come by and zap my server now that it is Plesk free.
The upshot is that I’m going to lease a third dedicated server from LunarPages or SingleHop and start building out a CentOS distribution. It seems to be popular and there are more hosting options with it.
I’m pleased to see that there is finally a place to go after the end of Red Hat.
comments
Memento Offload
February 13th, 2007Fragments
Memento is an XML database. It organizes XML documents as document fragments.
These fragments are kept in memory using soft references so they can be discarded when not in use.
The fragments can be used to compose a very large document. An application can navigate a document that is too large to fit into memory using an in memory document object model.
The fragments can be extracted from the database using queries to form XML result sets. Fragments can be stored and indexed so they can be retrieved as a document composed of the found fragments.
Document manipulation is performed using the document object model of your choice. Fragments are XML documents. They are created by writing to an output stream, so you can create fragments the same way you write XML to file. You create fragments using the DOM of your choice, or XSLT transforms, or else copying XML files from disk.
Fragments reference other fragments by virtue of elements in a specific namespace. When they are encountered, the application should substitute the fragment referenced by the fragment element.
Document navigation is performed by making this jump when a fragment element is detected.
The fragments can be interpreted as a single, seamless document when using a navigation API like the ones that back Jaxen and Saxon, streaming APIs such as SAX or STaX, or interface based document APIs such as Dom4J, and W3 DOM.
With these APIs the fragment jump can be implemented behind the interfaces of the API, they can skip over the boundaries between fragments to present their contents as a single XML document.
Big Files
If you are using Saxon to transform an XML document that will no longer fit into memory, you can read it into a Memento document using a SAX filter that will break it up into fragments at intervals of your choosing. Then, you can then navigate the document, using a Saxon NodeInfo wrapper.
Lot’s of Little Files
If you were to collect a lot of syndicate feeds, you could store the individual entries as single documents and index them on the date of the entry, to create an aggregate feed that is ordered by time.
(You get includes out of this don’t you?)
ACID
Memento uses Multi-Version Concurrency Control so that changes to a Memento document are atomic, consistant, isolated and durable.
Memento journals changes to the database, so that changes can be rolled back, or in the event of a shutdown, rolled back for forward, to ensure that the database is repaired.
Memento uses opportunistic locking, so that you can write new versions of the database, and so long as the fragments you’ve changed have not been changed by another process, they are committed.
The opportunistic locking makes Memento a very lively database. While a thread changes the database, writing out new versions of fragments, other threads can navigate an unchanged version of that fragment.
Indices
How would an index be used?
I can’t think of how an index would be used. The only way that occurs to me, is to do this.
Define an index. Populate the index using an XPath query. Then query the index using string values.
It could be queried from with XPath and XSLT using an extension function.
That is a very simple implementation of in index that could be used to prove the concept. One could then store atom entries using the UUID. Search for the UUID and obtain a navigator.
That is an implementation that I could use to store atom feeds.
Nodes
There are a number of DOMs that could be candidates for use with Memento. JDOM and DOM4J appear to be serializable, so they can be stored directly to a managed file, and everything could be serialized as XML and parsed to retrieve.
Navigating all these DOMs is a trick. There is no good solution. There are two navigation abstractions of which I’m aware, one in Saxon and one in Jaxen.
I’m already using Saxon and almost dependent on it, however, it is very hard to understand, and very difficult to use outside of Saxon.
W3C DOM itself provides an abstraction layer that is almost universal. Creating a W3 DOM wrapper might be a good bet, and it it would allow for pointer surgery, and easy wrapping.
Jaxen navigation is a nice universal navigation.
Serialization
There are two routes for serialization, write XML using proper serialization, or serialize the objects using Java serialization.
Additionally, one could persue the option of writing out XML in a format that is fixed width, or otherwise easy to read.
Java Serialization
One would hope that serialization using Java DOM is fast
Easy Storage DOM
References
References into the nodes start with references to the fragments and offsets into the fragment in document order. Finding the offset into the fragment may be a time consuming operation, if the document fragment is unduly large.
Keeping a skip list in a hash table might expedite this search for some document object models.
Memento is Fragments
Memento allocates a new fragment. You drop your XML into the fragment. The fragment may contain references to other fragments.
You can always navigate the entire Memento document as one big document. This huge document is rooted at a memento node.
Fragments can be visible or invisible.
Fragments are stored in the DOM using a node in the memento namespace called fragment that has a fragment attribute that references the root node of the fragment that should replace this fragment.
Namespace Prefixes
Because memento is a bunch of fragments, namespace prefixes need a declaration strategy. Three pop to mine.
- vivify, make one up as needed,
- external so that you look it up out of a map,
- internal so that every fragment is itself a correct document with correct prefix declarations.
Traversing the parent axis for a namespace declaration might be time consuming, and some error might mean that the declaration is not there, or rather, we need to ensure that it is there at insert. External might be missing the namespace. Vivify is goofy looking, but will work and work fast.
Node Surgery and References
References into a fragment are perverted when you add nodes to the document. Are the discernable?
If you insert nodes, on a child axis, you push forward all the following nodes, thus…
If you have a version number lying around, is it possible to adjust references between versions?
Then when do you vacuum this bit of housekeeping?
Couple of things…
References include the version. A container of references includes the version. Reference counting?
Concurrency
Can concurrency be optimistic?
If not, how do we enter into the document, without having to lock down the root?
Because of indices, we can navigate the document with a query, and lock only the root fragments that interest us. There is no root, except for the virtual root.
Navigating the document from the virtual root means that we’ll lock shared in document order.
There are some rules to follow…
You can only lock downward.
Thus, if you jump into a fragment that is not a root fragment, so long as you do not navigate back up the tree, to the parent, you are cool.
If you perform a query and get a bunch of random junk, then you can’t lock in document order. You need to lock in some order though, and it needs to be consistent.
Allocation order will be the same as document order for root fragments. The fragment id will correspond to the order of the virtual document.
Can you read the individual fragment housekeeping? That’s different from locking the fragment for reading, isn’t it?
Thus, I can select a fragment, grab the latest version that is not mutating, read it, because updating that housekeeping ought to be atomic. Passing through a fragment ought to be quick.
The only time to lock the thing down, really, is when you are cleaning up the older version. There is no need to lock anything, is there?
After I’ve perverted all my fragments, I can then commit the changes, and if someone else has perverted one of my fragments since I started, then I can throw an exception.
That opportunistic.
Then it’s on the application to deal with problems where a fragment keeps bombing out over and over again, like a header document or a summary document.
And the locking occurs only on those fragments that have been mutated, so that you can simply lock them in allocation order, to lock them for update. Check that there are no newer versions. Then write your changes. Bada-bing!
It doesn’t even matter about the root anymore.
Versions
The version number can be the time down to millisecond, or hey, why not down to the second? They chances that two threads will update the same record at the same time had better be low, or else you’ll have a lot of exceptions.
Using time makes for less housekeeping, also more concurrency, since there is no version number header to maintain.
Roots and Indices
There is more than one root meaning that a Memento database can have multiple root documents, or none at all, since you can find fragments using an index.
Reindexing
Reindexing is still an annoyance. Must remove the item from the index, but won’t know how. Could keep the keys that find the fragment in the fragment, so that the fragment can remove itself when it changes. It is difficult to know if the fragment is still a part of the index.
Indices are going to have to be fragment based, for now then. There is something that determines if a fragment belongs, so you remove the previous version of the fragment from the index, then there is a list of indices that reference the fragment at least once.
What’s the big deal. If you index the fragment, and retain the logic that indexed the fragment, you can then run the logic on the fragment and iterate through the results to find your matches. That’s fast enough, isn’t it?
Store as XML
Why not? Then you can read it back in as XML. You can back up a lot easier too, can’t you?
You can deserialize into the DOM of your choice. You can use node surgery, using copy on write, then you can save that, that’s the opportuntsitic lock.
Okay, but to navigate, I can use an abstraction, like the Navigator, enter into a new fragment and use any available DOM to navigate, then if I need to perform surgery, convert the DOM, and find my way back to the place where I’m at.
Thus, I can use any old DOM I want, and there are many to chose from.
Rather than versioning each node, as I’d done prior, fragments are versioned.
Memento clusters a huge XML file into document fragments that are managed using MVCC. Each fragment can have mutiple versions.
Nodes are kept in an array in the file. The text referenced by text nodes and attirbutes are stored in a heap. Each node in the array has a pointer to a child.
It can get very compact.
It can be a list of node objects, they do not have to be of fixed length, since we’re going to navigate them in order anyway, just however many fit on a page.
An element has a namespace and an name. The namespace can be interned, as can the name, but not all strings, that is too complicated.
Thus, it doesn’t matter how big the elements are, the fragments are versioned. Perhaps they are numbered and indexed in a B+Tree?
Indices are tricky. Aren’t they always. Does the indice reference the node directly? Or does it identifiy a fragment and offset into the fragment?
Bento
February 9th, 2007
Authentic c-store Japanese bento by Ramil Sagum.
Malloc
The idea behind Bento it to take a file and divide it up using the malloc algorithm as described by Doug Lea.
Bento is written in Java for JDK 1.4.2 using NIO.
You can read more about Bento on my Wiki and you can paruse the source in my Subversion repsitory and you can use it under LGPL.
Design Goals
Bento is a file format for writing out discrete blocks of data. It is designed for the storage of objects. Bento is not supposed to be a database in itself.
Bento is a robust file format that provides
- storage of blocks of arbitrary size,
- random access to stored blocks,
- atomic writes of multiple blocks,
- thread-safe allocation and deallocation of blocks,
- and thread-safe reads and writes.
Funamentally, Bento strives to be a simple data structure that you can use to persist objects in your applications. You can use Bento as the basis for more complicated data structures or databases.
Usage
Bento can be used to serialize objects using Java serialization, to write out objects as Java primitives, or to store Java primitives.
A single Bento file can be organized into a database of different Java types. For desktop applications, this means that a single file can be the container for many different objects of many different classes.
Multiple Bento files can be organized into a database, spread across different file system, to reduce disk contention.
You can design resuable Bento based data structures and mix them into a single Bento file.
Quick Start
An example program that creates a Bento file with a static header and a single allocated block.
Bento.Creator creator = new Bento.Creator();
creator.addStaticBlock(URI.create("flag://agtrz.com/header"), 512);
Bento bento = creator.create(new File("data.bento"));
Bento.Mutator mutator = bento.mutate();
try {
Bento.Block block = mutator.allocate(512);
ByteBuffer bytes = block.toByteBuffer();
for (int i = 0; i < 512 / 4; i++) {
bytes.putInt(i);
}
block.write();
Bento.Block header = mutator.load(URI.create("flag://agtrz.com/header"));
ByteBuffer bytes = header.toByteBuffer();
block.getAddress().write(bytes);
header.write();
mutator.commit();
}
finally {
mutator.rollback();
}
bento.close();
Bento.Opener opener = new Bento.Opener();
bento = opener.open(new File("bento.data"));
ByteBuffer header = bento.read(URI.create("http://agtrz.com/header"));
ByteBuffer bytes = bento.read(new Bento.Address(header));
for (int i = 0; i < bytes.capacity() / 4; i++) {
if (i != bytes.getInt()) {
throw new IllegalStateException();
}
}
bento.close();
Mutation
A Bento file is written using a Bento.Mutator. All of the changes performed by a mutator are atomic. The changes are not made perminent until the commit method of mutator is called. The changes can be discarded by calling the rollback method of the mutator.
After calling commit or rollback the mutator can be reused. Calling commit or rollback when there are no uncommited changes held by the mutator does nothing. Thus, we use a try/finally block to recover the state of a Bento file when a failure occurs.
public void storeObjects(Bento bento, Bento.Address address) {
Bento.Mutator mutator = bento.mutate();
try {
Bento.Block block = mutator.load(address);
block.toByteBuffer().putInt(this.magicNumber);
block.write();
mutator.commit();
}
finally {
mutator.rollback();
}
}
Concurrency
Bento synchronizes the list of free blocks, but does not synchronize the blocks themselves. Blocks can be allocated and freed in a thread-safe manner without further consideration by the application developer.
If you read and write blocks without some syncrhonization strategy, you’ll turn your Bento file into garbage in short order.
Multiple threads can read the same blocks, but blocks that are being written should not be written or read by other threads. Reads can be shared. Writes are exclusive.
A simple strategy is to use a read/write lock from the concurrency library to gaurd the file during updates.
Bento strives for liveness, however. It places as few locks as possible on the management of list of free blocks.
A more sophisticated strategy would place locks on the blocks themselves. A composite object, such as a tree, could lock a tree root element. Different threads can then operate on different trees in the same file. They would not run afoul of each other.
Atomic Writes
A Bento file is edited using a mutator the object. The mutator object keeps all block writes, allocations, and deallocations are kept in memory until the user commits or rolls back the changes made by the mutator.
Journaling
All reads, writes, allocations and frees are performed on buffers in memory. Only when a mutation is committed or rollback, is the underlying file written.
Journaling is implemented by writing out the list of operations into a journal prior to applying the changes to the file.
The journal can be implemented using blocks within the file, or it can be implemented using external temporary files. The former is good for desktop applications, where the user may move a corrupted Bento file out of reach of the external journals, or unwittingly delete the external journals. The latter is good for a server application, where multiple Bento files and journals can exist in a single directory, watched over by an administrator.



