Alan Gutierrez

Alan Gutierrez blogs on software, social networks, and himself.

Subscrive Via RSS Feed
« »

Indexing Fragments Versus Indexing Nodes

Fragments by K Tucker.

Memento does not index nodes. It indexes fragments based on the contents.

XPath can be used to extract the fields of an index, but XPath is not otherwise in play.

With a document, such as…


<person>
<first-name>Alan</first-name>
<last-name>Gutierrez</last-name>
</person>

(Uh, oh! WordPress cannot render example XML? Man this is broken!)

You can create an index using the XPath results from the query /person/last-name. I plan on writing a nice implementation that will select index participants using those XPath statements.

This doesn’t mean that you query the bin that contains the fragment, that you will receive an element with the name last-name. You’ll receive a list of fragments. If you want to extract nodes from those fragments, such as the element with the name last-name, then you’ll have to navigate the fragments by another means.

In fact, the query will return fragment identifiers, so that you can run the query, without deserializing the fragments. The fragments could be indexed using XOM and XPath, for example, but deserialized using JiXB, into a Java “Person” object.

At first, because I imagined a seamless integration with Saxon, it seemed necessary to index individual nodes.

Question: Is it possible to create an index of individual nodes by running the query used to create the fragment index against the returned fragments? Is it desirable?

Question: What happens when a fragment is duplicated in a index because a list property is indexed by individual element? Let’s say, a list of favorite colors indexes a person by both blue and red.

Leave a Reply