Introduction to Quarks

Definition

Quarks are basically representations of existents, be they entities, relationships, attributes, states, actions, etc. Most classification schemes and computer disciplines so far, especially the database community, divide their domain into entities and either attributes or relationships. Note that in a sense, the choice of representation is arbitrary. If you want, you can model, say, a blue box as:

an entity of type “box”, with a “blue” attribute of type “color”
an entity of type “box”, with a “has attribute” relationship to the “blue” singleton entity of type “color”
an entity of type “box”, with a “color” relationship to the “blue” singleton entity
an entity with a “type” relationship to “box” and a “color” relationship to the “blue” singleton entity
an entity plus the “type” entity plus the “box-type” entity, and
the same entity plus the “has-single-color” entity plus the “blue” entity
etc.

We call the entities in the last example “quarks,” and an assertion of fact expressed by three quarks a “triplet.”

Aristotle defined 12 basic categories of thought. These are entities, attributes, relationships, quantity, quality, etc. Most computational systems take some subset of these and use as primitives. We’ve gone to the extreme, and only have one: the “quark,” or “existent.” Thus, Aristotle's Categories are implemented within QuarkSpace, not as primitives of the basic QuarkSpace Kernel.

Another crucial starting point is that to make any statement of any kind, you need two parts, a subject and and a predicate. The predicate says something about the subject, i.e. it describes it. To do that, it needs to establish a relationship between the subject and some other existent. This is done with a verb and a predicate object. Hence the “triplet” construct. It relates two quarks using a third quark.

We could have chosen to have a two-quark or “couplet” model, or allow more than three quarks to be strung together. However, in a “couplet” model, one of the quarks would have had to be a composite quark consisting of a verb-object combination. More than three quarks could have been used, to provide more direct support for such constructs as context, quantity, ordinal, storage device, version, modification time, etc., but they are not universally applicable, and we chose the most generic approach possible. This way, we get a simple model which can be generically optimized, without lots of artificial boundaries between what is contained in the Kernel and what is layered on top of it.

Quarks will take a while getting used to, but we hope you’ll come to see the power of their simplicity and generality.

Example

As a first example, let’s expand on the example presented in Why Quarks? It is often desirable to use multiple “orthogonal,” i.e. unrelated, classifications. One way to achieve this is to model attributes and relationships “raw,” rather than using a fixed categorization as a storage scheme. For example, most automated knowledge systems store data in a containment hierarhcy, much like the kingdom-phylum-class-order-family-genus-species structure of biology. This provides efficient storage of and access to this view of the data. When dealing with other views of the data, such programs have to go through hoops, however, to simulate alternative hierarchichical views of the same data. In QuarkSpace, we treat data as essentially flat or arbitrarily nested—however you like to see it. The hierarchies used in interfaces, such as a file and directory hierarchy, become simply “views” onto underlying data, which supports any number of different views. These views are not simply a way to format data hierarchically, but also a way to enter hierarchical data and transform it into quarks. Typical low-level views will convert XML data, for example, which is hierarhical, to and from QuarkSpace, or to and from Entity-Attribute-like databases, such as address books. Views involve code that does the transformation, but very generic view handlers will be able to take “schema” descriptions, expressed in Quark, that allow you to produce specific views simply by describing them in data, rather than coding yourself.

To show you more concretely what quarks, triplets and views look like, let’s look at the dolphin example from Why Quarks? in some detail. Here are some triplets that describe dolphins as swimmers and mammals. The full context, which would be quite extensive, is not presented here. Only the essential building blocks which would allow a complete QuarksSpace describing the animal kingdom by reapplying them in principle are listed.

<dolphins-class>   <swim-in>             <water>
<dolphin>          <is-subtype-of>       <mammal>

<dolphins-class>   <is-the-class-of>     <dolphin> 
<mammals-class>    <is-the-class-of>     <mammal> 
<dolphins-class>   <is-subset-of>        <mammal-class>

Note that in normal thought and language, we speak of classes of things as if what we say about them apply to exactly all instances of the type. In fact, however, what we say usually applies to those instances “always or for the most part,” as in Aristotle’s words. Exceptions of a certain type are allowed. For example, all human beings have two hands with five fingers each. This is part of our genetic heritage. Exceptions happen, however, due to mutations and physical accidents during life. So, in Quark, we’ll make a distinction between <all-x> and <the-x-class>. So, <the-human-being-class> all have two hands with five fingers, but <all-human-beings> don’t.

Note that the concepts involved in this example belong to a rather advanced stage of knowledge. This is essential to a highly sophisticated knowledge base. But, it is equally important to note that QuarkSpace is designed explicitly to not require sophisticated classification, but to allow the evolution of classification schemes.

A child, while growing up, will start with very crude classification schemes. He may even use somewhat “silly” classification schemes as mnemonics, or even because he doesn’t know any better. For example, it would not be totally unreasonable to suppose a child had the following classification scheme:

<flipper>          <swim-in>             <water>
<mommy>            <swim-in>             <water>
<flipper>          <does>                <sing>
<mommy>            <does>                <sing>
<singer-swimmer>   <is-subtype-of>       <animals>
<flipper>          <is-instance-of>      <singer-swimmer>
<daddy>            <is-instance-of>      <bearded-animal>

Having such a scheme doesn’t prevent a more sophisticated one from evolving on top of it. But, if the child grew into an adult without revising or suppressing his earlier classification he would have definite problems. One way that the human mind deals with evolving knowledge is by going through spirals of learning, where old knowledge if revisited from a new, broader context. We often learn a new way to look at the world, and it feels like we’ve lost the knowledge we had in an earlier perspective. We then go back through the old subject matter and relate it to our new context, and only then do we have ready access to the old knowledge. We don’t exactly lose the old knowledge. It is still there, although retrieving it from memory has become complex, because it does not have associations to our new context. By forcing ourselves to go over the old knowledge we connect the old knowledge to our new context, and maybe revise it, or simply make integrations to new knowledge. Similarly, QuarkSpace will often be used to create new “versions” of abstract quarks, and old quarks, that were connectedto the old versions are still retrievable, only not as easily. When retrieval is accomplished, there is opportunity to make new connections to the new versions.

Another feature of QuarkSpace is that it will use statistics of usage to optimize storage and retrieval schemes. If quarks and triplets are accessed often, they’ll be prioritized in future searches. The search paths will burn trails, so to speak, making them easier to traverse with time. Less used information will be migrated to storage that is slower to access. Clients will be able to specify the amount of effort that goes into a search, and a search can be made as an iteration incorporating effort feedback.

I mentioned using “silly” classification schemes as mnemonics. This is a practice most of us use, child and adult alike. Ayn Rand called it thinking in headlines. I think of it as classification by example combined with the mnemonic benefit of exaggeration (although I use the “thinking in headlines” expression as a headline for this concept…). For example, we’ve all heard of good-guy bad-guy analyses of stories. We know it’s sometimes an exaggeration, but it’s a short way of representing a more complex classification. Or, we think of the football team playing our home team as “fools,” “geriatrics,” or whatnot. These are akin to metaphors used as classification. Mentally, we know that it’s not exactly what we mean, but we use these concepts in our internal thinking, because they make it easier to think in terms of essentials, to remember things clearly, because it’s fun, to releave frustration, etc. Traditional classification schemes require a fuddy-duddy anal retentive attitude of absolutely correct terminology and precise classification according to the one correct hierarchy. Quarks, on the other hand, let you work with classification schemes that are in transition, have various degrees of clarity in different parts, use metaphors, etc. Such things can be dealt with through the use of views. You can even take data that encodes a strict hierarchy and view it in your own private metaphorical classification scheme. If someone developed a view that encoded natural language into underlying quarks, you could create another view that expresses the same message using your own personal classification scheme. If a news article, say, were encoded by the author using quarks, it could be interpreted using your own set of views.