Your Data Model is defined in a plain Scala file that you can have anywhere in your project. Molecule interprets this file and creates Molecule boilerplate code based on your Data Model so that you can write molecule queries with your domain terms.
For Molecule to recognize your Data Model, a simple Data Model definition DSL is imported. A Scala object with the name or your domain and “DataModel” added should contain your definitions:
package path.to.your.project
import molecule.core.data.model._ // Data Model DSL
object YourDomainDataModel { // Name ending in "DataModel"
// ... Attribute definitions within Namespaces
}
Next, attributes needs to be defined within Namespaces.
An attribute is the smallest unit of information of a Data Model. It’s like a field in a database, a property, or, well, an “attribute” of something.
Attributes are organised within “namespaces” that semantically group related attributes. If used with SQL it would correspond to fields in a Table. But with Datomic, the semantics of a namespace is more of a “prefix” to the attribute.
Let’s look at an example Data Model from the Seattle tutorial:
package path.to.your.project
import molecule.core.data.model._ // import Data Model DSL
@InOut(2, 8)
object SeattleDataModel {
trait Community {
val name = oneString.fulltext.doc("A community's name") // optional doc text
val url = oneString
val category = manyString.fulltext
val orgtype = oneEnum("community", "commercial", "nonprofit", "personal")
val tpe = oneEnum("email_list", "twitter", "facebook_page" /* more..*/).alias("type")
val neighborhood = one[Neighborhood]
}
trait Neighborhood {
val name = oneString
val district = one[District]
}
trait District {
val name = oneString
val region = oneEnum("n", "ne", "e", "se", "s", "sw", "w", "nw")
}
}
Since type
is a reserved word in Scala, we choose another name and make an alias called “type”. This allows for accessing imported data that uses type
as attribute name.
The @InOut(2, 8)
arity annotation at the top instructs the generated boilerplate code to able to create molecules with up to 2 input attributes and up to 8 “output” attributes.
When developing your Data Model you might just set the first arity annotation variable for input attributes to 0
and then later when your model is stabilizing, then you can add the ability to make input molecules by setting it to 1, 2 or 3 (the maximum). Using parameterized input attributes can be a performance optimization since using input values in queries allows Datomic to cache the query.
The second arity annotation parameter basically tells how long molecules you can build. This doesn’t affect how many attributes you can define in each namespace in the Data Model. The maximum arity of a molecule and for this annotation parameter is 22, the same as for tuples.
If you at some point need to make molecules with more than 22 attributes you can use composite molecules.
If your Data Model gets big, you can use an extra layer of organization with “Partitions” that encapsulate multiple related Namespaces.
A “Partition” in Molecule is used as a conceptual term and not in the traditional sense as a physical database partition. The whole Data Model should be regarded as a conceptual model that can be projected later onto various databases. A database admin might choose to partition a database according to our conceptual partitions and in that case the terms would correlate.
We could for instance group some generic namespaces (and their respective attribues) in a gen
partition. And some Namespaces about literature in a lit
Partition:
@InOut(3, 22)
object BookstoreDataModel {
object gen {
trait Person {
val name = oneString
val gender = oneEnum("male", "female")
}
// ..more namespaces in the `gen` partition
}
object lit {
trait Book {
val title = oneString
val author = one[gen.Person] // ref to namespace in other partition
val publisher = one[Publisher] // ref to namespace in this partition
val cat = oneEnum("good", "bad")
}
trait Publisher {
val name = oneString
}
// ..more namespaces in the `lit` partition
}
}
Each partition can contain as many namespaces as you want.
Partition names have to be in lowercase and are prepended to the namespaces it contains with an underscore inbetween:
lit_Book.title.cat.Author.name.gender.get.map(_ ==> ...)
Since Author
is already defined as a related namespace we don’t need to prepend the partition name there.
In the Seattle example we saw how attributes are defined by assigning various DSL settings to a named variable:
oneString
, manyString
etc defines cardinality and type of an attribute.oneEnum
/manyEnum
defines enumerated values (pre-defined words).one[<ReferencedNamespace>]
defines a reference to another namespace.We can define the following types of attributes:
Cardinality-one Cardinality-many Mapped cardinality-many
------------------- ------------------------- --------------------------------
oneString : String manyString : Set[String] mapString : Map[String, String]
oneInt : Int manyInt : Set[Int] mapInt : Map[String, Int]
oneLong : Long manyLong : Set[Long] mapLong : Map[String, Long]
oneDouble : Double manyDouble : Set[Double] mapDouble : Map[String, Double]
oneBoolean : Boolean manyBoolean : Set[Boolean] mapBoolean : Map[String, Boolean]
oneDate : Date manyDate : Set[Date] mapDate : Map[String, Date]
oneUUID : UUID manyUUID : Set[UUID] mapUUID : Map[String, UUID]
oneURI : URI manyURI : Set[URI] mapURI : Map[String, URI]
oneBigInt : BigInt manyBigInt : Set[BigInt] mapBigInt : Map[String, BigInt]
oneBigDecimal: BigDecimal manyBigDecimal: Set[BigDecimal] mapBigDecimal: Map[String, BigDecimal]
oneEnum : String manyEnum : Set[String]
Due to limitations in JavaScript, some Float
precision is lost on the js platform. Please use Double
instead to ensure safe double precision.
Cardinality-one attributes can have one value per entity.
Cardinality-many attributes can have a Set
of unique values per entity. Often we choose instead to model many-values as a many-reference to another entity that could have more than one attribute.
Mapped cardinality-many attributes are a special Molecule variation based on cardinality-many attributes. Read more here…
References are also treated like attributes. It’s basically a reference to one or many entities. We define such relationship by supplying the referenced namespace as the type parameter to one
/many
:
Cardinality one Cardinality many
--------------- ----------------
one[<Ref-namespace>] many[<Ref-namespace>]
In the example above we saw a reference from Community to Neighborhood defined as one[Neighborhood]
. We would for instance likely define an Order/OrderLine relationship in an Order namespace as many[OrderLine]
.
In Bidirectional relationships some specialized reference definitions for bidirectional graphs are explained.
In Datomic, each attribute can have some extra options:
Option | Indexes | Description |
---|---|---|
doc | Attribute description. | |
uniqueValue | ✔ | Attribute value is unique to each entity. Attempts to insert a duplicate value for a different entity id will fail. |
uniqueIdentity | ✔ | Attribute value is unique to each entity and “upsert” is enabled. Attempts to insert a duplicate value for a temporary entity id will cause all attributes associated with that temporary id to be merged with the entity already in the database. |
indexed | ✔ | Generated index for this attribute. By default all attributes are set with the indexed option automatically by Molecule, so you don’t need to set this. |
fulltext | ✔ | Generate eventually consistent fulltext search index for this attribute. |
isComponent | ✔ | Specifies that an attribute whose type is :db.type/ref is a component. Referenced entities become subcomponents of the entity to which the attribute is applied. When you retract an entity with :db.fn/retractEntity, all subcomponents are also retracted. When you touch an entity, all its subcomponent entities are touched recursively. |
noHistory | Whether past values of an attribute should not be retained. |
Datomic indexes the values of all attributes having an option except for the doc
and noHistory
options.
We saw examples of adding options by when we added fulltext
to some of the attributes in the Seattle definition above. Molecule’s schema definition DSL let’s you only choose allowed options for any attribute type.