Molecules are built by chaining attributes together with the builder pattern.
Molecule then transforms the constructed molecule to a query string at compile time. Molecules can be constructed explicitly with the m
method. But generally the implicit call is used.
We could for instance build a molecule representing the data structure of Persons with name, age and gender Attributes:
// Explicit `m` macro call
m(Person.name.age.gender).get
// Implicit `m` macro call
Person.name.age.gender.get
The fundamental building blocks are Namespaces like Person
and Attributes like name
, age
and gender
. Namespaces are simply prefixes to Attribute names to avoid name clashes and to group our Attributes in meaningful ways according to our domain.
As you see we start our molecule from some Namespace and then build on Attribute by Attribute.
The size of molecules are limited to Scala’s arity limit of 22 for tuples.
But we can create a composite molecule with up to 22 x 22 = 484 attributes!
The attributes name
, age
and gender
that we saw above are typical cardinality-one attributes each with one value.
Datomic also has cardinality-many attributes that have a Set
of values. This means that the same value cannot be saved multiple times, or that only unique values are saved. An example could be a cardinality-many attribute hobbies
of a Person
:
Person.name.hobbies.get.head === ("John", Set("Trains", "Chess"))
In the Update section of CRUD we will see how multiple values are managed with Molecule.
attr
When we use a molecule to query the Datomic database we ask for entities having all our Attributes associated with them.
Note that this is different from selecting rows from a sql table where you can also get null values back!
If for instance we have entities representing Persons in our data set that haven’t got any age Attribute associated with them then this query will not return those entities:
val persons = Person.name.age.get
Basically we look for matches to our molecule data structure.
attr_
Sometimes we want to grap entities that we know have certain attributes, but without returning those values. We call the un-returning attributes “tacit attributes”.
If for instance we wanted to find all names of Persons that have an age attribute set but we don’t need to return those age values, then we can add an underscore _
after the age
Attribute:
val names = Person.name.age_.get
This will return names of person entities having both a name and age Attribute set. Note how the age values are no longer returned from the type signatures:
val persons: List[(String, Int)] = Person.name.age.get
val names : List[String] = Person.name.age_.get
This way we can switch on and off individual attributes from the result set without affecting the data structures we look for.
attr$
If an attribute value is only sometimes set, we can ask for it’s optional value by adding a dollar sign $
after the attribute:
val names: List[(String, Option[String], String)] = Person.firstName.middleName$.lastName.get
That way we can get all person names with or without middleNames. As you can see from the return type, the middle name is wrapped in an Option
.
Mapped values can be saved with mapped attributes in Molecule. It’s a special Molecule construct that makes it easy to save for instance multi-lingual data without having to create language-variations of each attribute. But they can also be used for any other key-value indexed data.
Say you want to save famous Persons names in multiple languages. Then you could use a mapString:
// In definition file
val name = mapString
// Insert mapped data
Person.id.name.insert(
1,
Map(
"en" -> "Dmitri Shostakovich",
"de" -> "Dmitri Schostakowitsch",
"fr" -> "Dmitri Chostakovitch",
"es" -> "Dmitri Shostakóvich"
)
)
// Retrieve mapped data
Person.id.name.get.head === (1,
Map(
"en" -> "Dmitri Shostakovich",
"de" -> "Dmitri Schostakowitsch",
"fr" -> "Dmitri Chostakovitch",
"es" -> "Dmitri Shostakóvich"
)
)
Molecule concatenates the key and value of each pair to one of several values of an underlying cardinality-many attribute. When data is then retrieved Molecule splits the concatenated string into a typed pair. This all happens automatically and let’s us focus on their use in our code.
We can apply values to Attributes in order to filter the data structures we are looking for. We could for instance find people who like pizza:
Person.likes.apply("pizza")
or simply
Person.likes("pizza")
Since the applied value “pizza” ensures that the attributes returned has this value we will get redundant information back for the likes
attribute (“pizza” is returned for all persons):
Person.name.likes("pizza").get === List(
("John", "pizza"),
("Ben", "pizza")
)
This is an ideomatic place to use a tacit attribute likes_
to say “Give me names of persons that like pizza” without returning the likes
value “pizza” over and over again. Then we get a nice list of only the pizza likers:
Person.name.likes_("pizza").get === List(
"John", "Ben"
)
Note that since we get an arity-1 result back it is simply a list of those values.
We can apply OR-logic to find a selection of alternatives
Person.age(40 or 41 or 42)
// .. same as
Person.age(40, 41, 42)
// .. same as
Person.age(List(40, 41, 42))
If we add the fulltext
option to a String attribute definition Datomic will index the text strings saved so that we can do fulltext searches across all values. We could for instance search for Community names containing the word “Town” in their name:
Community.name.contains("Town")
Note that only full words are considered, so “Tow” won’t match. Searches are case-insensitive.
Also, the following common words are not considered:
"a", "an", "and", "are", "as", "at", "be", "but", "by",
"for", "if", "in", "into", "is", "it",
"no", "not", "of", "on", "or", "such",
"that", "the", "their", "then", "there", "these",
"they", "this", "to", "was", "will", "with"
We can exclude a certain attribute value like in “Persons that are not 42 years old”:
Person.age.!=(42)
// or
Person.age.not(42)
Negate multiple values
Person.age.!=(40 or 41 or 42)
Person.age.!=(40, 41, 42)
Person.age.!=(List(40, 41, 42))
We can filer attribute values that satisfy comparison expressions:
Person.age.<(42)
Person.age.>(42)
Person.age.<=(42)
Person.age.>=(42)
Comparison of all types are performed with java’s compareTo
method. Text strings can for instance also be sorted by a letter:
Community.name.<("C").get(3) === List(
"ArtsWest", "All About South Park", "Ballard Neighbor Connection")
We can look for non-asserted attributes (Null values) as in “Persons that have no age asserted” by applying an empty value to an attribute:
Person.name.age_() === // all persons where age hasn't been asserted
Note that the age_
attribute has to be tacit (with an underscore) since we naturally can’t return missing values.
Even though Molecule introspects molecule constructions at compile time we can still use (runtime) variables for our expressions:
val youngAge = 25
val goodAge = 42
Person.age(goodAge)
Person.age.>(goodAge)
Person.age.<=(goodAge)
Person.age.>=(goodAge)
Person.age.!=(goodAge)
Person.age.!=(youngAge or goodAge)
Technically, Molecule saves the TermName
like ‘goodAge’ of the variable for later resolution at runtime so that we can freely use variables in our expressions.
For now Molecule can’t though evaluate arbitrary applied expressions like this one:
Person.birthday(new java.util.Date("2017-05-10"))
In this case we could instead apply the expression result to a variable and use that in the molecule:
val date = new java.util.Date("2017-05-10")
Person.birthday(date)
Molecule wraps Datomic’s native aggregate functions by applying special aggregate keyword objects to the attribute we want to aggregate on.
Aggregate functions either return a single value or a collection of values:
Applying the min
or max
aggregate keyword object as a value to the age
attribute returns the lowest/highest ages.
Person.age(min) // lowest age
Person.age(max) // highest age
min
/max
supports all attribute types.
Person.age(sum) // sum of all ages
Count the total number of entities with an asserted age
value (not to be confused with sum
).
Person.age(count) // count of all persons with an age (not the sum of ages)
Count the total number of entities with asserted unique age
values (not to be confused with sum
).
Person.age(countDistinct) // count of unique ages
Person.age(avg) // average of all ages
Person.age(median) // median of all ages
Person.age(variance) // variance of all ages
Person.age(stddev) // standard deviation of all ages
Person.age(distinct) // distinct ages
Person.age(min(3)) // 3 lowest ages
Person.age(max(3)) // 3 highest ages
Person.age(rand(3)) // 3 random persons (with potential for duplicates)
Person.age(sample(3)) // 3 sample persons (without duplicates)
Molecules can be parameterized by applying the input placeholder ?
as a value to an attribute. The molecule then expects input for that attribute at runtime.
By assigning parameterized “Input-molecules” to variables we can re-use those variables to query for similar data structures where only some data part varies:
// 1 input parameter
val person = m(Person.name(?))
val john = person("John").get.head
val lisa = person("Lisa").get.head
Of course more complex molecules would benefit even more from this approach.
Datomic caches and optimizes queries from input molecules so performance-wise it’s a good idea to use them.
val personName = m(Person.name(?))
val johnOrLisas = personName("John" or "Lisa").get // OR
Molecules can have up to 3 ?
placeholder parameters.
val person = m(Person.name(?).age(?))
val john = person("John" and 24).get.head // AND
val johnOrJonas = person(("John" and 24) or ("Lisa" and 20)).get // AND/OR
val americansYoungerThan = m(Person.name.age.<(?).Country.name("USA"))
val americanKids = americansYoungerThan(13).get
val americanBabies = americansYoungerThan(1).get