---
last_review: "2025-01-01"
last_reviewer: "-"
documented_code: [ ]
---
```{tags} developer, explanation
```
(mysql-backend)=
# MariaDB Backend #
:::{note}
This page has been migrated from the old documentation, and has not yet been fully revised.
There might be inconsistencies or errors when using with current LinkAhead versions.
:::
% TODO: Issue: https://gitlab.indiscale.com/caosdb/src/linkahead-docs/-/issues/114
The MariaDB backend in CaosDB may be substituted by other backends, but at the time of writing this
documentation, only MariaDB (often still named MySQL in the sources) is implemented. There are the
following main packages which handle the backend:
[backend.interfaces](/reference/server/api/org/caosdb/server/database/backend/interfaces/package-index.rst)
: Interfaces which backends may implement. The main method for most interfaces is `execute(...)`
with arguments depending on the specific interface, and benchmarking methods (`getBenchmark()` and
`setTransactionBenchmark(b)` may also be required.
[backend.implementation.MySQL](/reference/server/api/org/caosdb/server/database/backend/implementation/MySQL/package-index.rst)
: MySQL implementations of the interfaces. Typical "simple" implementations create a prepared SQL
statement from the arguments to `execute(...)` and send it to the SQL server. They may also have
methods for undoing and cleanup, using an [UndoHandler](/reference/server/api/org/caosdb/server/utils/UndoHandler.rst).
[backend.transaction](/reference/server/api/org/caosdb/server/database/backend/transaction/package-index.rst)
: Subclasses of the abstract [BackendTransaction](/reference/server/api/org/caosdb/server/database/BackendTransaction.rst)
which implement the `execute()` method. These classes may use specific backend implementations
(like for example the MySQL implementations) to interact with the backend database.
For example, the structure when getting an Entity ID by name looks like this:
```{eval-rst}
.. uml::
@startuml
together {
abstract BackendTransaction {
HashMap impl // stores all implementations
{abstract} execute()
}
note left of BackendTransaction::impl
Stores the
implementation
for each
interface."
end note
package "backend.interfaces" {
interface GetIDByNameImpl {
{abstract} execute(String name, String role, String limit)
}
}
}
together {
package "backend.transaction" {
class GetIDByName extends BackendTransaction {
execute()
}
}
package "backend.implementation.MySQL" {
class MySQLGetIDByName implements GetIDByNameImpl {
execute(String name, String role, String limit)
}
}
}
GetIDByName::execute --r-> MySQLGetIDByName
@enduml
```
## Mapping of Objects to Database Tables ##
### Mapping Entities with deep Properties ###
#### Understanding the problem ####
In LinkAhead, Properties of Entities can have subproperties, or they may overwrite default datatypes
of abstract Properties. The SQL representation of simple properties is a row in a data table. For
example (simplified):
| domain_id | entity_id | property_id | value |
| --------- | --------- | ----------- | ----- |
| 0 | 413 | 230 | V1 |
But how can we store a subproperty of the Property `230` with value `V1`? For example, how to store
that the datatype of this Property is different?
To understand LinkAhead's approach, it makes sense to imagine entities and their properties as a
tree: At the root of the tree, there is the base entity. Each "simple" property is a branch
connected to the entity, and sub-properties are connected to their base property. Properties that do
not have any sub-properties on their own are *leaf nodes* in the tree.
Now the SQL representation of this property tree consists of two parts:
- The `property_id` and `value` columns describe the kind of property and of course the value.
- All the other columns describe *where* this property can be found:
- "Direct" properties, i.e. those at the base of the tree (level 1), have `domain_id=0` and the
`entity_id` of their corresponding entity.
- Sub-properties of direct properties (level 2) use the entity's ID as `domain_id`.
- Everything below level 2 needs a new way of denoting their place in the property tree, and
that is where *replacement properties* come into play.
#### Simple examples ####
Two simple examples, which can be represented *without* replacement properties:
This entity with a text property (id=230)
```xml
V1
```
has no subproperties and thus will be represented in the `text_data` table as
| domain_id | entity_id | property_id | value | status |
| --------- | --------- | ----------- | ----- | ------ |
| 0 | 413 | 230 | V1 | FIX |
If the property has a subproperty 231:
```xml
V1
V2
```
the following will be stored in:
| domain_id | entity_id | property_id | value | status |
| --------- | --------- | ----------- | ----- | ------ |
| 0 | 413 | 230 | V1 | FIX |
| 413 | 230 | 231 | V2 | FIX |
This shows that the subproperty is stored using the identifiers (413,230) as domain and entity ID.
This allows to identify the property to which this subproperty belongs when retrieving data.
#### Representing deep Properties with replacements ####
In order to represent Properties that have no simple row-like representation in the `*_data` SQL
tables (due to subproperties or an override), replacement helper objects are used
(`ReplacedProperty` class in the Java code).
Whenever a Property cannot be represented directly, this property's value is replaced by a
*replacement*, which is represented similar to an Entity in SQL. The Property's value is now stored
in the `reference_data` table as a reference to the replacement and all the actual data is attached
to the replacement.
Every replacement (non-leaf node in the graph model above) gets an ID that is within-graph-unique,
i.e. no two replacements within a given entity tree have the same ID.
Each replacement results in two rows in the data tables:
1. ID in the `reference_data` table: The triple `entity_id` - `property_id` -
`id of the replacement property` is added here.
2. Value in the corresponding `*_data` table: The actual property value is added in a scope specific
to this replacement:
- `domain_id`: The base entity's ID.
- `entity_id`: The replacement ID.
- `value`: The original property value.
:::{figure} /.assets/images/explanation/server/replacements.png
:width: 100%
:::
#### Sub-property example in Python ####
To reproduce the examples above, here's some Python code:
```python
import linkahead as db; db.Info()
# Create the properties
sub_p = db.Property("subprop", datatype=db.TEXT)
test_p = db.Property("testprop", datatype=db.TEXT)
sub_p.insert()
test_p.insert()
# Create a recordtype
TestRT = db.RecordType("TestRT").add_property(test_p)
TestRT.insert()
# Create the entity and add property
test_rec = db.Record().add_parent(TestRT)
test_rec.add_property(test_p, value="V1")
prop = test_rec.properties[0].add_property(sub_p, value="V2")
my_subp = prop.properties[0]
my_subp.add_property(sub_p, value="V3")
test_rec.insert()
# Look at the data
print(test_rec)
# The output looks similar to this (printed in reduced form and reformatted here):
#
#
#
# V1
#
# V2
#
# V3
#
#
#
#
print(test_rec.id)
# In the browser, look at https://localhost:10443/Entity/123 if "123" is the ID.
#
# or run this to look at the SQL database directly:
# > docker exec -ti linkahead mariadb -h sqldb -u caosdb -prandom1234 caosdb
# > SELECT * FROM text_data WHERE value LIKE 'V%';
# +-----------+-----------+-------------+------------------+--------+------+
# | domain_id | entity_id | property_id | value | status | pidx |
# +-----------+-----------+-------------+------------------+--------+------+
# | 0 | 123 | 122 | V1 | FIX | 0 |
# | 0 | 123 | 113 | V2 | FIX | 0 |
# | 123 | 113 | 121 | V3 | FIX | 0 |
# +-----------+-----------+-------------+------------------+--------+------+
# > SELECT * FROM reference_data ORDER BY entity_id DESC LIMIT 10;
# +-----------+-----------+-------------+-------+-------------+------+----------------+
# | domain_id | entity_id | property_id | value | status | pidx | value_iversion |
# +-----------+-----------+-------------+-------+-------------+------+----------------+
# | 123 | 122 | 121 | 113 | REPLACEMENT | 0 | NULL |
# +-----------+-----------+-------------+-------+-------------+------+----------------+
```
#### A more complex example, with replacement properties ####
This more complex example requires replacement properties:
```xml
V1
V2
V3
V4
V3b
V2b
V1b
```
The tables will have the following rows:
`text_data` table:
| domain_id | entity_id | property_id | value | status |
| --------- | --------- | ----------- | ----- | ------ |
| 0 | 413 | 113 | V1 | FIX |
| 0 | 413 | 114 | V2 | FIX |
| 0 | 413 | 138 | V3 | FIX |
| 0 | 413 | 230 | V1b | FIX |
| 413 | 113 | 230 | V2b | FIX |
| 413 | 114 | 230 | V3b | FIX |
| 413 | 138 | 230 | V4 | FIX |
`reference_data` table:
| domain_id | entity_id | property_id | value | status |
| --------- | --------- | ----------- | ----- | ----------- |
| 0 | 413 | 230 | 113 | REPLACEMENT |
| 413 | 113 | 230 | 114 | REPLACEMENT |
| 413 | 114 | 230 | 138 | REPLACEMENT |
You can see that all non-leaf nodes/properties are added directly to the entity scope (0,413). In
case the property was replaced, the property_id is the replacement ID (see first row,
property_id=113). This means that the corresponding value of the property (e.g. V2 in second row)
is added to the replacement property when reading the data. I.e. the entity with ID 413 does not
have a property with value V2, but one of the subproperties does. The subproperties are stored in
the scope of their replacement parent (e.g. (413,113) for V2b).
This replacement strategy is also used in other cases where now row representation is possible, e.g.
when the datatype of a property is overwritten.
#### The real_property_id column ####
Because the property_id column may contain the ID of a replacement, it is not obvious, what the ID
of the Property at the Record is when looking at such a row in the data table. Therefore, there is
an additional column with redundant information to simplify things and to speed up lookups: The
`real_property_id` column. It always contains the ID of the Property of the Record, never the ID of
a potential replacement entity.
In the example above, the first rows of the `text_data` table with the `real_property_id` column:
| domain_id | entity_id | property_id | real_property_id | value | status |
|-----------|-----------|-------------|------------------|-------|--------|
| 0 | 413 | 113 | 230 | V1 | FIX |
| 0 | 413 | 114 | 230 | V2 | FIX |
| 0 | 413 | 138 | 230 | V3 | FIX |
| 0 | 413 | 230 | 230 | V1b | FIX |
The IDs 113, 114 and 138 belong to replacements, thus the real_property_id shows a different ID. In
the fourth line, `real_property_id` is simply the same as `property_id`.