--- last_review: "2026-05-06" last_reviewer: "-" documented_code: [ ] --- ```{tags} tutorial, advanced-user, schema ``` # Define a LinkAhead Schema with YAML :::{note} This page has been migrated from the old documentation, and has not yet been fully revised. There might be inconsistencies or errors when using with current LinkAhead versions. ::: % TODO: Issue: https://gitlab.indiscale.com/caosdb/src/linkahead-docs/-/issues/78 The `caosadvancedtools` library features the possibility to create and update LinkAhead {term}`Schemas ` using a YAML file. Let's start with an example taken from [schema.yml](https://gitlab.indiscale.com/caosdb/src/caosdb-advanced-user-tools/-/blob/6fd1d10dea144547553289f8bc71b38e48de782d/unittests/models/model.yml) in the library sources. ```yaml Project: obligatory_properties: projectId: datatype: INTEGER description: 'UID of this project' Person: recommended_properties: firstName: datatype: TEXT description: 'first name' lastName: datatype: TEXT description: 'last name' LabbookEntry: recommended_properties: Project: entryId: datatype: INTEGER description: 'UID of this entry' responsible: datatype: Person description: 'the person responsible for these notes' textElement: datatype: TEXT description: 'a text element of a labbook recording' associatedFile: datatype: FILE description: 'A file associated with this recording' table: datatype: FILE description: 'A table document associated with this recording' ``` This example defines 3 {term}`RecordTypes `: - A `Project` with one obligatory property `datatype` - A `Person` with a `firstName` and a `lastName` (as recommended properties) - A `LabbookEntry` with multiple recommended properties of different data types One major advantage of using this interface (in contrast to the standard python interface) is that properties can be defined and added to RecordTypes "on-the-fly". E.g. the three lines for `firstName` as sub entries of `Person` have two effects on LinkAhead: - A new property with name `firstName`, datatype `TEXT` and description `first name` is inserted (or updated, if already present) into LinkAhead. - The new property is added as a recommended property to RecordType `Person`. Any further occurrences of `firstName` in the yaml file will reuse the definition provided for `Person`. Note the difference between the three property declarations of `LabbookEntry`: - `Project`: This RecordType is added directly as a property of `LabbookEntry`. Therefore, it does not specify any further attributes. Compare to the original declaration of RecordType `Project`. - `responsible`: This defines and adds a property with name "responsible" to `LabbookEntry`, which has a datatype `Person`. `Person` is defined above. - `firstName`: This defines and adds a property with the standard data type `TEXT` to RecordType `Person`. If the Schema depends on RecordTypes or properties which already exist in LinkAhead, those can be added using the `extern` keyword: `extern` takes a list of previously defined names of Properties and/or RecordTypes. Note that if you happen to use an already existing `REFERENCE` property that has an already existing RecordType as datatype, you also need to add that RecordType's name to the `extern` list, e.g., ```yaml extern: # Let's assume the following is a reference property with datatype Person - Author # We need Person (since it's the datatype of Author) even though we might # not use it explicitly - Person Dataset: recommended_properties: Author: ``` ## Reusing Properties Properties defined once (either as a property of a Record or as a separate Property) can be reused later in the yaml file. That requires that after the first occurrence of the property, the attributes have to be empty. Otherwise, the reuse of the property would be conflicting with its original definition. ### Example ```yaml Project: obligatory_properties: projectId: datatype: INTEGER description: 'UID of this project' date: datetype: DATETIME description: Date of a project or an experiment Experiment: obligatory_properties: experimentId: datatype: INTEGER description: 'UID of this experiment' date: # no further attributes here, since property was defined above in 'Project'! ``` The above example defines two Records: Project and Experiment The property `date` is defined upon its first occurrence as a property of `Project`. Later, the same property is also added to `Experiment` where no additional attributes are allowed to specify. ## Datatypes You can use any data type understood by LinkAhead as datatype attribute in the Schema yaml. List attributes are a bit special: ```yaml datatype: LIST ``` declares a list datatype of DOUBLE elements. ```yaml datatype: LIST ``` declares a list of elements with datatype Project. ## Keywords - **importance**: Importance of this entity. Possible values: "recommended", "obligatory", " suggested" - **datatype**: The datatype of this property, e.g. TEXT, INTEGER or Project. - **unit**: The unit of the property, e.g. "m/s". - **description**: A description for this entity. - **enum-names**: List of possible values which a RecordType, which is used for enumeration only, can have. These are created as Records without any properties, and the Records' names set to the values. The values (and thus enum names) do not have to be unique across RecordTypes: for example there may be ``Other`` enum values for different RecordTypes. - **recommended_properties**: Add properties to this entity with importance "recommended". - **obligatory_properties**: Add properties to this entity with importance "obligatory". - **suggested_properties**: Add properties to this entity with importance "suggested". - **inherit_from_XXX**: This keyword accepts a list of other RecordTypes. Those RecordTypes are added as parents, and all Properties with at least the importance `XXX` are inherited. For example, `inherited_from_recommended` will inherit all Properties of importance `recommended` and `obligatory`, but not `suggested`. ## Usage You can use the yaml parser directly in python as follows: ```python from caosadvancedtools.models import parser as parser schema = parser.parse_model_from_yaml("model.yml") ``` This creates a DataModel object containing all entities defined in the yaml file. If the parsed Schema shall be appended to a pre-exsting Schema, the optional `existing_model` can be used: ```python new_schema = parser.parse_model_from_yaml("schema.yml", existing_model=old_schema) ``` You can now use the functions from `DataModel` to synchronize the Schema with a LinkAhead instance: ```python schema.sync_data_model() ``` % LocalWords: yml projectId UID firstName lastName LabbookEntry entryId textElement labbook % LocalWords: associatedFile extern Textfile DataModel