Other utilities in LinkAhead Advanced User Tools#
Note
This page has been migrated from the old documentation, and has not yet been fully revised. There might be inconsistencies or errors when using with current LinkAhead versions.
The table file importer#
The LinkAhead Advanced user tools provide a generic
TableImporter class which reads different table file formats (at
the time of writing of this documentation, .xls(x), .csv, and .tsv) and converts them into
pandas.DataFrame objects. It provides helper functions for converting column values (e.g.,
converting the string values “yes” or “no” to True or False), checking the presence of
obligatory columns in a table and whether those have missing values, and datatype checks.
The base class TableImporter provides the general
verification methods, while each subclass like
XLSXImporter or
CSVImporter implements its own read_file function that is used
to convert a given table file into a pandas.DataFrame.
Empty fields in integer columns#
Reading in table files that have integer-valued columns with missing data can result in datatype
contradictions (see the Pandas documentation
on nullable integers) since the default
value for missing fields, numpy.nan, is a float. This is why from version 0.11 and above, the
TableImporter uses pandas.Int64Dtype as the default datatype for all integer columns
which allows for empty fields while keeping all actual data integer-valued. This behavior can be
changed by initializing the TableImporter with convert_int_to_nullable_int=False in which case a
DataInconsistencyError is raised when an empty
field is encountered in a column with a non-nullable integer datatype.
The loadfiles module and executable#
For making files available to the LinkAhead server as File entities (see also the server’s
file server documentation), the LinkAhead Advanced User tools
provide the loadFiles module and a linkahead-loadfiles executable.
Both operate on a path as seen by the LinkAhead server (i.e., a path within the Docker container in
the typical LinkAhead Control setup) and can be further specified to exclude or exclude specific
files. In the typical setup, where a directory is mounted as an extroot into the Docker container
by LinkAhead control, running
linkahead-loadfiles /opt/caosdb/mnt/extroot
makes all files available. Execute
linkahead-loadfiles --help
for more information and examples.