--- last_review: "2025-01-01" last_reviewer: "-" documented_code: [ ] --- ```{tags} explanation, administrator ``` # Fileserver :::{note} This page has been migrated from the old documentation, and has not yet been fully revised. There might be inconsistencies or errors when using with current LinkAhead versions. ::: % TODO: Issue: https://gitlab.indiscale.com/caosdb/src/linkahead-docs/-/issues/90 ## Info There are several ways to utilize the file server component of LinkAhead. It is possible to upload a file or a whole folder including subfolders via HTTP, directly or via the `InsertFilesInDir` flag. It is also possible to download a file via HTTP identified by its ID or by its path in the internal file system. Furthermore, it is possible to get the files metadata via HTTP as xml. ## File upload ### HTTP upload stream #### Files There is an example on file upload using cURL described in detail in [the curl section of this documentation](/tutorial/examples/curl-access.md). File upload via HTTP is implemented in a [rfc1867](http://www.ietf.org/rfc/rfc1867.txt) consistent way. This is a de-facto standard that defines a file upload as a part of an HTML form submission. This concept shall not be amplified here. But it has to be noticed that this protocol is not designed for uploads of complete structured folders. Therefore, the LinkAhead file components have to impose that structure on the upload protocol. LinkAhead's file upload resource does exclusively accept POST requests of MIME media type `multipart/form-data`. The first part of each POST body is expected to be a form-data text field, containing information about the files to be uploaded. It has to meet the following requirements: * `Content-type: text/plain; charset=UTF-8` * `Content-disposition: form-data; name="FileRepresentation"` If the content type of the first part is not `text/plain; charset=UTF-8` the server will return error 418. If the body is not actually encoded in UTF-8 the servers behaviour is not defined. If the field name of the first part is not `FileRepresentation` the server will return error 419. The body of that first part is to be an xml document of the following form: ```xml ... ``` where * $temporary_identifier is simply a arbitrary name, which will be used to identify this `` tag with an uploaded file in the other form-data parts. * $path_filesystem is the designated relative path of that object in the internal file system, * $description is a description of the file to be uploaded, * $size is the files size in bytes, * $checksum is a SHA-512 Hash of the file. The other parts (which must be at least one) may have any appropriate media type. `application/octet-stream` is a good choice for it is the default for any upload file according to [rfc1867](http://www.ietf.org/rfc/rfc1867.txt). Their field name may be any name meeting the requirements of [rfc1867](http://www.ietf.org/rfc/rfc1867.txt) (most notably they must be unique within this POST). But in order to identify the corresponding xml file representation of each file the `filename` parameter of the content-disposition header has to be set to the proper $temporary_identifier. The Content-disposition type must be `form-data`: * `Content-disposition: form-data; name="$any_name"; filename="$temporary_identifier"` Finally, the body of these parts have to contain the file encoded in the proper `Content-Transfer-Encoding`. If a file part has a `filename` parameter which doesn't occur in the xml file representation the server will return error 420. The file will not be stored anywhere. If an xml file representation has no corresponding file to be uploaded (i.e. there is no part with the same `filename`) the server will return error 421. Some other error might occur if the checksum, the size, the destination etc. are somehow corrupted. #### Folders Uploading folders works in a similar way. The first part of the `multipart/form-data` document is to be the representation of the folders: ```xml ... ``` The root folder is represented by a part which has a header of the form: * `Content-disposition: form-data; name="$any_name"; filename="$temporary_identifier/"` The slash at the end of the `filename` indicates that this is a folder, not a file. Consequently, the body of this part will be ignored and should be empty. Any file with the name `$filename` in the root folder is represented by a part which has a header of the form: * `Content-disposition: form-data; name="$any_name"; filename="$temporary_identifier/$filename"` Any sub folder with the name `$subfolder` is represented by a part which has a header of the form: * `Content-disposition: form-data; name="$any_name"; filename="$temporary_identifier/$subfolder/"` Likewise, a complete directory tree can be transferred by appending the structure to the `filename` header field. **Example**: Given the structure rootfolder/ rootfolder/file1 rootfolder/subfolder/ rootfolder/subfolder/file2 an upload document would have the following form: ... (HTTP Header) Content-type: multipart/form-data, boundary=AaB03x --AaB03x content-disposition: form-data; name="FileRepresentation" --AaB03x content-disposition: form-data; name="random_name1"; filename="temp1234/" --AaB03x content-disposition: form-data; name="random_name1"; filename="temp1234/file1" Hello, world! This is file1. --AaB03x content-disposition: form-data; name="random_name1"; filename="temp1234/subfolder/" --AaB03x content-disposition: form-data; name="random_name1"; filename="temp1234/subfolder/file2" Hello, world! This is file2. --AaB03x-- % TODO: Timm 2014-06-17: to be continued ## Consistency checks To start a consistency check on either the complete file system or a subdirectory, add the `fileStorageConsistency` flag to a retrieve query. In a `GET` request, simply add `...?fileStorageConsistency=` to the URL. Possible options are: % TODO: (currently, only one of them?): - `-t ` :: The timeout for the query (in seconds?) - `-c ` :: To trigger internal test cases. - `` :: The path in the file system where searching files should start. If omitted or `\`, the full file system will be checked. One example, using curl and an existing cookie: `curl -X GET -G -b cookie.txt -d "fileStorageConsistency=Analysis/VideoAnalysis/masks/" --insecure "https:///Entity/12345"`