Tags: explanation, administrator

Fileserver#

Note

This page has been migrated from the old documentation, and has not yet been fully revised. There might be inconsistencies or errors when using with current LinkAhead versions.

Info#

There are several ways to utilize the file server component of LinkAhead. It is possible to upload a file or a whole folder including subfolders via HTTP, directly or via the InsertFilesInDir flag. It is also possible to download a file via HTTP identified by its ID or by its path in the internal file system. Furthermore, it is possible to get the files metadata via HTTP as xml.

File upload#

HTTP upload stream#

Files#

There is an example on file upload using cURL described in detail in the curl section of this documentation.

File upload via HTTP is implemented in a rfc1867 consistent way. This is a de-facto standard that defines a file upload as a part of an HTML form submission. This concept shall not be amplified here. But it has to be noticed that this protocol is not designed for uploads of complete structured folders. Therefore, the LinkAhead file components have to impose that structure on the upload protocol.

LinkAhead’s file upload resource does exclusively accept POST requests of MIME media type multipart/form-data. The first part of each POST body is expected to be a form-data text field, containing information about the files to be uploaded. It has to meet the following requirements:

  • Content-type: text/plain; charset=UTF-8

  • Content-disposition: form-data; name="FileRepresentation"

If the content type of the first part is not text/plain; charset=UTF-8 the server will return error 418. If the body is not actually encoded in UTF-8 the servers behaviour is not defined. If the field name of the first part is not FileRepresentation the server will return error 419.

The body of that first part is to be an xml document of the following form:

        <Post>
          <File upload="$temporary_identifier" destination="$path_filesystem" description="$description" checksum="$checksum" size="$size"/>
          ...
        </Post>

where

  • $temporary_identifier is simply a arbitrary name, which will be used to identify this <File> tag with an uploaded file in the other form-data parts.

  • $path_filesystem is the designated relative path of that object in the internal file system,

  • $description is a description of the file to be uploaded,

  • $size is the files size in bytes,

  • $checksum is a SHA-512 Hash of the file.

The other parts (which must be at least one) may have any appropriate media type. application/octet-stream is a good choice for it is the default for any upload file according to rfc1867. Their field name may be any name meeting the requirements of rfc1867 (most notably they must be unique within this POST). But in order to identify the corresponding xml file representation of each file the filename parameter of the content-disposition header has to be set to the proper $temporary_identifier. The Content-disposition type must be form-data:

  • Content-disposition: form-data; name="$any_name"; filename="$temporary_identifier"

Finally, the body of these parts have to contain the file encoded in the proper Content-Transfer-Encoding.

If a file part has a filename parameter which doesn’t occur in the xml file representation the server will return error 420. The file will not be stored anywhere. If an xml file representation has no corresponding file to be uploaded (i.e. there is no part with the same filename) the server will return error 421. Some other error might occur if the checksum, the size, the destination etc. are somehow corrupted.

Folders#

Uploading folders works in a similar way. The first part of the multipart/form-data document is to be the representation of the folders:

        <Post>
          <File upload="$temporary_identifier" destination="$path_filesystem" description="$description" checksum="$checksum" size="$size"/>
          ...
        </Post>

The root folder is represented by a part which has a header of the form:

  • Content-disposition: form-data; name="$any_name"; filename="$temporary_identifier/" The slash at the end of the filename indicates that this is a folder, not a file. Consequently, the body of this part will be ignored and should be empty. Any file with the name $filename in the root folder is represented by a part which has a header of the form:

  • Content-disposition: form-data; name="$any_name"; filename="$temporary_identifier/$filename" Any sub folder with the name $subfolder is represented by a part which has a header of the form:

  • Content-disposition: form-data; name="$any_name"; filename="$temporary_identifier/$subfolder/"

Likewise, a complete directory tree can be transferred by appending the structure to the filename header field.

Example: Given the structure

    rootfolder/
    rootfolder/file1
    rootfolder/subfolder/
    rootfolder/subfolder/file2

an upload document would have the following form:

    ... (HTTP Header)
    Content-type: multipart/form-data, boundary=AaB03x
    
    --AaB03x
    content-disposition: form-data; name="FileRepresentation"
    
    <Post>
      <File upload="tmp1234" destination="$path_filesystem" description="$description" checksum="$checksum" size="$size"/>
    </Post>
    
    --AaB03x
    content-disposition: form-data; name="random_name1"; filename="temp1234/"
    
    --AaB03x
    content-disposition: form-data; name="random_name1"; filename="temp1234/file1"
    
    Hello, world! This is file1.
    
    --AaB03x
    content-disposition: form-data; name="random_name1"; filename="temp1234/subfolder/"
    
    --AaB03x
    content-disposition: form-data; name="random_name1"; filename="temp1234/subfolder/file2"
    
    Hello, world! This is file2.
    
    --AaB03x--

Consistency checks#

To start a consistency check on either the complete file system or a subdirectory, add the fileStorageConsistency flag to a retrieve query. In a GET request, simply add ...?fileStorageConsistency=<OPTIONS> to the URL. Possible options are:

  • -t <TIMEOUT> :: The timeout for the query (in seconds?)

  • -c <TESTCASE> :: To trigger internal test cases.

  • <PATH> :: The path in the file system where searching files should start. If omitted or \, the full file system will be checked.

One example, using curl and an existing cookie: curl -X GET -G -b cookie.txt -d "fileStorageConsistency=Analysis/VideoAnalysis/masks/" --insecure "https://<SERVER>/Entity/12345"