CMIS Server

Uit Wiki Memorix Archieven

Archivematica CMIS Server User Manual

This manual will cover the way how to access AIPs and AIP files

that were ingested and stored in Archivematica Storage Service.

Links and References:

Features Implemented:

  • Segregated User Access to AIPs per Location.

    User can access AIPs from particular location.

  • Browse All Ingested AIPs in Location.

  • Browse File&Folders structure of AIP.

  • Download whole AIP.

  • Download particular AIP’s File.

Connecting to CMIS

Current implementation of CMIS allows only

Browser Binding connections. So CMIS Clients must use Browser Binding only.

Before connection to CMIS Server make sure you have your credentials (user and password)

and CMIS Endpoint to access it.

Example:

For testing purposes it is recommended to use CMIS Workbench (see link above).

That can help to understand the overall CMIS Protocol and its Data Structures.

For Development bare CMIS Clients you can follow steps from Next Paragraph

Browser Binding API

All requests to CMIS Server must include HTTP Basic Authentication

with your Username and Password.

Example HTTP Request:

<source lang="http"> GET / HTTP/1.1

Host: pic-devel-cmis.edepot.picturae.com

Authorization: Basic ZGVtbzpkZW1v </source> Authorization header contains base64 encoded string with credentials demo:demo.

In the following sections we assume that all request includes Authorization header

and requests are sent to CMIS Endpoint URL. For example GET / will be translated to

HTTP Request as following:

<source lang="http"> GET /

Host: your-cmis-server-endpoint.nl

Authorization: Basic <YOUR ENCODED CREDENTIALS> </source>

Getting CMIS Repository

Request:

<source lang="http"> GET / </source> Response:

<source lang="json"> {

   "ea550466-7dff-4b57-844e-c1163b3a7168": {
       "repositoryId": "ea550466-7dff-4b57-844e-c1163b3a7168",
       "repositoryName": "Archivematica Repository",
       "repositoryDescription": "Repository for Accessing Archivematica AIPs",
       "vendorName": "Picturae",
       "productName": "Picturae CMIS Server",
       "productVersion": "0.0.1",
       "rootFolderId": "root",
       "latestChangeLogToken": null,
       "capabilities": {
           "capabilityACL": "none",
           "capabilityAllVersionsSearchable": false,
           "capabilityChanges": "none",
           "capabilityContentStreamUpdatability": "none",
           "capabilityGetDescendants": false,
           "capabilityGetFolderTree": false,
           "capabilityOrderBy": "none",
           "capabilityMultifiling": false,
           "capabilityPWCSearchable": false,
           "capabilityPWCUpdatable": false,
           "capabilityQuery": "none",
           "capabilityRenditions": "read",
           "capabilityUnfiling": false,
           "capabilityVersionSpecificFiling": false,
           "capabilityJoin": "none",
           "capabilityCreatablePropertyTypes": null,
           "capabilityNewTypeSettableAttributes": null
       },
       "aclCapability": null,
       "cmisVersionSupported": "1.1",
       "thinClientURI": null,
       "changesIncomplete": null,
       "changesOnType": [],
       "principalAnonymous": null,
       "principalAnyone": null,
       "extendedFeatures": [],
       "rootFolderUrl": "https://pic-devel-cmis.edepot.picturae.com/ea550466-7dff-4b57-844e-c1163b3a7168/root",
       "repositoryUrl": "https://pic-devel-cmis.edepot.picturae.com/ea550466-7dff-4b57-844e-c1163b3a7168"
   }

} </source> Response contains a list of single Repository that user can access.

(multiple repositories is not yet supported).

Repository Info contains few major parts which must be used in next sections:

  • repositoryId – this field contains unique repositoryId

  • rootFolderId – CMIS Object ID for root folder to repository.

    In current implementation it is always root

  • repositoryUrl and rootFolderUrl – these urls MUST be used as entrypoints for further

    communication with Repository (like Get Object, Get Children, Get Content, etc… )

    Repository URL and Root Folder URL will use current CMIS Server Hostname and Repository ID

Getting List of Ingested AIPs

In order to obtain the list of your ingested AIPs use following URL

<source lang="http"> GET {rootFolderUrl}/?cmisselector=children&succinct=true </source> Response Example:

<source lang="json"> {

   "numItems": 1,
   "moreItems": false,
   "objects": [
       {
           "object": {
               "id": "aba7175b-856e-4b03-9201-e9ce551c0f9f",
               "relationship": [],
               "changeEventInfo": null,
               "acl": null,
               "exactACL": null,
               "policyIds": null,
               "renditions": [
                   {
                       "streamId": "aba7175b-856e-4b03-9201-e9ce551c0f9f",
                       "mimeType": "application/x-tar",
                       "length": 17408095,
                       "kind": "package",
                       "title": "Archived Package",
                       "height": null,
                       "width": null,
                       "renditionDocumentId": null
                   }
               ],
               "succinctProperties": {
                   "cmis:name": "Test_dubbellebestanden2_20180727_2-aba7175b-856e-4b03-9201-e9ce551c0f9f",
                   "cmis:description": "Test_dubbellebestanden2_20180727_2-aba7175b-856e-4b03-9201-e9ce551c0f9f",
                   "cmis:objectId": "aba7175b-856e-4b03-9201-e9ce551c0f9f",
                   "cmis:baseTypeId": "cmis:folder",
                   "cmis:objectTypeId": "cmis:folder",
                   "cmis:secondaryObjectTypeIds": null,
                   "cmis:createdBy": "Admin",
                   "cmis:creationDate": 946681200,
                   "cmis:lastModifiedBy": "Admin",
                   "cmis:lastModificationDate": 946681200,
                   "cmis:changeToken": "946681200",
                   "cmis:parentId": "root",
                   "cmis:path": "/Test_dubbellebestanden2_20180727_2-aba7175b-856e-4b03-9201-e9ce551c0f9f",
                   "cmis:allowedChildObjectTypeIds": null
               }
           },
           "pathSegment": "Test_dubbellebestanden2_20180727_2-aba7175b-856e-4b03-9201-e9ce551c0f9f"
       }
   ]

} </source> Response will contain a list of Ingested AIPs in format Name + UUID.

The full list of fields is described in CMIS Object

Let’s focus on major parts that will be used in following sections.

  • object.id – contains AIP UUID.
  • object.succinctProperties.cmis:name – contains AIP Name + UUID. That name will be displayed in CMIS Workbench.
  • object.succinctProperties.cmis:path – contains Path via what this object (AIP) can be accessed.

So now we can browse AIP’s file structure.

Browse File and Folder Structure of AIP

From previous step we can obtain AIP Object ID (UUID) or AIP Access Path.

And we can browse its internal structure. Request:

<source lang="http"> GET {rootFolderUrl}/{AIP_Path}?cmisselector=children&succinct=true </source> Or:

<source lang="http"> GET {repositoryUrl}?objectId={AIP UUID}&cmisselector=children&succinct=true </source> According to CMIS Specification object can be accessed through Path URL or by ObjectID including to Repository URL. Read more about it in

CMIS Specification

Response:

<source lang="json"> {

   "numItems": 5,
   "moreItems": false,
   "objects": [
       {
           "object": {
               "id": "/Test_grootbestand_20180725_1-5def6da9-9c70-4800-84e9-5e501bc6687e/objects/GDBA-7774-01.pdf",
               "relationship": [],
               "changeEventInfo": null,
               "acl": null,
               "exactACL": null,
               "policyIds": null,
               "renditions": [],
               "succinctProperties": {
                   "cmis:name": "GDBA-7774-01.pdf",
                   "cmis:description": "GDBA-7774-01.pdf",
                   "cmis:objectId": "/Test_grootbestand_20180725_1-5def6da9-9c70-4800-84e9-5e501bc6687e/objects/GDBA-7774-01.pdf",
                   "cmis:baseTypeId": "cmis:document",
                   "cmis:objectTypeId": "cmis:document",
                   "cmis:secondaryObjectTypeIds": null,
                   "cmis:createdBy": "Admin",
                   "cmis:creationDate": 946681200,
                   "cmis:lastModifiedBy": "Admin",
                   "cmis:lastModificationDate": 946681200,
                   "cmis:changeToken": "946681200",
                   "cmis:isImmutable": true,
                   "cmis:isLatestVersion": null,
                   "cmis:isMajorVersion": null,
                   "cmis:isLatestMajorVersion": null,
                   "cmis:isPrivateWorkingCopy": null,
                   "cmis:versionLabel": null,
                   "cmis:versionSeriesId": null,
                   "cmis:isVersionSeriesCheckedOut": null,
                   "cmis:versionSeriesCheckedOutBy": null,
                   "cmis:versionSeriesCheckedOutId": null,
                   "cmis:checkinComment": null,
                   "cmis:contentStreamLength": 0,
                   "cmis:contentStreamMimeType": "text/plain",
                   "cmis:contentStreamFileName": "GDBA-7774-01.pdf",
                   "cmis:contentStreamId": "/Test_grootbestand_20180725_1-5def6da9-9c70-4800-84e9-5e501bc6687e/objects/GDBA-7774-01.pdf"
               }
           },
           "pathSegment": "GDBA-7774-01.pdf"
       },
       ...
   ]

} </source> The result is presented in the same format as for requesting List of AIPS.

Because in terms of CMIS Protocol it is same actions – Get Children of an Object.

In that particular example we can see ingested file from AIP Files.

All files has cmis:objectTypeId property set to cmis:document.

For folders it is cmis:folder.

In current implementation of CMIS Server Object IDs are equals to theirs Paths.

This is useful for accessing object either via ObjectID or via Path.

Difference between files and folders is that File can have Content Stream to download.

Response contains following properties that can be used to download particular file:

  • object.succinctProperties.cmis:contentStreamId

    Contains an Content Stream ID that should be used to download File Content.

    In current Implementation Content Stream ID is equal to Object ID and thus its path.

  • object.succinctProperties.cmis:contentStreamFileName

    Contains File Name that can be used as name of downloaded file.

Following Content Stream related properties needs to be Implemented in the Future:

  • cmis:contentStreamLength – should be used to provide File Size information.

    Right now it’s always 0.

  • cmis:contentStreamMimeType – should be used to provide File Content MIME type. To discover file type.

    Right now it is always test/plain

Downloading File from AIP

As described above to Download AIP File you have to use its Stream ID which is the same

as Object ID or Object Path. So request to download a file is following:

<source lang="http"> GET {rootFolderUrl}/{File Path}?cmisselector=content </source> Response will Redirect with Http status 302 Found to File Location.

<source lang="http"> HTTP/1.1 302 Found

...

Location: /ss-content-proxy/5def6da9-9c70-4800-84e9-5e501bc6687e/Test_grootbestand_20180725_1-5def6da9-9c70-4800-84e9-5e501bc6687e/data/objects/GDBA-7774-02.pdf </source> So your CMIS Client has to follow HTTP Redirect in order to Download a file.

Accessing files without CMIS

As presented in section above CMIS will redirect a HTTP Client to specific File location to

download file. It is possible to combine the URL request in order to download needed file without

using CMIS even.

However there are few requirements:

  • You have to know exact Repository ID. Which is an Archivematica Location UUID assigned to your user.
  • You have to know exact AIP Name (including name and UUID).
  • You have to know exact path of needed file inside of AIP Package.

If requirements above are met you are able to build an URL to download needed file.

The rules are following:

{CMIS Endpoint} + /ss-content-proxy/ + {Repository ID} + /{AIP Name} + /{File Path inside AIP}

For example this url:

https://pic-devel-cmis.edepot.picturae.com/ss-content-proxy/5def6da9-9c70-4800-84e9-5e501bc6687e/Test_grootbestand_20180725_1-5def6da9-9c70-4800-84e9-5e501bc6687e/data/objects/GDBA-7774-02.pdf

It contains following parts:

  • CMIS Endpoint – https://pic-devel-cmis.edepot.picturae.com

  • Repository ID – 5def6da9-9c70-4800-84e9-5e501bc6687e

  • AIP Name – Test_grootbestand_20180725_1-5def6da9-9c70-4800-84e9-5e501bc6687e

  • File Path – data/objects/GDBA-7774-02.pdf

    Please notice that file location contains root folder data which is not presented in CMIS AIP Structure. So you have to include it explicitly.

So now you can download needed file without using CMIS.

Note Downloading big files may be slow. Because it requires some time to Extract file from an AIP Archive.

CMIS Workbench

Installation

In order to connect via CMIS Workbench you need to install Java.

Follow instruction from official Java Web site to download and install it on your Machine (Windows, MacOS, Linux, etc… ).

Once you have Java installed, check that it is accessible from your terminal. type:

java --version to see that it is working.

Example:

<source lang="bash"> $ java --version

java 10.0.1 2018-04-17

Java(TM) SE Runtime Environment 18.3 (build 10.0.1+10)

Java HotSpot(TM) 64-Bit Server VM 18.3 (build 10.0.1+10, mixed mode) </source> Now you can download CMIS Workbench from official mirror

Unpack it to your favorite location. Example (MacOS, Linux):

<source lang="bash"> curl -o cmis-workbench.zip http://www-eu.apache.org/dist/chemistry/opencmis/1.1.0/chemistry-opencmis-workbench-1.1.0-full.zip

mkdir cmis-workbench

unzip -d cmis-workbench/

cd cmis-workbench/ </source> After you unzipped the package you can run it ./workbench.command

Usage

After starting CMIS workbench you should see following window:

CMIS
CMIS

Now you need to login.

  • URL – this is your CMIS Server endpoint. (Example: https://pic-devel-cmis.edepot.picturae.com)
  • Binding – Browser
  • Username – Your username (Example: demo)
  • Password – Your password (Example: demo)
  • Authentication – Standard
  • Rest parameters leave As Is.

Then click on Load Repositories button and after it loads Repositories List click on Login.

Now you should see following window:

CMIS Browse
CMIS Browse

On the left side you can see list of ingested AIPs and on the right side CMIS Object Details.

You can double-click on Folders (AIP Packages) in order to browse its file & folder structure.

Once you reached needed file Right Click on it and select Download.

CMIS File Download
CMIS File Download

Ignore MIME Error message. This is an issue with CMIS Workbench.

After you clicked on the Download Button it may take some time to extract the file from AIP and start downloading it.