CMIS Server
Archivematica CMIS Server User Manual
This manual will cover the way how to access AIPs and AIP files
that were ingested and stored in Archivematica Storage Service.
Links and References:
- CMIS Specification[1]
- [Apache Chemistry Open CMIS](https://chemistry.apache.org/java/opencmis.html)
This resorce gathered all needed information for Developers and Users of CMIS Protocol.
Useful tool for Browsing and Testing CMIS Server.
Features Implemented:
- Segregated User Access to AIPs per Location.
User can access AIPs from particular location.
- Browse All Ingested AIPs in Location.
- Browse File&Folders structure of AIP.
- Download whole AIP.
- Download particular AIP's File.
Connecting to CMIS
Current implementation of CMIS allows only
[Browser Binding](http://docs.oasis-open.org/cmis/CMIS/v1.1/errata01/os/CMIS-v1.1-errata01-os-complete.html#x1-5360005) connections. So CMIS Clients must use Browser Binding only.
Before connection to CMIS Server make sure you have your credentials (user and password)
and CMIS Endpoint to access it.
Example:
- Username -- `demo`
- Password -- `demo`
For testing purposes it is recommended to use CMIS Workbench (see link above).
That can help to understand the overall CMIS Protocol and its Data Structures.
For Development bare CMIS Clients you can follow steps from Next Paragraph
Browser Binding API
All requests to CMIS Server must include [HTTP Basic Authentication](https://tools.ietf.org/html/rfc7617)
with your Username and Password.
Example HTTP Request:
```http
GET / HTTP/1.1
Host: pic-devel-cmis.edepot.picturae.com
Authorization: Basic ZGVtbzpkZW1v
```
Authorization header contains base64 encoded string with credentials `demo:demo`.
In the following sections we assume that all request includes Authorization header
and requests are sent to CMIS Endpoint URL. For example `GET /` will be translated to
HTTP Request as following:
```http
GET /
Host: your-cmis-server-endpoint.nl
Authorization: Basic <YOUR ENCODED CREDENTIALS>
```
Getting CMIS Repository
Request
```http
GET /
```
Response:
```json
{
"ea550466-7dff-4b57-844e-c1163b3a7168": {
"repositoryId": "ea550466-7dff-4b57-844e-c1163b3a7168",
"repositoryName": "Archivematica Repository",
"repositoryDescription": "Repository for Accessing Archivematica AIPs",
"vendorName": "Picturae",
"productName": "Picturae CMIS Server",
"productVersion": "0.0.1",
"rootFolderId": "root",
"latestChangeLogToken": null,
"capabilities": {
"capabilityACL": "none",
"capabilityAllVersionsSearchable": false,
"capabilityChanges": "none",
"capabilityContentStreamUpdatability": "none",
"capabilityGetDescendants": false,
"capabilityGetFolderTree": false,
"capabilityOrderBy": "none",
"capabilityMultifiling": false,
"capabilityPWCSearchable": false,
"capabilityPWCUpdatable": false,
"capabilityQuery": "none",
"capabilityRenditions": "read",
"capabilityUnfiling": false,
"capabilityVersionSpecificFiling": false,
"capabilityJoin": "none",
"capabilityCreatablePropertyTypes": null,
"capabilityNewTypeSettableAttributes": null
},
"aclCapability": null,
"cmisVersionSupported": "1.1",
"thinClientURI": null,
"changesIncomplete": null,
"changesOnType": [],
"principalAnonymous": null,
"principalAnyone": null,
"extendedFeatures": [],
"rootFolderUrl": "https://pic-devel-cmis.edepot.picturae.com/ea550466-7dff-4b57-844e-c1163b3a7168/root",
"repositoryUrl": "https://pic-devel-cmis.edepot.picturae.com/ea550466-7dff-4b57-844e-c1163b3a7168"
}
}
```
Response contains a list of single Repository that user can access.
(multiple repositories is not yet supported).
Repository Info contains few major parts which must be used in next sections:
- `repositoryId` -- this field contains unique repositoryId
- `rootFolderId` -- CMIS Object ID for root folder to repository.
In current implementation it is always `root`
- `repositoryUrl` and `rootFolderUrl` -- these urls MUST be used as entrypoints for further
communication with Repository (like Get Object, Get Children, Get Content, etc... )
Repository URL and Root Folder URL will use current CMIS Server Hostname and Repository ID
Getting List of Ingested AIPs
In order to obtain the list of your ingested AIPs use following URL
```http
GET {rootFolderUrl}/?cmisselector=children&succinct=true
```
Response Example:
```json
{
"numItems": 1,
"moreItems": false,
"objects": [
{
"object": {
"id": "aba7175b-856e-4b03-9201-e9ce551c0f9f",
"relationship": [],
"changeEventInfo": null,
"acl": null,
"exactACL": null,
"policyIds": null,
"renditions": [
{
"streamId": "aba7175b-856e-4b03-9201-e9ce551c0f9f",
"mimeType": "application/x-tar",
"length": 17408095,
"kind": "package",
"title": "Archived Package",
"height": null,
"width": null,
"renditionDocumentId": null
}
],
"succinctProperties": {
"cmis:name": "Test_dubbellebestanden2_20180727_2-aba7175b-856e-4b03-9201-e9ce551c0f9f",
"cmis:description": "Test_dubbellebestanden2_20180727_2-aba7175b-856e-4b03-9201-e9ce551c0f9f",
"cmis:objectId": "aba7175b-856e-4b03-9201-e9ce551c0f9f",
"cmis:baseTypeId": "cmis:folder",
"cmis:objectTypeId": "cmis:folder",
"cmis:secondaryObjectTypeIds": null,
"cmis:createdBy": "Admin",
"cmis:creationDate": 946681200,
"cmis:lastModifiedBy": "Admin",
"cmis:lastModificationDate": 946681200,
"cmis:changeToken": "946681200",
"cmis:parentId": "root",
"cmis:path": "/Test_dubbellebestanden2_20180727_2-aba7175b-856e-4b03-9201-e9ce551c0f9f",
"cmis:allowedChildObjectTypeIds": null
}
},
"pathSegment": "Test_dubbellebestanden2_20180727_2-aba7175b-856e-4b03-9201-e9ce551c0f9f"
}
]
}
```
Response will contain a list of Ingested AIPs in format Name + UUID.
The full list of fields is described in [CMIS Object](http://docs.oasis-open.org/cmis/CMIS/v1.1/errata01/os/CMIS-v1.1-errata01-os-complete.html#x1-220002)
Let's focus on major parts that will be used in following sections.
- `object.id` -- contains AIP UUID.
- `object.succinctProperties.cmis:name` -- contains AIP Name + UUID. That name will be displayed in CMIS Workbench.
- `object.succinctProperties.cmis:path` -- contains Path via what this object (AIP) can be accessed.
So now we can browse AIP's file structure.
Browse File and Folder Structure of AIP
From previous step we can obtain AIP Object ID (UUID) or AIP Access Path.
And we can browse its internal structure. Request:
```http
GET {rootFolderUrl}/{AIP_Path}?cmisselector=children&succinct=true
```
Or:
```http
GET {repositoryUrl}?objectId={AIP UUID}&cmisselector=children&succinct=true
```
According to CMIS Specification object can be accessed through Path URL or by ObjectID including to Repository URL. Read more about it in
[CMIS Specification](http://docs.oasis-open.org/cmis/CMIS/v1.1/errata01/os/CMIS-v1.1-errata01-os-complete.html#x1-5580004)
Response:
```json
{
"numItems": 5,
"moreItems": false,
"objects": [
{
"object": {
"id": "/Test_grootbestand_20180725_1-5def6da9-9c70-4800-84e9-5e501bc6687e/objects/GDBA-7774-01.pdf",
"relationship": [],
"changeEventInfo": null,
"acl": null,
"exactACL": null,
"policyIds": null,
"renditions": [],
"succinctProperties": {
"cmis:name": "GDBA-7774-01.pdf",
"cmis:description": "GDBA-7774-01.pdf",
"cmis:objectId": "/Test_grootbestand_20180725_1-5def6da9-9c70-4800-84e9-5e501bc6687e/objects/GDBA-7774-01.pdf",
"cmis:baseTypeId": "cmis:document",
"cmis:objectTypeId": "cmis:document",
"cmis:secondaryObjectTypeIds": null,
"cmis:createdBy": "Admin",
"cmis:creationDate": 946681200,
"cmis:lastModifiedBy": "Admin",
"cmis:lastModificationDate": 946681200,
"cmis:changeToken": "946681200",
"cmis:isImmutable": true,
"cmis:isLatestVersion": null,
"cmis:isMajorVersion": null,
"cmis:isLatestMajorVersion": null,
"cmis:isPrivateWorkingCopy": null,
"cmis:versionLabel": null,
"cmis:versionSeriesId": null,
"cmis:isVersionSeriesCheckedOut": null,
"cmis:versionSeriesCheckedOutBy": null,
"cmis:versionSeriesCheckedOutId": null,
"cmis:checkinComment": null,
"cmis:contentStreamLength": 0,
"cmis:contentStreamMimeType": "text/plain",
"cmis:contentStreamFileName": "GDBA-7774-01.pdf",
"cmis:contentStreamId": "/Test_grootbestand_20180725_1-5def6da9-9c70-4800-84e9-5e501bc6687e/objects/GDBA-7774-01.pdf"
}
},
"pathSegment": "GDBA-7774-01.pdf"
},
...
]
}
```
The result is presented in the same format as for requesting List of AIPS.
Because in terms of CMIS Protocol it is same actions -- Get Children of an Object.
In that particular example we can see ingested file from AIP Files.
All files has `cmis:objectTypeId` property set to `cmis:document`.
For folders it is `cmis:folder`.
In current implementation of CMIS Server Object IDs are equals to theirs Paths.
This is useful for accessing object either via ObjectID or via Path.
Difference between files and folders is that File can have Content Stream to download.
Response contains following properties that can be used to download particular file:
- `object.succinctProperties.cmis:contentStreamId` --
Contains an Content Stream ID that should be used to download File Content.
In current Implementation Content Stream ID is equal to Object ID and thus its path.
- `object.succinctProperties.cmis:contentStreamFileName` --
Contains File Name that can be used as name of downloaded file.
Following Content Stream related properties needs to be Implemented in the Future:
- `cmis:contentStreamLength` -- should be used to provide File Size information.
Right now it's always `0`.
- `cmis:contentStreamMimeType` -- should be used to provide File Content MIME type. To discover file type.
Right now it is always `test/plain`
Downloading File from AIP
As described above to Download AIP File you have to use its Stream ID which is the same
as Object ID or Object Path. So request to download a file is following:
```http
GET {rootFolderUrl}/{File Path}?cmisselector=content
```
Response will Redirect with Http status `302 Found` to File Location.
```http
HTTP/1.1 302 Found
...
Location: /ss-content-proxy/5def6da9-9c70-4800-84e9-5e501bc6687e/Test_grootbestand_20180725_1-5def6da9-9c70-4800-84e9-5e501bc6687e/data/objects/GDBA-7774-02.pdf
```
So your CMIS Client has to follow HTTP Redirect in order to Download a file.
Accessing files without CMIS
As presented in section above CMIS will redirect a HTTP Client to specific File location to
download file. It is possible to combine the URL request in order to download needed file without
using CMIS even.
However there are few requirements:
- You have to know exact Repository ID. Which is an Archivematica Location UUID assigned to your user.
- You have to know exact AIP Name (including name and UUID).
- You have to know exact path of needed file inside of AIP Package.
If requirements above are met you are able to build an URL to download needed file.
The rules are following:
`{CMIS Endpoint}` + `/ss-content-proxy/` + `{Repository ID}` + `/{AIP Name}` + `/{File Path inside AIP}`
For example this url:
It contains following parts:
- CMIS Endpoint -- `https://pic-devel-cmis.edepot.picturae.com`
- Repository ID -- `5def6da9-9c70-4800-84e9-5e501bc6687e`
- AIP Name -- `Test_grootbestand_20180725_1-5def6da9-9c70-4800-84e9-5e501bc6687e`
- File Path -- `data/objects/GDBA-7774-02.pdf`
Please notice that file location contains root folder `data` which is not presented in CMIS AIP Structure. So you have to include it explicitly.
So now you can download needed file without using CMIS.
- Note** Downloading big files may be slow. Because it requires some time to Extract file from an AIP Archive.
CMIS Workbench
Installation
In order to connect via CMIS Workbench you need to install [Java](https://java.com/en/download/).
Follow instruction from official Java Web site to download and install it on your Machine (Windows, MacOS, Linux, etc... ).
Once you have Java installed, check that it is accessible from your terminal. type:
`java --version` to see that it is working.
Example:
```bash
$ java --version
java 10.0.1 2018-04-17
Java(TM) SE Runtime Environment 18.3 (build 10.0.1+10)
Java HotSpot(TM) 64-Bit Server VM 18.3 (build 10.0.1+10, mixed mode)
```
Now you can download CMIS Workbench from [official mirror](http://www-eu.apache.org/dist/chemistry/opencmis/1.1.0/chemistry-opencmis-workbench-1.1.0-full.zip)
Unpack it to your favorite location. Example (MacOS, Linux):
```bash
curl -o cmis-workbench.zip http://www-eu.apache.org/dist/chemistry/opencmis/1.1.0/chemistry-opencmis-workbench-1.1.0-full.zip
mkdir cmis-workbench
unzip -d cmis-workbench/
cd cmis-workbench/
```
After you unzipped the package you can run it `./workbench.command`
Usage
After starting CMIS workbench you should see following window:
![CMIS](assets/cmis-open.png)
Now you need to login.
- URL -- this is your CMIS Server endpoint. (Example: `https://pic-devel-cmis.edepot.picturae.com`)
- Binding -- Browser
- Username -- Your username (Example: `demo`)
- Password -- Your password (Example: `demo`)
- Authentication -- Standard
- Rest parameters leave As Is.
Then click on `Load Repositories` button and after it loads Repositories List click on Login.
Now you should see following window:
![CMIS Browse](assets/cmis-browse.png)
On the left side you can see list of ingested AIPs and on the right side CMIS Object Details.
You can double-click on Folders (AIP Packages) in order to browse its file & folder structure.
Once you reached needed file Right Click on it and select Download.
![CMIS File Download](assets/cmis-download.png)
Ignore MIME Error message. This is an issue with CMIS Workbench.
After you clicked on the Download Button it may take some time to extract the file from AIP and start downloading it.