# OpenText Scanner

## Introduction

The OpenText Scanner allows extracting objects such as documents, folders compound documents and saves this data to migration-center for further processing. The key features of OpenText Scanner are:

* Supports OTCS versions: 9.7.1, 10.0, 10.5, 16.0, 16.2, 16.2.4, 20.4, 21.4, 22.4&#x20;
* Export documents content and their metadata (versions, ACLs, categories, attribute sets, classification trees, records management classifications, renditions)
* Export folders and their metadata (categories, attribute sets, classifications)
* Export projects and their metadata
* Export shortcuts of documents and folders
* Scale the performance by using multiple threads for scanning data

## Known issues & limitations

* A "*MalformedURLException: no protocol*" error is thrown when the **rmWebserviceUrl** parameter has no value. (#66471)\
  Install **MC23.1\_hotfix1** to fix this issue.
* User fields from a category that contain deleted users are scanned as IDs only with a warning in the log (#52982)
* The scanner might extract the ID instead of the username when scanning attributes that reference a user. This happens when the internal id of the user cannot be resolved by the Content Server. When this happens, a warning is logged in the report log. (#52847)
* Email folder can only be extracted from the Content Server 16 and later.
* Wildcards feature is supported, but it has some limitations:
  * Folders which contain the character “?” in the name will not be scanned using “\*” wildcard. Example: "/\*" will not scan "/test?name"
  * The wildcards in the middle of the path do not work: “/\*/Level2” will scan no documents under the “Level2” folder
  * “/Level1/\*exactname\*” will scan no documents which are located in “exactname” folder
* When using **scanFolderPaths** or **rootFolderIds**, if if two or more paths overlap, the documents which are located under them will be **scanned twice** or more. Content is not affected as each duplicate document will have a separate content location.
  * Example: if "folder1" contains "folder2" and scanFolderPaths is set to “/folder1, /folder1/folder2”, the contents of folder2 will be scanned twice.

## Exporting from OpenText

### Exporting objects

The OpenText scanner connects to the OpenText Content Server via the specified “webserviceURL” set in the scanner properties and can export folders, documents or compound documents. The account used to connect to the content server must have System Administration Rights to extract the content. All subfolders of the specified folder(s) will automatically be processed as well; an option for excluding select subfolders from scanning is also available. See [OpenText Scanner Properties](#opentext-scanner-properties) below for more information about the features and configuration parameters available in the OpenText scanner. Email folders are supported so the documents and emails located in email folders are scanned as well.

In addition to the documents themselves, renditions and properties can also be extracted. The content and renditions are exported as files on disk, while the properties are stored to the mc database, as it is the standard with all migration-center scanners.

After a scan has completed, the newly scanned documents and their properties are available for further processing in migration-center.

Below is a list of object types, which are currently supported by the OpenText scanner:

* Containers: Folder, Email Folder, Project, Binder
* Document (ID: 144)
* Compound Document (ID: 136)
* Email (ID: 749)
* Shortcut
* Generation
* CAD Document (ID: 736) - *scanned as a regular document*
* Url
* Physical item

> The objects above are collected from other types of containers like Business Workspaces.

Any objects with different types will be ignored during initialization. A list of all node types of scanned containers is provided in the run log as well as all node types in the scope of scanning that were not exported because they are not supported.

### Exporting the ACL metadata for folders and documents

The ACLs of the scanned folders and documents will be exported automatically as source attribute *ACLs*.

*ACLs* attribute can have multiple values, so that each value has the following format:

*`<ACLType#RightName#Permission-1|Permission-2|Permission-n>`*

The following table describes all valid values for defining a correct ACLs value:

| **Element** | **Possible Values**                                                                                                                                                                                                                  | **Description**                                                                                                                            |
| ----------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | ------------------------------------------------------------------------------------------------------------------------------------------ |
| ACLType     | <p>·         Owner</p><p>·         OwnerGroup</p><p>·         Public</p><p>·         ACL</p>                                                                                                                                         | This refers to the default and assigned access.                                                                                            |
| RightName   | <p>· Content Server User Login Name</p><p>·         Content Server Group Name</p><p>·         -1</p>                                                                                                                                 | It’s set with the owner name or owner access, with the username or group name for the assigned access and with “-1” for the public access. |
| Permissions | <p>·         See</p><p>·         SeeContents</p><p>·         Modify</p><p>·         EditAtts</p><p>·         CreateNode</p><p>·         Checkout</p><p>·         DeleteVersions</p><p>·         Delete</p><p>·         EditPerms</p> | The granted permissions separated by \|.                                                                                                   |

Ex:

`ACL#csuser#See|SeeContents`\
`Owner#manager1#See|SeeContents|Modify`

## OpenText Scanner Properties

To create a new OpenText Scanner job, specify the respective adapter type in the Scanner Properties window from the list of available connectors, OpenText must be selected. Once the adapter type has been selected, the Parameters list will be populated with the parameters specific to the selected adapter type, in this case the OpenText parameters.

The Properties window of a scanner can be accessed by double-clicking a scanner in the list or selecting the \[Properties] button or entry from the toolbar or context menu.

### Common scanner parameters

| **Configuration parameters** | **Values**                                                                                                                                                                                                                                                                                |
| ---------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| Name\*                       | <p>Enter a unique name for this scanner</p><p><strong>Mandatory</strong></p>                                                                                                                                                                                                              |
| Adapter type\*               | <p>Select the “OpenText” connector from the list of available connectors</p><p><strong>Mandatory</strong></p>                                                                                                                                                                             |
| Location\*                   | <p>Select the Job Server location where this job should be run. Job Servers are defined in the Jobserver window. If there are no Job Server defined, migration-center will prompt the user to define a Job Server Location when saving the Importer.</p><p><strong>Mandatory</strong></p> |
| Description                  | Enter a description for this job (optional)                                                                                                                                                                                                                                               |

### OpenText scanner parameters

| **Configuration parameters**  | **Values**                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             |                                                                                                                                                                                                                                                                                                                                                                                                                                                |
| ----------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| username\*                    | <p>The OpenText Content Server user with "System Administration Rights". This privilege is required so the scanner can access all objects in the scanning scope.</p><p><strong>Mandatory</strong></p>                                                                                                                                                                                                                                                                                                                                                                  |                                                                                                                                                                                                                                                                                                                                                                                                                                                |
| password\*                    | <p>The user password</p><p><strong>Mandatory</strong></p>                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              |                                                                                                                                                                                                                                                                                                                                                                                                                                                |
| webserviceUrl\*               | <p>The URL to the .NET Content Web Services.</p><p>[http://server:port/cws/Authentication.svc](https://docs.migration-center.com/23.1/scanners/http:/server:port/cws/Authentication.svc)"<br><strong>Mandatory</strong></p>                                                                                                                                                                                                                                                                                                                                            |                                                                                                                                                                                                                                                                                                                                                                                                                                                |
| authenticationWebserviceUrl\* | <p>The URL to a valid authentication webservice. Currently, CWS and OTDS authentication webservices are accepted.</p><p><strong>Mandatory</strong></p>                                                                                                                                                                                                                                                                                                                                                                                                                 |                                                                                                                                                                                                                                                                                                                                                                                                                                                |
| authenticationType\*          | <p>The authentication type. Valid values are CWS for standard content server authentication and OTDS for OpenText Directory Services authentication.</p><p><strong>Mandatory</strong></p>                                                                                                                                                                                                                                                                                                                                                                              |                                                                                                                                                                                                                                                                                                                                                                                                                                                |
| classificationsWebserviceUrl  | <p>The URL to a valid classification webservice or similar. Use only for Content Server 10.0.0 or later.</p><p>Ex: [http://server:port/les-services-classifications/Classifications.svc](https://docs.migration-center.com/23.1/scanners/http:/server:port/les-services-classifications/Classifications.svc)</p>                                                                                                                                                                                                                                                       |                                                                                                                                                                                                                                                                                                                                                                                                                                                |
| rmWebserviceUrl               | The URL of the Record Management WebService. This is necessary when we need to scan objects with Records Management Classifications. Ex: [http://server:port/les-recman/services/Classifications](https://docs.migration-center.com/23.1/scanners/http:/server:port/les-recman/services/Classifications)                                                                                                                                                                                                                                                               |                                                                                                                                                                                                                                                                                                                                                                                                                                                |
| sessionLength\*               | <p>The length of the session which is set into Content Server. The length is represented in minutes and has to be bigger or equal than 16 minutes. This is required for refreshing the authentication token.</p><p><strong>Mandatory</strong></p>                                                                                                                                                                                                                                                                                                                      |                                                                                                                                                                                                                                                                                                                                                                                                                                                |
| rootFoldeIds\*                | <p>The IDs of the nodes whose containers and documents will be scanned. The IDs can be provided either as a list of node IDs separated by a comma or as a CSV file path that contains a node ID value on each row. The CSV path must start with "@". By default, the value is 2000. Can be set with any folder IDs.</p><p><strong>Mandatory</strong></p>                                                                                                                                                                                                               |                                                                                                                                                                                                                                                                                                                                                                                                                                                |
| scanFolderPaths               | <p>The list of folder paths where the scanner looks for objects (documents and folders) relative to the specified root folders. Each of the specified paths must be in at least one of the root folders. The paths must start with "/". Multiple paths can be separated using the “                                                                                                                                                                                                                                                                                    | ” character. If empty, the scanner will scan the entire folder structure under each specified root folder.</p><p>The following wildcards are allowed in the folder path:</p><ul><li>\* - replace zero, one or multiple characters</li><li>? - replace a single character</li></ul><p>Examples of using wildcards:</p><p>/Shelf/Drugs/Drug No. ?</p><p>/Shelf/Drugs/\*</p><p>/Shelf/Drugs/Drug No. ?/Test</p><p>/Shelf/Drugs/\*end/Ultimate</p> |
| excludeFolderPaths            | The list of folder paths to be excluded from the scan. Paths must start with "/" and must be relative to at least one ID specified in **rootFolderIds.**                                                                                                                                                                                                                                                                                                                                                                                                               |                                                                                                                                                                                                                                                                                                                                                                                                                                                |
| exportDocuments               | Flag indicating if the documents will be exported. When **exportDocuments** is enabled, the scanner will scan all the documents and their versions linked under the folders specified in **rootFolderIds** and **scanFolderPaths** The documents are exported as *OTCS(document)* objects to MC database.                                                                                                                                                                                                                                                              |                                                                                                                                                                                                                                                                                                                                                                                                                                                |
| exportLatestVersions          | A number specifying how many versions from every version tree will be exported starting from the latest version to the older versions. If it is empty, 0 or negative, all versions will be exported.                                                                                                                                                                                                                                                                                                                                                                   |                                                                                                                                                                                                                                                                                                                                                                                                                                                |
| exportCompoundDocuments       | <p>Flag indicating if the compound documents will be exported. When enabled the scanner will scan the latest version of the compound document together with its children. There will be no migration-center relations between the compound document and its children. The children are related to the compound document through the parent folder attribute.</p><p>The parameter <strong>exportCompoundDocuments</strong> can be enabled only when <strong>exportDocuments</strong> is enabled.</p>                                                                    |                                                                                                                                                                                                                                                                                                                                                                                                                                                |
| exportClassifications         | <p>Flag indicating if the classifications will be exported. When <strong>exportClassifications</strong> is enabled, the scanner will export the classifications of the scanned folders and documents in the source attribute "Classifications". Each classification will be a distinct value of the attribute "Classifications". The classification values will be saved as paths. The export of classifications will be available only for CS 10, 10.5 and 16 since CS 9.7.1 does not provide this functionality.</p><p>Ex: Corporate Information/News/Newsletter</p> |                                                                                                                                                                                                                                                                                                                                                                                                                                                |
| exportShortcuts               | Flag indicating if the shortcuts of scanned documents and folders will be exported. The full path of every shortcut pointing to the current object (document or folder) is scanned in the source attribute “Shortcuts”.                                                                                                                                                                                                                                                                                                                                                |                                                                                                                                                                                                                                                                                                                                                                                                                                                |
| exportShortcutsAsObjects      | Flag indicating if the shortcuts of scanned documents and folders will be exported as independent objects. The shortcuts and generations are exported as *OTCS(shortcut)* objects to MC database.                                                                                                                                                                                                                                                                                                                                                                      |                                                                                                                                                                                                                                                                                                                                                                                                                                                |
| exportURLs                    | The flag indicating if the URLs will be exported as *OTCS(object)* to MC database. If the flag is not set, the URLs will not be exported.                                                                                                                                                                                                                                                                                                                                                                                                                              |                                                                                                                                                                                                                                                                                                                                                                                                                                                |
| exportRenditions              | Flag indicating if the renditions will be exported.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    |                                                                                                                                                                                                                                                                                                                                                                                                                                                |
| exportFolderStructure         | Flag indicating if the structure of folders will be exported. When enabled the scanner will scan entire folder structure under the folders configured by parameters **rootFolderIds** and **scanFolderPaths**. The scanner will export all folders as *OTCS(container)* objects to MC database.                                                                                                                                                                                                                                                                        |                                                                                                                                                                                                                                                                                                                                                                                                                                                |
| skipContent                   | Flag indicating if the documents content will be exported from repository. If checked only the metadata will be exported.                                                                                                                                                                                                                                                                                                                                                                                                                                              |                                                                                                                                                                                                                                                                                                                                                                                                                                                |
| exportLocation\*              | <p>The location where the exported object content should be saved. It can be a job server local folder or a shared folder. It must exist and it should be writable.</p><p><strong>Mandatory</strong></p>                                                                                                                                                                                                                                                                                                                                                               |                                                                                                                                                                                                                                                                                                                                                                                                                                                |
| computeChecksum               | <p>When it's checked the checksum of scanned files will be computed. Useful for determining whether files with different names and from different locations have in fact the same content, as can frequently happen with common documents copied and stored by several users in a file share environment.</p><p>Do not enable this option unless necessary, since the performance impact is significant due to the scanner having to read the full content for each and compute the checksum for it.</p>                                                               |                                                                                                                                                                                                                                                                                                                                                                                                                                                |
| hashAlgorithm                 | <p>Specifies the algorithm that will be used to compute the Checksum of the scanned objects.</p><p>Possible values are "MD2", "MD5", "SHA-1", "SHA-224", "SHA-256", "SHA-384" and "SHA-512". Default value is MD5.</p>                                                                                                                                                                                                                                                                                                                                                 |                                                                                                                                                                                                                                                                                                                                                                                                                                                |
| hashEncoding                  | <p>Specifies the encoding that will be used to compute the Checksum of the scanned objects.</p><p>Possible values are "HEX", "Base32" and "Base64". Default value is HEX.</p>                                                                                                                                                                                                                                                                                                                                                                                          |                                                                                                                                                                                                                                                                                                                                                                                                                                                |
| numberOfThreads               | The number of threads that will be used for scanning the documents. Maximum allowed is 20.                                                                                                                                                                                                                                                                                                                                                                                                                                                                             |                                                                                                                                                                                                                                                                                                                                                                                                                                                |
| loggingLevel\*                | <p>The scanner logging level to use: 1-Error, 2-Warn, 3-Info, 4-Debug.</p><p><strong>Mandatory</strong></p>                                                                                                                                                                                                                                                                                                                                                                                                                                                            |                                                                                                                                                                                                                                                                                                                                                                                                                                                |
