OpenText Scanner

Introduction

The OpenText Scanner allows extracting objects such as documents, folders compound documents and saves this data to migration-center for further processing. The key features of OpenText Scanner are:

  • Supports OTCS versions: 9.7.1, 10.0, 10.5, 16.0, 16.2, 16.2.4, 20.4, 21.4

  • Export documents content and their metadata (versions, ACLs, categories, attribute sets, classifications, renditions)

  • Export folders and their metadata (categories, attribute sets, classifications)

  • Export projects and their metadata

  • Export shortcuts of documents and folders

  • Scale the performance by using multiple threads for scanning data

Known issues & limitations

  • User fields from a category that contain deleted users are scanned as IDs only with a warning in the log (#52982)

  • Classifications not supported for version 21.4 (#63081)

Exporting from OpenText

Exporting objects

The OpenText scanner connects to the OpenText Content Server via the specified “webserviceURL” set in the scanner properties and can export folders, documents or compound documents. The account used to connect to the content server must have System Administration Rights to extract the content. All subfolders of the specified folder(s) will automatically be processed as well; an option for excluding select subfolders from scanning is also available. See OpenText Scanner Properties below for more information about the features and configuration parameters available in the OpenText scanner. Email folders are supported so the documents and emails located in email folders are scanned as well.

In addition to the documents themselves, renditions and properties can also be extracted. The content and renditions are exported as files on disk, while the properties are stored to the mc database, as it is the standard with all migration-center scanners.

After a scan has completed, the newly scanned documents and their properties are available for further processing in migration-center.

Below is a list of object types, which are currently supported by the OpenText scanner:

  • Any type of container nodes (we have successfully tested so far the following types: Folder, Email Folder, Project, Binder, EcmWorkspace)

  • Document (ID: 144)

  • Compound Document (ID: 136)

  • Email (ID: 749)

  • Shortcut

  • Generation

  • CAD Document (ID: 736) - scanned as a regular document

Any objects with different types will be ignored during initialization. A list of all node types of scanned containers is provided in the run log as well as all node types in the scope of scanning that were not exported because they are not supported.

Exporting the ACL metadata for folders and documents

The ACLs of the scanned folders and documents will be exported automatically as source attribute ACLs.

ACLs attribute can have multiple values, so that each value has the following format:

<ACLType#RightName#Permission-1|Permission-2|Permission-n>

The following table describes all valid values for defining a correct ACLs value:

Ex:

ACL#csuser#See|SeeContents Owner#manager1#See|SeeContents|Modify

OpenText Scanner Properties

To create a new OpenText Scanner job, specify the respective adapter type in the Scanner Properties window from the list of available connectors, OpenText must be selected. Once the adapter type has been selected, the Parameters list will be populated with the parameters specific to the selected adapter type, in this case the OpenText parameters.

The Properties window of a scanner can be accessed by double-clicking a scanner in the list or selecting the [Properties] button or entry from the toolbar or context menu.

Common scanner parameters

OpenText scanner parameters

Known issues

When extracting the Owner, Version Created By, Created By and any other attribute the references a user it may happen the internal id of the user cannot be resolved by the Content Server and therefore instead of the username the scanner will extract its id. When this happens, a warning is logged in the report log.

Email folder can only be extracted from the Content Server 16 and later.

Wildcards feature is working properly in most cases, but it has some limitations:

  • Folders which contain the character “?” in the name will not be scanned using “*” wildcard. Example: Ex: "/*" will not scan "/test?name"

  • The wildcards in the middle of the path do not work: “/*/Level2” will scan no documents under the “Level2” folder

  • “/Level1/*exactname*” will scan no documents which are located in “exactname” folder

When scanning documents by using “scanFolderPaths”, if two or more paths overlap, the documents which are located under them will be scanned twice or more. However, this issue will not affect the content integrity of the documents. Each duplicate document will have a separate content location.

Example:

/folder1/folder2/doc1

/folder1/folder3/doc2

/folder1/folder3/doc3

scanFolderPaths: “/folder1 | /folder1/folder3”

Considering the above scenario, the documents “doc2” & “doc3” will be scanned twice.

When at least two ids overlap with their children when setting rootFolderIds the documents will be scanned twice or more.

Example:

rootFolderIds* : 2000, 2001

If the root folder with id 2000 contains as children root folder 2001, all the objects under the root folder 2001 will be scanned twice.

Last updated