OpenText Importer
Introduction
The OpenText Importer takes the objects processed in migration-center and imports them into an OpenText Content Server.
Known issues & limitations
Importing not allowed items in the physical item container is permitted during delta migration (#50978)
RM classifications for physical objects are not removed during delta migration (#50979)
Physical objects properties of type date are not updated during delta migration (#50980)
Installation and supported versions
The OpenText Importer is compatible with the version 10.5, 16.0, 16.4, 20.2 and 20.4 of OpenText Content Server. The version 10.0 is not supported anymore.
It requires les-services (for v10.5) or Content Web Services (for v10.5+) to be installed on the Content Server. In case of setting classifications to the imported files or folders the Classification Webservice must be installed on the Content Server. For supporting Record Management Classifications, the Record Management Webservice is required.
Some specific importer features require the installation of some of the provided patches on the Content Serve. The required patches are delivered with MC kit within the folder “..\ServerComponents\Jobserver\lib\mc-otcs-importer\cspatches”.
For deployment, copy the provided patches to the folder .\patch on the Content Server and restart the Content Server.
Functional Description of Content Server Patch pat10000001
This patch extends the OpenText DOCMANSERVICE.Service.DocumentManagement. CreateSimpleFolder method.
The patch allows setting of custom CreateDate, ModifyDate, FileCreateDate and FileModifiyDate for nodes and versions.
Working with OpenText Importer
To create a new OpenText Importer job specify the respective adapter type in the importer’s
properties window from the list of available adapters “OpenText” must be selected. Once the adapter type has been selected, the Parameters list will be populated with the parameters specific to the selected adapter type, in this case OpenText.
The -Properties window- of an importer can be accessed by double-clicking an importer in the list, by selecting the [Properties] button from the toolbar or from the context menu.
Common importer parameters
OpenText Importer parameters
Working with OpenText specific features
Working with folders
There are two ways to create folders in the OpenText repository with the importer:
On the fly when importing documents When creating a new migration set choose the “<source type>ToOpenText(document)“ type – this will create migration set containing documents targeted at OpenText. Use the “autoCreateFolders” parameter from the OpenText Importer configuration to generate the folder structure based on values extract in the system rule “ParentFolder”. No categories, classification or permissions can be set on the created folders.
Using a dedicated migration set for folders When creating a new migration set choose the “<source type>ToOpenText(container)“ type – this will create a migration set containing folders targeted at OpenText.
Now only the scanner runs containing folder objects will be displayed on the |Filescan Selection| tab. Note that the number of objects contained in the displayed scanner runs now indicates folders and not documents, which is why the number on display (folders) can be different from the total number of objects processed by the scan (if it contains other types of objects besides folders). When creating transformation rules for the migration set, keep in mind that folder-only migration sets have folder-specific attributes to work with, in this case attributes specifically targeted at OpenText folder objects. You can set permissions, categories and classifications to the imported folders.
When importing folder migration set, in case an existing folder structure is already in place an error will be thrown for the folder objects that already exist. It is not possible to avoid this behavior unless you skip them manually by removing them from the migset or putting them in an invalid state for import.
The importer parameter “autoCreateFolders” applies to both documents and folders migration sets.
Working with documents
OpenText importer allows importing documents from any supported source system to OpenText. For that a migset of type “<Source>toOpenTextDocument” has to be created:
Importing documents with multiple versions is supported by OpenText importer. The structure of the versions tree is generated by the scanners of the systems that support this feature and provide means to extract it. Although the version tree is immutable (i.e. the ordering of the objects relative to their antecedents cannot be changed) the type of the versions (linear, minor or major) can be set in the system attribute “VersionControl” (see next sections for more detailed)
All objects from a version structure must be imported since each of them reference its antecedent, going back to the very first version. Therefore, it is advised to not drop the versions of an object between the scan and the import processes as this will generate inconsistencies and errors. If an object is intended to be migrated without versions (i.e. only the current version of the object needs to be migrated) than the affected objects should be scanned without enabling the respective scanner’s versioning option.
Content validation
The Content Validation functionality is based on checksums computed for the document’s contents during scan (by the Scanner itself) and after import (by the Importer). The two checksums are then compared - for any given piece of content its checksums from before and after the migration should be identical.
If the two checksums differ in any way, this is an indication that the content has been corrupted/modified during/after migration. Modifications can happen on purpose or simply by user or environment error. Actual data corruption rarely happens and is usually due to software and/or hardware errors on the systems involved during content transfer or storage.
Validating content is always performed after the content has been imported to the target repository, thus adding another step to the migration process. Accordingly, using this feature may significantly increase import time due to having to read back every piece of content for every document and compute its checksum in order to compare it against the initial checksum computer during scan.
This feature is controlled through the checkContentIntegrity parameter in the OpenText Importer (disabled by default).
This feature work only in tandem with a Scanner that supports it: Documentum Scanner, Filesystem Scanner, Database Scanner and SharePoint Scanner.
Working with compound documents
This current version of the importer allows importing virtual documents scanned from Documentum as compound document in OpenText. For creating a compound document, the first value of the system attribute “target_type” must be “opentext_compound”. Setting the categories and classifications to the compound documents are supported as they are supported for normal documents. The children of the VD will be imported inside the compound documents in a “box-in-box” manner, similar to the way VDs, children, content, and VDs in VDs are represented in Documentum Webtop.
Importing multiple versions of compound documents is not support. Therefore, the option for scanning only the last version of the virtual documents from Documentum should be activated. For more details please check the Documentum Scanner user guide.
Importing virtual documents as compound documents is done by OpentText Importer in two steps:
Import all virtual documents as empty CD and all non-virtual documents as normal documents. They are all imported within the folder specified in the "ParentFolder" attribute.
Add the virtual documents children to the compound documents based on the VDRelation relations scanned from Documentum in the following way:
if the child is linked in the folder where it was imported in the step 1 and the importer parameter “moveDocumentsToCD” is checked, then the child document is moved to the compound document. If “moveDocumentsToCD” is not checked then a shortcut to the child is added inside the compound document.
if the child is already linked in a compound document then a shortcut is created in the current compound document (the case when a document is child of a multiple VDs)
If the virtual documents being imported as compound documents have content, the content is imported as the first child of the compound document having the same name as the compound document itself.
Working with emails
OpenText importer allows importing emails scanned as MSG files from Outlook, Exchange or from other supported systems. In order to be imported as emails, the objects in migration center need to be associated with the target type “opentext_email”.
If the importer parameter “extractAttachmentsFromEmail” is checked the importer will extract the email attachments and import them as cross references for the imported email. In this case the parameter “crossRefCode” should be set with the value of Email Cross-Reference in the RM settings (Records Management -> Records Management Administration -> System Settings -> RM Settings -> Email Cross-Reference) .
For the case when MSG files are scanned from other source system than Outlook or Exchange, the importer allows extracting the email properties (subject, from, to, cc, body, etc) from the email file. This is done during the import.
Working with physical objects
Starting with version 3.7 the importer allows users to import the scanned documents and folders as Physical Items. All predefined types of physical items (Non Container, Container and Box) are supported. For importing physical objects, the user needs to configure the webservice URL in the “physicalObjectsWebserviceURL”.
Physical Items can be imported with migsets having the appropriate processing type: <SourceType>toOpentext(physicalObj). Physical objects migration sets have a predefined set of rules for system attributes listed under - Rules for system attributes- in the -Transformation rules- window. The system attributes have a special meaning for the importer, so they have to be set with valid values as it is described below.
Assigning Physical Items in Box containers
Until version 3.12 of migration-center assigning physical items into physical box items was done by setting the ParentFolder system rule to point to the box item. This would import the physical item in the parent folder of the box item and assign it to the box item.
Starting from version 3.13 a new parameter was added, PhysicalBoxPath, for specifying which box item the object should be assigned to. This allows setting the ParentFolder to any other location, in order to better replicate the OTCS functionality.
Rule for System Attribute - target_type
The first value of the target type should be set with a valid Physical Object Type that is defined in the section Manage/Object types…. The accepted values should start with one of the following value: “opentext_physical_item”, “opentext_physical_box”, “opentext_physical_container”. For distinguish between different Item type you can create new types like: “opentext_physical_item_notebook”, “opentext_physical_container_records”. The provided object types can be extended with the custom physical properties that are defined in the Content Server for every Physical Item Type (Physical Properties).
Starting with the second value of “target_type” rule the user is allowed to optionally set one or multiple categories and records management classifications. See the dedicated chapters for more details regarding these features.
Working with Records Management Classifications
Starting with version 3.7 the Records Management Classifications are not handled anymore as system rules, but they are handled internally as a dedicated and predefined object type: “opentext_rm_classification”.
For assigning the record management classification to the imported objects (Documents, Containers and Physical Items) the “target_type” system rule should one value (after the first value) equal with: “opentext_rm_classification”. For setting the specific attributes of the classification the association with the predefined type “opentext_rm_classification” and its attributes should be done as you can see in the screen shots below.
Only the attributes of “opentext_rm_classification” object type provided with the installation are accepted by the importer.
Categories
OpenText importer allows assigning categories to the imported documents and folders. A category is handled internally by migration center client as target object type and therefore the categories has to be defined in the migration-center client as object types (menu Manage/Object types…):
Since multiple categories with the same name can exist in an OpenText repository the category name must be always followed by its internal id. Ex: BankCustomer-44632.
The sets defined in the OpenText categories are supported by migration-center. The set attributes will be defined within the corresponding object type using the pattern <Set Name>#<Attribute Name> . The importer will recognize the attributes containing the separator “#” to be attributes belonging to the named Set and it will import them accordingly.
Only the categories specified in the system rules “target_type” will be assigned to the imported objects:
For setting the category attributes the rules must be associated with the category attributes in the migration set’s |Associations| tab:
Since version 3.2.9 table key lookup attributes are supported in the categories. These attributes should be defined in migration-center in the same way the other attributes for categories are defined. Supported type of table key lookup attributes are Varchar, Number and Date. The only limitation is that Date type attributes should be of type String in the migration-center Object types.
OpenText documents and folders features handled by rules for system attributes
Both folders and documents migration sets have a predefined set of rules for system attributes listed under -Rules for system attributes- in the -Transformation rules- window. The system attributes have a special meaning for the importer, so they have to be set with valid values as it is described in the next sections.
The following table lists all provided system attributes available for documents and folders.
Rule for System Attribute – ACLs
The system attribute ACLs (Access Control List) is responsible for optionally assigning permissions based on the source data to the target object.
Each value to be assigned consists of three main elements separated by #. The third value for individual permissions itself is separated by |.
<ACLType#RightName#Permission-1|Permission-2|Permission-n>
Sample ACLs value string: ACL#csuser#See|SeeContents
The following table describes all valid values for defining a correct ACLs value:
You may use as many individual entries as individual permissions required at OpenText Content Server for this MC System Rule.
Example:
Rule for System Attribute – Classifications
The system attribute Classifications is responsible for optionally assigning one or more Classifications to the target object.
Each value for a Classification to be assigned must be an existing Content Server Classification path, divided with forward slashes (/), located under the node specified in the importer parameter "classRootFolder" (see OpenText Importer parameters).
Rule for System Attribute – ContentName
The system attribute “ContentName” is responsible for assigning the internal target version filename for the source document to be uploaded.
OpenText Content Server uses this internal version filename also for its mimetype recognition, so it is required to always build “ContentName” together with a valid extension.
Rule for System Attribute – ImpersonateUser
The system attribute ImpersonateUser is responsible for assigning the correct owner of the source object to the imported target object.
Notes: If authenticated via RCS add the appropriate domain to the user.
Since the creation of the object to be imported is done with the context of this assigned user, this user needs at least "Add Items" permissions assigned for the target location defined in MC System Rule 'ParentFolder'
Rule for System Attribute – mc_content_location
This attribute can be used to import the content of the document from another place than the location where the scanner exported the document. It should be set with a valid file path. If it’s not set the content will be picked up from the original location.
Rule for System Attribute – Name
The system attribute Name is responsible for assigning a Name to the target document or folder. If a node with the same name already exists, the object will be skipped and the importer will throw an error.
Rule for System Attribute – ParentFolder
The system attribute ParentFolder will be set with the full path of the container where the current object will be imported.
Notes: The adaptor internally uses forward slash (/) as the path delimiter. Make sure to consider this in your rules.
If a folder name in the path contains a forward slash (/) that should be escaped with the character sequence: %2F
Rule for System Attribute – RenditionTypes and RenditionPaths
The system rules RenditionPaths and RenditionTypes can be used for importing renditions that had been exported from the source system.
RenditionTypes is multiple-value rule that will be set with the rendition types that will be imported to the content server (Ex: “pdf”, “png”). If no values are set for this attribute there will be no imported rendition.
RenditionPaths is multiple-value rule that will be used to set the paths where renditions exported from source system are located. If the rendition paths will not be set, the importer will ask the content server to generate the rendition on the file based on the document content.
RenditionTypes and RenditionPaths work in pairs in the following way:
when RenditionTypes has one or more values and the corresponding rendition paths are readable, the renditions are imported
when a RenditionTypes value is missing but a rendition path value is present, renditions are ignored
If a RenditionTypes value is duplicated the first pair of rendition type-rendition path are take in the consideration, the second pair being ignored.
Rule for System Attribute – Shortcuts
The system attribute Shortcuts is responsible for optionally creating shortcuts to the imported document or folder in the folder specified as value of this attribute. One or more folder paths can be specified. If the specified folder does not exist, it will be created automatically if “autoCreateFolders” is enabled in the importer.
If a shortcut cannot be created when importing a document an appropriate error will be reported by the importer and the document will be skipped.
If a shortcut cannot be created when importing a folder, the folder will be created but its status in migration center will be partially imported.
The name of the shortcut is taken from the system rule “Name”.
Rule for System Attribute – target_type
The first value of this attribute must be the type of object to be imported. For documents migsets the value must be “opentext_document”, for folders migsets the value must be “opentext_folder”.
The next values of this attribute are reserved to the categories that will be assigned to the imported document or folder.
Rule for System Attribute – VersionControl
The system attribute VersionControl is used for controlling the creation of versions at Content Server side with the required versioning method (Linear Versioning or Advanced Versioning with Minor or Major Versions).
The valid values are:
“0” - for creating a liner version at Content Server side
“1” - for creating minor version at Content Sever side
“2” - for creating a major version at Content Server side
You cannot mix linear versioning with advanced versioning (minor or major) for the versions belonging to the same version tree.
Considerations regarding delta migration
Objects that have been changed in the source system since the last scan are scanned as update objects. Whether an object in a migration set is an update or not can be seen by checking the value of the Is_update column – if it’s 1, the current object is an update to a previously scanned object (the base object). An update object cannot be imported unless its base object has been imported previously.
Depending on the source systems the object comes from, the method of obtaining the update information will differ but the objects behavior will stay the same once scanned. See the documentation of the scanners in case you need more information about the supported updates and how they are detected.
In order for delta migration to work properly it is essential to not reset the migration sets (objects) after they have been imported.
When updating the documents or versions, the importer may need to delete some documents or versions that where imported previously. This is because of a limitation of the Content Webservice that does not allow updating the content of existing objects. If Records Manager is installed on the Content Server, importing documents updates may not work since the Content Server does not allow deleting documents and versions.
Last updated