Documentum Importer
Introduction
The Documentum Importer takes the objects processed in migration-center and imports them into a Documentum repository. As a change in migration-center 3.2, the Documentum scanner and importer are no longer tied to one another – any other importer can now import data scanned with the Documentum Scanner and the Documentum Importer can import data scanned by any other scanner. Starting from version 3.2.9, it supports objects derived from dm_sysobjects.
Importer is the term used for an output adapter used as the last step of the migration process. It takes care of importing the objects processed in migration-center into the target system (such as a Documentum repository).
An importer works as a job that can be run at any time and can even be executed repeatedly. For every run a detailed history and log file are created. It is defined by a unique name, a set of configuration parameters and an optional description.
Documentum Importers can be created, configured, started and monitored through migration-center Client, but the corresponding processes are executed by migration-center Job Server.
Supported Documentum Content Server versions
The supported Documentum Content Server versions are 5.3 – 20.2, including service packs. For accessing a Documentum repository Documentum Foundation Classes 5.3 or newer is required. Any combinations of DFC versions and Content Server versions supported by OpenText Documentum are also supported by migration-center’s Documentum Importer, but it is recommended to use the DFC version matching the version of the Content Server targeted for import. The DFC must be installed and configured on every machine where migration-center Server Components is deployed.
Starting with version 3.2.8, migration center supports Documentum ECS (Elastic Cloud Storage). Nevertheless, the documents cannot be imported to ECS if retention policy is configured in the CA store.
DFC (Documentum Foundation Classes) configuration
Starting from version 3.9 of migration-center additional configurations need to be made for the Documentum adapter to be able to locate Documentum Foundation Classes. This is done by modifying the dfc.conf file, located in the Job Server installation folder.
There are two settings inside the file that by default match the paths of a standard DFC install. One needs to have the path for the config folder of DFC and the other needs the path to the dctm.jar.
See below example:
wrapper.java.classpath.dfcConfig=C:/Documentum/config
wrapper.java.classpath.dfcDctmJar=C:/Program Files/Documentum/dctm.jar
The dfcConfig
parameter must point to the configuration folder. The dfcDctmJar
parameter must point to the dctm.jar file!
Information on folder migration
There are two ways to create Documentum folders with the importer:
Documents Migration set: When creating a new migration set choose the <source type>ToDctm(document) type – this will create migration set containing documents targeted at Documentum. Use the “autoCreateFolders” setting (checked) from the Documentum Importer configuration to generate the folder structure based on the “dctm_obj_link” values assigned by the transformation rules. No attributes or permissions can be set on the created folders.
Folder Migration set: When creating a new migration set choose the <source type>ToDctm(folder) type – this will create migration set containing folders targeted at Documentum. Now only the scanner runs containing folder objects will be displayed on the |Filescan Selection| tab. Note that the number of objects contained in the displayed scanner runs now indicates folders and not documents, which is why the number on display (folders) can be different from the total number of objects processed by the scan (if it contains other types of objects besides folders). When creating transformation rules for the migration set, keep in mind that folder-only migration sets have folder-specific attributes to work with, in this case attributes specifically targeted at Documentum folder objects. You can set permissions and attributes for the imported folders.
Important aspects to consider when importing folder migration set:
The attributes “object_name” and “r_folder_path” are key attributes for folders in Documentum. If these attributes are transformed without taking into consideration how these objects build into a tree structure, it may no longer be possible to reconstruct the folder tree. This is not due to migration-center, but rather because of the nature of folders being arranged in a tree structure which does create dependencies between the individual objects.In Documentum the “r_folder_path” attribute contains the path(s) to the current folder, as well as the folder itself at the end of the path (e.g. /cabinet/folder_one/folder_two/current_folder), while “object_name” contains only the folder name (e.g. current_folder). To make it easier for the user to change a folder name, migration-center prioritizes the “object_name” attribute over the “r_folder_path” attribute; therefore changing “object_name” from current_folder to folder_three for example will propagate this change to the objects’ “r_folder_path” attribute and create the folder /cabinet/folder_one/folder_two/folder_three without the user having to change the “r_folder_path” attribute to match. This only applies to the last part of the path, which represents the current folder, and not to other parts of the path. Those can also be modified using the provided transformation functions, but migration-center does not provide any automations to make sure the information generated in that case is correct.
The importer parameter “autoCreateFolders” applies to both documents migration set and folder migration set.
When importing folder migration set, in case an existing folder structure is already in place an error will be thrown for the folder objects that exist already. It is not possible to avoid this behavior unless you skip them manually by removing them from the migset or putting them in an invalid state for import.
Working with features specific to Documentum
Versions
Versions (and branches) are supported by the Documentum Importer, including custom version labels. The structure of the versions tree is generated by the scanners of the systems that support this feature and provide means to extract it. Although the version tree is immutable (i.e. the ordering of the objects relative to their antecedents cannot be changed) the version description (essentially the Documentum “r_version_label”) can be changed using the transformation rules prior to import.
All objects from a version structure must be imported since each of them reference its antecedent, going back to the very first version. Therefore, we advise not to drop the versions of an object between the scan and the import processes, as this will generate inconsistencies and errors. If an object is intended to be migrated without versions (i.e. the current version of the object needs to be migrated only) than the affected objects should be scanned without enabling the respective scanner’s versioning option.
Permissions
Permissions or ACLs can be set to documents and folders by creating transformation rules for the specific attributes. The attributes group_permit, world_permit and owner_permit can be used to set granulated permissions. For setting ACLs, the attributes acl_domain and acl_name must be used. The user must set either *_permit attributes or acl_* attributes. If both*_permit attributes or acl_* attributes are configured to be migrated together the *_permit attributes will override the permissions set by the acl_* attributes. Because Documentum will not throw an error in such a case migration-center will not be able to tell that the acl_* attributes have been overridden and as such it will not report an error either considering that all attributes have been set correctly.
If the group_permit, world_permit, owner_permit AND acl_domain, acl_name attributes are configured to be migrated together the *_permit attributes will override the permissions set by the acl_* attributes. This is due to Documentum’s inner workings and not migration-center. Also, Documentum will not throw an error in such a case, which makes it impossible for migrationcenter to tell that the acl_* attributes have been overridden and as such it will not report an error either, considering that all attributes have been set correctly. It is advised to use either the *_permit attributes OR the acl_* attributes in the same rule set in order to set permissions.
Renditions
Renditions are supported by the Documentum Importer. The needed information for this process is typically generated during scan by the Documentum Scanner or other scanners supporting systems where rendition information or other similar features can be extracted. Should rendition information not be obtainable using a particular scanner, or if the respective source system doesn’t have renditions at all, it is still possible to add files as renditions to objects during the transformation process. The renditions of an object can be controlled through the migration-center system attribute called “dctm_obj_rendition”. This attribute appears in the “Rules for system attributes” area of the Transformation Rules window. If the object contains at least one rendition in addition to the main content, the source attribute “dctm_obj_rendition” will be available for use in transformation rules. To keep the renditions for an object and migrate them as they are, the system attribute “dctm_obj_rendition” must be set to contain one single transformation function: GetValue(dctm_obj_rendition[all]). This will reference the path of the files where the corresponding renditions have been exported to; the Importer will pick up the content from this location and add them as renditions to the respective documents.
It is possible to add/remove individual renditions to/from objects by using the provided transformation functions. This can prove useful if renditions generated by third party software need to be appended during migration. These renditions can be saved to files in any location which is accessible to the Job Server where the import will be run from. The paths to these files can be specified as values for the “dctm_obj_rendition” attribute. A good practice for being able to assign such third-party renditions to the respective objects is to name the files with the objects’ id and use the format name as the extension. This way the “dctm_obj_rendition” attributes’ values can be built easily to match external rendition files to respective Documentum documents.
Other properties of renditions are also available to be set by the user through a series of rendition-related system attribute which are available automatically in any Migration set targeting a Documentum system:
All dctm_obj_rendition* system attributes are repeating attributes and as such accept multiple values, allowing multiple renditions to be added to the same object. Normally the number of values for all dctm_obj_rendition* attributes should be the same and equal the maximum number of renditions one would like to set for an object. E.g. if three renditions should be set, then each of the dctm_obj_rendition* attributes should have three values for each of the three renditions. More values will be ignored, missing values will be filled in with whatever Documentum would use as default in place of that missing value.
Relations
Relations are supported by Documentum Importer. They can currently be generated only by the Documentum Scanner, but technically it is possible to customize any scanner to compute the necessary information if similar data or close to Documentum relations can be extracted from their source systems.
Relations cannot be altered using transformation rules; migration-center will manage them automatically if the appropriate options in the scanner and importer have been selected. A connection between the parent object and its relations must always exist and can be viewed in migration-center by right-clicking on the object in any view of a migration set and selecting <View Relations> from the context menu. A grid with all the relations of the selected object will be displayed together with their associated metadata, such as relation name, child object, etc.
In order for the importer to use this feature the option importRelations from its configuration must be checked. Based on the settings of the importer it is possible to import objects and relations together or separately. This feature enables you to attach relations to already imported objects.
Importing a relation will fail if its child object is not present at the import location. This is not to be considered a fatal error. Since the relation is connected to the parent object, the parent object itself will be imported successfully and marked as “Partially Imported”, indicating it has one or more relations which could not be imported (because the respective child object could not be found). After the child object gets imported, the import for the parent object can be repeated. The already imported parent object will not be touched, but its missing relation(s) will now be created and connected to the child object. Once all relations have been successfully created, the parent object’s status will change from “Partially imported” to “Imported”, indicating a fully migrated object, including all its relations. Should some objects remain in a “Partially imported” state because the child objects the relation depends on are not migrated for a reason, then the objects can remain in this state and this state can be considered a final state equivalent to “imported”. The “Partially imported” state does not have any adverse effects on the current or future migrations even if these depend on the respective objects.
migration-center’s Documentum Importer supports relations between folders and/or documents only (i.e. “dm_folder” and “dm_document” objects, as well as their respective subtypes) “dm_subscription” type objects for example, although supports relations from a technical point of view, will be ignored by the scanner because they are relations involving a “dm_user” object. Custom relation objects (i.e. relation-type objects which are subtypes of “dm_relation”) are also supported, including any custom attributes they may have. The restrictions mentioned above regarding the types of objects connected by a relation also apply to custom relation objects.
Virtual Documents
The virtual documents data is stored in the migration-center database as a special type of relation. Please view the above chapter for more details about the behavior of this type of data.
In order to rebuild the structure of the virtual documents the importRelations setting must be checked in the Documentum Importer configuration.
This special type of relation is based on the “dmr_containment” information; in order for the data to be compatible with the importer you will need specific information as you see below to be created by the source scanner.
The Snapshot feature of virtual documents is not supported by migration-center.
Audit Trails
Audit Trails can be scanned using the Documentum Scanner (see the Documentum Scanner user guide for more information about scanning audit trails), added to a Documentum Audit Trail migration set and imported using the Documentum Importer. This type of migration sets is subject to the exact same migration procedure as Documentum documents and folders. They can be imported together with the document and folder migration sets with which they are related to or on their own after the required document and folder objects have been imported.
It is not possible to import any audit trail if the actual object it belongs to hasn’t been migrated.
Importing audit trails is controlled in the Documentum Importer via the importAuditTrails parameter (disabled by default).
A typical workflow for migrating audit trails consists of the following main steps:
Scan folders/documents with Documentum Scanner by having “exportAuditTrail” flag activated. The scanning process will create tree kind of distinct objects in migration-center: Documentum(folder), Documentum(document) and Documentum(audittrail).
Assign the Documentum(audittrail) objects to a DctmToDctm(audittrail) migset(s) and follow the regular migration-center workflow to promote the objects through transformation rules to a “Validated” state required for the objects to be imported.
Import audit trails objects using Documentum Importer by assigning the respective migset(s) to a Documentum Importer job with the importAuditTrails parameter enabled (checked)
Preparing a Migration Set Containing Audit Trails
In order to prepare audit trail for import, first create a migset containing audit trail objects (more than one migset containing audit trails can be created, just like for documents or folders). For a migset to work with audit trails, the type of object must be set “DctmToDctm(audittrail)” accordingly. After setting the migset type to “DctmToDctm(audittrail)” the | Filescan | tab will display only scans which contain Documentum audit trail objects. Any of these can be added/removed to/from the migration set as usual.
Transformation Rules for a Migration Set Containing Audit Trails
Transformation rules allow setting values for the attributes audit trail entries on import to the target system. Values can be simply carried over unchanged from the source attributes, or they can be transformed using the available transformation functions. All attributes that can be set and associated are defined in the target type “dm_audittrail”.
As with other migration-center objects, audit trail objects have some predefined system attributes:
audited_object_id this should be filled with the corresponding value that comes from source system. No transformation or mapping is necessary because the importer will translate that id into the corresponding id in target repository.
r_object_type must be set to a valid Documentum audit trail object type. This is normally “dm_audittrail” but custom audit trails object types are supported as well.
The following audit trail attributes don’t need to be set through the transformation rules because they are automatically taken from the corresponding audited object in target system: chronicle_id, version_label, object_type.
All other “dm_audittrail” attributesthat refer to the audited object (acl_domain, acl_name, audited_obj_vstamp, controlling_app, time_stamp, etc) can be set either to the values that come from the source repository or not set at all, case in which the importer will set the corresponding values by taking them from the audited object located in the target repository.
The source attributes “attribute_list” and “attribute_list_old” may appear as multi-value in migration-center. This is because their values may exceed the maximum size of a value allowed in migration-center (4000 bytes), case in which migration-center handles such an attribute as a multi-value attribute internally. The user doesn’t need to take any action to handle such attributes; the importer knows how to process and set these values correctly in Documentum.
Not setting an attribute means not defining a rule for it or not associating an existing rule with any object type definition or attribute.
Working with rules and associations is core product functionality and is described in detail in the Client User Guide.
Aspects
Assigning Documentum aspects to document and folder objects is supported by Documentum Importer. One or multiple aspects can be assigned to any document or folder object during transformation.
Documentum aspects are handled just like regular Documentum object types in migration-center. This means aspects need to be defined as a migration-center object type first before being available for use during transformation
During transformation, aspects are assigned just like actual object types to Documentum objects using the r_object_type system rule; the r_object_type system rule has been modified to accept multiple values for Documentum documents and folders, thus allowing multiple aspects to be specified
Note: the first value of r_object_type needs to reference an actual object type (i.e. dm_document or a subtype thereof), while any other values of r_object_type will be interpreted as references to aspects which need to be attached to the current object.
As with actual object types, all references to aspects will need to be valid in order for the object to be imported successfully. Trying to set invalid aspects or aspect attributes will cause the object to fail on import.
As with an actual object type, transformation rules used to generate values for attributes defined through aspects will need to be associated with the respective aspect types and attributes – again, just use aspects as if they were actual object types while on the Associations page of the Transformation Rules window.
Important note on updating previously imported objects with aspects:
New aspects are added during a delta migration, but existing aspects are not removed – the reason is that any aspects not present in the current update being migrated may have been added on purpose after the migration and may not necessarily be meant to be removed.
Attach Lifecycle
The Documentum importer has the ability to configure the behavior to attach the imported objects to a specific lifecycle. This ability refers only to attaching a lifecycle to a state that allows attaching to it and does not include the Promote, Demote, Suspend or Resume actions of the lifecycle.
When the adapter is configured to attach the lifecycle the transformation rules should contain the r_policy_id attribute that should have as value a valid lifecycle id from the target repository and also the state attribute r_current_state that should have as value a valid lifecycle state number (a state that allows attaching the lifecycle to it).
To be able to overwrite some attribute changes made by lifecycle when attaching the document, the adapter allows to configure a list of attributes that will be set again with the values set in the transformation rules after the lifecycle is attached, to make sure the values of those attributes are the ones coming from the migration set.
PDF Annotation
For importing annotations that have been scanned by Documentum Scanner the “importRelations” parameter must be enabled.
The corresponding “dm_note” objects have been imported prior or together with the parent documents in order for the DM_ANNOTATE relations to be imported.
Comments
The Documentum importer has the ability to import the comments on documents and folders (dmc_comment) by enabling “importComments” (default is unchecked). When enabled, the comments scanned by the Documentum scanner will be create and attached to the imported documents and folders in the target repository.
The pre-condition of a correct comments migration is that the users that created the comment objects and their group names from the source system need to exist in the target repository as well. This is necessary to be able to import the same permission access to the comment owners as in the source system.
The importer is also able to import updates of comments (see below Note on pre-condition) in a delta migration import.
To be able to import updates of a document’s comments it is necessary that the DFC used to have a valid Global Registry configured, because the adapter is using the “CommentManager” BOF service to read them. The behavior implemented in the framework inside the “CommentsManager” service requires that the DFC session used by the Job Server to be an approved privileged DFC session (this can be configured in DA – see DA documentation for privileged DFC clients).
If creating a comment fails for any reason, the error will be reported as warning in the importer log. The affected document or folder is set to the status “Partially Imported” in the migset. Since it is not possible to resume importing failed comments, after fixing the root cause of the error, the affected documents need to be reset in the migset, destroyed from repository (only if they are not updates) and imported again.
Migrating Updates (“Update” or “Delta” Migration)
Objects that have changed in the source system since the last scan are scanned as update objects. Whether an object in a migration set is an update or not can be seen by checking the value of the Is_update column – if it’s 1, the current object is an update to a previously scanned object (the base object). An update object cannot be imported unless its base object has been imported previously.
Depending on the source systems the object come from, the method of obtaining the update information will differ but the objects behavior will stay the same once scanned. See the documentation of the scanners in case you need more information about the supported updates and how they are detected.
Extracting an object type from Documentum and preparing it for migration-center 3.x
migration-center Client, which is used to set up transformation and validation rules does not connect directly to any source or target system to extract this information. Object type definitions can be exported from the respective systems to a CSV file which in turn can be imported to migration-center.
One tool to easily accomplish this for Documentum object types is dqMan, which is used in the following steps to illustrate the process. dqMan is an administration Tool for EMC Documentum supporting DQL queries and API commands and much more. dqMan is free and can be downloaded at http://www.fme.de. Other comparable administration tools can also be used, provided they can output a compatible CSV file or generate some similar output which can be processed to match the required format using other tools.
Start dqMan and connect to the target DOCUMENTUM repository. dqMan normally starts with the interface for working with DQL selected by default. Press the [DQL] button in the toolbar if not already selected.
In the “DQL Query” -box paste the following command and replace dm_document with the targeted object type: select distinct t.attr_name, t.attr_type, '0' as min_length, t.attr_length, t.attr_repeating, a.not_null as mandatory from dm_type t, dmi_dd_attr_info a where t.name=a.type_name and t.attr_name=a.attr_name and t.name='dm_document' enable(row_based); Press the [Run] button.
Click somewhere in the “Results” box. Use {CTRL+A} to select all. Right-click to open the context menu and choose <Export to> <CSV>.
The extracted object type template is now ready to be imported to migration-center 3.x as described in the chapter Object Types (or object type template definitions) in the migration-center Client User Guide
Content Validation
Content Validation
The Content Validation functionality is based on checksums (an MD5 hash) computed for the document’s contents during scan (by the Scanner itself) and after import (by the Importer). The two checksums are then compared - for any given piece of content its checksums from before and after the migration should be identical.
If the two checksums differ in any way, this is an indication that the content has been corrupted/modified during/after migration. Modifications can happen on purpose or simply by user error and can usually be traced back based on the r_modifier and r_modify_date attributes of the affected object. Actual data corruption rarely happens and is usually due to software and/or hardware errors on the systems involved during content transfer or storage.
Validating content is always performed after the content has been imported to the target repository, thus adding another step to the migration process. Accordingly, using this feature may significantly increase import time due to having to read back every piece of content for every document and compute its checksum in order to compare it against the initial checksum computer during scan.
This feature is controlled through the checkContentIntegrity parameter in the Documentum Importer (disabled by default).
This feature works only in tandem with a Scanner that supports it: Documentum Scanner, Filesystem Scanner and SharePoint Scanner.
Primary content
The mc_content_location attribute can be used to import the content of the document from another place than the location where the scanner exported the document. It should be set with a valid file path. If it is not set, the content will be picked up from the original location. Useful when the content was moved or copied to other location than the initial one. If its value is set to “nocontent”, contentless documents will be imported. For importing primary content with multiple pages, the attribute mc_content_location can be set with multiple content paths so the importer will create a page for every given content.
Renditions
The Documentum Content Validation also supports renditions in addition to a document’s main content. Renditions are processed automatically if found and do not require further configuration by the user.
There is one limitation that applies to renditions though: since the Content Validation functionality is based on a checksum computed initially during scan (before the migration), renditions are supported only if scanned from a Documentum repository using the Documentum Scanner. Currently this is the only scanner aware of calculating the required checksums for renditions. Other scanners, even though they may provide metadata pointing to other content files, which may become renditions during import, do not handle the content directly and therefore do not compute the checksum at scan time needed by the Content Validation to be compared against the imported content’s checksum.
When the rendition has different pages, those will be also imported on specific rendition page.
Documentum Importer Properties
To create a new Documentum Importer job, specify the respective adapter type in the importer’s Properties window – from the list of available adapters “Documentum” must be selected. Once the adapter type has been selected, the Parameters list will be populated with the parameters specific to the selected adapter type, in this case the Documentum.
The Properties window of an importer can be accessed by double-clicking an importer in the list, by selecting the Properties button from the toolbar or from the context menu.
Common importer parameters
Documentum importer parameters
On the | Migsets | tab, the user can select the migration sets to be imported with this importer. Depending on the chosen Adapter Type only the migration sets compatible with this type of importer will be displayed and can be selected for import. Also, only migration sets containing at least one object in a validated state will be displayed (since objects which haven’t been validated cannot be imported). Available migration sets can be moved between the two columns by double clicking or using the arrows buttons.
Log files
A complete history is available for any Documentum Scanner or Importer job from the respective item’s History window. It is accessible through the History button/menu entry on the toolbar/context menu. The list of all runs for the selected job together with additional information, such as the number of processed objects, the starting time, the ending time and the status are displayed in a grid format.
Double clicking an entry or clicking the Open button on the toolbar opens the log file created by that run. The log file contains more information about the run of the selected job:
Version information of the migration-center Server Components the job was run with
The parameters the job was run with
Execution Summary that contains the total number of objects processed, the number of documents and folders scanned or imported, the count of warnings and errors that occurred during runtime.
Log files generated by the Documentum Scanner can be found in the chosen “logs” folder at the installation of the Server Components of the machine where the job was run, e.g. …\fme AG\migration-center Server Components <Version>\logs\ Dctm-Importer
the amount of information written to the log files depends on the setting specified in the loggingLevel start parameter for the respective job.
Rollback script
Each job run of the importer generates along with its log a rollback script that can be used to remove all the imported data from the target system. This feature can be very useful in the testing phase to clear the resulting items or even in production in case the user wants to correct the imported data and redo the import.
The name of the rollback script is build based on the following formula:
<Importer name>(<run number>)_<script generation date time>_ rollback_script.api
Its location is the same as the logs location:
<Server components installation folder>/logs/DCTM-Importer/
Composed by a series of Documentum API commands that will remove in the proper order the items created in the import process, the script should look similar to the following example:
//Links:
//Virtual Document Components:
//Relations:
//Documents:
destroy,c,090000138001ab99,1
destroy,c,090000138001ab96,1
destroy,c,090000138001ab77,1
destroy,c,090000138001ab94,1
//Folders:
destroy,c,0c0000138001ab5b,1
You can run it with any applications that supports Documentum API scripting this includes the fme dqMan application and the IAPI tool from Documentum.
The rollback script is created at the end of the import process. This means that it will not be created if the job run stops before it gets to this stage, this doesn’t include manual stops done directly from the client.
“Rule-to-Attribute” associations supported for dm_document and subtypes
This list displays which Documentum attributes can be associated with a migration-center transformation rule.
Last updated