Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Copyright © 2020 fme AG. All Rights Reserved.
fme AG believes the information in this publication is accurate as of its publication date. The information is subject to change without notice.
THE INFORMATION IN THIS PUBLICATION IS PROVIDED "AS IS." FME AG MAKES NO REPRESENTATIONS OR WARRANTIES OF ANY KIND WITH RESPECT TO THE INFORMATION IN THIS PUBLICATION, AND SPECIFICALLY DISCLAIMS IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
Use, copying, and distribution of any fme AG software described in this publication requires an applicable software license.
All trademarks used herein are the property of their respective owners.
This online documentation describes how to use the migration-center software in order to migrate content from a source system into a target system (or into the same system in case of an in-place migration).
Please see the Release Notes for a summary of new and changed features as well as known issues in the migration-center releases.
Also, please make sure that you have read System Requirements before you install and use migration-center in order to achieve the best performance and results for your migration projects.
The supported source and target systems and their versions can be found in our knowledge base article List of supported versions.
The migration-center is a modular system that uses so called adapters in order to connect to the various source and target systems. The adapters to connect to a source system are called scanners and the adapters to connect to a target system are called importers. The capabilities and the configuration parameters of each adapter are described in the corresponding manual.
You can find the explanations of the terms and concepts used in this manual in the Glossary.
And last but not least, Legal Notice contains some important legal information.
Customized versions of adapters or entirely new ones (for other types of source and target systems) can be created and added to migration-center thanks to the open, API-based and documented architecture of the product. If such customized adapters are deployed in your migration project, the documentation provided with them should be consulted instead, since the features, behavior and parameters are likely to be different from the adapters described in these guides.
For additional tips and tricks, latest blog posts, FAQs, and helpful how-to articles, please visit our user forum.
In case you have technical questions or suggestions for improving migration-center, please send an email to our product support at support@migration-center.com.
A content migration with migration-center always involves a source system, a target system, and of course the migration-center itself.
This section provides information about the requirements for using migration-center in a content migration project.
For best results and performance it is recommended that every component will be installed on a different system. Hence, the system requirements for each component are listed separately.
The required capacities depend on the number of documents to be processed. The values given below are minimal requirements for a system processing about 1.000.000 documents.
Category
Requirements
RAM
4 GB of memory for migration-center database instance
Hard disk storage space
depends on the number of documents to migrate roughly 300MB for every 100.000 documents
CPU
min. 2.5 GHz (corresponding to the requirements of the Oracle database)
Oracle version
Oracle 10g Release 2 - Oracle 19c
Oracle instance
Character set: AL32UTF8 Necessary database components: Oracle XML DB
Operating system
see Oracle documentation
For a typical Oracle Database creation please refer to Installation Guide. For more advanced topics regarding Oracle Database configuration and administration please refer to Database administrator’s guide.
Category
Requirements
RAM
4 GB
Hard disk storage space (logs)
20 MB + storage space for log files (~200 MB for 1,000,000 files)
Hard disk storage space (objects scanned from source system)
Variable. Refers to temporary storage space required for scanned objects. Can be allocated on a different machine in LAN.
CPU
min. 2.5 GHz
Operating system
Microsoft Windows Server 2003, Windows 7 Professional, Windows 8 / 8.1, Windows Server 2008, Windows Server 2012, Windows Server 2016, Linux
Java runtime
Oracle/OpenJDK JRE 8, Oracle/OpenJDK JRE 11
Note that not all adapters are available on the Linux version of the jobserver.
Category
Requirements
RAM
1 GB
Hard disk storage space
10 MB
CPU
min. 800 MHz
Operating system
Microsoft Windows Server 2003, Windows 7 Professional, Windows Server 2008 (32 bit or 64 bit), Windows 8/8.1, Windows Server 2012, Windows 10, Windows Server 2016
Database software
32 bit client of Oracle 11g Release 2 (or above)
or 32 bit Oracle Instant Client for 11g Release 2 (or above)
This guide describes the installation process of migration-center. The following main steps of the installation process are subsequently described:
Preparing the Oracle database for migration-center
Installing the Client components of migration-center
Installing the database of migration-center
Installing the Server Components of migration-center
The following section gives an overview of the system requirements, and paragraph 2.2 describes the preparation of an Oracle Standard Edition database for migration-center.
For best results and performance, it is recommended that every component will be installed on a different system. Hence, the system requirements for each module are listed separately.
The required capacities depend on the number of documents to be processed. The values given below are minimal requirements for a system processing about 1.000.000 documents. Please see the Sizing Guide for more details.
Database Server
Component
Requirements
RAM
4 - 8 GB for Oracle
Hard disk storage
Depends on the number of documents to migrate, roughly 2 GB for every 100.000 documents.
CPU
2 - 4 cores
Oracle
Supported versions: 11g Release 2, 12c Release 1, 12c Release 2, 18c, 19c
Character set: AL32UTF8
Required components: Oracle XML DB
Operating system
All OS supported by Oracle
For a typical Oracle database creation please refer the section Preparation of the Oracle database. For more advanced topics regarding Oracle database configuration and administration please refer to Database administrator’s guide.
Server Components (Job Server)
Component
Requirements
RAM
By default, the Job Server is configured to use 1 GB of memory. This will be enough in most of the cases. In some special cases (multiple big scanner/importer batches) it can be configured to use more RAM.
Hard disk storage (log files)
300 MB + storage space for log files (~ 50 MB for every 100,000 processed files)
Hard disk storage (content files)
Variable. Refers to temporary storage space required for the content of scanned objects. Should be a network share if you plan to install multiple Job Servers.
CPU
2 - 4 cores
Operating system
Windows Server 2008, 2012, or 2016 Windows 7, 8, or 10
Linux
Java
Oracle JRE 1.8
OpenJDK 8, 11, 13
32 or 64 bit
Note that not all adapters are available on the Linux version of the Job Server.
Client
Component
Requirements
RAM
1 GB for MC client
Hard disk space
10 MB
CPU
1 core
Operating system
Windows Server 2008, 2012, or 2016
Windows 7, 8, or 10
Oracle client
Oracle Client / Instant Client 11g Release 2, 12c, 18c, 19c Important: 32 bit version
For the communication between the individual components, it must be possible to have access to the database from the Job Server and the Client. In addition, the Client must be able to access the distributed Job Server. The figure below outlines the communication of the individual components.
Migration-center uses an Oracle database to store information about documents, migration sets, users, jobs, settings and others.
The steps in the creation of a new database for migration-center are described next. Even though the description is very detailed, the installing administrator is expected to have a good degree of experience in configuring databases.
It is also possible to use Oracle Express Edition, but this version is not recommended for performance reasons and scalability due to its built-in limitations.
Oracle 11g offers the Database Configuration Assistant for database creation. In Microsoft Windows, the assistant is in the Oracle program group in the Start menu. For all other operating systems, please refer to the corresponding Oracle documentation to start the assistant.
The assistant will provide the user with a step-by-step guide through all the stages necessary for creating and configuring a database. With the [Back] and [Next] buttons, the user can navigate between the individual steps of the database configuration process, to change options.
The screenshots in this chapter were created on a Microsoft Windows System with Oracle 11.2.0.1.0. For other operating systems, the menus and dialogs may be different
After the start, the assistant will display a welcome note which can be skipped with [Next].
The first step determines the operation to be performed with the Configuration Assistant. “Create a database” is the appropriate choice here.
The next menu shows all available database templates. „General Purpose“ is appropriate for creating a database instance for migration-center.
Define the database name here. The name must be unique, meaning that there should not be another database with this name on the network. The SID is the name of a database instance and can differ – if desired - from the global database name. In order to make an easy identification of the database possible, a descriptive name should be selected, for instance „migration“.
To manage the database, Oracle Enterprise Manager is recommended. In Oracle 11g, databases can be managed centrally using the Grid Control or locally using the Database Control. Since the choice of database management has no influence on the operability of migration-center, this can be configured according to personal preferences or company policies. For the purposes of this guide the local Database Control will not be selected here. E-mail notification and backup can be configured separately at a later time, if necessary.
Once the database has been created, the passwords for the Oracle system accounts “SYS”, “SYSTEM”, “DBSNMP” and “SYSMAN” must be entered and confirmed. These accounts with their passwords are needed for the installation of the migration-center database schema.
Using the [Exit] button, the user can exit the database configuration and close the assistant.
These accounts/passwords grant full rights for the new Oracle database and should therefore be chosen and safeguarded carefully. The allocation of different passwords is recommended.
This step defines the storage method for the database. For use with migration-center the “File System” option is sufficient. For more information on the “ASM” method, please refer to the Oracle documentation.
In this step the paths in which the initialization parameters and the trace files are to be stored are determined. The preset values from the template can be retained.
In this screen the initialization parameters for the database’s archive and recovery mode are specified. The default settings can be applied.
In this step pre-defined example schemata can be added to the database. For migration-center no example schemata are required.
In this step of the assistant there are several tabs on which the initialization parameters for the new database must be specified.
The memory values of the database need to be specified here. For migration-center, user-defined settings are required.
Please review the requirements in chapter System Requirements with regard to the recommended amount of memory allocated to the database instance
When processing mass data, it may be necessary to change the parameters after some time. This can be done retrospectively in the Oracle Enterprise Manager. To do so, select <Administration / Memory Parameters> in the menu. In the dialog box displayed the settings for the "SGA” and “PGA” values can be changed and saved.
At this point the block sizes and maximum number of user processes are defined. The default block size can be applied. The maximum number of processes should be set to 150.
The Unicode character set “AL32UTF8” should be selected here. This character set supports all languages specified in Unicode 3.1.
The figure below illustrates the selection.
Here the mode in which client applications access the database is specified. For migration-center, the “Dedicated Server Mode” connection type is appropriate and should be selected.
No modifications for the database storage are necessary for migration-center.
Optionally, at this point a script can be generated which contains all the settings for the creation of this database. With this script, the setup can be repeated later if necessary. Furthermore, this script can be used as documentation for the settings used.
At this point, the user has the last possibility to perform changes on previous pages by using the [Back] button. After confirming this step by selecting [Finish], the database will be created.
After the user selects [Finish], the assistant opens a window showing a summary of the previous settings. This summary can be stored for documentation purposes in an HTML file. By pressing the [OK] button, the database will be created and installed. This process can take some time.
The Oracle database Configuration Assistant generates the following tablespaces (we assume that unused tablespace is marked to be deleted. Refer to Step 4: Database- features):
INDX
SYSTEM
TEMP
TOOLS
UNDOTBS
USERS
Oracle components:
Oracle Database
Oracle XML DB
Oracle Net Listener
Oracle Call Interface
General options:
Oracle JVM
Enterprise Manager Repository (optional)
Initialization parameters:
Dependent on server
Character sets:
Database character set (AL32UTF8)
Data files:
Use standard parameters
Control files:
Use standard parameters
Redo log groups:
Use standard parameters
An installation assistant will help the user install migration-center. This assistant directs the user through all the steps necessary for the installation of migration-center by installing and configuring the components in the following order:
migration-center Client
migration-center database
migration-center Server Components
The three individual components of migration-center can also be distributed and installed with separate assistants. The following table shows the setup assistants on the installation medium:
Main installation assistant
setup.exe
migration-center Client
MigClient/MigClientSetup.exe
migration-center database
Database/InstallDataBase.exe
migration-center Server Components (Job Server)
ServerComponents/SPsetup.exe
This installation guide describes the installation by means of the main installation assistant, which initiates the setup routines of each individual component.
The installation is started with setupMigCenter.exe, which is to be found in the root directory of the installation medium/package.
The installation can be aborted before the component selection step by pressing the [cancel] button.
By pressing [Back], the user can navigate to the previous pages in order to change or edit options.
The migration-center installation package should not be placed in a folder/path containing special characters like “( ) ;” as this can interfere with the installation process.
At the following step, the user will choose the components of migration-center to install.
The following are different installation variants:
Full installation All components of migration-center are installed
Compact Installation This variant installs the migration-center Client and database only.
Custom installation The user-defined installation allows the user to select specific components.
The full installation is recommended as a first installation option, or a custom installation can be performed by selecting only components that need to be installed.
Although the components can be installed individually, for full functionality at least one instance of each component must be deployed and the components must be able to connect to one another.
By confirming the choice by selecting [Next], all chosen components will be installed one after another. For each of the components there will be an individual installation assistant.
Installation of the migration-center Client starts with the welcome screen. The user will get to the next page by clicking [Next], while [Back] brings him back to the previous page, in case the entries there have to be corrected.
The installation can be cancelled by pressing [Cancel]. Only the installation of the current migration-center component will be interrupted. The setup assistant will continue installing the next component.
Select destination location
To install the Client, select the installation path for the application. The default installation path proposed by the installer is recommended but can be changed if needed.
Select start menu folder options
During the installation shortcuts in the Windows Start Menu will be created for migration-center. The name of the folder for these shortcuts can be set here.
Select shortcut placing options
During the installation shortcuts to the Desktop or to the Quick Launch Bar can be created.
Installation summary
Before starting the installation process, all previously set options are displayed. In order to change settings, the user can navigate to the previous pages by clicking [Back]. By clicking [Install], the installation is started.
Officially, Oracle Client 11gR2 32 bit is not supported on Windows 2012. By ignoring the warning during installation, it can be successfully installed and used properly on Windows 2012.
This installation prepares the database for use with migration-center by creating a user, tables, installing the packages containing migration-center functionality and setting default configuration options. All of these objects will be grouped in a schema called FMEMC
. Currently it is not possible to change the schema’s name or install multiple schemata on the same database instance.
The migration-center database installation program requires Oracle 11g/12c/18c/19c Client or Instant Client to connect to the Oracle server remotely or can be run locally on the Oracle server if the Oracle server machine is Windows-based.
Database connection
Next, the name of the Oracle database instance for migration-center, as well as the user account for accessing it must be entered. The specified database user must the predefined SYS
user or an Oracle administrative user with enough privileges.
By selecting [Test Connection], the credential and connection to the selected database instance can be verified prior to starting the installation. A message box will confirm whether the connection could be established or return the error message from the server if not.
If selecting [Next], the connection test is performed and if the test is successful, the next window appears.
Enter your migration-center license key by copying and pasting it from the email message received from fme product support containing your licensing information.
Selecting the tablespaces path
At this point, the path for the tablespaces of the database is set. The default path provided by the selected database instance is recommended.
Also, the location for the database installation log file MigrationCenter.log can be set is set.
The migration-center database schema will be installed by clicking [Install].
The progress bar in the lower area of the screen informs the user about the progress of the database installation steps.
The log file records several database actions run by the setup routine. Therefore, this log data file is very useful for support in case problems occur during the installation, such as error messages due to insufficient privileges. The location for the log file should therefore be chosen carefully.
Selecting Listener port and date/time pattern
Listener port – is the port where the Oracle Listener is listening for incoming database connections. This is necessary for the scanner and importer to connect to the database instance. Configure accordingly to local Oracle configuration. By default, this is port 1521
, but another port number can be entered here if the Oracle Listener is configured differently.
Date-time pattern – is the pattern used to display time and date values across the various migration-center components, such as the Client and adapters. Also, the scanner and importer will use this pattern for exporting/importing date-time attributes. Since migration-center is a distributed application and the various components can be installed on different machines with different regional settings, the definition of a common date/time format makes sense.
The dropdown menu includes of 4 the mostly widely used date/time formats.
If the migration-center database schema was successfully installed, the [Finish] button appears, clicking the [Finish] button closes the installer.
During the installation, the following Oracle Users will be created for migration-center.
User
Authorizations
FMEMC
This Oracle User is the owner of all migration-center objects and tablespaces
Default password is migration123. Password can be changed after authorization.
GRANT CONNECT TO FMEMC
GRANT RESOURCE TO FMEMC
GRANT CREATE JOB TO FMEMC
GRANT SELECT ON SYS.V_$INSTANCE TO FMEMC
GRANT SELECT ON SYS.DBA_DATA_FILES TO FMEMC
GRANT SELECT ON SYS.DBA_TABLESPACES TO FMEMC
GRANT CREATE VIEW TO FMEMC
Additionally the user FMEMC
will be granted to use the following Oracle packages:
SYS.UTL_TCP TO FMEMC;
SYS.UTL_SMTP TO FMEMC;
SYS.DBMS_LOCK TO FMEMC;
SYS.UTL_RAW TO FMEMC;
SYS.UTL_ENCODE TO FMEMC;
SYS.DBMS_SCHEDULER TO FMEMC;
SYS.DBMS_OBFUSCATION_TOOLKIT TO FMEMC;
Furthermore, during the installation, the following tablespaces will be created for migration-center:
FMEMC_DATA (40MB)
FMEMC_INDEX (20MB)
These tablespaces are set to expand in steps of 10MBs according to the usage and needs of migration-center.
The user role FMEMC_USER_ROLE
will be created and granted the required privileges on FMEMC
schema objects. This role must be granted to any database user that needs to use migration center.
If default settings for the user FMEMC
and tablespaces do not meet the requirements of the customer or conflict with internal company policies and guidelines, the migration-center database installation can be customized. This way, one may create the user FMEMC
and the required necessary tablespaces manually before running the provided database installer by using the scripts located in <MIGRATION-CENTER package>\Database\Util.
The tablespaces can be separated or merged into any number of individual datafiles, but the tablespace’s names cannot be changed (FMEMC_DATA and FMEMC_INDEX must not be changed).
These tablespaces must be created prior to the creation of the FMEMC
user.
After creating the tablespaces and the FMEMC
user, the migration-center database installer can be run connected as user FMEMC
instead of SYS
.
Before starting the update process it is always advised to back up the existing data.
The migration-center database installer supports updating migration center databases starting with version 3.0. The process will be performed automatically by the installer and does not require any user intervention.
An update can take in excess of 1 hour to perform, depending on the amount of data and database system performance. This is normal and does not mean the installer has stopped responding. Please do not try to close the installer application or otherwise intervene in the process.
Prerequisites for upgrading from version 3.0.x to 3.13
Before starting the update process it is always advised to back up the existing data.
The following conditions need to be fulfilled for the update procedure to work (these are checked by the installer):
The old database’s version must be one of the following: 3.0.x, 3.0.1.985, 3.1.0.1517, 3.1.0.1549, 3.1.1.1732, 3.1.2.1894, 3.2.0.2378, 3.2.1.2512, 3.2.2.2584, 3.2.2.2688, 3.2.3.2808, 3.2.4.3124, 3.2.4.3187, 3.2.4.3214, 3.2.4.3355, 3.2.5.3609, 3.2.5.3768, 3.2.6.3899, 3.2.6.4131, 3.2.7.4348, 3.2.7.7701, 3.2.7.7831, 3.2.7.7919, 3.2.8.7977, 3.2.8.8184, 3.2.8.8235, 3.2.8.8315, 3.2.9.8452, 3.3.8573, 3.4.8700, 3.5.8952, 3.5.8952, 3.6.8970, 3.7.1219, 3.8.0125, 3.9.0513, 3.9.0606, 3.9.0614, 3.9.0606, 3.9.0704, 3.10.0823, 3.10.0905, 3.11.1002. You can check your current database version in the Client’s “About” window. If your version is not one of the above please contact our technical support at support@migration-center.com.
Any running jobs (scanners, importers, schedulers, transformations, validations) must be finished or stopped.
Due to the new update feature released with migration-center 3.1.0, a new instance of an existing scanner may detect updates for objects scanned with previous versions even though the objects haven’t changed from the previous scan. This behavior always applies to virtual documents or documents that have dm_relations and occurs due to the information used by the new features in migration-center 3.1 not being available in the previous release. For this reason, a new scan will recreate this information on its first run.
Transformation rules created with a version older than 3.1.0 which use the system attribute r_folder_path might need to be reconfigured. This is because migration-center 3.1 now stores absolute folder paths instead of paths relative to “scanFolderPaths” as was the case with the previous versions.
Database update from previous versions older than 3.1.2: as a result of the update process increasing the default size for attribute fields to 4000 bytes (up from the previous 2000 bytes), Oracle’s internal data alignment structures may fragment. This can result in a performance drop of up to 20% when working with updated data (apparent when transforming / validating / resetting). A clean installation is not affected, neither is new data which is added to an updated database, because in these cases the new data will be properly aligned to the new 4000 byte sized fields as it is added to the database.
If the database contains Virtual Documents and objects that are part of scanned Dctm Relations (this does not apply to FileSystem adapter) some additional checks are done by installer. In case any of these checks fail an error is raised by the installer. In this case stop the installation and contact our technical support at support@migration-center.com.
Due to changes and new functionalities implemented in migration-center version 3.2 that rely on additional information not present in older releases, the following points should be considered when updating a database to version 3.2 or later:
The concatenate function will only have 5 parameters for newly added transformation rules. Reason: the “Concatenate” transformation function has been extended to 5 parameters for version 3.2. Transformation rules using the “Concatenate” function and created with a previous version of migration-center will retain their original 3 parameters only.
The system rule “mc_content_location”, which allows the user to define or override the location where the objects’ content files are stored will be available for use only in migration sets created after the database has been updated to version 3.2 Reason: the “mc_content_location” system rule is new to migration-center version 3.2 and did not exist in previous versions
The Filesystem scanner won’t create the “dctm_obj_link” attribute anymore Reason: with version 3.2 of migration-center scanners and importers are no longer paired together. The “dctm_obj_link” attribute was created automatically by the Filesystem scanner in previous iterations because it was a Filesystem-Documentum adapter. Since this no longer applies to the current Filesystem scanner which is just a generic scanner for filesystem objects and has no connection to any specific target system, it won’t create any attributes specific to particular target systems either. If objects scanned with a Filesystem scanner are intended to be migrated to a Documentum system, the “dctm_obj_link” rule must be created and populated with transformation rules in the same way as any other user-defined rule.
The Filesystem scanner won’t detect changes in a file’s metadata during subsequent scans for files which have been scanned prior to updating to version 3.2 Reason: detecting changes in a file’s external metadata (in addition to the file’s content) is a new feature of the Filesystem scanner in migration-center 3.2; previous versions of migration-center did not store the information which would be required by the current Filesystem scanner to detect such changes. A change to a file’s modification date would trigger the update mechanism though and would also enable full 3.2 functionality on subsequent scans.
A previous version of the migration-center database can be updated to the current version. The installer will detect supported database versions eligible for update and offer to update the database structure, migration data and stored procedures to the current version.
Running the database installer in update mode
The first step is connecting to the database instance as user SYS
. If an old database version is detected on that database instance, the following screen appears:
Before clicking [Next] make sure you backed up your old database first!
Enter a location for saving the database installation/update log file. In case of any errors this log file will be requested by our technical support staff.
After clicking “Next” the appropriate database scripts will be executed. Some of them might take several minutes to execute; this depends on the amount of existing data in the database being upgraded.
Backup a migration-center database
To back up a database used by migration-center it is sufficient to back up only the data within the FMEMC
schema. The easiest way to do this is with Oracle’s “exp”. See screen shot below for the basic steps required to back up a schema. For more information consult the documentation provided by Oracle.
Starting with Oracle 11g release 2, the empty table might not be exported at all. This happens when the database parameter DEFERRED_SEGMENT_CREATION is set to TRUE. Initially this parameter is set to TRUE. To force exporting all tables from FMEMC
schema the following commands should be run connected as user FMEMC
:
ALTER TABLE SCHEDULER_RUNS ALLOCATE EXTENT;
ALTER TABLE SCHEDULERS ALLOCATE EXTENT;
Restoring a migration-center database
To restore the backup, follow the steps below:
If the database instance where the backup should be restored does not contain an FMEMC
schema and FMEMC
user, please create them first.
Use Oracle’s “imp” utility for importing the dump file previously created by the “exp” utility. See screen shot below for the basic steps required to restore a database schema from a dump file. For more information consult the documentation provided by Oracle.
Note: The same character sets in the Oracle Client should be used when exporting and importing the data.
The migration-center Server Components include the adapters which connect to the various source and target systems, as well as the Job Server service which runs the jobs employing those adapters.
Before starting the installation note that JAVA_HOME
or JRE_HOME
need to be set appropriately.
After starting the installation, a welcome screen appears. Click [Next] to move on to the next step.
The installation of the Job Server can be cancelled anytime by clicking [cancel].
Select destination location
Next, the user will select the installation location for the Server Components. The default installation path proposed by the installer is recommended but can be changed if needed.
Server components folder in the start menu
During the installation, shortcuts are created for migration-center Server Components in the Windows start menu. The location of the shortcuts can be configured here. Furthermore, an entry is created in the Windows Start Menu Startup folder, in order to allow the automatic start of the Job Server.
Select Job Server port
At this point the TCP listening port of the Job Server is configured. The port specified here is used by the migration-center client to control the Job Server. The default port is 9700
and can be changed if needed.
Different Job Servers on different machines can all use the default port number, the same port number but different from the default port, or entirely different port numbers each.
The port specified for each Job Server during installation must be specified when adding a new Job Server location in the client pointing towards that respective Job Server.
Select the folder where the log files will be saved
Default value is the logs folder in the Server Components installation folder, but it can be set to any valid local path or UNC network share. Note the location set here must allow write access for the account used to run the migration-center Job Server service, which is the component writing the log files.
Since version 3.14 the log location is only set for the adapter logs. The server logs location is manually set in the mc-core/logback.xml file, using the FOLDER_LOG property.
Completion of the installation
The Job Server is installed on the system by selecting [Install].
Multiple installations of the Server Components
Depending on the requirements or possibilities of each migration project, it can make sense to install multiple Job Servers across the environment, either to share the workload, or to exploit performance advantages due to the geographical location of the various source and/or target system, i.e. installing the Job Server on a node as close as possible to the source or target system it is supposed to communicate with.
In terms of throughput, local installations will provide benefits over remote installations and local networks will always be faster than remote networks.
Besides throughput, latency needs to be considered as well, since increased latency can affect performance just as much, even leading to timeouts or connection breakdowns in severe cases.
Ending the installation process
If all selected components are installed, the installation will be ended by selecting [Finish].
Uninstalling previous versions of the mc Server Components
A clean installation of the mc Server Components may require uninstalling the previously installed version first and would be recommended if upgrading while skipping over several intermediate versions.
To uninstall the mc Server Components follow the steps below:
Stop the mc Job ServerJob Service if it is currently running; this can be done
using the Windows Services application
using the “Stop service” shortcut in Start Menu/All Programs/fme AG/migration-center Components <currently installed version>
Uninstall the mc Server Components; this can be done
Using the Programs and Features applet in Windows Control Panel
using the “Uninstall” shortcut in Start Menu/All Programs/fme AG/migration-center Components <currently installed version>
Uninstalling will not delete the log files having been generated in the mc Server Components installation folder, as the user may want to keep log files for future reference. Thus, it is up the user whether to delete or archive leftover log files after the uninstaller has completed removing the application’s files.
Not all adapters are available on the Linux version of Server Components. The ones available are:
Scanners: Documentum, Filesystem, Database, eRoom, Exchange
Importers: Documentum, Filesystem, InfoArchive
The Linux version of the migration-center Server Components can be found in the folder ServerComponents_Linux.
In order to install the Migration Center Job Server extract the archive Jobserver.tar.gz
in the desired location using the command:
tar -zxvf Jobserver.tar.gz
All necessary Job Server files will be extracted in the “Jobserver” folder.
To install the Job Server as a service / daemon, follow these steps:
1. Switch to the “bin” folder of the “Jobserver” folder
2. Run the command sudo ./installDaemon.sh
Run the installDaemon.sh as a user that has administrative permissions (sudo).
Instead of installing the Job Server as a service/daemon, you can run it in the terminal by executing the script ./runConsole.sh
in the bin folder
The default TCP listening port is 9701
and can be changed in “server-config.properties” file located in “lib/mc-core” folder.
For running Documentum Scanner or Importer on Linux the DFC (Documentum Foundation Classes) needs to be configured in “conf/dfc.conf” as it is described in the Scanner and Importer user guides.
During installation of migration-center Client, a desktop icon (optional) and a Start Menu entry were created. By default this can be found in Start-> (All) Programs-> fme AG-> migration-center Server Components <Version>. The migration-center Client can be started by using the shortcuts from these locations.
The Job Server is set up as a Windows service during the installation and can be managed using the regular Services Microsoft Management Console snap-in.
By selecting the “Migration Center Job Server” entry, the service can be manually started, stopped or restarted if needed. If the service is required to run using a specific user account, this can be configured on the Log On tab in the services’ properties.
In addition, the Job Server has shortcuts in the Start menu by which the service can be stopped or started directly without having to access the Services console. By default these can be found in Start-> (All) Programs-> fme AG-> migration-center Server Components <Version>.
The Job Server is setup as a service (daemon) and can be started or stopped by running the following scripts inside the “Jobserver/bin” folder
./startDaemon.sh
for starting the daemon
./stopDaemon.sh
for stopping the daemon
Instead of installing the Job Server as a service/daemon, you can run it in the terminal by executing the script ./runConsole.sh
in the bin folder.
The following paragraphs describes how to uninstall the individual components of migration-center.
The migration-center Client can be uninstalled by using „Add or Remove Programs“ or “Programs and Features” item in the Control Panel (depending on the version of Windows used). You can find the Client listed under the name „migration-center Client <Version>“, it can be uninstalled by selecting the entry and clicking [Remove].
Uninstall links are also provided in the Start Menu program group created during installation. By default, this is Start-> (All) Programs-> fme AG-> migration-center Client <Version>.
Uninstalling the migration-center database schema will delete all data within that schema. It is no longer possible to recover the information after that. Please back up the database schema before uninstalling it.
An uninstall script is provided with migration-center; it can be found in database/util/drop_fmemc_schema.sql. Run this script against the Oracle database instance the FMEMC schema should be removed from using the Oracle administration tool of your choice.
Make sure no connections with the user FMEMC exist, otherwise the script will fail to execute properly. Should this happen, the script can be re-run after closing all connections of the user fmemc.
The migration-center Server Components can be uninstalled by using “Add or Remove Programs” or “Program and Features” item in the Control Panel (depending on the version of Windows used). You can find the Server Components listed under the name “migration-center Server Components <Version>” and can be uninstalled by selecting the entry and clicking [Remove].
Uninstall links are also provided in the Start Menu program group created during installation. By default, this is Start-> (All) Programs-> fme AG-> migration-center Server Components <Version>.
To uninstall the Job Server as a service/daemon, follow these steps:
Go to the “bin” folder inside “Jobserver” folder
Run the command ./uninstallDaemon.sh
Run uninstallDaemon.sh
as a user that has administrative permissions (sudo).
The present document offers a detailed description of migration-center requirements specifically related to the Oracle database management system it uses as the backend.
The contains a full description of the system requirements and installation steps for all migration-center components and should be consulted first. This document does not substitute the Installation Guide, but completes it with information that might be of importance for Oracle database administrators, such as the various packages, privileges and other Oracle specific features required and/or used by migration-center.
This document is targeted specifically at Oracle database administrators about to deploy migration-center in corporate environments where databases need to adhere to strict internal guidelines and policies, and where the default database installation procedure used by migration-center may need to be reviewed first, or possibly adapted to meet specific requirements before deploying the solution.
The migration-center database stores all information generated during a migration project. This includes job configurations, object metadata, transformation rules, status and history information, etc.
One exception here is the actual document’s/file’s content, which is not stored in the database and is of no relevance to database deployment or configuration.
Please consult the migration-center for the complete information regarding installation steps, recommended settings and configuration options for Oracle. Below is just an excerpt of the essential requirements, not a full overview of the installation process.
2-4GB of RAM should be assigned to the Oracle database instance (not the entire Oracle server) where the migration-center schema will be deployed. The Oracle specific memory allocation settings can be left at their defaults.
For storage requirements and sizing of Oracle datafiles see section below.
There is no differentiation between physical or virtual hardware in terms of requirements; faster physical hosts may be required to compensate for losses in I/O performance or latency if running the database server on virtual machines.
Oracle RDBMS version: 10g R2 - 19c Architecture: both 32bit or 64bit are supported Edition: Developer or Enterprise Edition Oracle Express Edition (Oracle XE) is not supported in production use due to its limitations!
Ideally a separate Oracle database instance should be designated for use by migration-center alone, rather than an instance shared with other database applications. This would help performance and related troubleshooting if necessary as there wouldn’t be multiple applications and thus multiple potential problem sources to investigate. From a purely technical perspective sharing a database instance with other database applications poses no problems.
The operating system the Oracle Server is running on is of no relevance to migration-center, as all communication with the database happens at a higher level, independently of the underlying OS or hardware architecture. All operating systems officially supported by the Oracle versions mentioned above are also supported by migration-center.
Please consult the appropriate Oracle documents for more information on operating systems supported by Oracle databases.
The migration-center database schema can be deployed on any Oracle database instance meeting the requirements and setting described in the migration-center Installation Guide
The schema is named FMEMC
. The schema name is fixed cannot be changed; as a consequence it is also not possible to have more than one schema installed on the same database instance.
The tablespace meant for storing user data is called FMEMC_DATA and will store information such as job configurations, object metadata, transformation rules, status and history information, etc.
A separate tablespace called FMEMC_INDEX is used for storing indices of indexed fields
Much like the schema name, the tablespace names are fixed and cannot be changed.
By default each of the two tablespaces mentioned above store information in 2 data files
The data files are set to autoextend by default
The data files can be customized in terms of count, size, storage location, and whether autoextend is allowed or not. With the autoextend option disabled the free space within the data files needs to be monitored and extended accordingly during the migration process to prevent the tables from filling up and stalling the migration process.
The above must be changed in the installation scripts and cannot be set through the regular setup program’s GUI.
The primary factor for sizing the Oracle data files is the number of objects planned to be migrated using the respective database instance, as well as the number of attributes stored per object, and the length of the respective attribute values; these factors can (and do) of course vary from case to case.
This chapter details the privileges required by migration-center for accessing the Oracle packages and various functionalities it needs to work properly. By default these privileges are granted to the FMEMC
user during installation. After database installation has completed successfully, it is possible to log in with the user FMEMC
and connect to the migration-center database in order to start using the solution.
The regular installation process using the setup program with GUI requires logging on as SYS
. Since this might not be possible in some environments, there is the alternative of customizing the critical part of the setup and having an administrator execute the appropriate scripts.
This would generally involve having an administrator customize and runs the scripts for creating the user FMEMC
, tablespaces, data files, etc., and then run the rest of the setup by running the setup program using the previously created user FMEMC
for connecting to the database instead of SYS
. In this case the below permissions must be granted to user FMEMC
in order for the setup program to determine whether migration-center tablespaces and/or data files already exist on the instance where the database is supposed to be installed.
GRANT SELECT ON SYS.DBA_DATA_FILES TO FMEMC;
GRANT SELECT ON SYS.DBA_TABLESPACES TO FMEMC;
Encrypting passwords saved by migration-center
Read current database configuration from view v$instance (host name and database instance name)
Execute core migration-center job-based functionalities such as transformation, validation, or scheduled migration as native Oracle Jobs
migration-center Transformation Engine (process migration-center data by applying transformation rules and altering object metadata)
GRANT CONNECT, RESOURCE TO FMEMC;
GRANT CREATE JOB TO FMEMC;
GRANT SELECT ON SYS.V_$INSTANCE TO FMEMC;
GRANT CREATE VIEW TO FMEMC;
GRANT SYS.DBMS_LOCK TO FMEMC;
GRANT SYS.UTL_RAW TO FMEMC;
GRANT SYS.UTL_ENCODE TO FMEMC;
GRANT SYS.DBMS_OBFUSCATION_TOOLKIT;
These privileges must be granted for migration-center to operate normally. It is not possible for the solution to offer even the core functionalities without having been granted the above privileges!
Start scheduled migration jobs at the specified time using Oracle’s built-in scheduler
GRANT SYS.DBMS_SCHEDULER TO FMEMC;
Establish network communication from scheduler to Job Server over TCP
GRANT SYS.UTL_TCP TO FMEMC;
Allow the scheduler to send notification emails to configured users about outcome of scheduled migration jobs
GRANT SYS.UTL_SMTP TO FMEMC;
Privileges on packages SYS.UTL_TCP and SYS.DBMS_SCHEDULER must be granted for the scheduler functionality to work. The SYS.UTL_SMTP package is required for sending notification emails to configured users summarizing the outcome of scheduled migration jobs. Since this can also be checked directly in the mc Client application at any time, sending notification emails is an optional feature, therefore access to the SYS.UTL_SMTP package is not mandatory.
As mentioned above, parts of the installation process can be controlled via installation scripts. These can be used for preparing the database manually by an administrator and finally running the regular setup program connected with the migration-center user FMEMC
instead of SYS
which may not be allowed to be used.
The script files used during setup are located in the Database\util folder of the migration-center installation package. These files can be used to adapt the setup procedure to some degree to specific requirements of the customer’s environment. The individual files and their role in the migration-center database installation process are described in the table below.
Decide whether to use a custom or the default location for the data files. Pick create_tablespaces_custom_location.sql or create_tablespaces_default_location.sql accordingly.
Review/adapt the selected tablespace creation script
Execute the tablespace creation script
Review/adapt the user creation script (do not change user name!)
Execute the user creation script
Run the migration-center database installation program (Database\InstallDataBase.exe) and proceed as described in the installation guide.
Exception to the Installation Guide: log in as the previously created user FMEMC
instead of user SYS
!
Verify the database installation log after installation
Log on to migration-center using the migration-center Client and verify basic functionalities
Use the drop_fmemc_schema script to drop the entire schema in case of an unsuccessful installation, or if the schema needs to be removed for other reasons. WARNING! This will delete all migration center data from the database instance! Make sure the schema has been backed up before dropping a schema that already contains user data created by migration-center.
The present document offers a guideline for sizing and configuring migration-center components in different migration scenarios. Due to the complexity of the migration projects, enterprise environments and applications migration-center interacts with, this guide does not provide exhaustive information, but it offers best practice configurations accumulated in more than 100 migration projects.
The and contain a full description of the system requirements and installation steps for all migration-center components and should be consulted first. This document does not substitute the documents mentioned above but completes it with information that might be of importance for getting the best migration-center performance.
This document is targeted specifically to the technical persons that are responsible with installation and configuration of migration-center components.
This chapter describes the configurations and sizing guidelines that apply to any kind of migration project.
Client
The following hardware and software resources are required by MC client in any of the migration scenarios described in the next chapters.
Server components
Oracle instance
The migration-center database stores all information generated during a migration project. This includes job configurations, object metadata, transformation rules, status and history information, etc.
Sizing and configuration tips
No special hardware or software requirements
All MC components can be run on the same machine
Deployment overview
For such a small migration projects, where up to 500,000 objects need to be migrated, there are no special hardware or software requirements. Basically, all three MC components can be run on a single desktop computer or on an adequately sized virtual machine. In this case the machine should have enough physical memory for OS, Oracle server, MC components and other applications that may run on that machine. The recommended processor is a dual or quad core having a clock rate of minimum 2.2 GHz.
Oracle instance
Standard database installation can be followed (as it is described in “migration-center installation guide”)
RAM: 4 GB of total memory for the instance.
Use “Automatic memory management”. This can be chosen when creating the Oracle instance.
Sizing and configuration tips
Dedicated Oracle Server machine
Two MC Job Servers for a better scalability
Deployment overview
It is recommended to use a dedicated machine for the Oracle instance needed by MC.
For a better scalability of scanning and importing jobs or when migration timeframe is short, it’s recommended to deploy the Server Components (Job Server) on two machines. In this way you may run multiple scanner and importers in parallel speeding up the entire migration process. The performance of scanners and importers is dependent by the Source and Target System so if one of those systems performs slowly, the migration process can be speeded up by running multiple scanners and importers in parallel. If necessary, the number of deployed Jobservers can be extended.
Oracle instance
The host should not necessarily be a server machine. A well sized desktop machine would be enough.
CPU: Quad Core, min. 2.5 GHz
8 GB of memory allocated for the instance
Use “Automatic memory management” in order for the instance to tune its target memory size, redistributing memory as needed between the system global area (SGA) and the instance program global area (instance PGA). If that is not possible, the recommendation is to allocate as much as possible for SGA memory (especially for buffer cache) and keep PGA memory in the range of 200 MB.
It is very recommended the instance has at least 3 Redo Log files having in total a minimum size of 1 GB. This is important when multiple big transformation and validation jobs are run by MC because those jobs update a big number of database rows that require enough redo log space. You may get the information about existing redo log files by running the query: SELECT * FROM v$log;
Sizing and configuration tips
Dedicated Oracle Instance running on a Server machine.
Three or more MC Job Servers for a better scalability
Deployment overview
It is recommended to use a dedicated Oracle Instance running on server hardware.
For a better scalability of scanning and importing jobs three or more instances of Server Components (Job Server) need to be deployed. The performance of scanners and importers is dependent by the Source and Target System so if one of those systems performs slowly, the migration process can be speeded up by running multiple scanners and importers in parallel. If necessary, the number of deployed Job Servers can be extended.
Oracle instance
A dedicated server machine is required. It is recommended to use a dedicated Oracle instance for running only Migration Center but no other applications.
CPU: 4-8 Cores, min. 2.5 GHz
16-32 GB of memory allocated for the instance
Use “Automatic memory management” in order for the instance to tune its target memory size, redistributing memory as needed between the system global area (SGA) and the instance program global area (instance PGA). If that is not possible the recommendation is to allocate as much as possible for SGA memory (especially for buffer cache) and keep PGA memory in the range of 400 MB.
Make sure that instance has at least 4 Redo Log files having in total a minimum size of 1.5 GB. This is important when multiple big transformation and validation jobs are run by MC because those jobs update a big number of database rows that require enough redo log space. You may get the information about existing redo log files by running the query: SELECT * FROM v$log;
Sizing and configuration tips
Multiple Dedicated Oracle Instances running on a Server machine or use an instance on a high-performance Oracle cluster
Four or more Job Servers.
A file server or a storage server used for temporary storing the content.
Deployment overview
The deployment for the migration of a very large number of objects should be planned carefully. In this case it is recommended to use multiple MC databases. In most cases it is not advised to migrate more than 10 million objects with a single Oracle instance even if the hardware is sized accordingly. There are several reasons for using multiple database instances:
The data from different sources is not mixed up in a single instance helping the user to handle easier the errors that appear during migration
The transformations and validations will be better scaled on multiple database instances
Facilitate the work of multiple migration teams
A dedicated file/storage server should be shared by all Job Servers for storing and accessing the content during migration. This will help the migration in the way that set of objects scanned with any scanner can be imported with any importer.
Oracle instance
Several dedicated server machines are required. For each machine it is recommended to use a dedicated Oracle instance for running only Migration Center but no other applications.
CPU: 4-8 Cores, min. 3.0 GHz
Minimum 16 GB of memory allocated for each instance.
Use “Automatic memory management” in order for the instance to tune its target memory size, redistributing memory as needed between the system global area (SGA) and the instance program global area (instance PGA). If that is not possible the recommendation is to allocate as much as possible for SGA memory (especially for buffer cache) and keep PGA memory in the range of 600 MB.
Is recommended that FMEMC_DATA tablespace to be split on multiple physical data files stored on different physical disks in order to maximize the performance for a specific hardware configuration.
Make sure that instance has at least 5 Redo Log files having in total a minimum size of 2 GB. This is important when multiple big batch jobs (transformations, validations, scanners, importers) jobs are run by MC because those jobs update a big number of database rows in quite a short time and therefore redo log space should be sized accordingly in order to prevent redo log wait events. You may get the information about existing redo log files by running the query: SELECT * FROM v$log;
The schema is home to all database objects created and used by migration-center, such as tables, stored procedures, Java code, etc. As part of the schema the user FMEMC
is created and granted access to all necessary packages, objects and privileges (also see section ). For storing data and indices two tablespaces are also created for use by FMEMC
(also see section ).
For more comprehensive information on typical migration project sizes (from small to very large) and the requirements in terms of storage space resulting from there, as well as general recommendations for sizing the database in preparation of an upcoming migration project please consult the .
Script file name (located in Database\util)
Description
Executable
Editable
create_tablespaces_custom_location.sql
Tablespace and data file creation and configuration script
Y
Y
create_tablespaces_default_location.sql
Tablespace and data file creation and configuration script using database instance’s default location for data files
Y
Y
create_user_fmemc.sql
User creation script; this is where the user is created and privileges are granted to the user
WARNING! Do not change the user name!
Y
Y
drop_fmemc_schema.sql
Drop user/schema fmemc and all objects owned by fmemc WARNING! This will delete all migration center data from the database instance!
Y
N
Operating system:
Windows Server 2003 , 2008, 2012, or 2016
Windows 7, 8, or 10
Required software:
Java 1.8 or OpenJDK 8, 32-bit or 64-bit
For migration to/from Documentum DFC 5.3 or later needs to be installed
CPU:
Dual or Quad core processor min. 2.5 GHz
RAM:
The Job Server is configured by default to use 1 GB of memory. This will be enough in most of the cases. In some special cases (multiple big scanner/importer batches) it can be extended to 1.5 GB through the configuration.
HDD:
For installation 300 MB of disk space is required. An additional disk space is required for the logs. In most of the productive migration when debugging is not activated, about 50 MB should be reserved for every 100,000 migrated objects.
Content storage:
The scanners running in Job Server extract the content of documents from the DMS and store them in a file system folder. That folder might be located on the Job Server machine (in case of small and medium migration projects) or on dedicated storage machine, NAS or fileserver in case of large and very large projects. How much disk space is needed to be reserved for exporting the content depends on the number and the size of documents that have to be migrated.
Operating system:
See Oracle system requirements
Required software:
Oracle 10g R2, Oracle 11g R2, or Oracle 12c R1 and R2, Oracle 18c
CPU:
Depends on the migration size (see next chapters)
RAM:
Depends on the migration size (see next chapters).
Data storage:
Generally sizing of the data files is determined based on the number of objects expected to be migrated using the respective database instance. By default, the tablespaces created by MC installer totalize 60 MB. They are set to auto extend automatically when more storage space is required by MC components. There is no upper limit set. The required storage is dependent by the number of objects that will be migrated and the amount of metadata that will be scanned and imported. Roughly 75% of necessary storage is occupied by the objects metadata (source and target metadata). The rest of 25% are required by the configurations and other internal data.
To estimate the necessary storage for a migration set the following formula might be used:
Storage size = 5 * (NR_OF_OBJECTS) * (AVG_METADATA_SIZE)
NR_OF_OBJECTS – the number of objects and relations to be migrated
AVG_METADATA_SIZE – the average size in bytes of an object metadata (attribute names size + attribute values size)
Example: Migration of 800,000 documents and folders from Documentum into Documentum. The medium number of attributes was 70 and the estimated AVG_METADATA_SIZE was 5,000 bytes.
Storage size = 5 * 800,000 * 5,000 = ~20 GB.
Note: In order to be accessible from any MC client, the scanner/importer report log files are stored in MC database as CLOB fields. If big scanners or importer are run with “debugLevel=4” the size of report logs gets increased significantly and therefore additional tablespace should be taken in consideration.
Character set
The Unicode (AL32UTF8) character set is recommended to be used in order to handle correctly any kind of character that may appear in the metadata.
Other recommendations
MC inserts and updates rows intensively and therefore fast disks help the performance.
Even though the rows are inserted and updated intensively the MC transactions are generally small. There are no situations when more than 5,000 table rows are inserted, updated or deleted in a single transaction.
Operating system:
Windows Server 2003 , 2008, 2012, or 2016
Windows 7, 8, or 10
Required software:
Oracle Client 10g, 11g, or 12c 32-bit
CPU:
Any CPU that is supported by OS will be supported by MC client
RAM:
Usually it doesn’t consume more the 200 MB of RAM. This threshold might be overtaken when loading more than 100,000 objects in a grid.
HDD:
10MB
CSV - Excel Scanner:
Added support for scanning repeating attributes (#54993)
InfoArchive Importer:
Added support for multiple references to the same content (#55529)
OTCS Scanner:
Added support for scanning CAD documents as regular documents (#55900)
SharePoint Online Batch Importer:
Added support for setting approval status on documents (#54990)
Added support for setting version numbering (#55505)
Added support for setting Lookup field type (#55521)
Veeva Importer:
Added support for delta migration of Veeva objects (#54928)
Added support for importing attachments for Veeva Objects (#55535)
Core Database:
Added additional multi-value transformation functions (#55424)
OTCS Importer:
Fixed record date not being set in opentext_rm_classification (#55431)
SharePoint Online Batch Importer:
Fixed import failing with invalid XML characters in attributes (#55007)
Fixed import failing verification (#55574)
Fixed import failing into library with deep path (#55745)
SharePoint Online Importer:
Fixed import failing into site collection URL with spaces (#55571)
Fixed not being able to assign AD group (#55796)
Fixed error on folder update (#55870)
Alfresco Scanner:
Scan doc with versions edited online while autoVersion false, but had autoVersion switched to true afterwards has wrong version content (#55983)
CSV - Excel Scanner:
Content_location is not scanned as multi-value (#55635)
Veeva Importer:
Objects are not rolled back if attachment was set on objects that do not support attachments (#55939)
Documentum NCC Importer:
Delta migration for the multi-page content does not work properly when a new page is added to the primary content (#55739)
AIP - Archival Information Package
AIU - Archival Information Unit
BLOB - Binary large object
DB - Database
CSV - Comma separated values
CLOB - Character large object
DCTM - Documentum
DFC - Documentum Foundation Classes
DMS - Document Management System
DSS - Data Submission Session
GB - Gigabyte
GHz - Gigahertz
IA - InfoArchive
JDBC - Java database connectivity
JRE - Java Runtime Environment
JVM - Java Virtual Machine
KB - Kilobyte
MB - Megabyte
MS - Microsoft
MHz - Megahertz
RAM - Random Access Memory
regex - Regular expression
SIP - Submission Information Package
SPx - Service Pack x
SQL - Structured Query Language
XML - Extensible Markup Language
XSD - File that contains a XML Schema Definition
XSL - File that contains Extensible Stylesheet Language Transformation rules
Veeva Importer
New feature: Import documents using existing content from FTP server (#55303)
New feature: Add support for importing relations (#54991)
SharePoint Online Importer (Batch)
New feature: Allow assignment of any valid SP user (#55219)
Support for OneDrive (#54262)
Filesystem Scanner
New feature: Perform transformation of source object XML files before processing (#54776)
SharePoint Online Importer (Batch)
Update objects not marked as processed/imported (#55058)
Veeva Importer
Import version tree with renditions fails (#55396)
SharePoint Importer
Folder and document get created although a column cannot be set (#55001)
SPO importer fails setting lookup column with a NULL value (#55140)
CSOM Processor fails with 401 Unauthorized error if a job ran longer than 24 hours (#55323)
Opentext Scanner
Keywords system rule is limited to 255 characters for Physical Objects migset (#55341)
General / logging
The installer does not update the location of the Job Server's log files. So if you do not want to use the default location for log files, which is <Job Server Home>/logs
, you need to manually update the log file location in the <Job Server Home>/lib/mc-core/logback.xml
configuration file (#54732)
Starting with version 3.13 Update 2, the SharePoint Online Importer does only support app-only principal authentication and no user name / password authentication. Please ensure that this will work for you before you upgrade your existing installation!
Alfresco Scanner
New feature: Scan only last "n" versions (#54289)
New feature: Scan multiple sites (#54291)
D2 Importer
Support for D2 20.2
Documentum Scanner / Importer
Support for Documentum Server 20.2
OpenText Importer
Support for OpenText Content Server 20.2
OpenText Scanner
New feature: Scan shortcuts as distinct MC objects (#53710)
New feature: Scan any folder subtype objects (#53711)
SharePoint Online Importer (Classic)
New feature: Add user-agent traffic decoration (#52837)
SharePoint Online Importer (Batch)
New feature: Importing role assignments (#54440)
Tools
New tool to export document type definitions from Veeva available (#51539)
OpenText Scanner
NPE occurs using excludeFolderPaths (#54326)
Wildcards do not work in the root folder (#54329)
Set attributes are not scanned if the name is longer than 100 chars (#54589)
SharePoint Online Importer (Batch)
SPO batch importer checks wrong content location (#54568)
SPO batch importer throws error when importing large folder hierarchy (#54574)
NPE for objects with NULL value in levelInVersionTree in SPO batch importer (#54637)
SharePoint Importer
Wrong max length limit for parentFolder system attribute when importing to SP 2019 (#54592)
General / logging
The installer does not update the location of the Job Server's log files. So if you do not want to use the default location for log files, which is <Job Server Home>/logs
, you need to manually update the log file location in the <Job Server Home>/lib/mc-core/logback.xml
configuration file (#54732)
D2 Importer
Support for importing to D2 4.1, 4.5, and 4.6 was removed
OpenText Scanner
Using binders in the parameters "scanFolderPaths" and "excludeFolderPaths" is not supported (#54963)
PowerShell tools:
Support multiple scan run IDs when creating migration sets (#54263)
SharePoint Online importer:
Support for app-only principal authentication with SharePoint permissions (#54277)
SharePoint Online Bulk importer:
Add mc_content_location in migset rules (#54182)
Support for app-only principal authentication with SharePoint permissions (#54277)
SharePoint Online scanner:
Add scanLatestVersionOnly feature (#5615)
Add includeFolders and excludeFolders parameters (#53706)
OpenText importer: Wrong values for date attribute during physical objects import (#54255)
OpenText scanner:
NPE occurs using excludeFolderPaths (#54326)
Wildcards do not work in the root folder (#54329)
SharePoint (Online) importer: Automatically added content type not added to cache (#54420)
SharePoint scanner: Permissions & User fields contain ambiguous username instead of login name (#54240)
New SharePoint Online Batch Importer (#52665)
Add “removeDuplicate” transformation function (#53528)
Add support for Oracle 19c (#52930)
Add support for Oracle JDK 11 & OpenJDK 11 (#53313)
Add support for Oracle JDK 13 & OpenJDK 13 (#52492)
Add support IA 16.7 (#52910)
Added support for Alfresco 6.2.0 for Alfresco scanner and importer (#54108)
Alfresco scanner and importer now require java 1.8 or later
Documentum Scanner
Scan complete document audit trail and save it as rendition to the current document version (#52846)
OpenText Importer
Assign objects to Physical Object Box with a dedicate system rule (#52748)
SharePoint importer
Support valid filename characters on SharePoint on-prem (#53304)
Documentum In-Place
Implement move content feature in Documentum In-Place adapter (#53518)
Fix typo in transformation function “ConverDateTimezones” (#53136)
“GetDateFromString” transformation function returns null in some cases (#53164)
Job server installation on Linux outdated in Installation Guide (#53714)
OpenText Importer became laggy during impersonation after a certain number of requests (#53180)
No error or warning when importing two renditions of the same type with OpenText Importer (#53124)
The underlying connection was closed: An unexpected error occurred on a send in SharePoint Importer (#53099)
Could not get SharePoint X-RequestDigest error message when space character in site collection name (#53190)
SharePoint Importer should log error immediately if required fields are missing (#53844)
HTTP 503 Server Unavailable Error when downloading large content files with SharePoint Online Scanner (#53673)
Fix “Member ID is not valid” in OpenText scanner (#53482)
Fix setting empty Vault Object reference in Veeva importer (#53573)
OTCS Scanner
Scanning Project objects as OTCS(container) with metadata (#52932)
Save version information as source attributes (#52292)
Support wildcards in folder paths (#52360)
Show number of warnings in job run report summary (#52845)
Scan user field with username instead of user id (#52847)
Objects scanned from inside projects now contain full parentFolder path (#52960)
Revised OpenText Adapter User Manuals (#52550)
Veeva Importer
Re-authentication mechanism in case of session timeout (#52822)
Pause and retry mechanism to handle burst and daily API limits (#52823)
Support version binding when importing Veeva Vault binders (#52091)
Delta Migration of objects in Veeva Importer (#53037)
Added support for Alfresco 6.1.1 in Alfresco Importer (#53008)
Added support for annotations in D2 Importer (#53018)
Documentum scanner now supports dqlExtendedString returning repeating values (#52587)
Added Support for SP 2019 in SP Importer (#52870)
Auto-Classification Module
Functionality to split dataset into training and test data (#52660)
Removing of library under GPL license (#52853)
Classification Process Logging (#52855)
Unresolved document file path in ScanRunExtractor (#52876)
Filesystem importer writes invalid XML characters to metadata XML file (#52909)
Missing dependency packages in AC module installer (#52935)
Error validating classification when subtypes not unique (#52945)
In Veeva Importer setting Binder version labels as major/minor does not work (#53160)
In Veeva Importer document updates with empty values for attributes that were not empty does not delete existing value (#53229)
In OTCS Scanner user fields that were deleted from a category are scanned as IDs (#52982)
New SharePoint Online Scanner (#52409)
Made OTCS scanner more robust to errors (#52361)
Imptoved OTCS scanner job run log output (#52365)
FileSystem Importer now applies XSL transformation on whole unified metadata XML file (#52394)
Veeva Importer now supports importing Documentum documents as renditions of a specific type to a Veeva Vault document (#52090)
Fixed bug in Filesystem Importer where attributes with no value were not being exported in XML metadata file (#52479)
Fixed OTCS Scanner not scanning documents under folder inside nested projects (#52809)
Fixed SPOnline importer bug when importing folders with a “%” in the name (#52857)
Fixed SPOnline importer not refreshing token when all objects of a large import fail (#52775)
SharePoint Online Scanner might receive timeout error from SharePoint Online when scanning libraries with more than 5000 documents (#52865)
Added support for Oracle database version 18c (#51451)
Added support for migrating Veeva Vault Objects (#52136)
Added support for version 16.x for the OpenText scanner (#50340)
Added support for D2 versions 16.4 and 16.5 (#50972, #51530)
Added switch for lifecycle error handling in D2 Importer (#50991)
Updated the IBM Domino Scanner (#49173)
Improved database security by changing default password (#51112)
Improved performance of filesystem importer copy operation (#52320)
Enhanced the RepeatingToSingleValue function to not throw exception when length exceeds 4000 bytes (#51798)
Added missing file and folder name checks in the SP and SPO importer user guides (#52076)
Updated linux Jobserver with new YAJSW wrapper (#52347)
Box.net Importer, added the following features:
Import custom metadata (#50312)
Import tags (#50314)
Import comments (#50315)
Import tasks (#50316)
Add collaborators to the imported documents (#50602)
Fixed bug in DCTM adapters when calculating multi page content (#50442)
Fixed bug in SharePoint scanner when scanning SP 2013+ on Windows Server (#51088)
Fixed error messages bigger than 2000 characters not being saved in the database (#52357)
SharePoint Importer, fixed the following bugs:
import taking a long time when content_location points to folder (#52115)
Importer not starting if adfsBaseURL contains trailing slash (#51941)
DEBUG messages getting logged despite log level ERROR in log4j (#52082)
Added debug log messages for queue building missing in SP importer (#52084)
Exception not caught when reverting import action (#52216)
Importer hanging when encountering an exception in RefreshDigest method (#52238)
Improved Retry handling (#52270)
Error when importing into site with %20 in the name (#52272)
autoCreateFolders always enabled (#52305)
NullPointerException in file upload when SPOnline throttling occurs (#52325)
Some new features and enhancements in Domino Scanner
New scanner parameter "excludedAttributeTypes"
Default for scanner parameter "exportCompositeItems" now "0"
Scanner parameter "selectionFormula" set to be editable by default
64bit support (based on IBM Domino); 32bit support remains unchanged (based on IBM Notes)
InfoArchive Importer: provide a better way for importing email attachments scanned from IBM Domino
Domino Scanner: Extracted floating point numbers truncated/rounded to integer value
Domino Scanner: Attributes of type "NumberRange" were not exported
OpenText CS scanner: some logging improvements
OpenText CS scanner: Fix scanning subprojects
Veeva Vault Importer: Fix setting status_v for the root version
Veeva Vault Importer: Fix importing VD relations that have order_no with decimals.
Veeva Vault Importer: Fix a null pointer exception when logging some results
CSV/Excel Scanner: Fix setting the internal processing type of the scanner
Sharepoint Importer: Fix importing documents having # and % in the name
Sharepoint Importer: Fix importing folder with two consecutive dots (..) in the name
Sharepoint Importer: Fix token expiration after 24 hours.
This hotfix requires a reinstall of the Jobserver and Client components, as well as an update of the Database component. Please refer to the Installation Guide for details regarding the update.
Add support for importing attachments to Veeva Vault.
This hotfix requires a reinstall of the Jobserver and Client components, as well as an update of the Database component. Please refer to the Installation Guide for details regarding the update.
Added support for scanning nested compound documents in OTCS Scanner
Added support for scanning the Nickname attribute for folders and documents in OTCS Scanner
Added Proxy support for ADFS Authentication in SharePoint Importer
Added support for filtering Scanners and Importers in MC Client
Fixed OTCS Scanner bug not scanning dates properly sometimes
Fixed OTCS Scanner "Member Type not valid" error message
Fixed OTCS Scanner nullPointerException when trying to get the owner name in specific cases
Fixed OTCS Scanner nullPointerException when trying to scan certain Workspaces
Fixed SharePoint Importer bug autoCreateFolders not working for values with leading slash
Fixed SharePoint failing authentication with invalid XML chars in use name or password
This hotfix requires a reinstall of the Jobserver and Client components, as well as an update of the Database component. Please refer to the Installation Guide for details regarding the update.
Replaced Tanuki service wrapper with YAJSW (#51159)
Added support for Java 64-bit (#51163)
Added support for OpenJDK 8 (#51156)
New CSV / Excel Scanner (#50343)
Added Content Hash Calculation feature to OTCS scanner (#50736)
Added support for java web service in OTCS Scanner (#51389)
Added support for importing RIM submissions to Veeva Vault (#51208)
Added support for importing Documentum Virtual Documents as Binders in Veeva Vault (#51188)
Added support for setting references to existing master data objects in Veeva Vault (#51151)
Added support for ADFS authentication in SPO Importer (#51414)
Added new Transformation function for time-zone conversions (#51480)
Fixed bug when pulling content for updates with the Database scanner (#51142)
Fixed error message in OTCS log (#51619)
Fixed SharePoint Online importer too short path length limitation (#51491)
Fixed SharePoint not resuming properly after pausing job for long time (#51492)
Changed behavior of setting the file extension for the SharePoint Importer (#51503)
Documentum Scanner: Added option to scan only the latest version of a VD (#51019)
OTCS Importer: Added support for importing Virtual Documents as Compound Documents (#51008)
New transformation functions: GetDataFromSQL() and Length() (#50997, #50999)
Database Scanner: New Delta migration feature (#50477)
Documentum Importer: Extended waiting time for operations queue to finish (#50995)
Fixed error in D2 Importer when property pages contained tabs with visibility conditions (#50987)
Removed unneeded jar file from D2 importer folder (#51073)
The Jobserver now requires Java 8. Java 7 is not anymore supported.
New Veeva Vault importer
Extended OpenText importer to import physical items (#50282)
Changed the way of setting RM Classifications in OpenText importer (#50842).
Delta migration issue after upgrading from 3.2.5 or older (#50571)
Importing not allowed items in the physical item container is permitted (#50978)
RM classifications for physical objects are not removed during delta (#50979)
Physical objects properties of type date are not updated during delta migration (#50980)
Veeva importer may fail to start, if DFC is installed on the same machine (#50981)
No error reported when setting values to Inactive Fields in Veeva Importer (#50880)
No error reported when setting values to fields that are not in the Type / Subtype / Classification used in Veeva Importer (#50871)
No error reported when setting permissions not in the selected lifecycle in Veeva Importer (#50939)
New OpenText In-Place Adapter (#49994)
Updated Tika library to version 1.17 for Filesystem scanner (#49258)
OTCS scanner is now able to scan Business Workspaces (#49994)
SharePoint Importer is now able to change the extension of files (#50142)
D2 Importer validates the Enabled, Visible and Mandatory conditions in a Property Page (#50235)
D2 Importer marks documents as Partially Imported when failing to apply a lifecycle action (#50274)
Fixed issue with saving migsets when plsql_optimize_level was set to 3 (#49160)
Fixed issue with exporting BLOB content from a Firebird database (#49197)
Fixed error in Documentum In-Place adapter with linking into inexistent folderpath (#49217)
Fixed nullPointerException in SharePoint Importer when using proxy (#49232)
Fixed bug in Documentum audit trail rollback scripts (#49943)
Fixed nullPointerException in Alfresco Importer (#50027)
Fixed error when scanning documents in “Projects” with OTCS Scanner (#50130)
Fixed individual version creator not being scanned in OTCS scanner (#50273)
Fixed missing supported versions for Oracle and DFC in the Installation Guide (#50275)
Fixed DQL Validation for $repeatingvalue placeholder in D2 Importer (#50278)
D2 Importer does not validate values for Editable ComboBox if they are not in the dictionary (#49115)
D2 Importer does not validate Enabled and Visibility properties if they have no condition attached to them (#50327)
Running IA Importer with values “16.x” for the InfoArchiveVersion parameter throws error (#50388)
Alfresco scanner now supports Alfresco 5.2 (#49977)
Database scanner now supports multi-content Blob/Clob (#49260)
Added support for scanning and importing xCP comments for folders (#49245)
Added support for Documentum Content Server 16.4 (#49243)
Added support for InfoArchive 16.3 (#49244)
OTCS importer now supports folders of type email_folder types in OpenText Content Server (#49257)
OTCS scanner now supports scanning documents and emails from Email folders (#49995)
OTCS scanner now supports OTDS authentication (#50134)
Added checksum verification feature for OpenText Content Server Importer (#49250)
Added support for Oracle 12c Release 2 (#49259)
Fixed bug in the Alfresco importer rollback scripts (#49251)
Fixed delta scan issue when only main object metadata changed (#50052)
Fixed issue with description length in OTCS importer (#50118)
Fixed error when setting subterm under non available term in SharePoint Online taxonomies (#49978)
Fixed job getting stuck when connection to SharePoint server is lost during import (#50026)
Fixed error when importing filename with apostrophe or plus sign in SharePoint (#49087, #49151)
Fixed error when importing SharePoint lists with certain settings (#49975)
Fixed rollback not being performed when a importing certain versions fails in SharePoint (#49974)
Fixed bug where document gets overwritten when old version moved to folder with a different document of the same name in SharePoint (#4661)
Added comprehensive content hashing with different hashing algorithms and encodings for the following Scanners Documentum, SharePoint, Database and Filesystem (#11636)
Filesystem Importer: XML Metadata file now contains elements for all null and blank attributes (#11606)
InfoArchive Importer: migrating Documentum AuditTrails is now supported (#11610)
InfoArchive Importer: migrating Documentum Virtual Documents with children relations is now supported (#11683)
OpenText Importer: migrating opentext_email objects is now supported (#11670)
SharePoint Online Importer: added content integrity checking feature for imported Documents (#8296)
SharePoint Scanner: scanning entire version trees as a single migration-center object is now supported (#11641)
New system attributes are now available in transformation rules for all adapters (#11941)
Fixed objects not being scanned when reading permissions failed in Filesystem Scanner (#11779)
Fixed table key lookup query being case sensitive in OpenText Importer (#11772)
Fixed length limitation for Table Key Lookup attributes in OpenText Importer (#11777)
Fixed setting attribute values from a different object to a category that has multiple rows when using multiple threads in OpenText Importer (#11797)
Improved logging for OpenText Importer (#11781)
Fixed issue where the initializer thread was still running in the background when a job was being stopped manually (#11739)
Fixed not being able to delete certain Scheduler runs that took more than 2 seconds to delete (#11889)
Sharepoint Scanner:
Added support for SharePoint 2016 (#10999)
Added support for CAML queries (#11629)
Added support for scanning SharePoint internal attributes (#11653)
Added support for scanning only current version in eRoom scanner (#11649)
Filesystem Scanner:
mc_content_location and original_location are now available in transformation rules when using moveFilesToFolder (#11474)
Added parameter for specifying the date format for extracting extended metadata with unusual format (#11673)
Added possibility of scanning permissions (#11720)
Upgraded to latest version of Tika Libraries (#11693)
Added support for InfoArchive 4.1 and 4.2 (#10325)
Added support for migrating SharePoint folders and List Items into InfoArchive (#11639)
OpenText importer:
now support inheriting folder categories when importing folders (#11674)
now support inheriting permissions (#11631)
now supports setting null values for popup / dropdown attributes (#11550)
Fixed applyD2RulesByOwner error message when using multiple threads in D2 (#11597)
Fixed some renditions being scanned twice in Documentum scanner (#11740)
Fixed wrong file path in log for Filesystem importer (#11469)
Fixed date values not being properly converted to XML format in some cases for InfoArchive importer (#11672)
Fixed contentless documents failing when ‘nocontent’ is specified in InfoArchive importer (#11656)
Fixed missing parent_object_id when scanning delta versions in OTCS Scanner (#11750)
Fixed importing documents that have date table key lookup attribute failing in OTCS Importer (#11576)
Fixed error when scanning uncommon managed metadata in SharePoint scanner (#11621)
Fixed only latest content exported for each version of a document in SharePoint scanner (#11657)
Fixed exportLocation parameter description when using local path in SharePoint scanner (#11568)
Fixed error when scanning documents with non-mandatory but empty taxonomy field in SharePoint scanner (#11652)
Fixed lookup attributes not being scanned at all in SharePoint scanner (#11655)
Improved OTCS scanner documentation (#11667)
Corrected version labels in Installation Guide (#11622)
Fixed Linux Jobserver classpath values (#11690)
New Documentum In-Place Adapter (#11490)
Alfresco Importer now supports Alfresco 5.2 (#10772)
Documentum Importer - added support for CS 7.3 (#9927)
D2 Importer
Added support for D2 version 4.7 (#10777)
Changed D2 DQL and Taxonomy validation to be done by the D2 API (#10317)
Added information regarding folder migrating options in the D2 Importer User Guide (#10754)
InfoArchive Importer
Added support for multiple content files per AIU (#8877)
Added support for multiple Object Types (#8877)
Added support for Custom Attributes inside the eas_sip.xml file (#9405)
Added support for calculating the PDI file checksum with SHA-256 and base64 encoding (#11418)
OTCS Scanner is now able to export rendition types (#11419)
OTCS Importer now supports Extended Attributes Model (#11424)
SharePoint Importer now does extra check to avoid overwriting existing content with the same List Item ID when trying to move content to different libraries (#11465)
Added warning regarding using multiple Jobservers with Sharepoint Importer (#11459)
OpenText Importer: fixed java heap space error when importing large size renditions. (#11426)
OpenText Importer: fixed the authentication cookie refreshment for impersonated user (#11396)
OpenText Scanner: fixed out of memory error when scanning objects with size larger than 20MB (#11393)
OpenText Scanner: fixed out of memory error when scanning objects with size larger than 8GB (#11434)
Sharepoint Importer: fixed updates to intermediate version being applied to the latest version (#11458)
Sharepoint Importer: fixed a case that caused the importer to not finish (#10338)
D2 Importer: Fixed validation for certain DQLs containing placeholders (#10318)
D2 Importer: Fixed validation of certain taxonomies (#11549)
Fixed a jar conflict between OpenText Importer and Sharepoint Scanner (#11487)
Multi-threaded import using applyD2RulesByOwner fails for some objects in D2 Importer
Remaining issues from the main release 3.2 to 3.2.8 Update 2 also apply to release 3.2.8 Update 3 unless noted otherwise.
Add support for migrating to DCM 6.7 (Documentum Compliance Manager).
Remaining issues from the main release 3.2 to 3.2.8 Update 2 also apply to release 3.2.8 Update 3 unless noted otherwise.
Added support for xCP comments in Documentum Scanner and Documentum Importer (#10886)
Added support for “dm_note” objects in Documentum Scanner and Documentum Importer (#10887)
Added support for multi-page content in Documentum Scanner, Documentum Importer and D2 Importer (#11271)
Added support for extracting content from BLOB / CLOB column types in the Database Scanner (#10899)
Fixed logging the version of the OpenText Importer in the run log (#11322)
Fixed setting the version description attribute in OpenText Importer (#11270)
Documentum scanner/importer: The scanned/imported “dm_note” objects are not counted in the job run report (#11391)
DQLs that have enable(row_based) are not processed correctly in the D2 Importer (#11390)
Remaining issues from the main release 3.2 to 3.2.8 Update 1 also apply to release 3.2.8 Update 2 unless noted otherwise
New OpenText scanner: supports scanning documents, compound documents and folders from OpenText repositories version 9.7.1, 10.0 and 10.5. Specific OpenText features are supported like scanning categories, classification, permissions and shortcuts.
Several bug fixes regarding validating DQL queries and Taxonomies when using property pages in D2 Importer.
Added support for D2 LSS (Life Sciences Solution) version 4.1 (#9990)
Added multi-threading capabilities to the D2 importer (#10107)
Added support for Alfresco version 5.1 (#10354)
Added support for OpenText Content Server version 16.0 (#9937)
Added support for SharePoint 2016 (#9925)
Improved SharePoint Online Importer performance when applying internal attributes and taxonomies (#10314)
Added support for InfoArchive 4.0 (#9926)
Added support for Documentum ECS (Elastic Cloud Storage) (#9783)
Alfresco Scanner: fixed scanning duplicates on delta scanning a new version for documents created directly in the Alfresco Share interface (#10013)
InfoArchvie Importer: fixed nullPointer exception when importing objects without content (#10746)
Filesystem Importer: fixed logging message when setting owner fails (#10326)
SharePoint Importer: fixed setting author and editor attributes in a multi-domain environment (#10171)
SharePoint Scanner: fixed job finishing successfully with incorrect credentials (#9939)
MC Client: fixed description for RepeatingToSingle transformation function (#10291)
D2 Importer:
importing version updates with CURRENT not set to latest version fails (#10784)
the CURRENT label is always set to the latest version when applying Lifecycle (#10686)
importing branched versions with multiple and no r_version_label associated fails (#10460)
D2 importer does not work with D2 4.5 or 4.1 when Jobserver runs on Java version 1.8 (#10384)
Improved Sharepoint and Sharepoint Online Importers performance when setting Taxonomies and internal attributes. (#10423)
Improved Sharepoint Importer Error Reporting in certain cases. (#10299)
Remaining issues from the main release 3.2 to 3.2.7 Update 3 also apply to release 3.2.7 Update 4 unless noted otherwise
Documentum Importer can attach now documents to a given lifecycle.
FirstDoc importer: Fix a java.lang.ClassCastException error when importing to FDQM module
FirstDoc importer: Fix importing versions when there is an attribute count difference in the repeating group between create SDL and document definition SDL.
Remaining issues from the main release 3.2 to 3.2.7 Update 2 also apply to release 3.2.7 Update 3 unless noted otherwise.
D2 Importer now supports only D2 version 4.6. The older D2 versions are only supported by the previous versions of migration-center
OpenText Importer now supports creating OpenText Compound Documents
OpenText Importer now supports importing renditions
IBM Domino Scanner:
Large attribute values can now be split into chunks of maximum 4000 bytes
Added custom attributes: "attachmentItemName" - name of item that an attachments was attached to in the original Domino document, "attachmentName" - name of the attachment in the original document and "$dominoNoteID$" - NoteID of the document in the database that it originates from.
Documentum Scanner: Scanning version trees having versions of multiple object types
OpenText Importer: Error messages concerting a single attribute do not have the attribute name
OpenText Importer: Categories not being set if no associations are made for it
Sharepoint Scanner: Documents with paths longer than 260 characters failed to be scanned
Running a SP scanner with incorrect credentials on a Sharepoint with anonymous authentication enabled, does not show errors.
Remaining issues from the main release 3.2 to 3.2.7 Update 1 also apply to release 3.2.7 Update 2 unless noted otherwise.
Alfrecso scanner now supports scanning version check-in comments
Alfresco importer now supports importing version check-in comments
Documentum adapter supports scanning/importing version trees with different object types in them
InfoArchive importer officially supported on the Linux jobserver
InfoArchive importer now supports ingestion via InfoArchive Webservices
OpenText Importer now supports inheriting categories from folders
OpenText Importer now supports setting values for attributes grouped within a Set attribute
InfoArchive Importer:
fixed text inside SIP xml files generated without indent
fixed ZIP files not being rolled back when there is not enough memory on disk
OpenText Importer:
fixed impersonate user not working as expected
fixed importer not working when DFC not installed
DocumentumNCC scanner: null pointer exception when scanning documents without content
InfoArchive importer: SIPs are imported successfully but marked as failed when moveFilesToFolder does not have write permissions
OpenText Importer: Error messages concerting a single attribute do not have the attribute name
OpenText Importer: Categories are not set if no associations are made for it
Sharepoint Scanner: Documents with paths longer than 260 characters fail to be scanned
Remaining issues from the main release 3.2 to 3.2.7 also apply to release 3.2.7 Update 1 unless noted otherwise.
New OpenText importer: supports importing documents and folders to OpenText repositories version 10.0 and 10.5. Specific OpenText features are supported like setting categories, classification, permissions, shortcuts to the imported documents and folders.
New SharePoint Scanner: supports the scan of SharePoint 2007, 2010 and 2013. Supports scanning documents, folders, listitems, lists and document libraries. Replaces the old scanner which has been retired.
New SharePoint Importer: supports SharePoint 2010 and 2013 and uses REST API. Support import of documents, folders, listitems, lists and document libraries.
Retired the previous SharePoint Importer as SharePoint Legacy: new installations will not have it. Updated installations will have it under the name of SharePoint Legacy.
SharePoint Online Importer supports now documents, folders, listitems, lists and document libraries. Relations can be imported now as attachments for listitems.
D2 importer supports now creating folders based on the paths provided to documents.
Alfresco importer supports now the Alfresco 5.0.
Alfresco scanner and importer: categories and tags are exported and imported using the values displayed in the interface instead of using internal ids.
Alfresco scanner and importer:
Fixed a NullPointerException when scanning documents with aspects from Alfresco.
Fixed the value of the content location attribute when scanning documents with skipping content.
The content location is now available in the attribute “mc_content_location” that is accessible to the transformation engine.
Fixed importing a new document version when the first link was modified compared with the previous version.
Improve the message description of several errors that may happen when importing updates and versions.
Documentum scanner:
Fixed a query error when scanning a repository having DB2 as database.
Fixed scanning version tree having multiple types.
Fixed the content location for the documents without format.
Box importer:
Fixed setting “content_created_at” and “content_modified_at” to the imported files.
Filesystem scanner:
Fixed out of memory when scanning more than 1 million of files
Fixed the case when scanning multiple folders like: <some folder path>\FName|<some folder path>\FName with Space. In the previous version the second folder was ignored without any error message.
Report properly the invalid paths when scanning multiple folders.
Fixed a possible performance issue when scanning extended metadata.
Filesystem importer:
Fixed a NullPointerException when setting “modified_date” without setting the “creation_date”
Fixed the case when the target file was overwritten by the metadata file when they were having the same name.
SharePoint Online importer:
Fixed updates imported as new versions.
Fixed import fails only with major version enabled.
Fixed null pointer exception when setting empty “Group or Person” or “Hyperlink” column.
Fixed items created for failed jobs are now rolled-back where possible.
Fixed running with invalid credentials does not set the error status.
Fixed proper log messages are set now for multiple attribute setting errors.
Eroom scanner
The “er:Path” attribute does not contains the file name anymore.
Database
Allow installing migration-center on a non-CDB Oracle 12c instance.
Fixed the throwing two errors when upgrading from version 3.2.5 or older. This was the case when filesystem importer was not licensed.
Client
Fixed installing the client on Window 8 and Windows 2012 server. No manual registry configuration is required.
OpenText Importer: The modified date of created folders (set by the importer) is overwritten by the content server when new documents are added to the folder.
Remaining issues from the main release 3.2 to 3.2.6 also apply to release 3.2.7 unless noted otherwise.
SharePoint Legacy Importer: The Logging system on the SharePoint side is not reusable after a forced restart. The SharePoint Importer uses much more memory then the previous version. Document sets cannot be created in a library with versioning disabled.
SharePoint and SharePoint Online Importer: Null reference not handled in case of the big document import. The author does not get set unless editor attribute is set as well.
Added support for Oracle Database 12c.
Added support for Documentum 7.2.
Added support for Windows Server 2012, Windows 8 / 8.1.
New InfoArchive Importer: supports the import of data generated by any scanner to InfoArchive enterprise archiving platform.
New Alfresco Scanner: supports scanning documents, folders, custom lists and list items from an Alfresco repository.
New Microsoft Exchange scanner: supports scanning emails from different accounts using an account that has delegate access to the other ones.
New Exchange Removal adapter: supports deleting emails in the source system that have been imported from Microsoft Exchange.
Documentum Importer supports multithreaded processing for improved performance.
Filesystem Importer supports setting the creator name, creation date and modify date to the imported file.
Enhance the flexibility of generating metadata files in Filesystem Importer.
Support of Jobserver components installation on Linux. Note that not all adapters are available on Linux
The Oracle package SYS.SQLJUTL is not needed anymore by migration-center.
Sharepoint importer:
Fixed updating minor versions of documents.
Fixed updating documents when “ForceCheckout” flag is set to true in the library.
Fixed dummy versions not deleted after large file failed import.
Fixed importer trying to delete inexistent dummy versions.
Fixed the reporting of some incorrect error messages.
Sharepoint Online importer:
Fixed setting the attribute “Title” for contact items.
Fixed the reporting of some incorrect error messages.
Filesystem Importer:
Fix a null pointer exception when setting invalid path to “unifiedMetadataPath”.
Fix moving renditions when “moveFiles” is enabled in the importer.
migration-center 3.2.6 requires now Java Runtime Environment 1.7.0_09 or later. It will not work with older java versions.
In Documentum 7.x the max allowed length for owner_name, r_modifier, r_creator_name was changed from 32 to 255 char. In MC the default provided types (dm_document, dm_folder, dm_audit_trail) still have the old limit of 32 char.
Filesystem Importer does not allow more than 255 characters for the “content_target_file_path” system rule.
SharePoint Online Importer: When importing large files into a SharePoint Online document library the import might fail due to a 30 minutes timeout defined by Microsoft. (Bug# 8043)
SharePoint Online Importer: Documents that have the attribute is update set to true are not overwritten during import, but a new version is created on import. Document Sets cannot be created in a library with versioning disabled. (Bug #7904)
Sharepoint Online Importer: The importer does not set the “Author” field of a document, if only the “Author” attribute has a rule associated to it. The user must associate rules to both the “Author” and “Editor” attribute for them to be set. (Bug #8629)
Sharepoint on Premise Importer: Importing documents without any specified version label results in an error. (Bug #8186)
Documentum Importer: Importing multiple updates of the same document that is part of a version tree containing branches may fail if using multiple threads. That is an extreme case since in the process of delta migration the object updates are migrated incrementally. (Bug #8175)
InfoArchive Importer: When the “targetDirectory” runs out of disk space during the zip creation, the importer fails as expected, but the zip file is not rolled back (deleted). (Bug #8758)
Alfresco Importer: Certain errors that may occur during import are not descriptive enough. (Bug #8715)
Alfresco Importer: Updating a document will result in an error if the first link of the base object needs to be removed by the update. (Bug #8717)
Alfresco Scanner: Running an Alfresco Scanner with the parameter “exportContent” unchecked results in invalid values set for the “content_location” attribute. (Bug# 8603)
Alfresco Scanner: Running two scanners in parallel on the same location can result in duplicate objects being created. (Bug #8365)
Filesystem Scanner: Folders will be ignored if multiple folder paths that begin with the same letters are used as a multivalue for the scanFolderPath (Bug #8601)
Remaining issues from the main release 3.2 to 3.2.5 also apply to release 3.2.6 unless noted otherwise.
SharePoint on Premise supports: new object types Sites, Document Sets; custom sites for individual imported objects; non-consecutive versioning for documents.
SharePoint Online Importer: internal properties like ‘Modified’, ‘Modified by’, ‘Created’ and ‘Created by’ are properly updating now.
SharePoint Online Importer: removed limitations of the number of items supported during import in the target library.
SharePoint Online Importer: internal name of the taxonomy terms are correctly used.
IBM Domino Scanner:
The scanner requires that the temporary directory for the user running MC Job Server Service exists and that the user can write to this directory. If the directory does either not exist or the user does not have write permission to the directory, the creation of temporary files during document and attachment extraction will fail. The logfile will show error messages like
„INFO | jvm 1 | 2014/10/02 12:06:26 | 12:06:26,850 ERROR [Job 1351] com.think_e_solutions.application.documentdirectory… - java.io.IOException: The system cannot find the path specified“.
To work around this issue, make sure the temporary folder exists and the user has write permission for this folder. If the MC Job Server is started manually as a normal user then the “Temp” folder should be C:\Users\Username\AppData\Local\Temp. Therefore, if the MC Job Server is run as a service by the Local System account, the folder is one of the following:
For the 32bits version of Windows: C:\Windows\System32\config\systemprofile\AppData\Local\Temp
For the 64bits version of Windows:
C:\Windows\SysWOW64\config\systemprofile\AppData\Local\Temp
New IBM Domino Scanner: supports scanning emails and documents from IBM Domino/Notes
New Sharepoint Online Importer: supports importing Lists, List Items, Documents and folders in Sharepoint Online
Eroom scanner: Multiple erooms can be scanned now with a single scanner configuration
Scheduler: Fix a special case when the scheduler run hangs in status “Scanner running”
Documentum Importer: Fix a “NullPointerException” when importing VDRelation updates
IBM Domino Scanner:
The scanner requires that the temporary directory for the user running MC Job Server Service exists and that the user can write to this directory. If the directory does either not exist or the user does not have write permission to the directory, the creation of temporary files during document and attachment extraction will fail. The logfile will show error messages like
„INFO | jvm 1 | 2014/10/02 12:06:26 | 12:06:26,850 ERROR [Job 1351] com.think_e_solutions.application.documentdirectory… - java.io.IOException: The system cannot find the path specified“.
To work around this issue, make sure the temporary folder exists and the user has write permission for this folder. If the MC Job Server is started manually as a normal user then the “Temp” folder should be C:\Users\Username\AppData\Local\Temp. Therefore, if the MC Job Server is run as a service by the Local System account, the folder is one of the following:
For the 32bits version of Windows: C:\Windows\System32\config\systemprofile\AppData\Local\Temp
For the 64bits version of Windows:
C:\Windows\SysWOW64\config\systemprofile\AppData\Local\Temp
Outlook scanner:
Extract the email address of the following recipients: To, CC and BCC
Extract the number of attachments in the source attribute: AttachmentsCount
Documentum Scanner and Importer: now support the migration of aspects attached to documents and/or folders. Documentum CS and DFC 6.x or higher required.
New Documentum No Content Copy (“NCC”) Scanner and Importer: these customized variants of the regular Documentum adapters allow for fast metadata-only migration between Documentum, while preserving the references to content files. You can move or copy the content files separately at any point during or after the migration.
Documentation: fixed several errata in SharePoint Scanner and Importer User Guides in filenames, paths and URLs used throughout the respective documents
eRoom Scanner: fixed the mc_content_location system rule not appearing for documents scanned from eRoom, thus preventing changes to the content file’s location to be made for import if needed
migration-center Client: fixed issue with import of mapping lists containing keys differing only in character case. Now these are imported and treated as distinct values correctly.
Jobserver: fixed issue with establishing SSL connections from Jobserver with certain combination of adapters
Database Scanner: fixed Unexpected error occurred during content integrity checking message when migrating data scanned with the Database Scanner to target systems with importers supporting the content integrity check feature, which the Database Scanner itself does not support.
File System Scanner: Extended metadata containing values longer than 4000 bytes no longer errors out. Instead, the values are truncated to the max 4000 bytes supported by mc.
File System Scanner: fixed extraction of creation date and owner for files with paths longer than 260 characters
Scheduler: fixed Oracle error when interval for a scheduler was set to “Month”
SharePoint Importer: The ID that is stored after a successful import of an object in id_in_target_system changed from target list specific ID to global unique ID (GUID). This change maintains compatibility with data scanned using mc 3.2.4 U1 and U2 that still uses the list specific ID
MC Client: Importing a mapping list containing identical keys will silently skip the existing key. Solution: verify mapping lists before importing them to migration-center in order to remove or rename identical keys, mapping lists must always have unique keys.
File System Scanner: added new option to move successfully scanned files to a local path or UNC network path specified by the user. See the migration-center File System Scanner User Guide for more information on the new moveFilesToFolder parameter.
Installer: added option to specify location of log files during installation of migration-center Server Components. Log files were stored in the installation folder’s logs subfolder by default, now any valid local or UNC network path can be specified for storing the log files generated by migration-center during runtime. See the migration-center Installation Guide for more information about installing migration-center Server Components.
Database: removed dependency on Oracle SYS.UTL_TCP and SYS.UTL_SMTP packages for performing manual/regular migration activities. Note the scheduler still requires these packages in order to function, but it is now possible to use all other features of migration-center except the scheduler without the need to have access granted to these Oracle system packages.
SharePoint Importer: Support for importing folder objects (content type Folder and custom subtypes inheriting from Folder)
SharePoint Importer: Support for creating link objects (content type Link to a Document and custom subtypes inheriting from Link to a Document) (see the SharePoint Importer User Guide for more information)
Documentum Scanner: added “exportLatestVersions” to control the number of latest versions to scan (see the Documentum Scanner User Guide for more information)
SharePoint Scanner: fixed an issue causing a "Object reference not set to an instance of an object." error during scan with certain configurations of Date and Time and Yes/No columns
New Documentum D2 Importer*: currently supports D2 4.1, including D2 specific features such as autonaming, autolinking, rule based security, applying rules based on the document’s owner. etc.
New SharePoint Scanner*: currently supports SharePoint versions 2007/2010/2013, extraction of Document Libraries, exclusion of selected Content Types or file types, checksum generation for verifying content integrity after import, etc.
New Documentum DCM Importer*: currently supports DCM 5.3, including DCM specific features such as document classes, creating controlled documents, applying autonaming rules, setting specific lifecycle states and associated attributes for imported documents. etc.
Updated SharePoint Importer: now implemented as a SharePoint Solution (just like the new SharePoint Scanner), simplifying deployment and improving performance, reliability and error resilience
Updated SharePoint Importer with content integrity check feature (comparison of checksums computed during scan and import)
Updated Documentum Importer: now supports importing to Documentum 7 repositories
Updated box Importer: now works with the box API version 2.0 for improved performance and reliability; verification of content integrity can now be performed during import based on checksums computed by the importer and box respectively
Updated Documentum Scanner: now supports scanning Documentum 4i repositories (via DFC 5.3)
New parameter for Documentum Importer to toggle whether errors affecting individual renditions should be treated as warnings or errors affecting the entire document, thus preventing it from being imported (set to ignore errors and treat as warnings by default for compatibility with previous versions)
New parameter for Filesystem Scanner to ignore hidden files (false by default)
Documentation: added new document “migration-center Database Administrator’s Guide” detailing database installation requirements (privileges, packages, etc.) and procedures for deploying mc in environments where the regular database setup cannot be run, and the database must be prepared manually by a DBA for installing the mc schema
*New adapters must be purchased separately.
Scheduler: fixed an issue where a 24-hour time set was converted to 12-hour time
Scheduler: fixed an issue where the hourly interval was not taken into account if the scheduler was configured to run on a minutely or hourly basis
Documentum Scanner: fixed issue where the “scanNullAttributes” parameter’s functionality was reversed with regard to the parameter’s actual setting in the UI
Documentum/D2/DCM Importer: fixed an issue with the importer not properly handling certain null attribute values (in case the null value of an attribute resulted from null values returned by the If transformation function)
Filesystem Importer: fixed an issue causing errors when importing renditions for a selection of documents if some objects had renditions and others didn’t
FirstDoc Importer: fixed an issue with setting version labels in a particular case (Superseded 1.0 – Obsolete 1.1)
FirstDoc Importer: fixed an issue trying to delete already deleted relations
FirstDoc Importer: fixed objects no being rolled back in case errors occur during saving content
FirstDoc Importer: fixed objects not being rolled back in case errors occur during updating of ACLs and links
Documentation: minor corrections and additions in various documents
SharePoint Importer now integrates with SharePoint as a SharePoint Solution instead of the separate service component it used up to mc 3.2.3. This changes the requirements and deployment procedures for this adapter. These are described in the mc 3.2.4 SharePoint Importer User Guide
Database: Privilege “CREATE ANY JOB” is no longer required, “CREATE JOB” is now sufficient
SharePoint Scanner only scans complete Document Library items and cannot scan individual folders currently
SharePoint Importer only supports versioning using contiguous numbering
Scheduler: setting a time interval (the hours during which the scheduler is allowed to run) for a scheduler configured to run on a minutely or hourly basis will not be reflected correctly in the “Next run date” displayed. The scheduler will run as configured, the issue affects only the display of next run date.
Remaining issues from the main release 3.2 to 3.2.3 also apply to release 3.2.4 unless noted otherwise.
New Microsoft Outlook Scanner – scan specified folders of a user’s mailbox from Outlook 2007-2010, including messages and their properties, choose to exclude particular subfolders or properties, etc. See the migration-center 3.2.3 – Outlook Scanner User Guide document for more information about this new adapter. As with all adapters, the Microsoft Outlook Scanner is available for purchase separately.
FirstDoc Importer now officially supports FirstDoc versions 6.3 and 6.4 (R&D)
Documentum Importer now supports setting rendition page modifiers and rendition file storage locations through transformation rules.
eRoom Scanner: fixed an issue leading to out of memory errors when scanning large files (several hundred megabytes each)
eRoom Scanner: fixed an issue with the skipContent parameter not working properly in some cases
Filesystem Scanner: fixed an issue where the scanner could fail to scan files equal to or larger than 4GB in size
FirstDoc Importer: fixed a memory leak issue related to session management
Documentum Scanner & Importer: Changed handling of rendition page modifiers (now exposed as a dedicated system attribute that can be set using transformation rules). See the “migration-center 3.2.3 – Documentum Importer User Guide” document for more information about this feature and how it may affects renditions scanned using older versions of the Documentum Scanner.
eRoom Scanner: changed naming of files extracted by the scanner to the mc storage location. This has been done in order to line up the eRoom Scanner’s behavior with the rest of mc’s scanner’s, which all use an object ID rather than actual filenames for the extracted content. This should not affect the regular workflow of a migration from eRoom, as all changes are handled internally by mc and are not exposed to the user (nor does or did the user need to interact with said files at any time).
The FirstDoc Importer cannot set Documentum system attributes such as creation and modify dates, creator and modifier user names, the i_is_deleted attribute, and so on. Setting these attributes is only supported using the Documentum Importer.
If a source attribute has the same name as one of migration-center’s internally used columns, the Client will display the value of the migration-center internal attribute for the source attribute as well. This occurs in Source Objects view. The issue also persists in the data exported to a CSV file if the CSV export is performed from Source Objects view. Other views or the View Attributes window will display the correct value of the respective source attribute.
Remaining issues for the main release 3.2 also apply to release 3.2.3 unless noted otherwise.
Documentum Content Validation – checks content integrity of documents migrated to a Documentum repository by verifying an MD5 checksum of content files before and after the import. From a Documentum source both primary content and renditions can be checked, for other sources only the primary content is supported. The feature is available as an option in the Documentum Importer. See the migration-center 3.2.2 – Documentum Importer User Guide document for more information about this feature.
Documentum Audit Trails – the Documentum adapters now support the migration of audit trails. The Documentum Scanner can scan the audit trails of any folders and documents within the scanner’s scope. Audit trail objects can then be added to an Audit Trail migset and processed just like documents and folders. Finally, audit trails can be imported using the Documentum Importer. The new options available in the Documentum adapters for controlling and configuring the migration of audit trails are described in the migration-center 3.2.2 – Documentum Scanner User Guide and migration-center 3.2.2 – Documentum Importer User Guide respectively.
The Documentum Scanner has a new option, allowing an additional DQL statement to be executed for every object within the scope of the scan. This allows additional information to be collected for each object from sources other than the document’s properties. Other database tables (registered tables) for example can be queried using this option. See the migration-center 3.2.2 – Documentum Scanner User Guide for more information.
The Filesystem Scanner can now build versions from additional attributes (as long as these attributes can supply the required information. Two attributes per object are required for the feature to work, which can be named arbitrarily and can be defined in the fme metadata files associated with each content file. The attributes to be used as version identifier and version number by the scanner can be specified through additional options available in the Filesystem Scanner. See the migration-center 3.2.2 – Filesystem Scanner User Guide for more information.
The SharePoint Importer can now update previously imported objects (if the objects had been modified in the source and scanned as updates)
Documentum Importer: Fixed an issue where importing an update to a Microsoft Word 8.0-2003 document would set the wrong format for the updated document (a_content_type was set to “doc” instead of “msw8” previously)
Documentum Importer: Fixed an issue with importing Filesystem objects to Documentum when user defined version labels were set in transformation rules. Depending on how the labels were set, documents could end up all having CURRENT labels set, or having no CURRENT labels at all. Now the most recently imported version is always set as CURRENT, regardless of the CURRENT label set in Transformation Rules.
Transformation Engine: Fixed an issue where Validating a migration set would validate objects that have been already validated again, which didn’t make sense.
Scheduler: fixed issues where a Scheduler would stop working after several hundred runs and required to be stopped and restarted manually, or report a run as having finished with errors although there were none.
SharePoint Importer: Fixed an issue with the SharePoint Importer where it would set values of multiline columns as literal HTML code sometimes.
SharePoint Importer: Fixed an issue where during import the Comment column was set to value “Version checked in by migration-center” automatically.
Box Importer: improved performance and reliability when importing large files (several hundred MB)
Box Importer: fixed issue where the progress would indicate 100% for the entire duration of a large file being uploaded, with no actual progress being visible to the user
Box Importer: fixed an issue where the max number of threads allowed is set for the importer and wrong credentials are entered – this caused connections to the database to be opened but not closed, exceeding the maximum number of connections allowed by the database after some time.
FirstDoc Importer: Changed handling of application number from regulatory dictionary to work around issue where duplicate application numbers are used. See the migration-center 3.2.2 – FirstDoc Importer User Guide document for more information about this feature.
Documentum Content Validation relies on a checksum computed during scan to compare the checksum it computes after import against. Currently only the Documentum and Filesystem Scanner have the option of computing the required checksum during scan (Documentum Scanner: main content and renditions; Filesystem Scanner: only main content). Scanners for other source systems could of course be extended to also compute a checksum during scan time which the Documentum Content Validator could use.
The SharePoint Importer and the FirstDoc Importer cannot function together due to some conflicting Java methods. Workaround: use only one of the adapters on a given machine by deleting the other adapter’s folder from the migration-center Server Components libs subfolder.
Same issues as for the main release 3.2 also apply to release 3.2.2
As a major feature, migration-center 3.2 provides several new adapters for various source and target systems, listed below. The interaction between scanners and importers has also been reworked, so now scanners and importers are no longer required to exist in pairs in order to work together. Instead, any scanner/importer exists as a generic, standalone adapter that can be used in combination with any other available scanner/importer, allowing for any combination of source and target systems possible using the available adapters.
The adapters available for migration-center 3.2 are:
Documentum Scanner
eRoom Scanner
Filesystem Scanner
Database Scanner
Documentum Importer
Filesystem Importer
Box Importer
Alfresco Importer
SharePoint Importer
FirstDoc Importer
Filesystem-Documentum adapters have been discontinued and replaced by generic Filesystem and Documentum adapters respectively. The new adapters perform the same while at the same time allowing interaction between any scanner and importer.
Consult the individual adapters’ User Guide documents for detailed information regarding supported functionalities and systems.
migration-center 3.2 now requires the Java Runtime Environment 1.6 It will not work with older versions.
Filesystem-Documentum Scanner has been replaced by generic Filesystem Scanner which generates content able to be migrated using Documentum Importer or other importers to the respective target systems
Documentum Scanner is no longer connected to the Documentum Importer and generates content able to be migrated using other importers as well
Filesystem-Documentum Importer has been replaced by generic Documentum Importer which can accept and import content from any scanner, including filesystem or Documentum sources
Documentum Importer: improved performance for importing large numbers of Virtual Documents by up to 15% by skipping some unneeded processing
Documentum Scanner: content files are now written to the export location by using the Windows/DOS extension corresponding to the respective formats instead of the Documentum format name used previously. This is necessary to facilitate interaction between the Documentum Scanner and various importers targeted at other content management systems that rely on the Windows/DOS extension to work properly.
Documentum Scanner has a new “skipContent” parameter - enabling it skips extraction of the actual content, thus saving time during scan. The feature is intended to be used for testing or other special purposes; not recommended to enable for production use.
Filesystem Scanner can now detect changes to external metadata files as well instead of just the main content file and can add the updated metadata as an update to the object. This is useful for scenarios where the content is expected to change less frequently than the external metadata files.
Filesystem Scanner now also supports reading external metadata for folders, providing the same functionalities as for file metadata.
Filesystem Scanner has a new “ignoreAttributes” parameter – this allows defining a list of unwanted attributes that should be ignored during scan. This mostly applies to attributes coming from external metadata files of from extended metadata from the files’ contents
Filesystem Scanner has a new “ignoreWarnings” parameter – this allows scanning documents even if some of their metadata (owner or creation date) cannot be extracted or if the external metadata file is missing. Use only if that kind of information is not critical for the migration, and/or would otherwise cause too many files to be skipped.
Client/Transformation Engine: system attributes are no longer available to be associated on the “Associations” page of the “Transformation Rules” window. These are used automatically by the respective importers and did not need to be associated anyway
Client/Transformation Engine: set default rule to be pre-selected in the “Get type name from” drop-down list on the “Associations” page for all the different object types (user can of course change the selection if desired)
Documentum Scanner: fixed an isolated issue where a particular combination of scanner parameters, source attribute values and objects without content would prevent such objects from being scanned
Documentum Importer: fixed an issue where adding a new version of a document with less values for a repeating attribute than the previous version had for the same repeating attribute would keep the previous version’s repeating attribute values which exceeded the number of attribute values present in the currently added version
Documentum Importer: fixed an issue where an underscore character (“_”) in a rendition file’s path was interpreted as a page modifier separator, leading to unexpected errors. Now the “_” character’s role as a page modifier separator correctly applies to a rendition’s filename only, not its entire path
Documentum Importer: fixed an issue where the CURRENT version label was not updated correctly in case an object and updates of that object were imported together
Documentum Importer: fixed an issue where updating folder objects during import would not clear empty attributes in case those attributes had values different from null before the update.
Filesystem Scanner: fixed an issue where the modification date for folders would not be scanned
Filesystem Scanner: removed “content location” value for folder objects added previously by scanner since folders do not have content of their own. This does not influence functionality.
Filesystem Scanner: fixed error extracting extended metadata from PDF files
Filesystem Scanner: fixed some discrepancies with object counts in log files when scanning updated objects
Transformation engine: a multi-value rule is now validated against the multi-value setting of the attribute it is associated with, so trying to set multiple values for an attribute defined as single value for example will now throw a validation error.
Client: fixed an issue where the Object History feature did not return results at all under certain circumstances
Client: fixed multiple minor issues related to focusing and scrolling in various grids/lists
Database: installation of the Oracle database schema can only be performed with the “sys” user
Database: multiple mc database schemata are not supported on the same database instance
Database installer: avoid running the database installer from folder locations containing “(“ or “)” characters in the path
Documentum Features: Cabinets: dm_cabinet objects currently cannot be represented as individual objects in mc, hence mc cannot store or set specific attributes for cabinets eitheri
Documentum Features: Cabinets: mc cannot migrate empty cabinets (i.e. dm_cabinet objects with no other objects linked to it) for reasons stated abovei
Documentum Features: Relations: support for migrating dm_relation type objects is limited to relations between dm_document and dm_folder type objects (and their respective subtypes)i
Documentum Features: Virtual documents: the snapshot feature of virtual documents is not supportedi
Documentum Scanner: for the DQL query to be accepted by the scanner it must conform to the following template: “select r_object_id from <dm_document|subtype of dm_document> (where …)”. Other data returned by the query or object types other than dm_document (or its subtypes) is not supported.
Documentum Scanner: the ExportVersions option needs to be checked for scanning Virtual Documents (i.e. if the ExportVirtualDocuments option is checked) even if the virtual documents themselves do not have multiple versions, otherwise the virtual documents export might produce unexpected results. This is because the VD parents may still reference child objects which are not current versions of those respective objects. This is not an actual product limitation, but rather an issue caused by this particular combination of Scanner options and Documentum’s VD features
Documentum Scanner: scanning dm_folder type objects using the DQL option is not supported currently.
Documentum Importer: the importer does not support importing multiple updates of the same object in the same run (or the base object and of its updates). Every update of an object must be imported in different runs (this is how it would happen during normal conditions anyway)
Update migration: Objects deleted from the source will not be detected and the corresponding target object will not be deleted (if already imported)
Update migration: whether a source object has been updated is determined by checking the i_vstamp and r_modify_date attributes; source objects changed by third party code/applications which do not touch these attributes might not be detected by mc
Client: the client application does not support multi-byte characters, such as characters encoded with Unicode. This does not affect the migration of such information, but merely limits the Client’s ability to display such values on screen; multi-byte characters are supported and processed accordingly by migration-center’s Database and Server components, which are the components involved in the actual data processingi
eRoom Scanner: scanning updates of eRoom objects is not supported currently; this also applies to newly added versions to a previously migrated object. Only newly created objects will be detected and migrated. This is due to the way eRoom handles object IDs internally, which prevents mc from correctly detecting changes to versioned objects. The full functionality may be implemented in a future release.
Filesystem Importer: does not import folders as standalone objects. Folders will be created as a result of the path information attached to the documents though, so folder structures are not lost. The full functionality may be implemented in a future release.
Database Scanner: the database adapter does not extract content from a database, only metadata. Content can be specified by the user during the transformation process via the mc_content_location system attribute. It may be necessary to extract the content to the filesystem first by other means before migration-center can process it. Content extraction functionality may be implemented in a future release.
Attributes: The maximum length of an attribute name is 100 bytesi
Attributes: The maximum length of an attribute value is 4000 bytesi
Attributes: The maximum length of a path for a file system object is 512 bytesi Note: all max supported string lengths are specified in bytes. This equals characters as long as the characters are single-byte characters (i.e. Latin characters). For multi-byte characters (as used by most languages and scripts having other than the basic Latin characters) it might result in less than the equivalent number of characters, depending on the number and byte length of multi-byte characters within the string (as used in UTF-8 encoding)
Documentum Importer: During runtime, the importer creates a registered table for internal use; this table will be not deleted after the import process has finished because it might be used by other importers running concurrently on other Jobservers. Since an importer job running on one Jobserver does not know about any importers that may be running on other Jobservers, it cannot tell whether it is safe to delete the table, which is why it is left in place. This registered table does not store actual data; it acts as a view to data already stored by Documentum. It is safe to remove this registered table once the migration project is finished. The following query is used to create the registered table: register table dm_dbo.dm_sysobject_s (r_object_id char(16), r_modify_date DATE, r_modifier char(32), r_creator_name char(32))
Documentum Scanner: if several scanners are running concurrently and are scanning overlapping locations (e.g. due to objects being linked into multiple locations), a scanner might detect an object scanned earlier by another scanner as an update, although nothing has changed about that object. This has been observed in combination with relations attached to the objects. This will result in some redundant object appearing as updates while they are in fact the same, but apart from this the final result of the migration is not affected in any way. The redundant objects will be handled like regular objects/updates and imported accordingly.
Documentum: If the group_permit, world_permit, owner_permit AND acl_domain, acl_name attributes are configured to be migrated together the *_permit attributes will override the permissions set by the acl_* attributes. This is due to Documentum’s inner workings and not migration-center. Also, Documentum will not throw an error in such a case, which makes it impossible for migration-center to tell that the acl_* attributes have been overridden and as such it will not report an error either, considering that all attributes have been set correctly. It is advised to use either the *_permit attributes OR the acl_* attributes in the same rule set in order to set permissions.
Transformation engine: Transformation Rules window/Associations page: if object types are added and no attributes are associated for these object types all objects matching the respective types will be migrated with no attribute values set (except for system attributes handled automatically by Documentum). Avoid adding object types without associating all required attributes for the respective types.
Client: on rare occasions, some windows or parts of a window will flicker or fail to refresh their contents. To work around issues like these, use the window’s Refresh button, scroll the window left/right, or close and re-open the affected window if nothing else helps.
Client: when editing an object type’s definition and trying to change an attribute’s type the drop-down list will not appear on Microsoft Windows 7 systems unless miglient.exe is configured to run with “Disable visual themes” (option can set on the Compatibility page in the executable’s properties)
Client: trying to import a mapping list from a file will not work on Microsoft Windows 7 systems because the context menu containing the command will not appear unless miglient.exe is configured to run in Windows XP compatibility mode (option can set on the Compatibility page in the executable’s properties)
Client: After using "Export to CSV" the folder where the CSV file has been saved is still in use by migration-center Client
Scheduler: the scheduler may report runs that finished with warnings during import as having finished with errors instead. Make sure to check any scheduler history entries listed as “Finished with errors” to determine whether the cause is an actual error condition or merely a warning, which is not a critical condition.
Installer: The migration-client installer does not work with User Account Control enabled in Windows 7. Please either disable UAC for the duration of the installation, or if installation needs to be performed using UAC enabled, manually grant full access permissions for the required users on the installation folders afterwards
Database update from previous versions: as a result of the update process increasing the default size for attribute fields to 4000 bytes (up from the previous 2000 bytes), Oracle’s internal data alignment structures may fragment. This can result in a performance drop of up to 20% when working with updated data (apparent when transforming/validating/resetting). A clean installation is not affected, neither is new data which is added to an updated database, because in these cases the new data will be properly aligned to the new 4000 byte sized fields as it is added to the database.
Migration Set A migration set comprises a selection of objects (documents, folders) and set of rules for migrating these objects. A migration set represents the work unit of migration-center. Objects can be added or excluded based on various filtering criteria. Individual transformation rules, mapping lists and validations can be defined for a migration set. Transformation rules generate values for attributes, which are in turn validated by validation rules. Objects failing to pass either transformation or validation rules will be reported as errors, requiring the user to review and fix these errors before being allowed to import such objects.
Attribute A piece of metadata belonging to an object (e.g. name, author, creation date, etc.). Can also refer to the attribute’s value, depending on context.
Transformation Rules A set of rules used for editing, transforming and generating attribute values. A set of transformation rules is always unique to a migration set. A single transformation rules is comprised of one or several different steps, where each step calls exactly one transformation function. Transformation rules can be exported/imported to/from files or copied between migration sets containing the same type of objects.
Transformation Function Transformation functions compute attribute values for a transformation rule. Multiple transformation functions can be used in a single transformation rule.
Job Server The migration-center component listening to incoming job requests, and running jobs by executing the code behind the adapters referred by those jobs. Starting a scanner or importer which uses the Documentum adapter will send a request to the Jobserver set for that scanner, and tell that Jobserver to execute the specified job with its parameters and the corresponding adapter code.
Transformation The transformation process transforms a set of objects according to the set of rules to generate or extract.
Validation Validation checks the attribute values resulting from the Transformation step against the definitions of the object types these attributes are associated with. It checks to make sure the values meet basic properties such as data type, length, repeating or mandatory properties of the attributes they are associated with. Only if an object passes validation for every one of its attributes will it be allowed for import. Objects which do not pass validation are not permitted for import since they would fail anyway.
Mapping list A mapping list is a key-value pair used to match a value from the source data (the key) directly to the specified value.
The present document offers a detailed description of the Documentum security enhancements introduced with version 3.2.4 of migration-center.
The additional security features provide means for system administrators to define an additional layer of security between migration-center and Documentum repositories.
Some of the features offered are:
delegating access to users without having to give them superuser privileges and credentials directly on the respective repositories
controlling the type of operation a user is allowed to perform on a given repository (import and/or export)
controlling which paths a user can access to perform the above actions on a given repository
controlling where a user is allowed to export scanned documents to
limit the validity or duration of a security configuration by setting a date until which the configuration is considered valid
encrypting any passwords saved in the security configuration file
The implementation and configuration of the security enhancement features is transparent to end-users working with migration-center. End-users working with migration-center will be notified through on-screen messages if they trigger actions or use configurations which conflict with any enhanced security settings.
Usage of the enhanced security features is also optional. Removing or renaming the main security configuration file will disable the feature and will revert migration-center to its standard behavior when working with Documentum repositories (as described in the Documentum Scanner and Documentum Importer user guides).
The security features as well as this document is targeted specifically at system administrators managing Documentum Content Servers and access to the respective repositories, especially through migration-center.
System requirements are unchanged for the Documentum security enhancements feature. The same requirements as for using migration-center with the Documentum Scanner and Documentum Importer apply. For more information about general system requirements as well as supported Documentum versions and requirements please see the Installation Guide, Documentum Scanner user guide, and Documentum Importer user guide.
The Documentum security enhancements features are implemented as an additional, optional module which integrates with migration-center’s Documentum Scanner and Documentum Importer.
The presence of the Documentum security enhancements module will be detected by migration-center automatically, and if a valid configuration exists the settings within will apply every time a Documentum Scanner or Documentum Importer job is executed.
The Documentum security enhancements module is located in the <migration-center Server Components installation folder>/lib/mc-dctm-adaptor/security-config folder. This folder contains the code package, sample configuration file, and tools used by the feature.
There is one tool for encrypting passwords to be stored in the configuration file, and another tool for validating the configuration file’s structure. The configuration file itself is a human-readable and editable XML file. The tools and configuration file will be described in detail in the following chapters.
The Documentum security enhancements feature is disabled by default (after installation). Since the configuration of the features depends on the customer’s preferences and requirements, this configuration must be created first before the feature can work. Configuration is described in section Configuration.
The Documentum security enhancements feature is disabled by default and will become active only after a correct configuration file has been created and provided by the system administrator.
Should it be required to disable this feature after it has been configured and is in use, this can be achieved easily using either one of the below approaches:
Rename, move, or delete the <migration-center Server Components installation folder>/lib/mc-dctm-adaptor/security-config folder
Rename, move, or delete the mc-dctm-security-config.xml file located in the <migration-center Server Components installation folder>/lib/mc-dctm-adaptor/security-config folder
As always, consider backing up any files before making changes to them.
Changes to the security configuration file, deleting, renaming or moving it, will not affect currently running Documentum Scanner or Documentum Importer jobs. Changes will only take effect for jobs started afterwards
Since the Documentum security enhancements feature’s folder (<migration-center Server Components installation folder>/lib/mc-dctm-adaptor/security-config) can contain sensitive information, it is advised to secure this folder using the file system and network security and access policies applicable to such information for the environment where migration-center is deployed in order to prevent unauthorized access.
At the same time the user account used for running the migration-center Jobserver service must be allowed access to this folder, as any Documentum Scanner or Documentum Importer will run in the context of the migration-center Jobserver, thus requiring access to the <migration-center Server Components installation folder>/lib/mc-dctm-adaptor/security-config folder and its contents.
The configuration file controls all aspects of the Documentum security enhancement features. A valid configuration file needs to be provided by the system administrator configuring migration-center for the feature to work. As long as no valid configuration file exists, the feature will be disabled and will have no effect.
The configuration file must be named mc-dctm-security-config.xml and must be located in the <migration-center Server Components installation folder>/lib/mc-dctm-adaptor/security-config folder.
Since the exact configuration depends on the customer’s environment, a preconfigured file cannot be delivered with migration-center. Instead a sample file illustrating the structure and parameters available for configuration is delivered and can be used as a starting point for creating a valid configuration. The sample file is named mc-dctm-security-config-sample.xml and is located in the <migration-center Server Components installation folder>/lib/mc-dctm-adaptor/security-config folder.
The sample configuration file can be copied and renamed to mc-dctm-security-config.xml to create the base for the actual configuration file, which will be edited by a system administrator.
The sample configuration file is also listed in section Sample configuration file.
The configuration file is a human-readable XML file and can be edited with any text editor, or application providing XML editing capabilities. In order for the configuration file to be valid it will have to conform to XML standards in terms of syntax and encoding, something that should be considered when editing the file using a text editor. Migration-center will validate the file and refuse operation if the file structure or its contents are found to be invalid.
Special characters
Certain special characters need to be escaped according to the XML syntax in order to be entered and interpreted correctly as values. See below a list of these characters:
Character
Escape in XML as
" (quote)
"
' (apostrophe)
'
< (less than)
<
> (greater than)
>
& (ampersand)
&
Encoding
The XML standard allows many different types of encodings to be used XML files. Encodings affect how international characters are interpreted and displayed. It is recommended to use the UTF-8 encoding, which can represent all international characters available today in the UTF character set.
Note that the (mandatory) XML header tag <?xml version="1.0" encoding="UTF-8"?> does not set the encoding, it merely specifies that the current file is supposed to be encoded using UTF-8; the actual encoding used is not directly visible to the user and depends on how the file is saved by the application used to save it after editing. Please consult your text editor’s documentation about saving files using specific encodings, such as UTF-8.
See the official source http://www.w3.org/standards/xml/ for more information about XML.
This paragraph describes the structure of the Documentum Enhanced Security configuration file. It will list all available configuration elements, their allowed values and whether the respective parameter is mandatory to be set or not.
A <configuration> block defines an entire configuration block controlling access of one user to a repository. All other parameters which apply to that repository and user must be contained in the same <configuration> block.
Multiple repositories and users can be configured by creating multiple <configuration> blocks in the configuration file.
configuration Properties
Mandatory
yes
Can occur multiple times
yes
Attributes
repository_name
migration_user
action_allowed
Allowed values
None (contains other elements, not value)
Example
<configuration repository_name="repo1" migration_user="external.user" action_allowed="scan">
…
…
…
</configuration>
repository_name is an attribute of a <configuration> element and defines the Documentum repository the current configuration block applies to.
repository_name Properties
Mandatory
yes
Can occur multiple times
no
Attributes
None (is an attribute)
Allowed values
String representing one valid Document repository name
Example
<configuration repository_name="repo1" migration_user="external.user" action_allowed="scan">
…
…
…
</configuration>
migration_user is an attribute of a <configuration> element and defines the migration user who will be granted access to the repository. This user will need to be configured in any Documentum job (scanner or importer) meant to access the repository defined in repository_name.
migration_user Properties
Mandatory
yes
Can occur multiple times
no
Attributes
None (is an attribute)
Allowed values
String identifying a migration user
Example
<configuration repository_name="repo1" migration_user="external.user" action_allowed="scan">
…
…
…
</configuration>
action_allowed is an attribute of a <configuration> element and defines what type of action a user is allowed to perform on the repository defined in repository_name. A user may be allowed to scan, import, or perform both actions on the repository.
action_allowed Properties
Mandatory
yes
Can occur multiple times
no
Attributes
None (is an attribute)
Allowed values
scan
import
both
Example
<configuration repository_name="repo1" migration_user="external.user" action_allowed="scan">
…
…
…
</configuration>
effective_date is a child element of configuration. effective_date sets a date until which the current configuration is allowed to be executed. If the set date is exceeded when a job is started, the job will not be permitted to execute and the user will be notified about the configuration having expired.
effective_date Properties
Mandatory
yes
Can occur multiple times
no
Attributes
None
Allowed values
Valid date using the format YYYY-MM-DD
Example
<configuration …>
…
<effective_date>2013-12-22</effective_date>
…
</configuration>
super_user_name is a child element of configuration. super_user_name defines a Documentum superuser which will be used to connect to the repository defined in repository_name. The migration-center user running the job will not need to know the Documentum superuser’s name or password, this will only be known and used internally by migration-center.
super_user_name Properties
Mandatory
yes
Can occur multiple times
no
Attributes
None
Allowed values
String identifying a valid Documentum superuser
Example
<configuration …>
…
<super_user_name>dmadmin</super_user_name>
…
</configuration>
super_user_password is a child element of configuration. super_user_password defines the password for the Documentum superuser which will be used to connect to the repository defined in repository_name. The migration-center user running the job will not need to know the Documentum superuser’s name or password, this will only be known and used internally by migration-center.
The password needs to be stored in encrypted form; see Password encryption tool for information about how to encrypt the password.
super_user_password Properties
Mandatory
yes
Can occur multiple times
no
Attributes
None
Allowed values
String defining the Documentum superuser’s password (in encrypted form)
Example
<configuration …>
…
<super_user_password>DSzl1hrzj+yYMLOtxR5jlg==</super_user_password>
…
</configuration>
migration_user_password is a child element of configuration. migration_user_password defines the password for the migration user who will be granted access to the repository. This password will need to be configured in any Documentum job (scanner or importer) meant to access the repository defined in repository_name.
The password needs to be stored in encrypted form; see Password encryption tool for information about how to encrypt the password.
migration_user_password Properties
Mandatory
yes
Can occur multiple times
no
Attributes
None
Allowed values
String defining the migration user’s password (in encrypted form)
Example
<configuration …>
…
<migration_user_password>DSzl1hrzj+yYMLOtxR5jlg==</migration_user_password>
…
</configuration>
allowed_paths is a child element of configuration. allowed_paths defines paths the migration user will be allowed to access in the specified Documentum repositories. The allowed_paths element is optional; if this element is omitted, access to all folders will be granted (whether access will actually succeed depends on the respective Documentum repositories permissions of course).
allowed_paths Properties
Mandatory
no
Can occur multiple times
no
Attributes
None
Allowed values
Elements of type <path>
Example
<configuration …>
…
<allowed_paths>
<path>/Import</path>
<path>/Public/Documents</path>
<path>/User Cabinets</path>
</allowed_paths>
…
</configuration>
allowed_export_locations is a child element of configuration. allowed_export_locations defines local paths or UNC paths (network shares) where the migration user will be allowed to export content from Documentum repositories. This (these) path(s) will need to be used in the Documentum scanner’s configuration as exportLocation (please see the Documentum Scanner User Guide for more information about the Documentum Scanner and the exportLocation parameter).
allowed_export_locations Properties
Mandatory
yes when running a Documentum Scanner
not relevant when running a Documentum Importer
Can occur multiple times
no
Attributes
None
Allowed values
Elements of type <path>
Example
<configuration …>
…
<allowed_export_locations>
<path>d:\mc\export</path>
<path>\\server\share\mc export folder</path>
</allowed_export_locations>
…
</configuration>
path is a child element of allowed_paths or allowed_export_location. It specifies individual Documentum paths (when used with allowed_paths) or local file system paths or UNC paths (network shares) when used with allowed_export_locations.
path Properties
Mandatory
Depends on whether it is used with allowed_paths or allowed_export_locations. Please see the respective elements’ descriptions
Can occur multiple times
yes
Attributes
None
Allowed values
One string per <path> element representing a valid Documentum path (when used with allowed_paths) or local file system path or UNC path (when used with allowed_export_locations)
Example
<configuration …>
…
<allowed_paths>
<path>/Import</path>
<path>/Public/Documents</path>
<path>/User Cabinets</path>
</allowed_paths>
<allowed_export_locations>
<path>d:\mc\export</path>
<path>\\server\share\mc export folder</path>
</allowed_export_locations>
…
</configuration>
Passwords must be encrypted when saved to the configuration file. Encryption is performed using the supplied password encryption tool. The password encryption tool is located in the <migration-center Server Components installation folder>/lib/mc-dctm-adaptor/security-config folder and can be executed using encrypt-tool.
Usage:
Enter password
Confirm password
Press [Encrypt] button (password and confirmed password must match in order to proceed)
Password is encrypted and displayed
Copy encrypted password
Paste encrypted password in migration_user_password or super_user_password element of configuration file
The configuration file must be a valid XML file. For convenience a verification tool is provided to check the validity of the configuration file against an XML Schema Definition file.
Migration-center will also validate the file internally and refuse operation if the file structure or its contents are found to be invalid.
The configuration file verification tool is located in the <migration-center Server Components installation folder>/lib/mc-dctm-adaptor/security-config folder and can be executed using check-configuration.
Sample output for a valid configuration file
C:\Program Files (x86)\fme AG\migration-center Server Components 3.13\lib\mc-dctm-adaptor\security-config>check-configuration.cmd
********************************
* Configuration OK *
********************************
Sample output for an invalid (corrupted) configuration file
C:\Program Files (x86)\fme AG\migration-center Server Components 3.13\lib\mc-dctm-adaptor\security-config>check-configuration.cmd
********************************
* Configuration FAILED *
********************************
de.fme.mc.common.MCException: An error has occured when loading security configuration: org.xml.sax.SAXParseException; systemId: file:///C:/Program%20Files%20(x86)/fme%20AG/migration-center%20Server%20Components%203.13/lib/mc-dctm-adaptor/security-config/mc-dctm-security-config.xml; lineNumber: 5; columnNumber: 29;
The element type "effective_date" must be terminated by the matching end-tag "</effective_date>".
A sample configuration file is provided in <migration-center Server Components installation folder>/lib/mc-dctm-adaptor/security-config/mc-dctm-security-config-sample.xml.
The contents of this file are also listed below for reference.
The focus with migration-center 3.x is on providing a lightweight, yet highly functional, usable and extensible migration solution which can grow to match the requirements of present and future migration projects through customization and connectivity to other document and content management systems.
The current release is primarily a release that consolidates the features and adapters developed in the versions 3.2.x, which itself is a significant update to the initial version 3.0 with major new features, usability, performance, and reliability improvements. Besides the incremental improvements and fixes, migration-center also brings along an entire suite of new adapters available directly from fme; currently the full list of adapters for migration-center available from fme is:
Documentum Scanner
eRoom Scanner
Filesystem Scanner
Database Scanner
Outlook Scanner
SharePoint Scanner
Documentum NCC Scanner
Alfresco Scanner
Domino Scanner
Exchange Scanner
OpenText Scanner
Documentum Importer
DCTM Audit trail Importer
Alfresco Importer
Filesystem Importer
FirstDoc Importer
D2 Importer
DCM Importer
box Importer
Alfresco Importer
SharePoint Legacy Importer
SharePoint Online Importer
SharePoint 2013 Importer
Documentum NCC Importer
InfoArchive Importer
OpenText Importer
migration-center client 3.13 cannot be used to connect to a migration-center database 3.4 as well as migration-center client 3.4 cannot be used to connect to migration-center database 3.13.
Other adapters can of course be developed according to customer or project requirements.
Using migration-center will tell why migration-center is “first in migration technology”.
As stated in the introduction, migration-center 3.x is a complete rebuild. The general concept and useful features from the previous version have been retained functionality-wise and improved in the current version. migration-center’s strengths remain:
The transformation of metadata through intelligent, rule-based mapping of objects
Incremental delta migrations for newly created and changed documents and folders
Identification of duplicates
one-time or regular batch migration of large volumes of documents and folders
Complete audit ability, error reporting and error handling
1:1 migration of folder structures and mapping of objects within, including metadata
1:1 migration of relations (i.e. Virtual Documents, etc.)
The possibility to export and import type templates, transformation rules, object grids (export only) to CSV files.
Features that were never/rarely used have been removed, resulting in a cleaner, more user-friendly and productivity-oriented GUI.
Migration-center 3.x is not compatible with migration-center 2.x. Migrations initially started with migration-center 2.x should be completed before upgrading to the new version; a version upgrade halfway through a project is not possible.
From a technical point of view, migration-center is a distributed, client-server type application using a database management system as a backend. All these components interact to allow for a highly configurable, automated mass import of very large amounts of data from a source document/content management system into another content management system.
The solution centers on an Oracle database used for storing object related information like metadata, migration rules, and all configuration and status information necessary for managing and tracking the steps involved in the migration process.
All of the aforementioned information is defined and controlled via the migration-center Client, a GUI application which acts as the command and control center for migration-center.
Finally, there are the adapters which connect migration-center to various source and target systems via migration-center’s own API on one end and the respective system’s API(s) on the other end. These adapters which interact directly with the source and target systems are controlled by a service, which is also managed by the migration-center Client application (configuring and launching the appropriate jobs). There can be any number of adapters installed and managed with migration-center.
Multiple instances of the above components can be deployed across the network, even across geographical boundaries, and can work together on the same or multiple projects.
The actual content moved during a migration (files, documents, emails) is stored on the file system (local or remote) and does not require a specific migration-center component for managing.
Depending on the combination of adapters deployed in a particular migration project, the source and target systems supported can be same/different versions of the same solution or entirely heterogeneous systems sharing nothing but basic concepts of content management, like a content object with associated metadata.
Not all adapters illustrated in the above diagram are part of the standard product package.
Migration-center Client is a GUI application and represents the interface between the user and the migration-center core features and functionalities. All migration-center functionalities like creating, monitoring and managing jobs, tasks, and migration sets are accessible through the client.
The client does not perform any tasks by itself, it’s merely a management interface; therefore the Client application can be disconnected or shut down safely at any time, without affecting any tasks that might be running in migration-center at that time.
The client is a standard Windows application that can be installed anywhere, provided it has a network connection to migration-center’s central database, and the prerequisites installed (as described in the Installation Guide). Any number of Clients can be deployed across a network, and connect to the same, or different migration-center databases.
After starting the migration-center Client application, the database login window opens. The Client application is not operational by itself, since all information required to operate migration-center is stored in a migration-center database, which is why a connection to such a database is required at all times to use the Client. Type the username and the password, select the migration database from the list of available database connections and click [OK]. migration-center Client will remember the last database it connected to and will automatically reconnect to this database the next time it is started.
During installation a default user is created. If no other users have been created, use the default user account shown below for logging on to the migration-center database.
Username: fmemc
Password: migration123
The Job Server is a migration-center component installed with the migration-center Server Components; it runs as a Windows service and does not interact with the user. Its purpose is to execute various tasks in the background, like scanning or importing documents. The tasks running on a given Job Server can be monitored and managed by the user through the Client. A migration-center Job Server can be installed anywhere as long as there is a network connection to the migration-center database; several Job Servers can be deployed across the network and managed using migration-center Clients. Since the Job Server is the process actually accessing the source and target systems, the user account this service runs with should have the required permissions for proper access to the respective source and target systems.
The migration-center database contains most of the data processing logic and stores all the migration sets, including their objects, metadata and transformation rules.
As with the other components, at least one database needs to be installed for migration-center to work, but several databases can be deployed in the same environment, depending on the requirements.
Both the Client and the Job Server need to connect to a migration-center database to function.
This chapter details the features and functionalities available to the user through migration-center Client.
migration-center’s main application window adheres to typical windows applications’ user interface, offering a main menu bar and toolbar for accessing general, frequently used features.
Major features and functionalities each open in their own child windows, providing toolbars and context menus that apply to the items displayed in that window.
Any combination of child windows can be opened at the same time; child windows can be tiled or cascaded.
Some windows (e.g. windows displaying properties for items, or message boxes informing the user or requesting input from the user) always display on top of other windows and stay there until they are closed by the user.
All the important main menu commands are represented in the toolbar of the main window, are self-explanatory and will be described in the next chapter - Main window toolbar. The menu commands which do not have a correspondent in the toolbar menu will be described here.
<migration-center> <Connect to…> Calls up a dialogue window where the user can connect to a (another) migration- center specific database.
<migration-center> <Reconnect> Reconnects client to currently used migration-center DB
<migration-center> <Renew License> Allows entering a new license key
<migration-center> <About> Displays information about licensing and product versions
<migration-center> <Exit> Exits migration-center
<Manage> <Scanners…> opens the -Scanners- window for creating and managing scanner jobs
<Manage> <Migration Sets…> open the –Migration Sets- windows for creating and managing migration sets, setting up and testing transformation rules and validating the transformed objects
<Manage> <Importers…> opens the -Importers- window for creating and managing importer jobs
<Manage> <Jobs…> open the –Jobs- window which displays currently running jobs and allows the user to pause or stop running jobs if needed
<Manage> <Jobservers…> open the –Jobservers- window which displays currently running jobs and allows the user to pause or stop any running jobs if needed
<Manage> <Schedulers…> opens the -Schedulers- window, where fully automated, time scheduled migration runs can be configured and managed
<Manage> <Object Types…> opens the -Object Types- window, where the object type definitions can be imported and edited (these are required for associating transformation rules with for the attribute model transformation are defined)
<Manage> <Global Mapping Lists> opens the -Global Mapping Lists- window, where global mapping lists can be created or imported. A global mapping is automatically available for use in any migration set.
<Manage> <Object History…> opens the -History- window where it is possible to look up any object that underwent processing in migration-center. Key steps of the objects lifecycle during the migration can be viewed, such as the objects status, metadata and the exact timestamp at that moment.
The main toolbar provides quick access to frequently used functionalities. These correspond with the same functionalities available from the main menu.
[Scanners] opens the -Scanners- window for creating and managing scanner jobs
[Migration Sets] open the –Migration Sets- windows for creating and managing migration sets, setting up and testing transformation rules and validating the transformed objects
[Importers] opens the -Importers- window for creating and managing importer jobs
[Jobs] open the –Jobs- window which displays currently running jobs and allows the user to pause or stop running jobs if needed
[Jobservers] open the –Jobservers- window which displays currently running jobs and allows the user to pause or stop any running jobs if needed
[Schedulers] opens the -Schedulers- window, where fully automated, time scheduled migration runs can be configured and managed
[Cascade] Arranges all open child windows in a cascade
[Tile Horizontally] Tiles all open child windows horizontally
[Tile Vertically] Tiles all open child windows vertically
To input/output any data from/to migration-center there needs to be at least one location with a running Job Server defined.
The -Jobservers- window has its own toolbar and right-click context menu containing items that allow the user to create, edit, delete, and monitor the availability of the Job Servers running on the specified Job Server locations.
The location for a Job Server can be specified as the machine’s hostname or fully qualified domain name, if required, or simply as an IP address.
Note: Do not use loopback addresses or the “localhost” name.
The toolbar and the context menu of the -Jobservers- window contain the same commands.
[Properties] is used to change the configuration of an item in the list. -Properties- window can also be opened by double-clicking an item in the list
[Refresh] is useful for rechecking the status of the locations listed
[New] opens a blank -Jobserver- properties window
[Copy] makes a copy (duplicate) of an item in the list
[Delete] deletes a location from list
To create a new Job Server location
select <New> from the context menu or from the toolbar
Enter the IP address or hostname of the machine the Job Server is running on and the port number it is listening on.
Enter the appropriate port on which the Job Server on that machine was configured
The new location is displayed in the -Jobservers- window. Check the Status column to make sure the required Job Server is listed as Available and thus capable of accepting incoming connections. The Job Server on that particular location is now ready to be used by any input/output adapter.
The default port number is 9700
, unless changed during setup.
Creating a scanner and trying to save it without having any Job Servers defined will prompt the user to define one Job Server location and save the scanner with the new created Job Server location.
Scanners are responsible for filtering and extracting metadata and content from the source system and inputting this information into the migration-center database. Different types of scanners can be available in a migration-center installation; different types of scanners also offer different configuration parameters, depending on the source system they connect with migration-center. All scanners provide the same basic functionalities, such as extracting documents and folders form the source system and extracting the metadata associated with these objects and storing it in the migration-center database, as well as adding system specific information allowing migration-center to understand and process these objects correctly.
The toolbar and the context menu of the -Scanners- window contain the same commands.
[Properties] is used to change the configuration of an item in the list. -Properties- window can also be opened by double-clicking an item in the list
[History] opens the -History- window where more information about all scans performed by the selected scanner can be viewed. The log files from these scans can also be accessed here.
[View Scanned Objects] will display a grid with the scanned objects in a new window, where these can be viewed, filtered, and exported to a CSV (comma separated values) file. This functionality does not allow any editing to be performed on the objects.
[Run] initiates the run of a scanner with the currently saved configuration
[Refresh] is useful for rechecking the “last run status” of the scanners listed
[New] opens a blank -Scanner- properties window
[Copy] makes a copy (duplicate) of an item in the list
[Delete] deletes a configured scanner from list
Items in columns can be sorted (increasing/decreasing) by clicking on the column labels.
The -History- windows displays various information about the selected scanner, such as the number of processed objects, the start and ending time and the status of each run.
Double clicking a history entry or clicking the [Open] button on the toolbar with a selected history entry opens the log file created by that run. The log file contains more information about the run of the selected job:
Version information of the migration-center Server Components the job was run with
The parameters the job was run with
Execution Summary that contains the total number of objects processed, the number of documents and folders scanned or imported, the count of warnings and errors that occurred during runtime.
The log files generated by a scanner are always saved to disk and can be found in the Server Components installation folder of the machine where the job was run, e.g. …\fme AG\migration-center Server Components 3.13\logs
All scanners, regardless of type share some common parameters
Name: must be a unique name for the current scanner
Adapter Type: displays all adapters available with the current migration-center installation. The configuration parameters for the scanner depend on the adapter selected here.
Location: Job Server locations defined in the -Jobservers- window can be selected from in this drop-down list
Description: provides space to enter an optional description of what this scanner does or add some comments related to the scanner
In the History of a scan, or after the objects have been assigned to a Migration set the metadata of these objects can be viewed in the -Scanned Objects- window which can be opened with the [View Scanned Objects] button.
Choose which columns to display using the [Show/Hide Columns] button in the toolbar.
The [Export to CSV] button allows the user to export the entire grid in its current configuration (i.e. including any applied filters and visible columns) into a comma separated file. The CSV field separator, and the repeating attribute separator can be specified by the user before exporting the file.
Filtering attributes and (de)selecting columns is neither permanent, nor does it affect the actual data in any way. It only affects the current display and serves to analyze the data on display.
Migration sets are sets of individual objects but with a common set of transformation and validation rules applied to them. Operations like transformation, validation and import are always performed on such a set of objects and cannot be performed on individual objects that are not assigned to any migration set. Assigning the objects resulting from a scan to a migration set is therefore the first step towards making these objects operable for migration-center’s features.
It is possible to add multiple scans to one migration set, as it is possible to split a scan containing a large number of objects onto several migration sets.
It is also possible to filter the selection of objects before adding it to a migration set based on user specified selection criteria and conditions; objects which do not meet these criteria will not be added to the current migration set and will continue to be available and can be filtered further, added to another migration set, or left unassigned if they are not supposed to be migrated.
Buttons and context menu items:
[Edit Transformation Rules] opens the window where transformation rules are managed and rules are applied
[Apply Transformation] (press drop-down list/button combo) checks attribute model rules and applies the transformation rules to objects. After this step, misconfigured attributes and transformation rules will be pointed out.
[Apply Validation] (select from drop-down list/button combo) when validation is applied, all the attributes of all objects individually are compared against the limitations of the object type they belong to. For details on object type definitions, please see chapter Object Type definitions.
[Transform and Validate] (select from drop-down list/button combo) will perform both actions in sequence, automatically. Use this option once the transformation and validation rules are finalized and should be applied to migration-sets without requiring user intervention to launch the validation process once transformation has completed.
[Reset All] resets all objects of in the selected migration set to their initial attribute values, essentially removing any processing performed during the transformation process and resetting status changes resulting from validation and import. The [Reset All] button is available in all views (-Source Objects-, -Processed Objects-, and -Error Objects-).
Imported objects will NOT be reset by default and require an extra checkbox to be checked prior to executing the reset command. This is for security reasons, because resetting imported objects will erase all information linking the source objects to the objects created by migration-center during import, therefore rendering migration-center unaware of having imported the affected objects at all.
Resetting imported objects would typically be used during testing phases of a project and not during production.
It is strongly advised NOT to reset imported objects in cases where dependencies to these objects may exist in other, even future migration sets, such as newer versions of these objects or relations between these and other object which haven’t been migrated yet; this also applies if it is intended to update these objects during future migrations.
[Stop] appears as a button only when the selected migration set is busy being transformed or validated, and only if these processes take longer than 3 seconds for the selected migration set.
[Refresh] refreshes data in all info-columns
[View All Objects] (press drop-down list/button combo) opens 3 windows: Source Documents, Processed Documents, and Error Documents.
[Source Objects] (select from drop-down list/button combo) displays all scanned objects attributed to the respective migration set. Original attributes and attribute values are displayed.
[Processed Objects] (select from drop-down list/button combo) displays the objects after they have been processed by applying the transformation rules.
[Error Objects] (select from drop-down list/button combo) displays objects that failed transformation/validation criteria, or failed to import, indicating the cause
[New] opens a blank -Migration Set- properties window
[Copy] makes a copy (duplicate) of an item in the list
[Delete] deletes a configured migration set from list
Items in columns can be sorted by clicking on the column headers. Clicking the same column header repeatedly switches between ascending/descending sorting order.
In the lower part of the window, migration sets can be filtered by name. This can prove useful if there are a large number of migration sets, such as the migration sets generated automatically by repeated scheduled migration runs.
The migration set’s properties window opens when creating a new migration set and can be accessed for an existing migration set by double-clicking it or selecting [Properties] button or menu item from the -Migration Sets- windows toolbar or context menu.
The -Migration Set Properties- window also determines the selection of objects assigned to the migration set. The |Filescan Selection | allows scans to be selected and added to the migration set, | Exclude objects by values | offers a simple method of removing excluding objects from the migration set based on selected metadata values, and the | Advanced Filters | page allows creating complex, linked filter expressions to filter the selection of objects using any attribute, value and various conditional operators.
The resulting set of objects can be previewed by clicking the [Preview objects] button on the right side to open the -Preview- window. The -Preview- window offers an overview of the objects resulting from the selection, and some basic functionality for analyzing these such as viewing attributes for individual objects, selecting the attribute to display or exporting this list to a CSV file.
Common controls
The buttons on the right side of the –Migration Set Properties- window are available on all tabbed pages of this window.
The [OK], [Apply] and [Cancel] buttons are self-explanatory and work as expected.
Trying to save a migration set using the [OK] or [Apply] buttons without having assigned any objects to it will display a warning message to the user. Although usually it doesn’t make sense to save a migration-set without having objects assigned to it, this can be used to define migration sets which will be used as templates, and create new migration sets by copying these. In such a case ignore the warning and save the “empty” migration set.
Otherwise define a selection of objects as described in the following chapters before saving the migration set.
The [Preview objects] button and preview functionality is available on all pages of the –Migration Set Properties- window.
If the selection is ok, the selected scans can be assigned to the migration set using the [Select objects] button in the upper right corner.
Note: Assigning objects to a migration set triggers various status changes and updates to the object’s internal metadata, as well as evaluating any additional filter criteria the user may have specified. This can take some time, especially with large numbers of objects. Therefore, the user will be informed that the selection process is about to run in the background, and the Properties window will close. The selection progress can be monitored in the main -Migration Sets- window. While a selection process is running, the user can continue working with other migration sets or other parts of the applications. Selection is a one-time-only operation and does not need to be performed anytime afterwards.
For a migration set that already has a selection of objects assigned instead of the [Select objects] button there will be a [Deselect objects] button. This will free any objects assigned to the migration set but also remove any metadata generated through transformation rules and status information for objects. Recovery of the information lost if deselecting objects from a migration set is not possible. Use with care!
Deselecting objects form a migration sets also involves various updates to the object’s status and internal metadata, which can be a relatively time-consuming operation for large numbers of objects. The process will therefore be run in the background, similar to the procedure described above for selecting objects.
The |Properties| page
A migration set’s basic properties, including user specified properties like the migration set’s name, the type of object it contains, and an optional description can be set here. Other information managed by the system, such as creation or modification dates or the last operation performed on the migration set are also displayed.
The available object types depend on the adapters available with the respective migration center installation and may vary. Always make sure to select the correct object type since the selection here determines what is displayed on the following pages of the properties window.
The |Filescan selection| page
This page allows the user to select and assign any available scan to the current migration set.
The upper list shows all scans with available objects and some relevant information (i.e. scans that haven’t been fully assigned to other migration sets). Move selected scans to the bottom list to assign them to the migration set by clicking on the up/down arrow buttons or double-clicking a scan run. Multiple scans can be selected and assigned to the current migration set.
Only scans matching the object type set on the previous page (|Properties|) will be listed here. Check the object type selection if the desired scans are not displayed here.
The |Exclude objects by values| page
This page offers a simple way of excluding some objects from the selection based on their values for one or more attributes.
In previous versions of migration-center this page was called |Filters|. The functionality provided is the same in all product versions.
The list on the left displays all available attributes. Selecting an attribute displays a list of its distinct values in the Include list to the right. Moving any of these values to the Exclude list on the far right will exclude all objects with that value from the migration set. Move items between the Include/Exclude lists by double clicking or using the arrow buttons between the lists.
Attributes that have been used to exclude some of their values will turn bold to indicate their status. Remove any excluded values for such an attribute to revert it to its default (no exclusion) state.
The |Advanced Filters| page
This page allows creating complex, linked filter expressions to filter the selection of objects using any attribute, value and various conditional operators. Use advanced filter expressions to specify exactly which objects should be assigned to the current migration set.
For objects matching a filter expression means these objects will be added to the migration set, not excluded!
The feature works by adding filter expressions and grouping them with AND/OR operators. The list of expressions will be evaluated from top to bottom to determine the remaining subset of objects which can be assigned to the migration set.
An expression always consists of a source attribute, a logical operator or condition, and a user specified value. Individual expressions can be created, added to the list below (click the [Add] button to the right), and then moved up or down in the list or deleted using the respective [Up], [Down] and [Delete] buttons. Any number of expressions can be added.
Conditional operators for working with string, numeric, and date/time type values are provided.
One, two, or all three of these three views can be opened at any time for the current migration set.
The filter bar at the bottom of the window is common to all views and offers following functionalities:
Other features common to all views are the following toolbar items:
Choose which columns to display using the [Show/Hide Columns] button in the toolbar.
The [Export to CSV] button allows the user to export the entire grid in its current configuration (i.e. including any applied filters and visible columns) into a comma separated file. The CSV field separator, and the repeating attribute separator can be specified by the user before exporting the file.
Click [Refresh] to reload the content of the current view
Source Objects View
The Source Objects view always displays the unmodified state of the objects in the migration set, i.e. the state of the objects as they were scanned from their respective source system. This view does not provide any editing possibilities for the objects displayed, it serves as a permanent reference for checking the original metadata of these objects.
The Source Objects view also allows identifying and removing duplicate objects. Pressing the [View Duplicate Objects] button will display only the duplicates; duplicates are identified by means of a content checksum and the objects’ creation date, e.g. from all the objects with an identical checksum the one with the oldest creation date will be considered the original, and all others will be considered duplicates. This can be used just to display the duplicates or select and remove them from the migration set using the <Remove documents from migration set> entry on the context menu.
The checksum is computed during scan and must be activated in the scanner. See the respective adapter documentation for more information about using the checksum option in the scanner.
Toolbar items
[View Attributes] displays all available (scanned) attributes of one object only. Previous and following object can also be browsed by using the appropriate buttons.
[Edit Transformation Rules] opens the -Transformation Rules- window.
Items in columns can be arranged (increasing/decreasing) by clicking on the column labels.
[View Duplicate Objects] filters identifies and displays duplicate objects
Context menu items
<View Source Attributes> will display all attributes of that object in a small new window and allows browsing through the list objects by object (back and forward.
<View Relations> will display the relations (if any) between the selected object and other objects. It is also possible to see the full details for each relation, including the objects’ IDs it related to, the status and type of that relation, etc.
<View Value> displays the selected cell’s value in a new window. This is particularly helpful when viewing details of a long value (like folder path for example), or a multi-value attribute with a large number of individual values. It is also possible to copy the value to the clipboard, or toggle wrapping the displayed value in the -View Value- window
Copying the currently selected value to the clipboard function can also be performed directly using the <Copy Value> context menu entry or using the {Ctrl+C} keyboard shortcut
<Filter by Value> will immediately filter the view using the selected attribute value. Use the [Clear Filter] button in the filter bar at the bottom to remove a filter and return to the full list.
Objects can be selected and removed from the migration set by selecting <Remove documents from migration set> from the context menu. Objects removed from a migration set become unassigned and are available to be added to another migration set or ignored if these objects are not supposed to be migrated.
Processed Objects View
Opposed to the Source Objects view, the Processed Objects view displays the processed state of the objects in the selected migration set. An object is considered as having been processed if it underwent transformation (and any other operation following transformation, such as validation or import). This view will display the newly generated attribute values for any objects that have been successfully transformed, and the status the processed objects are currently in (i.e. Transformed, Validated, or (Partially) Imported)
If no transformed objects exist in the currently selected migration set (which implies no validated or imported objects can exist either), this view will be empty. Also, if transformation rules are applied to objects but all objects fail transformation this view will be empty as well (in this particular case all objects would be displayed in the third view, Errors)
In addition to the -Source Documents- view, the -Processed Documents- window features a few more functions.
Transforming, validating, and the resetting objects can be performed directly from this window using the appropriate toolbar buttons.
[Apply Transformation] (press drop-down list/button combo) checks attribute model rules and applies the transformation rules to objects. After this step, misconfigured attributes and transformation rules will be pointed out.
[Apply Validation] (select from drop-down list/button combo) when validation is applied, all the attributes of all objects individually are compared against the limitations of the object type they belong to. For details on object type definitions, please see chapter Object Type definitions.
[Apply Transformation & Validation] (select from drop-down list/button combo) will perform both actions in sequence, automatically. Use this option once the transformation and validation rules are finalized and should be applied to migration-sets without requiring user intervention to launch the validation process once transformation has completed.
For quick access to the source attributes of any object the -Source Attributes- window can be also accessed from the context menu.
Individual objects can also be reset to their default, unprocessed state using the <Reset selected documents> item from the context menu. Resetting an object will essentially remove any metadata generated through transformation rules, processing and status information from such objects. This context menu entry is slightly different from the [Reset all documents] button on the toolbar, which, as the name implies, operated on the entire set of objects instead of an individual selection of objects.
Imported objects will NOT be reset by default and require an extra checkbox to be checked prior to executing the reset command. This is for security reasons, because resetting imported objects will erase all information linking the source objects to the objects created by migration-center during import, therefore rendering migration-center unaware of having imported the affected objects at all.
Resetting imported objects would typically be used during testing phases of a project and not during production.
It is strongly advised NOT to reset imported objects in cases where dependencies to these objects may exist in other, even future migration sets, such as newer versions of these objects or relations between these and other object which haven’t been migrated yet; this also applies if it is intended to update these objects during future migrations.
The Processed Objects view offers some new quick filter options below the common filter bar. A processed object includes any objects with the Transformed, Validated, Importer or Partially imported status; objects can be filtered quickly for any of these states using the appropriate checkboxes. Pressing the [Apply Filter] button is not needed just (de)select the checkboxes corresponding to the required status. These quick filter options also work together with the common filter bar, acting as an additional filter criterion.
In the Processed Objects view it is also possible to edit individual objects’ metadata. The [Edit Attributes] item from the context opens a new window with the currently selected objects attributes ordered as a list instead of columns. In the -Edit Attributes- window it is possible to edit any of the current objects’ attributes manually. The -Edit Attributes- window also allows applying the changes, skipping the current document, or moving to the next document in the list without closing the window.
Any manually edited object will be reset to the Transformed status. This makes a difference for objects which have been validated already; since the user has edited these objects manually, the information the object has been validated with has changed and the object will be reset to the Transformed status and require validation using the new attribute values.
Error Objects View
The Error Objects view displays all objects that failed at some point during the migration. All states an object goes through during migration have a corresponding error state, such as Transformation Error, Validation Error, etc. The Error Objects view collects all objects in any state considered as an error and provides a centralized location for tracking such documents, identifying and correcting such objects. In addition to listing Error Objects and their associated attributes, the error message for causing the object to fail be listed in addition for every object; if the error is related to any particular attributes, those attributes will be colored red and hovering over such an attribute will display an additional message about the cause of the error.
All this information makes it easy to identify and correct the error and transfer the object into the proper states allowing it to be migrated.
The Error Objects view provides the same functionalities as the Processed Objects window, since both views serve the same goal: to prepare objects to be migrated and successfully meet the required conditions.
Errors are not a final state in migration-center. If the error can be fixed, it is also possible to correct the objects affected by that error as well and start a new attempt for migrating affected objects. This procedure can be repeated until all objects with errors have been fixed one way or another:
There are five basic ways of handling errors; one of them should always be able to solve typical errors encountered during migration
If the error is caused by a transformation rule (objects in Transformation Error status), adjust the transformation rule to cover the faulty objects and re-apply it to those objects. Any objects which have already been transformed, validated or transformed successfully will not be affected and will maintain their current status and metadata
If the error cannot be fixed by adjusting transformation rules and the object is in the Transformation Error status, it may be necessary to edit objects manually. Use the <Edit Attributes> window to browse through the objects and edit the attributes manually, skipping any objects where manual editing does not apply.
If the error occurs during import due to external factors (objects in Import Error status) like software or hardware issues, network issues, target system configuration problems, proper access permission, etc.) try to remedy the error first and after that just add the migration set back to the importer and simply try to import it again. Migration-center will attempt to import all objects in the Import Error state automatically and will succeed if the cause for the error has been fixed.
If objects have failed Validation (objects in Validation Error status) check the Object Type definitions and Associations in the Transformation Rules window, make the necessary corrections and re-run the Validation process to validate these objects successfully.
Finally, if none of the above helps, or the objects having failed are not even needed to be migrated they can either be removed from the migration set or left in their respective error states (temporarily or permanently). Unless other objects depend on them, leaving unneeded objects in an error state or removing them from the migration set will not affect other objects.
The transformation rules are a set of rules that apply to all objects of a migration set, transforming the attributes of the objects inside. Transformation rules determine how the metadata provided with the source objects is going to be altered to generate new or modified attribute values, and how attributes from the source and target object models will be connected. A transformation rule consists of one or several transformation steps, where each step references one transformation function. The final transformation step will return the result for a rule.
The -Transformation rules- window can be opened by pressing the [Edit transformation] button in -Migration Sets- window, or the [Edit rules] button in both -View Processed documents- and -View Error documents- windows. Also, the -Transformation rules- window can be opened by context menu <Edit Transformation rules> the three windows where the button is present.
The -Transformation rules- window has three tabs for managing different aspects of transformation and validation, and a set of common buttons accessible from all tabs. These buttons are:
[Generate attributes] will automatically generate the simplest transformation rules based on all source attributes found for all object in the migration set. These simple rules would enable a 1:1 migration of the objects in the current migration set. Rules generated this way will behave just like any user created rule afterwards, meaning they can be edited, renamed or deleted as needed. If rules with the same name already exist, they will not be affected. This function will be further discussed in the next chapter (Attributes).
[Export] will export the current migration set’s transformation rules to a XML file for the purpose of backing up or transferring the model to a migration set located on another DB.
[Import] imports transformation rules generated by the XML files generated with the previous export function. Warning! Importing rules from a file will replace all rules configured previously!
[Ok] saves changes and closes window.
[Cancel] cancels all changes that have not been applied yet and closes window.
[Apply] saves changes without closing the window.
The |Rules| page consists of four areas, namely Rules, Rules for system attributes, Rule properties and Transformation methods. The Rules and the Rules for system attributes work the same way: selecting a rule will allow its properties and transformation methods to be edited using the controls on the right side of the window. While the former is meant to allow the user to create, edit, and delete any rule, the latter does not allow rules to be added or deleted by the user, it only allows editing the existing rules, which are predefined system rules defined by the object type of the migration set (the object type selected when creating the migration set). Attributes initially obtained by scanning (DOCUMENTUM repository or file system) are called source attributes. They are used as source for direct transformation or for generating more complex transformation rules from them. Source attributes are accessible in the Transformation methods dynamic area.
Rules list
Rules are not immediately connected to any attribute or property of the target system. In the most general sense, rules have arbitrary names and use various source attributes as input, process these using the specified transformation functions and finally output the resulting value. An actual connection between these rules and the object types and their respective attributes is created on the Associations page, which will be discussed later. This generic approach allows migration-center’s Transformation engine to adapt to a variety of content management systems which use generic concepts such as type of object and attributes/properties.
The [Generate rules] button can be useful for starting work on transformation rules. This feature will generate rules based on any source attributes it can find among the objects from the current migration set. The rules will be named using the respective attribute names and will contain only a basic transformation function. Rules generated this way will behave just like any user created rule afterwards, meaning they can be edited, renamed or deleted as needed. If rules with the same name already exist, they will not be affected.
These simple transformation rules will extract the original attribute values without performing any processing on them and would be suitable for a 1:1 migration. For most real-world migration projects at least some of these transformation rules will require changes to make them suit the requirements though. It is of course possible to ignore this feature altogether and add rules one by one, according to the requirements and possibilities of the project.
Rules for system attributes
The system attribute rules work the same way as user defined transformation rules. System rules are predefined rules and usually refer to key information that needs to be specified for the objects to be migrated and work correctly in the respective target system. System attributes depend on the type of object used in the migration set (specified during creation of the migration set and displayed in the -Transformation Rules- windows’ title bar). System rules cannot be renamed, deleted and new system rules cannot be created by the user; only the transformation rules for the predefined system rules can be edited. Here the full functionality of transformation functions is available, as for the user defined rules.
The migration-centers adapter targeted at Documentum provide following system attribute rules:
“dctm_obj_link” specifies the path(s) where a document should be linked to in the target repository. “dctm_obj_link” is also an example for a system attribute which is defined by and used internally by migration-center and does not require association with an actual target attribute. This means there is no need for an actual attribute named “dctm_obj_link” to exist in either the source or target system.
The system attribute “dctm_obj_rendition” specifies file system location to files which represent renditions. Zero or more renditions can be specified for any object and will be added by migration-center during import. This attribute is also an internal attribute used by migration-center similar to “dctm_obj_link”
“r_folder_path” and “r_object_type” can get their default information automatically (see [Generate attributes]) only inside a Documentum-to-Documentum migration set. In the migration set of a file system scan, this information must be put in by user.
Rule properties
The Rule properties allows specifying the name for a newly created rule or renaming an existing rule. Whether a rule can work with multiple values can also be set using the Multi-Value checkbox. These controls are disabled for editing if a system attribute is selected since system attributes cannot have their properties changed by the user.
Transformation methods
The Transformation methods manages the transformation steps and functions used to perform the actual processing for the selected rule. Each step corresponds to a function. Any number of steps can be added to obtain the desired result for the current rule. At least on step must be added for a transformation rule to work.
The Function drop-down list all available transformation functions. Select a function and add it as a step to by clicking the [Insert] button. Functions will always be inserted last in the list. After inserting a function, it is possible to edit, reorder, or delete steps using the respective buttons.
Inserting a new function as a transformation step will immediately open the function’s properties for editing. Configure the function’s parameters accordingly and click [OK] to save the function and add it to the list.
Function properties can be edited at any time by selecting the corresponding transformation step and clicking the [Edit] button above. This will open the same properties window mentioned above and allows the user to make the required changes.
Selecting a function from the “Function” drop-down menu and hovering mouse pointer over the help-icon (“?”) next to [Insert], a windows tooltip will give the user more information about the selected function.
Error and Warning indicators for rules:
Migration-center always checks rules when saving and looks for invalid references, functions, or unused transformation steps. Rules will be colored red for “Error” or yellow for “Warning” as a result of this verification. Migration-center cannot correct such inconsistencies because the intended purpose of the affected rules is not known to migration-center. It is up to the user to check and correct any issues with rules marked red or yellow.
In the example below, the rule called “extension” has been marked yellow by migration-center’s rule auto verification feature.
The user can select that rule to check the cause of the warning. Once a rule marked as an Error or Warning has been selected the transformation rules will also be displayed in the rule’s properties. The actual transformation step causing the error/warning will also be marked red/yellow. In the example below, migration-center warns because there is an unused function in the list. This information is shown in a tooltip when hovering the cursor over the respective transformation step.
Warnings can usually be ignored and will not affect the result of the transformation rule. However, warnings should only be ignored if it really makes sense; e.g. in the example above the user may want to leave the unused step in there, and switch between the two steps for testing purposes. For production use though it is not advised to leave any unused transformation steps because these will be computed, needlessly adding to the time required for transforming the objects.
Rules marked red, indicating errors, are rarely encountered during normal usage. The highest chance for getting Errors is when importing or copying transformation rules into the current migration set; since the newly imported transformation rules may have been created based on another migration set, they may reference source attributes which do not exist among the current objects and this may raise an error because of the invalid reference. Errors can not only result from references to invalid source attributes, but also from invalid references to previously deleted transformation steps, or wrong parameter combinations for some functions.
Marking rules and transformation steps as errors works exactly the same as described above for warnings.
Errors should never be ignored because they will definitely affect, or even prevent transformation rules from working correctly.
Transformation functions provide the functionality behind transformation rules.
Custom functions can be implemented and used along the standard functions provided by migration-center.
Functions typically have multiple input parameters, with one of them usually being the value that needs processing, and output the processed value which may be entirely different from the source value, depending on what the function is designed to do and what parameters it has specified.
Input parameters can range from referencing a source attribute to referencing the output of a previous transformation step, a value typed by the user to various function specific parameters which determine how that function should process its input our output. The number of parameters varies by function. Some functions (like Sysdate) do not have any parameters at all, while other functions have three or four parameters.
A function’s parameter always has two properties: a type and a value. The parameter’s value is determined by the parameter’s type. Parameters can be of the following types:
Source Attribute: the value will be extracted for each object's attribute individually during the transformation process. For objects which do not have the specified source attribute no transformation will take place.
User Value: acts as a constant value and will be set for all objects to which the current rule applies.
Previous step: references the value returned by a previous transformation function from the current transformation rule and uses it as input. If an input value is multi-value, the index of the value to be processed can be selected or typed into the drop-down list on the far right. By default, all values are processed for a multi-value input
The Index selection (the small dropdown list to the far right) is for working with multi-value (repeating) attributes. By default, any function will operate on all values of a multi-value attribute, but it is possible to specify a particular value by setting its index (i.e. the n-th value of a multi-value attribute) in the Index dropdown list.
In case the parameter in question is referring to a single value attribute selecting an Index will have no effect.
The Index selection list offers the following options:
All is the default option and can be used only for the first parameter and not the subsequent parameters.
A specific index ranging from 1 to n (where n is the maximum number of values for the currently referenced multi-value attribute) can also be specified. In this case the current transformation function will only process the requested value and ignore all other values, effectively returning a single value output from a multi-value input.
The Index selection offers some more options when used with the If transformation function:
It is also possible to specify All as an option for the Return if true or Return if false parameters.
When All is used in first parameter, the If is called once for each value of the first parameter (as long as it has multiple values). The corresponding index value from Return if true or Return if false will be returned after every call. This means the result will have the same number of results as the number of values of first parameter. This is the same behavior as in all other transformation functions when used with index All.
The value Any is exclusive to the If transformation function and can be used to define a condition based on a value whose index is not fixed or is simply unknown to the user within a multi-value attribute and therefore cannot be specified using the All or 1 – n settings
When Any or a specific index 1-n is used for the first parameter, the function is called only once and will return all values specified in Return if true or Return if false. If All is used in a return parameter all values will be returned, otherwise only the specific index will be returned.
Transformation function reference
All standard transformation functions provided with migration-center are described below. Full context-sensitive information is also available in the migration-center Client when inserting or editing any transformation function.
Function
Description
CalculateNewDate
CalculateNewDate() computes a new date by adding Years, Months and Days to a valid source date. The Database’s Datetime pattern must be followed by the input value. Negative integer parameters are accepted.
Example: CalculateNewDate('01.01.2001 12:32:05', 1,2,3) returns '04.03.2002 12:32:05'
Concatenate
Concatenate() concatenates up to three strings values into one and returns it as a single value Concatenate('AAA',' and ','BBB') returns "AAA and BBB".
IMPORTANT: The function returns only the first 4000 bytes of the resulted string since this is the maximum lenght allowed for an attribute value.
ConcatenateEach
Concatenates every value of the first input parameter with every value of a second parameter, resulting in a list with Count(Source Values 1) x Count(Source Values 2) elements.
Example: If Source Values 1 has the values "A", "B", "C" and Source Values 2 has the values "X", "Y" the result will be "AX", "AY", "BX", "BY", "CX", "CY".
If the result of any concatenation exceeds 4000 bytes, the function will throw an error.
ConvertDateTimezones
ConvertDateTimezones converts a given date from one timezone to another. The accepted timezones are the ones provided by Oracle, so you can see them with the following query:
SELECT DISTINCT tzname FROM v$timezone_names;
Count Values
It counts the number of values of a repeating attribute. If null or no values provided it returns 0.
GetDataFromSql
Map one or multiple source attributes with the data extracted from an external table. The external table must be located on the migration-center database in any schema that is accessible by FMEMC user.
The user FMEMC must have "select" permission on that table. The query must return a single column and must have at least one parameter and maximum 3 that will be replaced at runtime with the values of the parameter1, parameter2 and parameter3.
The parameter name in the query is any string the start with : (colon). The number of parameters in the query must match the number of parameters set in the function.
If the query returns multiple values only the first one will be taken in consideration by the transformation engine.
Example:
select user_id from mcextra.users_data where username = :username
Important Note: Since the SQL query is executed for each object in the migset, you should ensure that it is executed fast, i.e. the columns used in the where condition should be indexed.
GetDateFromString
GetDateFromString() extracts a date type expression from any string if it matches the date expression specified by the user.
Use this function to extract non-standard or partial dates from source attributes like a filename.
Example: GetDateFromString('filename 2007-Dec 14. 16:11','YYYY-MON DD. HH24:MI') will identify and extract the non-standard date format contained within the input string
GetPathLevel
GetPathLevel() understands strings representing a path and can extract specific levels from that path. The path separator character, the path level to start from and the path level up to which the function should extract the subpath can be specified as input parameters. The function will also strip leading and ending path separators from the result.
Example: GetPathLevel('/this/is/the/folder/structure','/','2','4') will parse the input string looking for "/" as the path separator, and return path levels 2-4, i.e. "is/the/folder"
GetValue
GetValue() is used to migrate attributes where the attribute's value is not supposed to be changed or to generate and set user defined values which are not present in the source data and cannot or have not been generated by other transformation functions. The GetValue() function always returns the exact same value it gets as input without altering it in any way. Examples: GetValue('user') outputs the string value "user" for all objects to which the current rule applies GetValue(filename[1]) outputs the value of the source attribute filename for all objects to which the current rule applies. For each object, the value will vary according to the actual value from that objects' source attribute named filename.
GetValueAt
It gets the value at specific index from a multi-value attribute. Index counting starts with 1. If the provided index is out of range the function returns null.
Example: GetValuesAt('a,b,c', 2) reruns 'b' and GetValuesAt('a,b,c', 4) returns null.
GetValueIndex
Get the first index number of a given value for a multi-value attribute. If no value was found 0 is returned.
The parameter "ExactMatch" specifies if exact match will be used for comparing the values. Use '1' or 'T' for exact match and '0' or 'F' for "contains" search. In any case the search is case sensitive.
Example: GetValueIndex('abc,def,ghi', 'de', 'F') returns 0
GetValueIndex('abc,def,ghi', 'de', 'T') returns 2
GetValueIndex('abc,def,ghi', 'DE', 'T') returns 0
GetValueIndex('a,b,c,b','b' ', 'F') returns 2
If
If() evaluates a logical condition and returns different outputs depending on whether the condition is found to be true or false.
A previous transformation step from the current rule, a source attribute or a user specified string are all valid arguments for both input and output values as well as for the logical condition.
The If() function can correctly evaluate conditions based on various types of data such as strings, numbers, dates, null values, etc. and offers a number of predefined conditional operators.
Length
Calculates the length of the string using Unicode characters.
Ltrim
The Ltrim function removes characters from the left of the given Source String, with all the leftmost characters that appear in the Characters to trim removed. The function begins scanning the Source String value from its first character and removes all characters that appear in the Characters to trim until reaching a character that is not in the trim expression and then returns the result. If second parameter is empty the leading spaces will be removed, i.e. Ltrim('babcde','ab') will remove the first 3 characters so the result will be 'cde'".
MapValue
MapValue() considers the input value a key, looks for a row with a matching key in a specified mapping list and returns the value corresponding to that key if a suitable match is found. Keys with no match can be reported as transformations errors (optional).
A mapping list must be defined before using a MapValue function. Mapping lists can be defined either on the Mapping lists tab in the Transformation Rules window of a migration set (case in which they would be available only to that particular migration set), or as a global mapping list (available to all migration sets) from the Manage menu in the main application window. Use the MapValue() function to define direct mappings of source attribute values to target attribute values based on simple key-value lists.
MultiValue_ReplaceNulls
MultiValue_ReplaceNulls() can replace null values in a multi-value attribute with a user defined value.
Example: MultiValue_ReplaceNulls(multi_value_input[all],'default') will replace all null values from the multi-value source_attribute named "multi_value_input" with "default"
The function can also remove null values from a multi-value attribute if no replacement string is defined.
To use this function in a rule, the rule must be a multi-value rule. Example: MultiValue_ReplaceNulls(multi_value_input[all],'') will remove all null values from the multi-value source_attribute named "multi_value_input", thereby reducing the total number of values for the multi-value attribute by the number of null values found
Multivalue_RemoveDuplicates
Remove duplicates from a multi-value attribute. If the input values are 'a', 'b', 'b', 'c', 'a' the result will be 'a', 'b', 'c'.
The function is available only for multi-value transformation rules.
RemoveDuplicates
It is provided for removing duplicates from a given string.
Example:
RemoveDuplicates('DE|RO|IT|DE|P','|') will remove duplicates form the first string by using the delimiter ‘|’ so it will return "DE|RO|IT|P.
The function can be used in combination with RepeatingToSingleValue and SingleToRepeatingValues for removing duplicated values from a repeating source attribute.
Example:
#1 RepeatingToSingleValue (countries[all], ‘|’)
#2 RemoveDuplicates(#1, ‘|’)
#3 SingleToRepeatingValues(#2,’|’)
RepeatingToSingleValue
RepeatingToSingleValue() concatenates all values of the source string value into one single string.
Optional parameters include the delimiter to be used (can be zero, one or multiple characters), the range of values which should be concatenated and a replacement string to be used in place of any NULL values the source may contain.
It is recommended to use a multi-value (repeating) attribute or previous step as source for this function
Example:
SingleToRepeatingValues(keywords[all],'|') will return value1|value2|value3.
SingleToRepeatingValues(keywords[all],'|', 2, 3) will return value2|value3.
IMPORTANT: The function returns only the first 4000 bytes of the resulted string since this is the maximum lenght allowed for an attribute value.
ReplaceStringRegex
ReplaceStringRegex() Replaces the parts of the input value that match the regular expression specified by the user with a user defined value Example: ReplaceStringRegex('AAAAA-CX-9234-BBBBB','\w{2}-\d{4}','AB-0000') will parse the input string looking for a match; according to the regex this would be a sequence of 2 letters followed by a dash and four numbers. Since the input does contain a matching part, it will be replaced with "AB-0000", and the final output of the function will be "AAAAA-AB-0000-BBBBB"
Rtrim
The Rtrim function removes characters from the right of the given Source String, with all the rightmost characters that appear in the Characters to trim removed. The function begins scanning the Source String value from its last character and removes all characters that appear in the Characters to trim until reaching a character that is not in the trim expression and then returns the result. If second parameter is empty the trailing spaces will be removed, i.e. Rtrim('cdebab','ab') will remove the last 3 characters so the result will be 'cde'.
SingleToRepeatingValues
SingleToRepeatingValues() separates a string value based on a user specified separator character and returns all resulting values as a multi-value result Use this function to transform a string of comma separated values into a multi-value attribute with multiple individual values. To use this function in a rule, the rule must be a multi-value rule Example: SingleToRepeatingValues(comma_separated[all],',') will parse the source attribute named "comma_separated" looking for commas (","), strip the commas and create a multi-value list from the resulting values.
SplitStringRegex
SplitStringRegex() is an advanced function for splitting up a string value by specifying the separator as a regular expression rather than a single character (the regex can represent a single character as well). Depending on the number of matches for the specified separator, multiple substrings can result from the function; which one of the resulting substrings the function should return can also be specified by the user.
Example: SplitStringRegex('one-(two)-three','(-\()|(\)-)','2') will split the string into substrings based on the separators described by the regex and return the second substring which is "two"
Substring
Substring() returns part of the input string. Which part of the string should be returned can be specified as a number of characters starting from a given index within the input string.
Example: Substring('teststring','3','5') returns 5 characters starting with the 3rd character, which is "ststr"
SubStringRegex
SubstringRegex() is an advanced transformation function for extracting a substring from the input value. A regular expression can be used to extract a complex substring from the input string, such as a particular name, a formatted number sequence, a custom date expression, an email address, etc. SubstringRegex('0123abc 4567 ',' \d{4} ') will return " 4567 " according to the regex defining the substring.
Sysdate
Sysdate() outputs the current system date as a value. Use this function to track the date when a document underwent transformation in migration-center. This function does not have any properties.
ToLowerCase
ToLowercase() transforms all characters from the input string value to lowercase characters
ToUpperCase
ToUpperCase() transforms all characters from the input string value to uppercase characters
Mapping lists can be used as an alternative means of transformation in case the desired result cannot be obtained using transformation functions. Usually mapping user names or permissions falls under this type of data, since there is no way to create a transformation rule which maps a list of several hundred users or permissions onto their corresponding new values individually; mapping lists can of course be applied to any attribute, not just the ones mentioned in the example.
Mapping lists are specific to one migration set and are only available for use within the migration set where they are defined. It is also possible to define so called global mapping lists which are available in all migration sets.
Mapping lists are simple 2-column lists and work based on a key-value concept, where the key is specified in the first column, and the associated value in the second column. If the MapValue transformation function is used in a transformation rule with a given mapping list and matches a key from the mapping list to its input value, it will output the corresponding value from the mapping list as the result.
A mapping list must be referenced by a “MapValue()” transformation function in at least one transformation rule on the |Rules| page to have any effect on the objects of the migration set. A mapping list will not do anything by itself.
“Exact Match” will require an exact match with the specified key if checked and will accept any string as a match as long as it contains the key if unchecked. It is recommended to check this option; uncheck it only if needed.
“Case sensitive” will consider a valid match only if the case of the input value matches the key and its case. Usually this would not be used, enable if needed.
Mapping list can be edited in-place if needed, but this is only for short lists or making minor corrections to an existing list. Usually a mapping list would be prepared using an appropriate application such as a spreadsheet, then copied and pasted into migration-center via the context menu available when right clicking into the list area.
The example above illustrates using a mapping list with the “MapValue()” function. Here the mapping list “opened by” is used through the “MapValue()” function in the transformation rule “extension to OPENED BY”. The window below shows the properties for this function.
Migration-center will interpret this transformation rule as follows:
In all objects of the migration set look up the source attribute “extension” and try to match its value to any of the following “mp3, doc, pdf, ppt”; if a match is found, replace the source value with the value corresponding to the matching key specified in the mapping list named “opened by” and output this value as the result.
Global Mapping Lists
Global Mapping Lists are mapping lists independent of any particular migration set. They are thus globally available to all migration sets across the entire application. Use global mapping lists for information which will be used by multiple/all migration sets. Global Mapping Lists can be managed from the main menu <Manage>, <Global Mapping Lists>. The functionalities and properties for creating, copying, deleting, and editing global mapping lists are identical to the ones provided for regular mapping lists, described above.
When using the “MapValue()” function, both the “local” (migration set specific) and the “global” mapping lists are available for use. Global mapping lists can be identified by the word “(global)” appended to their name.
In the |Associations| page existing transformation rules can be associated with actual object types and attributes from the target system. These associations will provide a means to validate the values generated by the transformation process, and will tell the importer how to assign these values during import
Types
To create a connection from a transformation rule to a target object type there needs to be one transformation rule which outputs the object type’s name. Select this rule from the “Get type name from attribute” drop-down list. For Documentum “r_object_type” is a predefined system rule which server this purpose, but there is no restriction to select a particular rule; any rule will work if it results in something which matches object type names and is not null.
From the “Available type definitions” drop-down list, add the required object type definitions below by pressing [Add]. The object type definitions displayed depend on what is available in the -Object Type Definitions- window. If a required object type definition is not available for creating an association here, Click [OK], add the object type definition, and come back to this page to create an association with it (See chapter 7.8 Object type definitions for details on adding object type definitions to migration-center).
Associations
Selecting an object type definition will populate, the “Target attribute” drop-down list with the attributes of the selected type. The “Rule” drop-down list shows the rules defined on the |Rules| tab. By matching pairs from these two controls and clicking [Add Association] rules can be associated with corresponding attributes. A rule and a target attribute can be associated only once per type.
In case the rules already have names that match attributes of the targeted object types, there is a much quicker way for creating these associations: just click [Auto-Associate]. This button will automatically match rules and attributes by name and add them to the list of associations. Any unneeded rules created this way can be deleted afterwards, and any rules that could not be associated automatically can still be added using the regular method described above.
Note: An association always applies to the currently selected object type in the list, not all of them! If there are multiple object types targeted for a migration, manual/automatic association procedure described above needs to be done for each object type.
Import is referring the process of transferring content and metadata obtained using the transformation rules to the target system. The importer is the job actually performing this task of outputting data from migration-center or importing data into the target system.
Importers migrate one or more migration sets. Only migration sets with objects that have successfully passed both transformation and validation can be imported.
Similar to a scanner, an importer is a job with a unique name and a set of configuration options, which is started by the user and the corresponding adapter code is executed on the specified Job Server.
Note: for details about the scanners provided with the standard product consult the appropriate documents listed in 1.3 References. For details about customized scanners or scanners provided by third parties please consult the documentation delivered with the respective adapters.
The toolbar and the context menu of the -Importers- window contain the same commands.
[Properties] opens the configuration details of that particular importer. The -Importer- properties window can also be opened by double-clicking an item in the list.
[History] opens the -History- window where more accurate data about all imports performed by the importer in question can be accessed. The log of the scans can be accessed here.
[Run] initiates the run of an importer with the currently saved configuration
[Refresh] is useful for rechecking the “last run status” of the importers listed
[New] opens a blank -Importer- properties window
[Copy] makes a copy (duplicate) of an item in the list
[Delete] deletes a configured scanner from list
Items in columns can be sorted (increasing/decreasing) by clicking on the column labels.
The -History- windows displays various information about the selected importer, such as the number of processed objects, the start and ending time and the status of each run.
Double clicking a history entry or clicking the [Open] button on the toolbar with a selected history entry opens the log file created by that run. The log file contains more information about the run of the selected job:
Version information of the migration-center Server Components the job was run with
The parameters the job was run with
Execution Summary that contains the total number of objects processed, the number of documents and folders scanned or imported, the count of warnings and errors that occurred during runtime.
The log files generated by an importer are always saved to disk and can be found in the Server Components installation folder of the machine where the job was run, e.g. …\fme AG\migration-center Server Components <version>\logs
All importers, regardless of type share some common parameters
Name: must be a unique name for the current importer
Adapter Type: displays all adapters available with the current migration-center installation. The configuration parameters for the importer depend on the adapter selected here.
Location: Job Server locations defined in the -Jobservers- window can be selected from in this drop-down list
Description: provides space to enter an optional description of what this importer does or add some comments related to the importer
A Scheduler acts as a master process which manages and runs an end-to-end migration fully automated and according to a set schedule.
A scheduler is also a job, and will use preset scanners, transformation rules and importers to run extract, process, and migrate objects to the target system automatically. A full history is available for each run, listing both objects that have been imported successfully as well as any objects which may failed during the process.
There are a number of configuration options regarding the times when such an automated migration should run, ranging from minutes to months, days of the week, or hours of the day. This makes it possible to tailor a schedule according to the needs and possibilities of the project and avoid running performance intensive migration during working hours for example.
The toolbar and the context menu of the -Schedulers- window contain the familiar items: New, Properties, Delete, Run, Refresh.
Double-clicking a scheduler in the list has the same function as the <Properties> menu command.
Items in columns can be sorted (increasing/decreasing) by clicking on the column labels.
A scheduler can be run immediately if needed, regardless of its actual schedule, through the <Run> command found in the toolbar and the context menu.
An email report can be scheduled to be sent to a certain email address in case of success or failure of the scheduler processes. For this a valid SMTP-Server and email address must be provided, and at least one of the two checkboxes flagged.
Migration-center needs to know the properties of the object types targeted during migration to correctly associate transformation rules with the attributes from these object types and also for validating the values generated by transformation functions. These target object types can be managed in migration-center Client in the -Object Types- window. The window can be opened from the main menu <Manage> - <Object Types…>.
All target object types in use should be defined before attempting to create associations or validate the objects in a migration set. For creating and testing transformation rules it is not necessary to have objects types defined at that point.
Out-of-the-box the product comes with basic object type template definitions for the adapters the customer has acquired licenses for.
Additional user/custom object types can be added to migration-center by the user.
To create a new object type template, click the [New] button in the Object Types toolbar or <New> in the context menu inside the same area. Enter the template’s name in the “Name” field; the name should be exactly the same as the actual name of the object type. The [Delete] button
In the list below the individual attributes and their properties can be defined. Since the process of defining all types and attributes by hand would most likely be too time-consuming it is possible to import an object type directly from a properly formatted CSV file. Object type definitions can be imported using the [Import] button located in the Object templates dynamic area. The imported file needs to be a CSV file (comma separated value) similar to the following example, with each row containing attribute name, data type, minimum length, maximum length, repeating flag, mandatory flag, and an optional regular expression:
id,2,0,0,0,0
first_name,2,0,0,0,0
last_name,2,0,0,0,0
email,2,0,0,0,0
gender,2,0,0,0,0
ip_address,2,0,0,0,0
animal_name,2,0,0,0,0
Valid values for data type are: 0 - Boolean, 1 - Integer, 2 - String, 3 - ID, 4 - Date and Time, 5 – Double.
Some importers provide best practices for extracting the target types from the source systems using common tools. Documentum Importer User Guide and SharePoint Importer User Guide can help you with such suggestions.
The optional regular expression that you can provide for an attribute will be used to validate the attribute values during object transformation and validation. Please provide a rule that matches all valid characters for that attribute. If the attribute value contains any characters that are not matched by the regular expression, its validation will fail. For example, the following regular expression will match all characters that are valid for a SharePoint 2013 file name:
^[^\~\"\#\%\&\*\:\<\>\?\/\\\{\|\}]+$
It will match one or more characters between begin and end of the string that are not in the list of the following characters: ~"#%&*:<>?/\{|}
You can also use the regular expression to enforce a certain data schema for the attribute values. For example, to enforce a schema with three digits followed by a minus followed by five digits you can use the following regex:
^[0-9]{3}-[0-9]{5}$
Migration-center tracks key stages throughout the lifecycles of object being migrated (such as scan, import, deletion, etc.) and stores information about the exact state of that object at that point in time. The information stored about the objects includes properties such as the status, timestamp, set of metadata, or the ID the object was imported with in the target system. The Object History serves as an internal reference to all objects that have been migrated.
The -History- window can be accessed from migration-center’s main menu <Manage>, menu item <Object History…>
A complete history is available for any job (scanner, importer, scheduler) from the respective items’ History window. It is accessible through the History button/menu entry on the toolbar/context menu. The History window displays a list of all runs for the selected job together with additional information, such as the number of processed objects, the start and ending time and the status.
Double clicking an entry or clicking the Open button on the toolbar opens the log file created by that run. The log file contains more information about the run of the selected job:
Version information of the migration-center Server Components the job was run with
The parameters the job was run with
Execution Summary that contains the total number of objects processed, the number of documents and folders scanned or imported, the count of warnings and errors that occurred during runtime.
Log files generated for each job can be found in the Server Components installation folder of the machine where the job was run, e.g. …\fme AG\migration-center Server Components <version>\logs
The amount of information written to the log files depends on the setting specified in the ‘loggingLevel’ start parameter for the respective job
Log files generated by the migration-center Client application in the event of errors can be found in the Client application’s installation folder, e.g. …\fme AG\migration-center Client <version> folder inside the installation path. These logs are not meant for the user but should be sent if requested by our technical product support staff in case there are issues preventing the Client from working normally.
The Alfresco Scanner allows extracting object such as document, folders and lists and saves this data to migration-center for further processing. The key features of Alfresco Scanner are:
Extract documents, folders, custom lists and list items
Extract content, metadata
Extract documents versions
Scanner is the term used in migration-center for an input adapter. Using a scanner such as the Alfresco Scanner to extract data that needs processing in migration-center is the first step in a migration project, thus scan also refers to the process used to input data to migration-center.
Scanners and importers work as jobs that can be run at any time, and can even be executed repeatedly. For every run a detailed history and log file are created. Multiple scanner and import jobs can be created or run at a time, each being defined by a unique name, a set of configuration parameters and a description (optional).
The Alfresco Scanner it’s not included in the standard installer of migration-center Server Components but it is delivered packaged as Alfresco Module Package (amp). This is because the Alfresco Scanner has to be installed within the Alfresco Repository Server. The following versions of Alfresco are supported (on Windows or Linux): 4.0, 4.1, 4.2, 5.2, 6.1.1, 6.2.0.
Java 1.8 is required for the installation of Alfresco Scanner.
For installing the other adapters you need during your migration process please install the Server Components as it is described in the Installation Guide. It is recommended to install the Server Components on another machine but it is also possible to install it on the Alfresco Server. In case you use the Alfresco Scanner in combination with an Importer running on another machine then the scanner should export the files on a network share that is accessible from the Server Components.
The first step of the installation is to copy mc-alfresco-adaptor-<version>.amp
file in the “amps-folder” of the alfresco installation.
The last step is to finish the installation by installing the mc-alfresco-adaptor-<version>.amp
file as it is described by the wiki guide of Alfresco under http://wiki.alfresco.com/wiki/Module_Management_Tool
Before doing this, please backup your original alfresco.war and share.war files to ensure that you can uninstall the migration-center Jobserver after successful migration. This is the only way at the moment as long the Module Management Tool of Alfresco does not support to remove a module from an existing WAR-file.
The Alfresco-Server should be stopped when applying the amp-files. Please notice that Alfresco provides files for installing the amp files, e.g.:
C:\Alfresco34\apply_amps.bat (Windows)
/opt/alfresco/commands/apply_amps.sh (Linux)
Due to a bug of the Alfresco installer under Windows, please be careful if the amp installer via apply_amps.sh works correctly! Under Alfresco 3.4, the file apply_amps.bat must be location in the alfresco location and not in the subfolder bin!
The Alfresco Scanner can be uninstalled by following steps:
Stop the Alfresco Server.
Restore the original alfresco.war and share.war which have been backed up before Alfresco Scanner installation
Remove the file mc-alfresco-adaptor-<version>.amp
from the “amps-folder”
To create a new Alfresco Scanner, create a new scanner and select Alfresco from the Adapter Type drop-down. Once the adapter type has been selected, the Parameters list will be populated with the parameters specific to the selected adapter type. Mandatory parameters are marked with an *.
The Properties of an existing scanner can be accessed after creating the scanner by double-clicking the scanner in the list, or selecting the Properties button/menu item from the toolbar/context menu. A description is always displayed at the bottom of the window for the selected parameter.
Multiple scanners can be created for scanning different locations, provided each scanner has a unique name.
Configuration parameters
Values
Name
Enter a unique name for this scanner
Mandatory
Adapter type
Select Alfresco from the list of available adapters
Mandatory
Location
Select the Jobserver location where this job should be run. Jobservers are defined in the Jobserver window. If no Jobserver has been created by the user to this point, migration-center will prompt the user to define a Jobserver Location when saving the Scanner.
Note that the Alfresco Server must be used as a Jobserver Location with default port 9701.
Mandatory
Description
Enter a description for this job (optional)
The configuration parameters available for the Alfresco Scanner are described below:
Configuration parameters
Values
username*
User name for connecting to the source repository. A user account with admin privileges must be used to support the full Alfresco functionality offered by migration-center.
Example: Alfresco.corporate.domain\spadmin
Mandatory
password*
Password of the user specified above
Mandatory
scanLocations
The entry point(s) in the Alfresco repository where the scan starts.
Multiple values can be entered by separating them with the “|” character.
Note that this value(s) needs to be according to the Alfresco Repository folder structure, ex:
/Sites/SomeSite/documentLibrary/Folder/AnotherFolder
/Sites/SomeSite/dataLists/02496772-2e2b-4e5b-a966-6a725fae727a
The scanner allows as a scan location: an entire site, a specific library, a specific folder in a library, a specific data list.
If one location is invalid the scanner will report an appropriate error to the user so it will not start.
contentLocation*
Folder path. The location where the exported object content should be temporary saved. It can be a local folder on the same machine with the Jobserver or a shared folder on the network. This folder must exist prior to launching the scanner and must have write permissions, migration-center will not create this folder automatically. If the folder cannot be found an appropriate error will be raised and logged. This path must be accessible by both scanner and importer so if they are running on different machines, it should be a shared folder.
Mandatory
exportLatestVersions
This parameter specifies how many versions from every version tree will be exported starting from the latest version to the older versions. If it is empty, not a valid number, 0 or negative, greater than the latest "n" versions, all versions will be exported.
exportContent
Setting this parameter to true will extract the actual content of the documents during the scan and save it in the contentLocation specified earlier.
This setting should always be checked in a production environment.
dissolveGroups
Setting this parameter to true will cause every group permission to be scanned as the separate users that make up the group
loggingLevel*
Sets the verbosity of the log file.
Values:
1 - logs only errors during scan
2 - is the default value reporting all warnings and errors
3 - logs all successfully performed operations in addition to any warnings or errors
4 - logs all events (for debugging only, use only if instructed by fme product support since it generates a very large amount of output. Do not use in production)
Mandatory
A complete history is available for any Alfresco Scanner job from the respective items’ –History- window. It is accessible through the [History] button/menu entry on the toolbar/context menu. The -History- window displays a list of all runs for the selected job together with additional information, such as the number of processed objects, the start and ending time and the status.
Double clicking an entry or clicking the Open button on the toolbar opens the log file created by that run. The log file contains more information about the run of the selected job:
Version information of the migration-center Server Components the job was run with
The parameters the job was run with
Execution Summary that contains the total number of objects processed, the number of documents and folders scanned or imported, the count of warnings and errors that occurred during runtime.
Log files generated by the Alfresco Scanner can be found in the Alfresco log folder on the machine where the job was run, e.g. C:\Alfresco\logs
The amount of information written to the log files depends on the setting specified in the ‘loggingLevel’ start parameter for the respective job.
The CSV & Excel Scanner is one of the source adapters available in migration-center starting with version 3.9. It scans data from CSV and MS Excel files and creates corresponding objects in the migration-center database. If the scanned data also contains links to content files, the CSV & Excel scanner can link those files to the created objects as well.
Scanner is the term used in migration-center for an input adapter. Using a scanner module to read the data that needs processing into migration-center is the first step in a migration project, thus scan also refers to the process used to input data to migration-center.
The scanner module works as a job that can be run at any time and can even be executed repeatedly. For every run a detailed history and log file are created.
A scanner is defined by a unique name, a set of configuration parameters and an optional description.
CSV & Excel Scanners can be created, configured, started, and monitored through migration-center Client, but the corresponding processes are executed by migration-center Job Server.
When scanning Excel files the “Write Attributes” permission is required otherwise the scanner will throw an “Access is denied” error.
When attempting to scan Excel files with a large number of rows and/or columns the UI might freeze until the following error is thrown:
ERROR executing job: Error was: java.lang.OutOfMemoryError: GC overhead limit exceeded
This is a limitation of the Apache POI API, and the recommendation is to convert the excel file into a CSV file.
To create a new CSV & Excel Scanner job click on the New Scanner button and select “CSV/Excel” from the adapter type dropdown list. Once the adapter type has been selected, the parameters list will be populated with the CSV & Excel Scanner parameters.
The Properties window of a scanner can be accessed by double-clicking the scanner in the list or selecting the Properties button or entry from the toolbar or context menu.
Configuration parameters
Values
Name
Enter a unique name for this scanner
Mandatory
Adapter type
Select the “Veeva” adapter from the list of available adapters
Mandatory
Location
Select the Job Server location where this job should run. Job Servers are defined in the Jobserver window. If no Job Server was selected, migration-center will prompt the user to define a Job Server Location when saving the scanner.
Mandatory
Description
Enter a description for this scanner (optional)
Configuration parameters
Values
filePath*
The full path to the CSV or MS Excel file to scan.
Since the job will be executed on the job server machine, you must provide a valid path on that machine.
Mandatory
sourceIdColumn*
The name of the column in the CSV or Excel file that contains the source ID of the object.
Note that the values in this column must be unique.
Mandatory
contentPathColumn
The name of the column in the CSV or Excel file that contains the path to the corresponding content file of the object.
versionIdentifierColumn
The name of the column in the CSV or Excel file that identifies all objects that belong to a specific version tree.
Mandatory when scanning versions.
versionLevelColumn
The name of the column in the CSV or Excel file that specifies the position of an object in the version tree.
Mandatory when scanning versions.
enrichMetadataForScannerRun
The run number of the scan run that you want to enrich.
Mandatory when using enrichment mode.
enrichMetadataPrefix
Optional prefix for the columns that will be added to the objects when running in enrichment mode.
multivalueFields
The names of the columns that will have multi-value. The values separated by a comma.
multivalueDelimiter
The delimiter will be used to separate the values of multi-value columns.
loggingLevel*
Sets the verbosity of the log file.
Values:
1 - logs only errors during scan
2 - is the default value reporting all warnings and errors
3 - logs all successfully performed operations in addition to any warnings or errors
4 - logs all events (for debugging only, use only if instructed by fme product support since it generates a very large amount of output. Do not use in production)
Mandatory
The CSV & Excel scanner supports two operation modes: normal and enhancement mode. In normal mode, you can scan objects from a CSV or Excel file and those objects are saved in the corresponding scan run as new, distinct objects in the MC database. In enhancement mode, you can scan data from a CSV or Excel file that is added to an existing scan run, i.e. it enhances an existing scan run with additional data.
In normal operation mode, you can scan objects without content, objects with content, and objects with versions from a CSV or Excel file.
Scan objects without content
In order to scan objects without content, you just specify the path to the CSV or Excel file and the name of the column that contains the unique source ID of the objects. For example:
The screenshot below shows an excerpt of CSV file you would like to scan.
You would then enter the path to the CSV file in the “filePath” parameter and enter “id” as value in the “sourceIdColumn” parameter (because the column named “id” contains the unique IDs of the objects in the CSV file).
Scan objects with content
If your source file contained a column with a file path, as seen in the next screenshot, you can enter that column name (“profile_picture” in the example) in the “contentPathColumn” in order to scan that content file along with the metadata of the object.
Scan objects with versions
In order to scan objects with versions, your source file needs to contain two additional columns: one column that identifies all objects that belong to a specific version tree (“versionIdentifierColumn”) and another column that specifies the position of an object in the version tree (“versionLevelColumn”).
In the example below, the source file contains metadata of documents that consist of several versions each. Each document has a “title” and a “document_id”. Each version has a “version_id” and a “content_file”. The combination of “document_id” and “version_id” must be unique. Since the content file path is unique for each version in this example, you could use that column as “sourceIdColumn”.
A valid configuration to scan the example source file could look like this:
There are many document management applications, which reference data in third-party-systems. If you want, for example, to archive documents from such an application, it might be useful to store that referenced data also in the archive records, in order to have independent, self-contained archive records. To achieve this, you could (1) scan your document management system with the appropriate scanner, (2) export the data from the third-party-system into a CSV file, and (3) enhance the scan run from step (1) with the data from the CSV file using the CSV & Excel scanner in enhancement mode.
The source file for an enhancement mode scan needs a column that stores the source ID of the corresponding object, i.e. the object that will be enhanced with the data in the source file, in the specified scan run.
For example, the following table shows an excerpt of an existing scan run. The source ID of the objects is stored in the column “Id_in_source_system”. In this case, the ID is the full path of the files that were scanned by the scan run.
The source CSV or Excel file for the enhancement scan should contain all the data that you would like to add to the previously scanned objects and a column with the source ID of the corresponding object, as shown in the following table:
In order to run the scanner in enhancement mode, you need to enter a valid scan run ID in the parameter “enrichMetadataForScanRun”. You can find that ID in the -History- view of a scanner:
The “sourceIdColumn” parameter should contain the name of the column with the source ID. That would be “source_id” in the example above.
You can provide an optional prefix for the columns that will be added to the scan run in the “enrichMetadataPrefix” parameter. This is helpful in the case when your source file contains columns with names that already exist in the specified scan run.
Of course you can enhance a certain scan run several times with different source files.
The scanner allows splitting values from one or multiple columns in multiple distinct values based on a delimiter.
For example, in this particular case, if you want to split the values of columns Record_Series and Classification, all you need to do is to add the column names in the multivalueFields parameter and set the "|" delimiter in multivalueDelimiter parameter:
If the delimiter between the values will not correspond with the one that was insered in the multivalueDelimiter attribute, the cell value will be no longer considered a multi-value so it will be stored as single-value.
Limitations
The user cannot set as input for multivalueFields the value from fields sourceIdColumn, versionIdentifierColumn, or versionLevelColumn because those are single-value fields.
If the parameter multivalueFields is set then parameter multivalueDelimiter must be set as well.
A complete history for any CSV & Excel Scanner job is available from the respective items’ History window. It is accessible through the History button/menu entry on the toolbar/context menu. The History window displays a list of all runs for the selected job together with additional information, such as the number of processed objects, the start and ending time and the status.
Double clicking an entry or clicking the Open button on the toolbar opens the log file created by that run. The log file contains more information about the run of the selected job:
Version information of the migration-center Server Components the job was run with
The parameters the job was run with
Execution Summary that contains the total number of objects processed, the number of documents and folders scanned or imported, the count of warnings and errors that occurred during runtime.
Log files generated by the Documentum Adapter can be found in the Server Components installation folder of the machine where the job was run, e.g. …\fme AG\migration-center Server Components <Version>\logs
The amount of information written to the log files depends on the setting specified in the ‘loggingLevel’ start parameter for the respective job.
The Documentum Scanner extracts objects such as files, folders, relations, etc. from a source Documentum repository and saves this data to migration-center for further processing. As a change in migration-center 3.2, the Documentum Scanner and Importer, are no longer tied to one another – data scanned with the Documentum Scanner can now be imported by any other importer, including of course the Documentum. Starting with version 3.2.9 objects derived from dm_sysobject are supported.
Scanner is the term used in migration-center for an input adapter. It is used to read the data that needs processing into migration-center and is the first step in a migration project, thus scan also refers to the process used to input data to migration-center in general.
A scanner works as a job that can be run at any time and can be executed repeatedly. For every run a detailed history and log file are created.
A scanner is defined by a unique name, a set of configuration parameters and an optional description.
Documentum Scanners can be created, configured, started and monitored through migration-center Client but the corresponding processes are executed by migration-center Job Server.
The Documentum Scanner currently supports Documentum Content Server versions 4i, 5.2.5 to 20.2, including service packs.
For accessing a Documentum repository Documentum Foundation Classes 5.3 or newer is required. Any combinations of DFC versions and Content Server versions supported by EMC Documentum are also supported by migration-center’s Documentum Scanner, but it is recommended to use the DFC version matching the version of the Content Server being scanned. The DFC must be installed and configured on every machine where migration-center Server Components is deployed.
For scanning a Documentum 4i or 5.2.5 source repository, DFC version 5.3 must be used since newer DFC versions do not support accessing older Documentum repositories properly. At the same time, migration-center does not support DFC versions older than 5.3, therefore DFC 5.3 is the only option in this case.
Starting from version 3.9 of migration-center additional configurations need to be made for the Documentum adapter to be able to locate Documentum Foundation Classes. This is done by modifying the dfc.conf file, located in the Job Server installation folder.
There are two settings inside the file that by default match the paths of a standard DFC install. One needs to have the path for the config folder of DFC and the other needs the path to the dctm.jar.
See below example:
wrapper.java.classpath.dfcConfig=C:/Documentum/config
wrapper.java.classpath.dfcDctmJar=C:/Program Files/Documentum/dctm.jar
The dfcConfig
parameter must point to the configuration folder. The dfcDctmJar
parameter must point to the dctm.jar file!
When scanning Documentum documents, their folder paths are also scanned, and the folder structure can be automatically re-created by migration-center in the target system. This procedure will not keep any of the metadata attached to the folder objects, such as owners, permissions, specific object types, or any custom attributes. Depending on project requirements, it may be required to do a “folder-only migration” first, e.g. for migrating a complete folder structure including custom folder object types, permissions and other attributes first, and then populate this folder structure with documents afterwards. In order to execute a folder-only migration the following steps should be performed to configure the migration process accordingly:
Scanner: on the scanner’s |Parameters| tab check the exportFolderStructure option. Set scanFolderPaths (mandatory in that case) and excludeFolderPaths (if any, optional) leave the parameter documentTypes empty to scan only the folder structure without the documents; list document types as well if both folders and documents should be scanned; now only folders will be scanned without any documents they may contain. Note: Scanning folders is not possible via the dqlString option in the scanner.
Migration set: When creating a new migration set choose a <source type to target type>(folder) object type. Now only the scanner runs containing folder objects will be displayed on the |Filescan Selection| tab. Note that the number of objects contained in the displayed scanner runs now indicates folders and not documents, which is why the number on display (folders) can be different from the total number of objects processed by the scan (if it contains other types of objects besides folders, such as documents).
Folder migration is important. It is necessary to take the approach described above when migrating folder structures with complex folder objects containing custom object types, permissions, attributes, relations, etc. This information will be lost if exportFolderStructure is not selected during scan. If the exportFolderStructure parameter was not set during a scan, it is of course possible to re-run the scan again after setting this option, or to copy/create a new scanner and scan the missing folder information with that one.
Versions (and branches) are supported by the Documentum Scanner, including custom version labels. The exportVersions parameter in the scanner’s configuration parameters determines if all versions (checked) or only current versions of documents (not checked, default setting) are scanned.
It is important to know that a consistency check of the version tree is performed by migration-center before scanning. A version tree containing invalid or missing references will not be exported at all and the operation will be reported as an error in the scanner’s log. It is not possible for migration-center to process or repair such a version structure because of the missing references.
For documents with versions, the version label extracted by the scanner from the Documentum attribute r_version_label can be changed by the means of the transformation rules during processing. The version structure (i.e. the ordering of the objects relative to their antecedents) cannot be changed using migration-center.
If objects are scanned with the exportVersions option checked, all versions must be imported as well since each object references its antecedent, going back to the very first version. Therefore, it is advised not to drop the versions of an object between the scan and the import processes since this will most likely generate inconsistencies and errors. If an object is intended to be migrated without versions (i.e. only the current version of the object needs to be migrated), then the affected objects should be scanned without enabling the exportVersions option.
Scanning large version trees
Processing a version tree is based on a recursive algorithm, which implies that all objects which are part of a version tree must be loaded into memory together. This can be problematic with very large version trees (several thousand versions). By default, the Documentum Scanner can load and process version trees up to around 2,000 versions in size. For even larger version trees to be processed the Java heap size for the Job Server must be increased according to the following steps:
Stop the Job Server
Open the wrapper.conf file located in the migration-center Server Components installation folder (by default it is %programfiles%\fme AG\migration-center Server Components <Version>)
Search for
# Java Additional Parameters
# Increase the value of this parameter it if your documentum scanner needs
# to scan a large number of versions per document. Alocate 256k for every
# 1000 versions/document.
Edit the line wrapper.java.additional.1=-Xss512k
, incrementing the default 512k by 256k for every additional 1,000 versions mc should be able to process.
E.g. for enabling processing of version trees containing up to 4,000 versions (2,000+1,000+1,000 versions), set the value to 1024k (512k+256k+256k)
Save the file
Start the Job Server
The scanner exports the primary content of all documents unless the skipContent is set to true. The locations where the content was exported can be seen in the column Content location and in the source attribute mc_content_location,If a primary content has multiple pages, the column Content location stores the location where the page 0 was exported since mc_content_location stores all locations of all pages.
Renditions are supported by the Documentum Scanner. The “exportRenditions” parameter in the scanner’s configuration parameters determines if renditions are scanned. Renditions of an object will not count as individual objects, since they are different instances of content belonging to one and the same object. The scanner extracts rendition’s contents, format, page modifiers page numberand storage location used. This information is exposed to the user via migration-center source objects attributes starting with dctm_obj_rendition* in any documents migration set that has Documentum or FirstDoc as source system.
Documentum 4i does not have the page modifier attribute/feature for renditions, therefore such information will not be extracted from a Documentum 4i repository.
Relations are supported by the Documentum Scanner. The option named exportRelations in the scanner’s configuration determines if they are scanned and added to the migration-center database. Information about them cannot be altered using transformation. Migration-center will manage relations automatically if the appropriate options in the scanner and importer have been selected. They will always be connected to their parent object and can be viewed in migration-center by right-clicking on an object in any view of a migration set and selecting <View Relations> from the context menu. The resulting dialog will list all relations of the selected object with their associated metadata, such as relation name, child object, etc.
IMPORTANT: The children of the scanned relations are not scanned automatically if they are not in the scope of the scanner. The user must ensure the documents and folders that are children in the scanned relations are included in the scope of the scanner (they are linked under the scanned path or they are returned by dqlString).
migration-center’s Documentum Scanner supports relations between folders and/or documents only (i.e. “dm_folder” and “dm_document” objects, as well as their respective subtypes). “dm_subscription” type objects, for example, although supports relations from a technical point of view, will be ignored by the scanner because they are relations involving a “dm_user” object. Custom relation objects (i.e. relation-type objects which are subtypes of “dm_relation”) are also supported, including any custom attributes they may have. The restrictions mentioned above regarding the types of objects connected by a relation also apply to custom relation objects.
As an alternative to scanning relations as they are in Documentum, the scanner offers the possibility to scan the child related documents as renditions of the parent document. For that, the parameter “exportRelationsAsRendtions” should be checked. This requires “scanRelations” to be checked as well. You can filter the relations that will be scanned as renditions by setting the relation names in the parameter “relationsAsRenditionNames”. If this is not set, all relations to documents will be processed as renditions.
Documentum Virtual Documents are supported by the Documentum Importer. The option named exportVirtualDocs in the configuration of the scanner determines if virtual documents are scanned and exported to migration-center.
There is a second parameter related to virtual documents, named maintainVirtualDocsIntegrity. This option will allow the scanner to include children of VDs which may be outside the scope of the scanner (paths to scan or dqlString) in order to maintain the integrity of the VD. If this parameter is disabled, any children in the VD that are out of scope (they are not linked under the scanned path or they are not returned by dqlString) will not be scanned and the VD may be incomplete.
The VD binding information (dmr_containment objects) are always scanned and attached to the root object of a VD regardless of the maintainVirtualDocsIntegrity option. In this way it is possible to scan any missing child objects later on and still be able to restore the correct VD structure based on the information stored with the root object.
The exportVDVersions options allows exporting only the latest version of the VD documents. This option applies only to virtual documents since the exportVersions option applies only to normal documents.
The exportVersions option needs to be checked for scanning Virtual Documents (i.e. if the exportVirtualDocuments option is checked) even if the virtual documents themselves do not have multiple versions, otherwise the virtual documents export might produce unexpected results. This is because the VD parents may still reference child objects that are not current versions of those respective objects. This is not an actual product limitation, but rather an issue caused by this particular combination of scanner options and Documentum’s VD features, which rely on information related to versioning.
The Snapshot feature of virtual documents is not supported by migration-center.
The Documentum Scanner also supports audit trail entries for documents and folders. To enable scanning audit trails, the scanner parameter exportAuditTrail must be checked; in this case the audit trail entries of all documents and folders within the scope of the scan will be scanned as Documentum(audittrail) type objects, similarly to Documentum(document) or Documentum(folder) type objects.
There are some additional parameters used for fine tuning the selection and type of audit trail entries the scanner should consider:
auditTrailType – is the Documentum audit trail object type. By default, this is dm_audittrail but custom audit trail types (derived from dm_audittrail) are supported as well
auditTrailSelection – is used for narrowing down the selection of audit trail records since the number of audit trails can grow large especially in old systems, but not necessarily all audit trail entries may be relevant for a migration to a new system. This option accepts a DQL conformant WHERE clause as would be used in a SELECT statement. If this returns no results, all audit trail objects of scanned documents and folders will be scanned. Example 1: event_name in ('dm_save', 'dm_checkin') Example 2: event_name = 'dm_checkin' and time_stamp >= DATE('01.01.2012', 'DD.MM.YYYY')
auditTrailIgnoreAttributes – contains a comma separated list of dm_audittrail attributes the scanner should ignore. Again, this option can be used to eliminate audit trail information that is not needed for migration right from the scan.
Because there are target systems that don’t allow importing audit trail objects, Documentum scanner allows exporting audit trail objects to PDF renditions of the scanned documents. Exporting audit trails objects as PDF renditions applies only to documents.
The following scanner parameters are used for applying this feature:
exportAuditTrailAsRendition – when checked the audit trail entries are written in a PDF files that are saved as renditions for the documents. This parameter can be checked only when exportAuditTrail is checked and skipContent is not checked. If not checked the audit trail entries are exported as Documentum(audittrail).
auditTrailPerVersionTree – this apply only when exportAuditTrailsAsRendition is checked. When it is checked one PDF is generated for all audit trail entries of all versions of the document. The audit trails entries related to the deleted versions are exported as well. The rendition is assigned to the latest version in the tree. When not checked, one PDF rendition is generated for every version in the tree. In this case the audit trails entries related to the deleted versions are not exported because those versions are not exported by the scanner since they don’t exist anymore in the repository.
Exporting audit trail per version tree may have a big impact on the scanner performance. That’s because audit trail entries for documents are queried by the attribute dm_audittrail.chronicle_id. The performance might be dramatically improved by adding an index in the underlying table DM_AUDITTRAIL_S for the column CHRONICLE_ID.
Scanning aspects is supported with the latest update of the migration-center Documentum Scanner. Attributes resulting from aspects are scanned automatically for any document or folder type object within scope of the scan.
The notation used by the Documentum Scanner to identify attributes which result from aspects appended to the objects being scanned is the same as used by Documentum, namely <aspect_name>.<aspect_attribute>.
Any number of aspects per document/folder, as well as any number of attributes per aspect are supported.
After a scan has finished, attributes scanned from aspects are available for further processing just like any other source attribute and can be used normally in any transformation rule.
Aspects are supported only for document and folder type objects!
Starting with version 3.2.8 Update 2 of migration-center the possibility of scanning PDF annotations has been added to the Documentum Scanner. When activating “exportAnnotations” the scanner will scan the related “dm_note” objects together with DM_ANNOTATE relations. The “dm_note” objects are scanned as normal objects since the DM_ANNOTATE relations are exported as MC relation having the relation_type = “DctmAnnotationRelation”.
During delta migration, the scanner is able to identify the annotation changes and scan them accordingly.
Scanning the documents and folders related comments is possible and can be activated (default is deactivated) by changing the scanner parameter “exportComments” to true.
The best-known use case for documents and folders comments is within xCP (xCelerated Composition Platform) application, but it can also be used in custom WDK Documentum solutions.
The comment objects will be scanned as MC relation objects and can be seen in the MC client by opening the relations view of a scanned object. They will have the value of RELATION_TYPE as “CommentRelation”. All comment related attributes that have values will be scanned as attributes of these relations.
For performance reason, when a document has more versions, the comment relations will be attached only to the first document version that had comments (since all document versions share the same comments).
During delta migration, the scanner is able to identify comment changes, based on modifications of “i_vstamp” so it will rescan the corresponding document with all its comments (the first version document that had comments – see paragraph above) even if the document did not change.
To be able to scan the document’s comments it is necessary that the DFC used to have a valid Global Registry configured, because the adapter is using the “CommentManager” BOF service to read them.
Objects that have changed in the source system since the last scan are scanned as update objects. Whether an object in a migration set is an update or not can be seen by checking the value of the Is_update column – if it’s 1, the current object is an update to a previously scanned object (the base object). There are some things to consider when working with the update migration feature:
Updated objects are detected based on the r_modify_date and i_vstamp attributes. If one of these attributes has changed, the object itself is considered to have changed and will be scanned and added as an update. Typically any action performed in Documentum changes at least one if not both of these attributes, offering a reliable way to detect whether an object has changed since the last scan or not; on the other hand, objects changed by third party code/applications without touching these attributes might not be detected by migration-center as having changed.
Objects deleted from the source after having been migrated are not detected and will not be deleted in the target system. This is by design (due to the added overhead, complexity and risk involved in deleting customer data).
Updates/changes to primary content, renditions, metadata, VD structures, and relations of objects will be detected and updated accordingly.
To create a new Documentum Scanner job, specify the respective adapter type in the Scanner Properties window – from the list of available adapters “Documentum” must be selected. Once the adapter type has been selected, the Parameters list will be populated with the parameters specific to the selected adapter type.
The Properties window of a scanner can be accessed by double-clicking a scanner in the list or selecting the Properties button/menu item from the toolbar/context menu.
A detailed description is always displayed at the bottom of the window for the currently selected parameter.
When editing a field in the |Parameters| tab (like loggingLevel i.e.), a description/help/hint appears in the lower part of the window.
A complete history is available for any Documentum Scanner job from the respective items’ History window. It is accessible through the History button/menu entry on the toolbar/context menu. The list of all runs for the selected job together with additional information, such as the number of processed objects, the starting time, the ending time and the status are displayed in a grid format.
Double clicking an entry or clicking the Open button on the toolbar opens the log file created by that run. The log file contains more information about the run of the selected job:
Version information of the migration-center Server Components the job was run with
The parameters the job was run with
Execution Summary that contains the total number of objects processed, the number of documents and folders scanned or imported, the count of warnings and errors that occurred during runtime.
Log files generated by the Documentum Scanner can be found in the chosen “logs” folder at the installation of the Server Components of the machine where the job was run, e.g. …\fme AG\migration-center Server Components <Version>\logs\Dctm-Scanner
the amount of information written to the log files depends on the setting specified in the loggingLevel start parameter for the respective job.
The eRoom Adapter consists of one module: the eRoom Scanner, which scans and exports objects from an eRoom site and inputs them into migration-center.
The eRoom scanner supports eRoom 7 or higher. It uses the eRoom XML Query Language (EXQL) to access eRoom data and methods through Simple Object Access Protocol (SOAP) requests, which is only available starting with eRoom 7.
Scanner is the term used in migration-center for an input adapter. Using the eRoom Scanner Module to read the data that needs processing into migration-center is the first step in a migration project, thus scan also refers to the process used to input data to migration-center.
The scanner module works as a job that can be run at any time and can even be executed repeatedly. For every run a detailed history and log file are created.
A Scanner is defined by a unique name, a set of configuration parameters and an optional description.
eRoom Scanners can be created, configured, started and monitored through migration-center Client, but the corresponding processes are executed by migration-center Job Server.
The eRoom scanner is designed to scan and export objects from eRoom. Currently the eRoom scanner support scanning items of type file (including versions); folder and room information is also appended to the files, thus making it possible to restore an existing hierarchy later.
The eRoom scanner can be targeted at a facility and/ or a room. From there it will start scanning for objects recursively, meaning that in case of a facility it will scan all rooms and from each room it will scan all objects contained in a certain room (folders or files).
The eRoom scanner adheres to migration-center standard behavior, meaning it runs as a job configured by the user via parameters, and saves all extracted objects/items’ content on disk and the corresponding metadata to the migration-center database in the process. After a scan has completed, the newly scanned objects are available for further processing in migration-center.
The eRoom scanner does not work with OpenJDK 8, it only works with the Oracle version of Java SE 8.
To create a new eRoom Scanner job, specify the respective adapter type in the Scanner Properties window – from the list of available adapters, “eRoom” must be selected. Once the adapter type has been selected, the Parameters list will be populated with the parameters specific to the selected adapter type, in this case the eRoom adapter.
The Properties window of a scanner can be accessed by double-clicking a scanner in the list, or selecting the Properties button or entry from the toolbar or context menu.
A complete history is available for any eRoom Scanner job from the respective items’ History window. It is accessible through the History button/menu entry on the toolbar/context menu. The History window displays a list of all runs for the selected job together with additional information, such as the number of processed objects, the start and ending time and the status.
Double clicking an entry or clicking the Open button on the toolbar opens the log file created by that run. The log file contains more information about the run of the selected job:
Version information of the migration-center Server Components the job was run with
The parameters the job was run with
Execution Summary that contains the total number of objects processed, the number of documents and folders scanned or imported, the count of warnings and errors that occurred during runtime.
Log files generated by the eRoom Adapter can be found in the Server Components installation folder of the machine where the job was run, e.g. …\fme AG\migration-center Server Components <Version>logs
The amount of information written to the log files depends on the setting specified in the ‘loggingLevel’ start parameter for the respective job.
You can use our Filesystem Scanner in several use cases, e.g. to scan files from file repositories or to scan files exported into a filesystem from a DMS or other third-party system. Scanner is the term used in migration-center for an input adapter. Using a scanner module to read the data that needs processing into migration-center is the first step in a migration project, thus scan also refers to the process used to input data to migration-center.
The scanner module works as a job that can be run at any time and can even be executed repeatedly. For every run a detailed history and log file are created.
A scanner is defined by a unique name, a set of configuration parameters and an optional description.
Filesystem Scanners can be created, configured, started and monitored through migration-center Client, but the corresponding processes are executed by migration-center Job Server.
A basic scanner configuration supposes setting a list of folders to be scanned. Local paths and UNC paths are supported. The scanner will scan all files located inside given folders and their subfolders. The common windows attributes like filename, file path, creation date, modify date, content size etc. are extracted and saved in MC database as metadata.
To scan folders as distinct objects in migration-center the flag scanFolders needs to be checked. In this case all subfolders of the given folder list will be saved in migration-center database together with their metadata.
Additional metadata stored in external files can be used to enrich the files and folders originating from the file system. This file needs to contain the XML schema used by migration-center format and adhere to the naming convention expected by migration-center. The format for such a file is described below.
Although the files’ contents are XML, the extension of such metadata files can be arbitrary and is NOT recommended to be set to XML in order to prevent potential conflicts with actual files using the XML extension. The file extension migration-center should consider as metadata files can be specified in the mcMetadataFileExtension parameter of the Filesystem scanner. If the option has been set and the metadata file for a file or folder cannot be found, an appropriate warning will be logged.
If metadata files and/or folders are not required in the first place, clear the mcMetadataFileExtension parameter to disable additional metadata processing entirely. If some files require additional metadata and others don’t, configure the mcMetadataFileExtension parameters as if all files had metadata. In this case it is safe to ignore the warnings related to missing metadata files for the documents where metadata files are not required or available.
One metadata file should be available in the source system for each file and folder which is supposed to have additional metadata provided by such means. The naming for the metadata file has to follow a simple rule:
for files: filename.extension.metadataextension
for folders: .foldername. metadataextension
E.g.: If the document is named report.pdf, and the extension for the metadata files is determined to be fme, then the metadata file for this document needs to be called report.pdf.fme and fme has to be entered as the value for the mcMetadataFileExtension parameter.
If the folder is Migration the metadata file for it must be .Migration.fme.
A sample metadata file’s XML structure is illustrated below. The sample content could belong to the report.pdf.fme file mentioned above. In this case the report.pdf file has 4 attributes, each attribute being defined as a name-value pair. There are five lines because one of the attributes is a multi-value attribute. Multi-value attributes are represented by repeating the attribute element with the same name, but different value attribute (i.e. the keywords attribute is listed twice, but with different values)
The number, name and values of attributes defined in such a file are not subject to any restrictions and can be chosen freely. The value of the name attribute will appear accordingly as a source attribute in migration-center.
Multi-value attributes can be defined by repeating the attribute element with the same name, but different value attribute.
Once the document and any additional metadata have been scanned, migration-center no longer differentiates between attributes originating from different sources. Attributes resulting from metadata files will appear alongside the regular attributes extracted from the file system properties but they are prefixed with “xml_“. The full transformation functionality is available for these attributes.
In addition to metadata obtained from the standard file system properties and metadata added via external metadata files, the Filesystem Scanner can also extract metadata from supported document formats. This type of metadata is called external metadata. The corresponding functionality can be toggled in the Filesystem Scanner via the “scanExtendedMetadata” parameter.
The scanner can parse the following file formats for metadata:
HyperText Markup Language
XML and derived formats
Microsoft Office document formats
OpenDocument Format
Portable Document Format
Electronic Publication Format
Rich Text Format
Compression and packaging formats
Text formats
Audio formats
Image formats
Video formats
Java class files and archives
The mbox format
Metadata extracted from these files will be added to the respective documents metadata. Extended metadata source attributes will be prefixed with “ext_“ to indicate their source. Apart from their naming, these attributes are not handled differently by migration-center. Just as with attributes provided via metadata files, extended attributes will appear and work alongside the standard file system attributes. The full transformation functionality is available for these attributes.
Although there aren’t any versioning features available in a standard file system, The Filesystem scanner can detect when source objects in the file system have changed and can version these upon import to Documentum. This can be configured through the "scanChangedFilesBehaviour" parameter configured in the Scanner. This parameter can take the following values:
1 (default) - the changed file will be added as update object, meaning that the existing object in migration-center will be updated (i.e. overwritten) with the new attributes of the modified object.
2 the changed file will be added as a new version of the existing object. This means that a new version of the document will be created, its parent will be set to the previous version and the level in version tree will be incremented by 1.
3 the changed file will be added as a new object, and is not related in any way to the previous existing object in migration-center. If the user does not change the object’s name in migration-center, the document is imported in the target repository with the same name and linked under the same folder as the original object.
A file is detected as changed if either its content or its metadata file has been modified since the previous scan.
A folder is detected as changed if its metadata file has been modified since the previous scan. In this case it is saved as an update.
Starting with migration-center 3.2.3, the Filesystem Scanner can also generate versioning information from attribute values provided through additional metadata (e.g. via the fme metadata files).
The Filesystem Scanner offers two new parameters which can be set to the names of the source attributes containing the explicit versioning information the scanner should use.
The two parameters are versionIdentifierAttribute and versionLevelAttribute. They should be used together, and work as follows:
versionIdentifierAttribute specifies the name of the source attribute which identifies a version group/tree. Setting this parameter will activate the versioning based on metadata. Must be used together with versionLevelAttribute The specified source attribute’s value must be the same for all objects that are part of a version group/tree. Any string value is permitted as long as it fulfills the previous requirement.
versionLevelAttribute specifies the name of the source attribute which identifies the order of objects within a group of versions. Must be used together with versionIdentifierAttribute. The specified source attribute’s values must be distinct for all objects within the same version group/tree, i.e. with the same versionIdentifierAttribute value. The specified source attribute’s values must be positive numbers. A decimal point followed by one or more digits is also permitted, as long as the value makes sense as a number.
Setting these parameters to attributes containing valid information will allow the Filesystem Scanner to build internal information describing how the scanned objects should be linked together to form versions, similar to how the scanners targeting dedicated DMS can extract the native version information from there. This information can then be understood and processed by migration-center importers which support versioning for their respective target systems.
To create a new Filesystem Scanner job, specify the respective adapter type in the Scanner Properties window – from the list of available adapters “Filesystem” must be selected. Once the adapter type has been selected, the Parameters list will be populated with the parameters specific to the selected adapter type, in this case the Filesystem adapter’s.
The Properties window of a scanner can be accessed by double-clicking a scanner in the list, or selecting the Properties button/menu item from the toolbar/context menu.
A detailed description is always displayed at the bottom of the window for the currently selected parameter.
The maximum length of a path for a file system object is now 512 bytes, up from 255 bytes used in previous versions of migration-center all max supported string lengths are specified in bytes. This equals characters as long as the characters are single-byte characters (i.e. Latin characters). For multi-byte characters (as used by most languages and scripts having other than the basic Latin characters) it might result in less than the equivalent number of characters, depending on the number and byte length of multi-byte characters within the string (as used in UTF-8 encoding).
A complete history is available for any Filesystem Scanner job from the respective items’ History window. It is accessible through the History button/menu entry on the toolbar/context menu. The History window displays a list of all runs for the selected job together with additional information, such as the number of processed objects, the start and ending time and the status.
Double clicking an entry or clicking the Open button on the toolbar opens the log file created by that run. The log file contains more information about the run of the selected job:
Version information of the migration-center Server Components the job was run with
The parameters the job was run with
Execution Summary that contains the total number of objects processed, the number of documents and folders scanned or imported, the count of warnings and errors that occurred during runtime.
Log files generated by the Filesystem Adapter can be found in the Server Components installation folder of the machine where the job was run, e.g. …\fme AG\migration-center Server Components <Version>\logs
The amount of information written to the log files depends on the setting specified in the ‘loggingLevel’ start parameter for the respective job.
The SharePoint Online Scanner allows extracting documents, folders and their related information from Microsoft SharePoint Online libraries.
SharePoint Online Scanners can be created, configured, started and monitored through migration-center Client, while the corresponding processes are executed by migration-center Job Server and the migration-center SharePoint Scanner respectively.
Scanner is the term used in migration-center for an input adapter. Using a scanner such as the SharePoint Scanner to extract data that needs processing in migration-center is the first step in a migration project, thus scan also refers to the process used to input data to migration-center.
Scanners and importers work as jobs that can be run at any time and can even be executed repeatedly. For every run a detailed history and log file are created. Multiple scanner and import jobs can be created or run at a time, each being defined by a unique name, a set of configuration parameters and a description (optional).
To install the main product components, consult the migration-center Installation Guide document.
The migration-center SharePoint Online scanner requires installing an additional component besides the main product components.
This additional component needs the .NET Framework 4.7.2 installed and it’s designed to run as a Windows service and must be installed on all machines where the a Job Server is installed.
To install this additional component, it is necessary to run an installation file, which is located within the
SharePoint folder of your Job Server install location, which is by default C:\Program Files (x86)\fme AG\migration-center Server Components <Version>\lib\mc-sharepointonline-scanner\CSOM_Service\install.
To install the service run the install.bat file using administrative privileges. You will need to start it manually for the first time, afterwards the service is configured to start automatically at system startup.
To uninstall the service run the uninstall.bat file using administrative privileges.
The CSOM service must be run with the same user as the Job Server service so that it has the same access to the export location.
To create a new SharePoint Online Scanner, create a new scanner and select SharePoint Online from the Adapter Type drop-down. Once the adapter type has been selected, the Parameters list will be populated with the parameters specific to the selected adapter type. Mandatory parameters are marked with an *.
The Properties of an existing scanner can be accessed after creating the scanner by double-clicking the scanner in the list or by selecting the Properties button/menu item from the toolbar/context menu. A description is always displayed at the bottom of the window for the selected parameter.
Multiple scanners can be created for scanning different locations, provided each scanner has a unique name.
The configuration parameters available for the SharePoint Scanner are described below:
The SharePoint Online scanner can use SharePoint CAML queries for filtering which objects are to be scanned. Based on the entered query, the scanner scans documents and folders in the lists/libraries.
For details on how to form CAML queries for each version of SharePoint please consult the official Microsoft MSDN documentation.
When using the CAML query parameter “query”, the parameters "excludeListAndLibraries", "includeListAndLibraries", "scanSubsites", "excludeSubsites", "excludeFolders", "includeFolders" must not be set. Otherwise the scanner will fail to start with an error message.
The SharePoint Online scanner can extract permission information for documents and folders. Note that only unique permissions are extracted. Permissions inherited from parent objects are not extracted by the scanner.
There is a configuration file for additional settings regarding the SharePoint Online Scanner. Located under the …/lib/mc-sharepointonline-scanner/ folder in the Job Server install location it has the following properties that can be set:
A complete history is available for any SharePoint Scanner job from the respective items – History - window. It is accessible through the [History] button/menu entry on the toolbar/context menu. The -History- window displays a list of all runs for the selected job together with additional information, such as the number of processed objects, the start and ending time and the status.
Double clicking an entry or clicking the Open button on the toolbar opens the log file created by that run. The log file contains more information about the run of the selected job:
Version information of the migration-center Server Components the job was run with
The parameters the job was run with
Execution Summary that contains the total number of objects processed, the number of documents and folders scanned or imported, the count of warnings and errors that occurred during runtime.
Log files generated by the SharePoint Scanner can be found in the Server Components installation folder of the machine where the job was run, e.g. …\fme AG\migration-center Server Components <Version>\logs
The amount of information written to the log files depends on the setting specified in the ‘loggingLevel’ start parameter for the respective job.
An additional log file is generated by the SharePoint Online Scanner.
The location of this log file is in the same folder as the regular SharePoint Online scanner log files with the name: mc-sharepointonline-scanner.log.
Scanner is the term used in migration-center for an input adapter. Using the IBM Domino scanner module to read the data that needs processing into migration-center is the first step in a migration project, thus “scan” also refers to the process used to input data to migration-center.
The IBM Domino Scanner is available since migration-center 3.2.5. It extracts documents, metadata and attachments from IBM Domino/Notes applications and use them as input for migration-center. After the scan the generated data can be processed and migrated to other systems supported by the various migration-center importers.
The currently supported formats of the documents export are Domino XML (dxl), Hypertext Markup Language (html), ARPA Internet Text Message (rfc 822/eml) and HTML from the EML. In addition, the scanner is capable of generating a Portable Document Format (pdf) rendition based on a DXL file of that document.
The IBM Domino Scanner currently supports all IBM Notes/Domino versions 6.x and above. Documents from applications that have been built with older IBM Notes/Domino versions can be extracted without any limitation.
The module works as a job that can be run at any time and can even be executed repeatedly. For every run a detailed history and log file are created.
A Scanner is defined by a unique name, a set of configuration parameters and an optional description.
IBM Domino scanners can be created, configured, started and monitored through migration-center client, but the corresponding processes are executed by migration-center Job Server.
To be able to use the IBM Domino scanner additional software must be installed on the migration-center job server that executes IBM Domino scanner operations.
The scanner is available in a 64-bit and 32-bit version, each of which has different requirements.
The 32-bit version of the scanner relies on IBM Notes whereas the 64-bit version of the scanner uses IBM Domino. Because IBM Domino lacks some libraries used by the scanner to generate specific document formats (e.g. “EML”, “HTML” and “RTF”) the 64-bit version of the scanner can currently not generate any other formats than “DXL” and “PDF”.
The 32-bit scanner requires:
Microsoft Windows based 32-bit/64-bit Operating System.
Java Runtime Environment (JRE) 1.8.x (32-bit)
IBM Notes 9.0.1 or later
IBM Notes 9.0.1 or later must be installed on the migration-center job server. Please refer to chapter “9. IBM Notes/Domino installation and configuration” for detailed instructions about installing and configuring IBM Notes.
Microsoft Visual C++ 2017 Redistributable Package (x86) The Microsoft Visual C++ 2017 Redistributable Package can be downloaded from: https://aka.ms/vs/16/release/vc_redist.x86.exe
The 64-bit scanner requires:
Microsoft Windows based 64-bit Operating System.
Java Runtime Environment (JRE) 1.8.x (64-bit)
IBM Domino 9.0.1 or later IBM Domino version 9.0.1 must be installed on the migration-center job server. The community version may be used. Please refer to chapter “9. IBM Notes/Domino installation and configuration” for detailed instructions about installing and configuring IBM Domino.
Microsoft Visual C++ 2017 Redistributable Package (x64) The Microsoft Visual C++ 2017 Redistributable Package can be downloaded from: https://aka.ms/vs/16/release/vc_redist.x86.exe
If the scanner shall be used for Domino documents containing Object Linking and Embedding (OLE) objects, Apache OpenOffice 4.1.5 or later must be installed. Please refer to the section “Exporting OLE objects” for details.
If documents extracted from IBM Domino/Notes applications should be transformed into PDF (PDF, PDF/a-1a, PDF/a-1b) by the scanner, a second system, a “rendition server” is required. The rendition server must have the optional PDF Generation Module installed. For details about setting up a rendition server based on PDF Generation Module refer to the PDF Generation Module manual.
IBM Domino stores all date and time information based on GMT/UTC internally.
If a date and time value is converted into a text value for display purposes in an IBM Domino API based software solution, the date and time value is always displayed using the client’s current timezone settings.
As the scanner is an IBM Domino API based software product, the migration-center job server’s timezone setting will be used to extract all date and time values from IBM Domino documents, i.e. they will be available in the migration-center database and always be related to the migration-center Job Server’s timezone.
If you require date and time values to be shown based on a specific timezone inside the migration-center database, set migration-center Job Server’s timezone accordingly.
If you require “normalized” date and time values in migration-center, set the migration-center Job Server’s timezone to GMT/UTC.
If the installer is run separately from the migration-center installer, it must be run with administrative privileges. If it’s run as a normal user, the installer cannot update configuration files and set environment variables as required.
After a scan has completed, the newly scanned documents along with their metadata, attachments and the content of the richtext fields they contain are available for further processing in migration-center.
To create a new IBM Domino Scanner job, specify the respective adapter type in the scanner properties window – from the list of available adapters, “Domino” must be selected. Once the adapter type has been selected, the list of parameters will be populated with the parameters specific to the selected adapter type.
The properties window of a scanner can be accessed by double-clicking a scanner in the list or by selecting the [Properties] button for the corresponding selected entry on the toolbar or context menu.
The IBM Domino Scanner for fme migration-center supports the generation of different output formats for a Domino document. Each of the formats has its advantages and disadvantages. Which one best suits your needs can be determined by closely looking at your requirements, i.e. how users should work with the documents once migration into the new target system has been completed.
The formats currently supported will be described in detail in the following sections.
The .MSG and eml2HTML formats require an additional license for creating.
The Domino XML (DXL) format is an XML format that has been defined by IBM/Lotus. It has been around for a while (at least since Domino version 6). A DXL file represents an entire Domino document including all its metadata, richtext elements and attachments.
The generation of DXL files from Domino documents relies on core functionality of Domino’s C-API as provided by IBM.
DXL files can be used to extract any document information from Domino applications. Based on special helper applications that are not part of Domino/Notes, a DXL file can be re-imported back into the original Domino application in order to read its content or otherwise work with the document at a later point in time.
DXL is especially useful whenever Domino documents should be transformed into PDF. The “PDF Generation Module” which is available as an add-on product for the IBM Domino Scanner makes use of the DXL format for PDF generation.
The ARPA Internet Text Message format (RFC 822) describes the syntax for messages that are sent among computer users (e-mail). The EML file format adheres to RFC 822.
Any Domino document – not only e-mails – can be transformed into EML format based on core functionality of Domino’s C-API as provided by IBM. An EML file contains the document’s content, its metadata as well as its attachments.
The EML format does not guarantee preservation of the Domino document’s integrity. Information from the document maybe lost or changed during conversion into EML (see Domino C-API documentation).
The major benefit of EML is that – since version 8 of Notes – an EML file can be opened in Notes again without the need for special helper applications.
Hypertext Markup Language (HTML) files can be generated for Domino documents based on two different approaches both of which will now be described.
Hypertext Markup Language (HTML) – direct approach
The Domino C-API offers the ability to directly transform a domino document into an HTML file.
As with the EML file format, the direct HTML generation based on the Domino C-API has some issues regarding completeness of information. One example are images that had been embedded into richtext fields. Those images will not be visible in the HTML file created.
EML to Hypertext Markup Language (EML2HTML) – indirect approach
Besides the direct approach described in the previous section, HTML can also be created from the EML format.
In most scenarios that the Domino scanner has been tested on, the result of the indirect approach had a much higher quality than that of the direct approach.
Generating EML2HTML requires a third party library that needs to be purchased separately. Please contact your fme sales representative for details.
The MSG format is the format that is used by Microsoft Outlook to store e-mails on the filesystem. It’s a container format that includes the e-mail and all its attachments.
Generating MSG requires a third-party library that needs to be purchased separately. Please contact your fme sales representative for details.
The Domino scanner can extract the entire Domino document (not just the document’s richtext fields) as a single RTF file. This functionality is provided by the Domino C-API.
All the PDF formats preserve the Domino document in a read-only form that looks like the document opened in Notes.
The PDF generation module takes care of collapsible sections, fixed-width images and tables and other Domino specific features that PDF printing might interfere with.
If required, all the Domino document’s attachments can be re-attached to the PDF file that was generated (see parameter “embedAttachmentsIntoPDF”). Thereby, the entire e-mail will be preserved in a read-only format that can be viewed anywhere at any time requiring a standard PDF reader only.
If the IBM Domino documents contain OLE embedded objects, Apache OpenOffice 4.1.5 or later must be installed and configured on the migration-center job server in order to properly extract the OLE objects.
Install Apache OpenOffice 4.1.5 on the migration-center job server.
Add the folder containing the “soffice.exe” file to the system’s search path. This folder is typically:
<Apache OpenOffice installation folder>/program
Add the following entry to the file “wrapper.conf” inside the migration-center server components installation folder and replace <#> with an appropriate value for your installation:
wrapper.java.classpath.<#>=<Apache OpenOffice installation folder>/program/classes/*.jar
Open the configuration file „documentDirectoryRuntimeConfiguration.xml“ located in subfolder „lib/mc-domino-scanner/conf“ of your migration-center server components‘ installation folder in your favorite editor for XML files.
Go to line 83 of the file which looks like:
<parameter name="exportOLEObjects">false</parameter>
and replace “false” with “true”.
The entry inside the configuration file should look like:
<parameter name="exportOLEObjects">true</parameter>
If you want to use a different port for the Apache OpenOffice server than the default port (8100), go to line 84 of the file:
<!--<parameter name="apacheOpenOfficePort">8100</parameter>-->
Uncomment it and and replace “8100” with the portnumber to use, e.g “1234”.
The entry inside the configuration file should look like:
<parameter name="apacheOpenOfficePort">1234</parameter>
Save the configuration file.
While PDF generation can be activated in the scanner’s configuration (parameters “primaryDocumentFormat”, “secondaryDocumentFormats” and “embedAttachmentsIntoPDF”), the setup of PDF generation requires and the additional “PDF Generation Module”.
The “PDF Generation Module” is licensed separately.
From a technical perspective, the “PDF Generation Module” requires an additional system (“rendition server”). This system will be used to print any IBM Notes document using a PDF printer driver based on IBM Notes’ standard print functionality. The process for PDF generation is as follows:
The scanner submits a request to create a PDF rendition for an existing Domino document or a DXL file to PDF Generation Module on the rendition server.
PDF Generation Module creates a PDF rendition of the document.
If PDF generation was successful, PDF Generation Module will save the PDF to a shared network drive.
PDF Generation Module will signal success or failure to the scanner.
Setting up the rendition server requires additional configurative actions. For each IBM Domino application/database template that was used to create documents, an empty database needs to be created based on this template and either made available locally on the rendition server or on the IBM Domino server.
Each of these empty databases needs to be prepared for PDF printing. As necessary configuration steps vary depending on the application that is being worked on, they cannot be described here.
Please contact your fme representative should you wish to implement PDF generation for migration of an IBM Domino application/database.
A complete history is available for any IBM Domino Scanner job from the respective item’s history window. It is accessible through the [History] button/menu entry on the toolbar/context menu.
The History window displays a list of all runs for the selected job together with additional information, such as the number of processed objects, the start and ending time and the status.
Double clicking an entry or clicking the [Open] button on the toolbar opens the log file created by that run. The log file contains more information about the run of the selected job:
version information of the migration-center Server Components the job was run with
the parameters the job was run with
the execution summary that contains the total number of objects processed, the number of documents and folders scanned or imported, the count of warnings and errors that occurred during runtime
Log files generated by the IBM Domino Scanner can be found in the server components installation folder of the machine where the job was run, e.g. …\fme AG\migration-center Server Components <Version>\logs
The amount of information written to the log files depends on the setting specified in the ‘loggingLevel’ start parameter for the respective job.
The following issues with the MC Domino Scanner are known to exist and will be fixed in later releases:
The scanner requires that the temporary directory for the user running MC Job Server Service exists and that the user can write to this directory. If the directory does either not exist or the user does not have write permission to the directory, the creation of temporary files during document and attachment extraction will fail. The logfile will show error messages like
„INFO | jvm 1 | 2014/10/02 12:06:26 | 12:06:26,850 ERROR [Job 1351] com.think_e_solutions.application.documentdirectory… - java.io.IOException: The system cannot find the path specified“.
To work around this issue, make sure the temporary folder exists and the user has write permission for this folder. If the MC Job Server is started manually as a normal user then the Temp folder should be C:\Users\Username\AppData\Local\Temp. Therefore, if the MC Job Server is run as a service by the Local System account, the folder is one of the following:
For the 32-bit version of Windows:
C:\Windows\System32\config\systemprofile\AppData\Local\Temp
For the 64-bit version of Windows:
C:\Windows\SysWOW64\config\systemprofile\AppData\Local\Temp
If a document is exported from IBM Domino but the related entries in the mc database cannot be created (e.g. because an attribute’s value exceeds the maximum number of characters allowed for a field in the mc database), the related files can be found in the filesystem (inside the export directory). If this document is scanned again, it will be treated as a new document, not as an update.
If the scanner parameter “relationType” is set to “relation”, relations will be automatically deleted by migration-center if they do not exist anymore. If the scanner parameter “relationType” is set to “object”, objects representing relationships cannot be deleted if the relation is invalidated.
Example: If a document had one attachment when scanned in scanner run #1 and that attachment was removed from the document before scanner run #2, the scanner cannot remove the object representing the “attachment” relation between document and attachment (created in scanner run #1) in scanner run #2.
If a PDF rendition is requested and DXLUtility receives the request to generate the rendition but isn’t able to import the DXL file into the appropriate IBM Domino database on the rendition server, it’s likely that the shared folder used to transfer DXL and PDF files between the scanner and PDF Generation Module cannot be read by the user running PDF Generation Module on the rendition server.
The scanner will crash the Java VM if the parameter “exportCompositeItems” is set to “true” and the log level in log4j.xml (located in subdirectory “conf” of the scanner installation directory) is set to “ERROR”.
The 64-bit version of the scanner relies on IBM Domino. As Domino lacks the required libraries to export “EML”, “HTML” or “RTF”, the 64-bit version of the scanner cannot export documents in any other format than “DXL” or “PDF”. If other formats are required, the scanner’s 32-bit version needs to be run based on IBM Notes instead.
The following table lists all (relevant) Domino attribute types.
The scanner parameter “excludedAttributeTypes” is a logical “OR” of all types that should be excluded from the scan.
The Database scanner can access SQL compliant databases and extract information via user specified SQL SELECT statements.
Note that the Database Scanner can extract content from a database or use the content files which are specified by the user during the transformation process via the mc_content_location system attribute.
Scanner is the term used in migration-center for an input adapter. Using a scanner module to read the data that needs processing into migration-center is the first step in a migration project, thus scan also refers to the process used to input data to migration-center.
The scanner module works as a job that can be run at any time and can even be executed repeatedly. For every run a detailed history and log file are created.
A scanner is defined by a unique name, a set of configuration parameters and an optional description.
Database Scanners can be created, configured, started and monitored through migration-center Client, but the corresponding processes are executed by migration-center Job Server.
The Database scanner can access SQL compliant databases and extract information via user specified SQL SELECT statements. Since many content management systems rely on some type of SQL compliant database to manage their data, the Database scanner can also be used as a generic interface for extracting information from unsupported/obsolete/custom built content management systems. The types of database management systems supported are not limited to any particular vendor or brand, as access happens via JDBC, hence any SQL compliant database having a (compatible) JDBC adapter available can be accessed and queried.
Some common content management system features supported by migration-center Database scanner are metadata, including system metadata such as permission information, owner, creator, content path, etc, as well as version information, as long as these types of information can be obtained from the respective system’s SQL database. The information extracted by the database scanner is stored in the migration-center database and can be processed, transformed, validated and imported just like any other type of scanned information.
Note that the Database Scanner can extract content from a database stored in BLOB/CLOB fields. Alternatively, the content files corresponding to the objects can be specified by the user during the transformation process via the mc_content_location system attribute.
Depending on the way the content is stored, it may be necessary to extract the content to the filesystem first by other means before migration-center can process it. For the mc_content_location system attribute any of the available transformation functions can be used, so it is easy to generate a value resembling a path pointing to the location of the object’s content file, which a migration-center importer can use to import the content. A good practice would be to export content files to the filesystem using the object’s unique identifier as the filename, and then build the path information based on the known path and the objects unique identifier. This location would need to be accessible to the Job Server running the import which will migrate the content to the new target system.
The types of database management systems supported are not limited to any particular vendors or brands, as access happens via JDBC, hence any SQL compliant database having a (compatible) JDBC adapter available can be accessed and queried.
Before using the database scanner, the appropriate JDBC driver needs to be installed and configured for the Job Server. To configure a JDBC driver all required Java jar files and Java library paths need to be added to the "... \migration-center Server Components \jdbc.conf" file. The file can be edited with any text editor and contains all information needed for configuration as each setting is described in the file itself.
The JDBC driver for Oracle 11g is delivered with the Database scanner already preconfigured and ready to use.
To create a new database Scanner job, specify the respective adapter type in the Scanner Properties window – from the list of available adapters, “Database” must be selected. Once the adapter type has been selected, the Parameters list will be populated with the parameters specific to the selected adapter type, in this case the Database Scanner.
The Properties window of a scanner can be accessed by double-clicking a scanner in the list or selecting the Properties button or entry from the toolbar or context menu.
The “queryFile” is an xml file containing a series of SQL queries that will be run sequentially by the scanner for extracting information from the database.
At least two of the queries are mandatory to be configured in this file: one is for selecting the unique identifier of the rows/objects which need to be exported as, and the second query to extract the metadata for each row/object returned by the previous query.
If objects have multiple versions and the related information also needs to be extracted (i.e. a row/object is a version of another row/object), some additional queries are required.
The number of queries for extracting metadata for objects or versions is not limited. Multiple queries can be specified to gather and extract various information related to objects.
Example 1: a simple query file
The xml file is composed for <query> elements that contain the queries that will be run by scanner for extracting objects and metadata. The <query> element may have the following attributes (some of them mandatory):
Example 2: a more complex query file
The scanner will validate the queries configuration file against the following xml schema:
If the parameter “scanUpdates” is not checked, the scanner will only scan new objects or new versions of existing objects and it will ignore all objects that were previously scanned, i.e. that already exist in the MC database (based on their ID in source system).
If the parameter “scanUpdates” is checked, the scanner will scan new objects or new versions of existing objects and it will detect if it should scan existing objects as updates. If an object will be scanned as an update depends on several factors:
If there are no attributes specified in the “deltaFields” parameter, the scanner will scan every object that already exists in the MC database as an update.
If one or multiple attributes are specified in the “deltaFields” parameter, the scanner will scan an object that was previously scanned as an update only if a value of a delta field in the source database is different than the corresponding value in the MC database. If all values (of all fields defined in “deltaFields”) in the source database match the values in the MC database, then the object will not be scanned as an update, it will just be ignored.
The field names in the “deltaFields” are case sensitive so you should define them as they are scanned by the scanner.
Since version 3.2.8 Update 2 of migration-center the database adapter supports extracting BLOB or CLOB content from databases. In order for this feature to work the main-content or version-content query type must be specified in the query file.
The structure of the queries is as follows:
The column for BLOB/CLOB is mandatory and it must have the alias BLOB_CONTENT or CLOB_CONTENT so the scanner is able to process the value correctly and extract the content.
The columns [nameOfFile] and [nameOfExtenstion] are optional. In case they are set, the scanner will use the values to build the name of the file where the content will be exported. The scanner avoids overwriting exiting files by adapting the file name to be unique in the export folder. Also, the characters that area not allowed in the file name will be replaced. Filename and extensions are also saved as source attributes so they can be used in the transformation engine. For avoiding name conflicts with the other attributes, the user can set aliases for these columns. If the columns [nameOfFile] and [nameOfExtenstion] are missing the scanner will use the value of the id in source system as a filename.
Multiple contents for a single object are allowed. If the query returns multiple rows the content returned by every row is exported. All the paths will be stored in the source attribute “mc_content_location” and the path from the last returned row is set as primary content in the column “content_location”.
Here is an example query file:
Although it is still possible to use the database scanner to scan Excel files, we recommend to use our CSV & Excel scanner instead. The CSV & Excel scanner is faster and there are no tweaks necessary to get the JDBC-ODBC bridge working in Java later 1.7.
To scan excel files MS Office or at least Excel ODBC drivers need to be installed on the Job Server machine. To install the Excel ODBC driver one of the easiest ways is to download and install the 32bit version of the Microsoft Access Database Engine 2010 Redistributable that includes the necessary drivers and which can be downloaded from Microsoft’s website.
The connectionURL for should have the following format:
jdbc:odbc:DRIVER={Microsoft Excel Driver (*.xls, *.xlsx, *.xlsm, *.xlsb)};DBQ=
PATH-TO-EXCEL-FILE
;ReadOnly=1
Note that all extensions (*.xls, *.xlsx, *.xlsm, *.xlsb) are required.
Here is an example XML Query File:
When a column in the sheet contains mixed data (numbers and strings) the ODBC driver uses the first few rows to detect the column type. If the first values are numbers, the ODBC driver tries to handle all the subsequent values as numbers as well and therefore the values that are strings (not numbers) will be ignored, resulting in a “null” value being scanned by migration-center.
To avoid this problem, the parameter “IMEX=1” must be set in the metadata query:
A complete history is available for any Database Scanner job from the respective items’ History window. It is accessible through the History button/menu entry on the toolbar/context menu. The History window displays a list of all runs for the selected job together with additional information, such as the number of processed objects, the start and ending time and the status.
Double clicking an entry or clicking the Open button on the toolbar opens the log file created by that run. The log file contains more information about the run of the selected job:
Version information of the migration-center Server Components the job was run with
The parameters the job was run with
Execution Summary that contains the total number of objects processed, the number of documents and folders scanned or imported, the count of warnings and errors that occurred during runtime.
Log files generated by the Database Scanner can be found in the Server Components installation folder of the machine where the job was run, e.g. …\fme AG\migration-center Server Components <Version>\logs
The amount of information written to the log files depends on the setting specified in the ‘loggingLevel’ start parameter for the respective job.
The SharePoint Scanner allows extracting documents, list items, folders, list/libraries and their related information from Microsoft SharePoint sites.
Supports Microsoft SharePoint 2007/2010/2013/2016 documents, list items, folders, list/libraries
Extract content(document), metadata
Extract versions
Exclude specified content types
Exclude specified file types
Exclude specified columns (attributes)
Calculate checksum during scan to be used later for validating the imported content (in combination with importers supporting this feature)
The SharePoint Scanner is implemented mainly as SharePoint Solution running on the SharePoint Server, with the Job Server part managing communication between migration-center and the SharePoint component.
SharePoint Scanners can be created, configured, started and monitored through migration-center Client, while the corresponding processes are executed by migration-center Job Server and the migration-center SharePoint Scanner respectively.
Scanner is the term used in migration-center for an input adapter. Using a scanner such as the SharePoint Scanner to extract data that needs processing in migration-center is the first step in a migration project, thus scan also refers to the process used to input data to migration-center.
Scanners and importers work as jobs that can be run at any time, and can even be executed repeatedly. For every run a detailed history and log file are created. Multiple scanner and import jobs can be created or run at a time, each being defined by a unique name, a set of configuration parameters and a description (optional).
The migration-center SharePoint Scanner requires installing an additional, separate component from the main product components. The migration-center SharePoint Scanner is a SharePoint Solution which manages the scan (extraction) process from Microsoft SharePoint Server. This component will need to be installed and deployed manually on the machine hosting the Microsoft SharePoint Server. The required steps are detailed in this chapter.
To install the main product components consult the migration-center Installation Guide document.
To install the migration-center SharePoint Scanner, read on.
The migration-center SharePoint Scanner is implemented as a SharePoint Solution, a functionality supported only with Microsoft SharePoint Server 2007 or newer.
Since the migration-center SharePoint Scanner Solution must be installed on the same machine as Microsoft SharePoint Server, the range of Windows operating systems supported is the same as those supported by Microsoft SharePoint Server 2007-2013 respectively. Please consult the documentation for Microsoft SharePoint Server 2007-2016 for more information regarding supported operating systems and system requirements.
Administrative rights are required for performing the required uninstallation, installation and deployment procedures described in this chapter.
Connect to the SharePoint Server (log in directly or via Remote Desktop); in a farm, any server should be suited for this purpose.
Copy the McScanner.wsp file from <migration-center Server Components installation folder>/mc-sharepoint-scanner/Realease_SPVersion to a location on the SharePoint Server
Open an administrative Command Prompt
Navigate to C:\Program Files\Common Files\Microsoft Shared\Web Server Extensions\<Hive Folder>\BIN
Use the STSADM tool to install the SharePoint Solution part of the SharePoint Scanner STSADM –o addsolution –filename <path to the file copied at step 2>\McScanner.wsp
For SharePoint 2010, 2013 and 2016 an alternative installation using PowerShell is possible and can be used if preferred:
Connect to the SharePoint Server (log in directly or via Remote Desktop); in a farm, any server should be suited for this purpose.
Copy the McScanner.wsp file from <migration-center Server Components installation folder>/mc-sharepoint-scanner to a location on the SharePoint Server
Open the SharePoint Management Shell from the Start menu
Use the following PowerShell commands to install the SharePoint Solution part of the SharePoint Scanner Add-SPSolution <path to the file copied at step 2>\McScanner.wsp The output should be like
Having installed the SharePoint Solution it is now time to deploy it. Due to differences in the various SharePoint versions’ management interfaces, the procedure differs slightly depending on the version used. Follow the steps below corresponding to the targeted SharePoint version:
SharePoint 2007:
Open SharePoint Central Administration
Go to Operations
Under Global Configuration, click Solution Management
Click McScanner.wsp and follow instructions to deploy the solution.
SharePoint 2010, 2013 and 2016:
Open SharePoint Central Administration
Go to System Settings
Under Farm Management, click Manage Farm Solutions
Click McScanner.wsp and follow the instructions to deploy the solution
Verify the solution works correctly after deployment by calling the following URL in a web browser:
http://<your sharepoint farm>/_vti_bin/McScanner.asmx?wsdl
If the output looks like picture below, deployment was successful and the SharePoint Scanner is working.
Since the SharePoint Scanner is split between two components (the mc Job Server part running in Java, and the SharePoint Solution part running on IIS), these two components need some additional configuration to communicate between one another in case the SharePoint site is configured to run over HTTPS using the SSL protocol.
In this case the issuer of the server’s SSL certificate must set as a trusted certification authority on the JVM used by the Job Server to allow the Job Server component of the SharePoint Scanner to trust and connect to the secure SharePoint site.
Follow the steps below to register the certification authority with the JVM:
Export the certificate as a .cer file
Transfer the file to the machine running the Job Server
Open a command prompt
Import the certificate file to the Java keystore using the following command (use the actual path corresponding to JAVA_HOME instead of the placeholder; the below is one single command/line!) JAVA_HOME\bin\keytool –import –alias <set an alias of your choice, e.g. SP2013> -keystore ..\lib\security\cacerts –file <full path and name of certificate file from step 2>
Enter “changeit” when asked for the password to the keystore
The information contained in the certificate is displayed. Verify the information is correct and describes the certification authority used to issue the SSL certificate used by the secure SharePoint connection
Type “y” when prompted “Trust this certificate?”
“Certificate was added to keystore” is displayed, confirming the addition of the CA from the certificate as a certification authority now trusted by Java.
Restart the Job Server
Repeat the above steps for all machines if you have multiple Job Servers with the SharePoint Scanner running.
To create a new SharePoint Scanner, create a new scanner and select SharePoint from the Adapter Type drop-down. Once the adapter type has been selected, the Parameters list will be populated with the parameters specific to the selected adapter type. Mandatory parameters are marked with an *.
The Properties of an existing scanner can be accessed after creating the scanner by double-clicking the scanner in the list or selecting the Properties button/menu item from the toolbar/context menu. A description is always displayed at the bottom of the window for the selected parameter.
Multiple scanners can be created for scanning different locations, provided each scanner has a unique name.
When scanning from SharePoint, the “exportLocation” parameter must be set.
For ensuring proper functionality of the content export there are a few considerations to keep in mind:
Regarding the path: The export location should (ideally) be a UNC path that points to a shared location (e.g., \\server\fileshare). If a local system path is used (D:\Temp), this path will be relative to the SharePoint Server where the WSP solution is running, and NOT to the Job Server machine.
Regarding the credentials: For accessing the specified file share the SharePoint scanner will use the credentials provided in the scanner configuration. Therefore, the same user used to do the migration (e.g., sharepoint\mcuser) must have write permissions on the file share.
Starting from version 3.3 of migration-center the SharePoint scanner is able to use SharePoint CAML queries for filtering which objects are to be scanned. Based on the entered query, the scanner scans documents, folders and list items in the lists/libraries, which are specified in the parameter “includeLibraries”. If the parameter “includeLibraries” contains *, the query applies to all lists/libraries within the site.
For details on how to form CAML queries for each version of SharePoint please consult the official Microsoft MSDN documentation.
When using the CAML query parameter “query” the “excludeContentTypes” parameter must be empty. Otherwise the scanner will fail to start with an error message.
A complete history is available for any SharePoint Scanner job from the respective items’ –History- window. It is accessible through the [History] button/menu entry on the toolbar/context menu. The -History- window displays a list of all runs for the selected job together with additional information, such as the number of processed objects, the start and ending time and the status.
Double clicking an entry or clicking the Open button on the toolbar opens the log file created by that run. The log file contains more information about the run of the selected job:
Version information of the migration-center Server Components the job was run with
The parameters the job was run with
Execution Summary that contains the total number of objects processed, the number of documents and folders scanned or imported, the count of warnings and errors that occurred during runtime.
Log files generated by the SharePoint Scanner can be found in the Server Components installation folder of the machine where the job was run, e.g. …\fme AG\migration-center Server Components 3.6\logs
The amount of information written to the log files depends on the setting specified in the ‘loggingLevel’ start parameter for the respective job.
Additional logs are generated by the SharePoint Solution part of the SharePoint Scanner on the server side. The location of this log file can be configured through the file C:\Program Files\Common Files\Microsoft Shared\Web Server Extensions\15\CONFIG\ Migration_Center_SP_Scanner.config .
Open the file with a text editor and edit the line below to configure the path to the log file
<file type="log4net.Util.PatternString" value="C:\MC\logs\%property{LogFileName}" />
Only change the path to the file, do not change the name! (the %property{LogFileName} part).
This document is a guide on the auto classification module for migration-center 3.
Creating transformation rules on vast amounts of documents, can be a laborious task. This module can support the user by automatically classifying documents to predefined categories. Configuring this module might require basic knowledge about machine learning principles.
The installation process can be performed in a few minutes. But before installing the module, two mandatory prerequisites must be installed.
The module requires a Java runtime environment of version 8 or higher and Python v3.7. Make sure to configure the environment variables PATH and PYTHONPATH for Python during the installation process in the interactive installation dialog. To do this, you need to check the box “Add Python 3.7 to PATH” at the first installation window. If you are unsure, read the official installation instructions.
You can configure the environment variables later, but it is mandatory for the module. More information can be found in the official Python documentation. If the variables are not configured, the module might not start or work as intended.
Along with the Python language, the installer installs the package manager “pip”. “pip” is a terminal software and needs to be executed in the Windows CMD shell.
If the use of OCR (optical character recognition) is intended, Tesseract with version 4 or 3.05 must be installed (see Installing TessaractOCR). Two other use cases require the installation of the graphical library Graphviz (see Installing Gaphviz) or the Oracle Instant Client for database access.
In addition, it is possible to use a dedicated Apache Tika instance. If you like to do that please refer to section Installing Apache Tika.
The operation of the Auto Classification Module requires configuration files. The samples subdirectory contains a set of example files.
The installation of the auto classification module can be summarized in seven steps.
Download and extract the software code from the zip archive onto your hard drive. Copy the files to a directory of your choice.
Open a CMD window and navigate to the path where the software was copied to or open a command window from the Windows Explorer.
Install all Python package requirements by executing the following statement in the CMD:
pip install -r requirements.txt
(optional) Install Oracle Instant Client (see here)
(optional) Install TesseractOCR (see Installing TesseractOCR)
Create the “config.ini” file, as described in section Deploying the auto classification module.
Provide a classifier configuration file. The configuration schema of the file is specified in Classifier configuration.
Start the module by executing the init file in the CMD:
python __init__.py {path to config.ini}
The script accepts the file path to the “config.ini” file as the only parameter. If the file path is not provided, the script assumes that the “config.ini” file is in the same directory.
Open the GUI in a web browser:
http://{server ip}:{WSGI port}/app
After executing the script, the service boots. Please wait until the service has started successfully, before executing any further operations. The CMD windows needs to stay open.
If additional resources are available or the classification is time critical, it is advised to install the AC-Module and Apache Tika with TesseractOCR on two different servers, although all programs can be executed on the same machine.
TesseractOCR is a software for Optical Character Recognition. It extracts text from image files and images embedded in PDF and Office documents. Tesseract should be used if any scanned PDF files are to be classified or documents contain images or diagrams with valuable information.
If scanned PDF files are to be classified, but OCR is not installed, the Auto Classification Module will not be able to extract any text from those files. On the other hand, if it is known that documents contain images with valuable text for the classification process, OCR should be used, as well. Valuable text or valuable words in a document are those words that might be unique for a small size of documents and make the prediction of the correct category easier.
First, download the Windows installer from the official GitHub repository. When the download is completed, execute the installer. When asked what language packs should be downloaded, select the languages that are going to be classified.
Also, you can add additional languages later on.
Add the installation path into your environment Path variable.
First, download the Linux installer from the official GitHub repository or from apt. An installation instruction can be found in the project’s wiki.
By default, the module is shipped with the capability to download Apache Tika from the official download page on first use. Tika will be started on the same machine as the module and no changes of installation procedures are generally required.
If no connection to the internet can be established, the same procedure as starting Apache Tika from a dedicated server can be applied.
To use Apache Tika on a dedicated server, please download the tika-server.jar file from the official repository first.
Afterwards, the tika-server.jar file must be executed and the port must be open for outside communication.
The connection details must be provided to the module via “config.ini”. The configuration file “config.ini” holds three variables that need to be edited. The variables can be found in the Tika section.
“tika_client” needs to be set to true.
“tika_client_port” needs to be set to the correct port number.
“host” needs to be set to the correct name / IP address.
If Tika runs on port 9000 in a dedicated process on the same machine, the configuration looks as follows:
[TIKA]
tika_client: true
tika_client_port: 9000
host: 127.0.0.1
Graphviz is an additional library for creating charts and plots. It is not mandatory and is only required if the attribute structure of a document repository is investigated and a chart should be created. If Graphviz is properly set up, the script cli_show_distribution.py
will automatically create a chart of the attribute structure.
First, Graphviz needs to be downloaded and installed.
Second, the Python wrapper for Graphviz needs to be installed in the Python environment.
On Linux OS, pygraphviz can be installed easily with pip in the CMD:
pip install pygraphviz
For Windows OS, an installation file is provided. Usually, the installation file is already used when installing all library requirements from the “requirements.txt” file. The installation file is a whl file and can be found in the directory “install_packages/”. The whl file can be installed with the CMD command:
pip install pygraphviz-1.5-cp37-cp37m-win_amd64.whl
The supplied whl file is compiled for Windows OS with 64 bit and Python 3.7 installed. Other compiled whl files for Windows can be downloaded from the GitHub repository.
The auto classification module offers a powerful functionality to enhance migration project. However, using the module can be tricky. Therefore, this document provides in-depth information for using and configuring the module in a production environment.
The module needs to be deployed as a service. Auto classification is a memory consuming task and it is advised to run the module on a dedicated server with access to as much memory as possible. If OCR is performed, it is advised to use a GPU, because character recognition on a CPU is very time consuming.
To deploy the auto classification module, a configuration file must be configured. The file can be placed in any accessible UNC file path. If no configuration file is provided during the initial startup, the module will look for a configuration file in the working directory. In this case, the file must be named config.ini
.
The following two chapters provide information on how to set up SSL encryption and a valid path to an algorithm configuration file.
Template files are provided in the sample directory of the module.
The configuration file contains two port properties. One property is defined in the FLASK_SERVER section and the other in the WSGI_SERVER section.
The Flask server is only used for development.
Therefore, if you like to change the port number of the AC module, set the port property under the WSGI_SERVER section. The definition can look like this:
port: 443
By default, the port is set to 3000
. This means that every interaction with the module goes through port 3000. This includes API requests and interacting with the web app.
The module can provide encryption of HTTP traffic with SSL. It is hardly advised to use this feature to prevent any data interception. Please keep in mind that either HTTPS or HTTP can be used. A dual usage of both modes is not supported.
SSL encryption can be activated by defining the following line in the SSL chapter of the configuration file:
ssl.enabled: true
To use SSL encryption, it is mandatory to provide a valid SSL certificate. The required properties are:
ssl.certificate_file_path
ssl.key_file_path
If you do not have access to a valid certificate, you can create one on your own. You can use the program keytool to create a self-signed certificate. The program is shipped with the Java runtime and can be found in the ${JAVA_HOME}/bin
folder.
To be able to use the auto classification module, a classifier configuration file must be provided. The file can be specified with the property
file_path
in the config.ini file under the section ALGORITHM_CONFIG
The file’s schema and configuration options are described thoroughly in the section Classifier configuration below.
The module is a Python program. Therefore, starting the module is very easy. Just enter the following command:
python __init__.py {file path to config.ini}
A classifier is a combination of an algorithm with a transformation list. Before a classifier can be used, it must be configured and trained.
As a basic understanding, one must know that an algorithm cannot be optimal for every situation, although it might be sophisticated and very powerful. Each algorithm has advantages and disadvantages and one algorithm can solve one problem better than another. Therefore, the most significant boost and most suitable adjustment can be achieved by providing only the most meaningful data to an algorithm. This is what a transformation list can be employed for.
The use of a transformation list is aimed to select the most meaningful words (feature selection) and create additional data from files (feature extraction).
A transformation list can be interpreted as a pipeline for the words of a document before they are submitted to an algorithm.
To explain the value of transformation lists and their relationship to algorithms, a simple example is depicted. The figure below displays the relationship between algorithms and transformation lists. Each entity has a unique id within their type. The list with id 1 enables three transformations:
lower-case
tf-idf
stopwords
The second list (id:2) uses the same transformations and adds the usage of OCR. It is not important in which order the transformations are listed as transformation lists are only used to enable transformations and not to define an ordered sequence.
Next to the transformation lists are three algorithms. Each algorithm points to exactly one transformation list, but one transformation list can be referenced by many algorithms. Based on these relationships, one can reason that three classifiers are defined. If each classifier is tested with the same documents, each of them will compute different results.
A second example displays how transformation lists are applied to documents. During the stages of training, test and prediction, raw documents are supplied. These unprocessed documents need to be transformed by the pipeline of the transformation list.
First, the stop words of the documents are removed. Followed by the replacement of all upper-case letters by their lower-case equivalents. Third, the tf-idf values are computed. In the end, the documents are processed and can be delivered to the SVM. The order of pipeline transformations may change on which transformations are selected.
All classifiers of the auto classification module are configured in a single XML file. The main XML elements are
algorithms
transformation-lists
common-config
The following chapters aim to explain the configuration possibilities in detail.
Also, a template and sample XML file are provided.
Inside the “algorithms” element the individual algorithms are specified. The supported algorithms are:
SVM (element name: “svm”)
Naïve Bayes Multinomial (element name: “naive-bayes-multinomial”)
Logistic regression (element-name: “logistic-regression”)
All algorithms must have an id attribute (attribute name : “id”) with a unique value and an attribute with the correspondent transformation list (attribute name: “transformationListId”). Some algorithms support the prediction of multiple attributes (called multi-output). If multi-output behaviour is desired, the attribute “multi-output” must be set to true.
Algorithm
multi-output support
SVM
Yes
Naïve Bayes Multinomial
No
Logistic regression
Yes
Additionally, every algorithm offers attributes to configure its behavior.
The SVM supports the following attributes:
kernel (value of “linear”, “poly” or “rbf”)
degree (positive integer, affects only poly-kernel)
gamma (positive double, affects only rbf-kernel)
c (positive float)
Naïve Bayes Multinomial offers the following attributes:
alpha (positive float, affects only Naïve Bayes)
A transformation list is a group of transformation functions. It must have a unique id value. Each transformation function operates on the content of documents and transforms the content to a specific output. They can be used to optimize the process of feature selection. The order in which they are specified in the XML file does not matter. The functions are applied before an algorithm is trained. Not every function is supported by every algorithm:
SVM
Naïve Bayes Multinomial
lower-case
Yes
Yes
tf-idf
Yes
Yes
stopwords
Yes
Yes
length-filter
Yes
Yes
n-gram
Yes
Yes
ocr
Yes
Yes
pdf-parser
Yes
Yes
token-pattern-replace-filter
Yes
Yes
document-frequency
Yes
Yes
On the following pages every transformation function is described.
XML attribute name
Description
Parameters
Example code
lower-case
Transforms every upper-case letter to his lower case equivalent.
/
<lower-case></lower-case>
tf-idf
Uses the TF-IDF algorithm on the document content.
/
<tf-idf></tf-idf>
document-frequency
Filters the words by their frequency in a document. If a word has a frequency lower than the min-value or greater than the max-value, it is not selected for further processing. At least one value must be provided when using this function.
min-value (positive integer)
max-value (positive integer)
<document-frequency min-value="2" max-value="50">
</document-frequency>
n-gram
Splits the document content into n-grams of n = length. The default are uni-grams with a length of 1.
length (positive integer)
<n-gram length="1"></n-gram>
stopwords
Analyses the document content for stop words and removes them from the content.
Child element of “languages” (please consult the example code)
<stopwords> <languages> <language name="german">
</language> <language name="english">
</language> </languages> </stopwords>
length-filter
Ignores all words that do not have the specified minimum length.
min-length (positive integer)
<length-filter min-length="4">
</length-filter>
ocr
Performs OCR (object character recognition) analysis on images in a document. The library TesseractOCR in version 3.04 is used for this purpose. The analysis uses a lot of CPU power and can require several minutes for a single document (depending on the number and size of images in the document).
language (three letter language code as defined by TesseractOCR)
enable-image-processing (boolean)
density (Integer)
<ocr
language="deu"
enable-image-processing="true"
density=”300>
</ocr>
pdf-parser
Extracts the text content from PDF files. If used in combination with OCR, extracted images are automatically analyzed.
extract-inline-images (Boolean)
extract-unique-inline-images-only (Boolean)
sort-by-position (Boolean)
ocr-dpi (Integer)
ocr-strategy (String)
<pdf-parser
extract-inline-images="true"
extract-unique-inline-images-only="false"
sort-by-position="true"
ocr-dpi=”300”
ocr-strategy=”ocr_and_text”>
</pdf-parser>
token-pattern-replace-filter
Replaces words that match a regex pattern with a defined string.
pattern (regex)
replacement (characters)
<token-pattern-replace-filter
pattern="(\w)\2+"
replacement=" ">
</token-pattern-replace-filter>
With the “common config” attribute, global properties can be configured. These properties are accessed by all classifiers. Currently it does not offer any configuration parameters.
The module offers the functionality to analyze a dataset and perform classification tasks with migration-center. The functionality can be consumed through a web service through a HTTP/HTTPS interface with a range of provided Python scripts or within selected adaptors.
The API interface will always respond with a body encoded in JSON format.
In addition, a Web-UI is provided with the Auto Classification module. The UI offers a clean and simple interface to interact with the module and it offers the same functionality as the scripts. The UI can be reached with the following URL:
https://{server-ip}:{WSGI port}/app
The following chapters explain the usage of the functionality to analyze a dataset and to train, validate, test and reinforce a classifier. Furthermore it describes how to make predictions for unclassified documents and to query a status of a training or reinforcement processes.
This document only describes how to use the API and not the UI. However, all parameters described in this chapter, can be used in the UI.
The following chapters explain the module usage by using scan run ids from the migration-center database. It is not always a desired practice to access the database. Therefore, the alternative method is to use a unified metadata file. A unified metadata file can be created when using the Filesystem Importer of the migration-center. The metadata file contains all necessary information for the Auto-Classification module.
To use a unified metadata file, the user needs to change the following things on a HTTP request:
Replace the “scan-run” part of the URL with “unified-metadata-file”.
Send the request with the mimetype “form-data”
Send the metadata file with the form key “file”.
(optional) Remove the properties of database connection and scan-run-id.
Before validating and training a classifier, the training dataset needs to be investigated. The module offers two different reports.
The distribution report creates a barplot for every attribute with the frequencies of the values on the y axis.
A distribution report can be created with a HTTP POST request to the URL:
https://{server-ip}:{port}/api/distribution-report/scan-run
It is required to supply three parameters with the request:
id
dbCon
classAttrs
The id is the scan run id, the database connection must be a JDBC connection string and the class attributes must be string of attributes, separated with a comma.
It returns a pdf report.
Much more detailed is the hierarchy report. It allows to specify an attribute hierarchy. It creates a graph which gives an insight into the frequency of every attribute value per hierarchy layer.
A hierarchy report can be created with a HTTP POST request to the URL:
https://{server-ip}:{port}/api/hierarchy-report/scan-run
It is required to supply three parameters with the request:
id
dbCon
classAttrs
The id and database connection parameters are the same as with the distribution report. The class attributes parameter needs to be a json object dumped as a string. It can look like the following example:
{
“type”: {
“subtype”: {}
}
}
In the example, type is the most general attribute and subtype is dependent on the individual values of the type attribute.
The hierarchy report is returned as a png file.
After analyzing a dataset, one might want to remove certain documents or attribute combinations from the dataset. The dataset splitter functionality supports this process.
The dataset splitter can split a single unified metadata file into two separate files. Furthermore, he can filter documents by attribute values.
It can either be used through the web service API or the web app.
The API function is available through a HTTP Post request to the following URL:
https://{server-ip}:{port}/api/dataset-splitter/unified-metadata-file
With the request, a metadata file (key: file) must be supplied.
Additionally, the user must define a splitting percentage by which the file is split into two files. The key for the payload property is “trainingPercentage” and it must be between 0 and 100. If the value is 0, all data will be stored in the test file. If the value is 50, the data is split in half and stored in both files. If the value is 100, all data is stored in the training file.
An optional third parameter is “exclusions”. This parameter allows to define attribute combinations that are prohibited in the newly created metadata files. The dataset splitter excludes documents based on the parameter. The parameter must be a json object with the following structure:
[ { ‘attribute’: ‘document_type’, ‘values’: [‘concept’, ‘procotol’] } ]
“document_type” is an example of an attribute name, as are “concept” and “protocol” examples of values.
The ampersand (*) is a wildcard value for the values property. If the ampersand is defined, all documents are ignored that hold the attribute, no matter what attribute value they actually have.
The response object is a zip archive which contains two xml files. The two xml files are valid unified metadata files. One file starts with the keyword “TRAINING”, the other with “TEST”.
A short example explains when the exclusion parameter can be used:
A user analyzed a dataset and found out that for the attribute “document_type”, the values “concept” and “protocol” have a very low frequency and they are not classifiable. He wants to remove them from the dataset as he aims to build a classifier on the “document_type”, but the concept and protocol values increase the noise.
From there on, he uses the dataset splitter and supplies the raw metadata file. As the “trainingPercentage”, he sets a value of 100. This will save all document’s metadata into the training file and leave the test file empty. The “exclusions” parameter is defined just as seen above. The configuration leads to the result that every document that either has “concept” or “protocol” as a “document_type” value, is ignored. These documents will not be saved in the new training file.
Therefore the returned zip archive contains a full training file and an empty test file. The theoretical outcome is that the new metadata training file has less noise and the classification of the “document_type” performs better.
As the exclusion parameter is a list, the user can specify more than one exclusion entry. The entries are not connected. This means that not all entries must be true. If one entry is true for a document, the document is automatically ignored.
The following chapters only mention specific parameters of the processes.
Two parameters are supported for the training, validation, testing and reinforcement process without being mentioned explicitly:
use-serialized-documents
description
This parameter takes a Boolean value.
If it is true, it will serialize and store documents after their content has been extracted. This happens before any transformation functions are applied. If a document has already been stored on the filesystem, the stored file is loaded, instead of reextracting the content. This is helpful if OCR needs to be performed and the classifier parameter are being tuned.
One must define the “document_storage_path” setting in the config.ini file under the TIKA section.
Every process can have a description. This is helpful to identify a process later.
The training of a classifier is an essential part of auto classification. The module can be trained by performing a HTTP POST request to the URL
https://{server-ip}:{port}/api/train
It is required to supply five parameters with the request:
scan-run-id
db-con
class-attr
algorithm-id
model-target-path
The scan run id is the id of a completed scan run inside of migration center. The documents within the scan run are used to train the classifier. If the scan run does not contain all documents for the training process, the classifier can later be reinforced (see chapter 4.9).
The parameter “db-con” is a connection string to the Oracle database of migration center. The string consists of the following structure:
{account name}/{password}@{host}:{port}/{instance}
The third parameter specifies the class attribute of the documents. If the classifier should predict multiple attributes, the attribute names must be separated by a semicolon.
The algorithm id references the configured algorithm inside the classification configuration file from section Classifier configuration.
The last attribute is a file path to the location of the trained classifier. This location can be any UNC accessible path. For the uses of testing, reinforcement and prediction this file path is consumed.
Depending on the number of documents, the length of the documents and the embedded images, the training requires a significant amount of time. Because of that the initial request returns a process id, after validating the parameters. The actual training process starts after the process id is received. The id can be used to query the module for a status of the training process. Please refer to section Status on how to query the process status.
It is advised to use the provided batch script train.cmd
to train a classifier. The script requires six parameters:
Scan run id
Connection string to migration center database
Class attribute name (of the documents in the scan run id)
Algorithm id (as defined in the XML file)
Model path (the trained model will be saved in the location)
IP of the auto classification module service (optional)
If the IP is not defined, the script assumes that the service runs on localhost.
An example usage of the script can look like this:
train.cmd 432 fmemc/fmemc@localhost:1521/xe ac_class 1 C:\Temp\ac-module-dev.model
The module will use the documents within scan run “432” to train the classifier with algorithm id “1” to be able to predict the categories from the attribute name “ac_class”.
A second example shows how to train a classifier to predict two attributes:
train.cmd 432 fmemc/fmemc@localhost:1521/xe
ac_class;department
1 C:\Temp\ac-module-dev.model
This classifier will predict the attribute "ac_class" and the "department".
Testing a classifier requires an already trained classifier.
A test process can be started by performing a POST request to the URL
https://{server-ip}:{port}/api/test
It is mandatory to supply four parameters:
scan-run-id
db-con
class-attr
model-path
The scan run id is the identifier of a completed scan run inside of migration center. The documents within the scan run are used to test the classifier.
The parameter “db-con” is a connection string to the Oracle database of migration center.
The third parameter specifies the class attributes of the documents, separated by a semicolon.
The last parameter specifies a UNC file path to a trained model file.
The module responds with an overall precision value and a list of tested documents. For every document, the actual and predicted classes and confidence values are provided.
A validation process combines a training and test process. The process splits the provided datasets into k packages and performs k training batches. The user defines the value for the k parameter (common values are 10 or 5).
In every training batch, one package is used as a validation package and the remaining packages are bundled into a training dataset. The validation package works like a test dataset. Because the number of packages and batches is equal. Each package is used as a testing package exactly once.
This process allows to tune algorithm parameters without using the actual test dataset.
You can start a validation process with a POST request to
https://{server-ip}:{port}/api/validate
It is mandatory to provide six parameters
scan-run-id
db-con
class-attr
model-path
algorithm-id
k
After the parameters are validated, the module responds with a process id before starting the validation. The process id can be used to query the current state of the process. Please refer to section Status on how to query a process state.
The grid search validation process is a process to automate the tuning of algorithm parameters. It works similar to the general validation process, but uses a default and non-changeable k value of 5 and will not create a process report.
In exchange, the user can define multiple values for each algorithm parameter. The module will apply each cross product of the values on a validation process and return the precision and standard deviation of the predictions.
A grid search validation process can be started with a POST request to
https://{server-ip}:{port}/api/validate-grid-search-cv
It is mandatory to provide six parameters
scan-run-id
db-con
class-attr
model-path
algorithm-id
grid-search-config
The “grid-search-config” parameter is a string and must have the following schema:
Parameters must be separated by an ampersand (&).
Parameter values must be separated by a semicolon (;).
Therefore a sample parameter looks like the following:
C=0.5;1;2;5°ree=1;2;3;4
Because the process will apply every element of the cross product of the parameter values, a total of 16 validation processes are started.
The process of reinforcement is similar to the train process. The only difference is that it uses an already trained model file and retrains the algorithm with the newly provided documents.
You can start reinforcement with a POST request to
https://{server-ip}:{port}/api/reinforce
It is mandatory to provide four parameters
scan-run-id
db-con
class-attr
model-path
The scan run id is the id of a completed scan run inside of migration center. The documents within the scan run are used to reinforce the classifier.
The parameter “db-con” is a connection string to the Oracle database of migration center.
The third parameter specifies the class attribute of the documents.
The last parameter specifies a UNC file path to a trained model file.
After the parameters are validated, the module responds with a process id before starting the reinforcement. The process id can be used to query the current state of the process. Please refer to section Status on how to query a process state.
The prediction of a document’s class can easily be done with two options: with a HTTP API request or via the File System Scanner in migration-center.
A unclassified document can be predicted with a GET request to
https://{server-ip}:{port}/api/predict
It is mandatory to provide two parameters
document-path
model-path
The module response with a classification and a confidence value:
To be able to use the file system scanner for document prediction, it is necessary to train a classifier beforehand. Also, three file system scanner parameters must be configured.
The parameters are:
acModelFilePath
acServiceLocation
acUseClassification
The model file path points to a UNC file path of the trained classifier model file. The service location is the URI address of the deployed auto classification module. At last, the usage of classification must be enabled by ticking the parameter “acUseClassification”.
Now, a file system can be scanned by the scanner and predictions are automatically performed. The results can be accessed by viewing the attributes of the scanned objects (see figure below). In this case, the attribute “ai_class” holds the value of the predicted class. The classifier expresses his confidence in his own prediction with a value between 0 and 100. The higher the value, the greater the confidence.
If one is unsure which attribute has been predicted and where its confidence value column is, he should know the easy naming schema: The predicted values are displayed in columns starting with “ai_”, followed by the original attribute name. The confidence values column has the name of “confidence_”, again followed by the original attribute name.
By initializing a training or reinforcement process, a process id is replied. The id can be used to query the status of the process at any time until the module is turned off.
The current state of a process can be retrieved by performing a GET request to the URL
https://{server-ip}:{port}/api/processes/{process_id}
It requires to supply one parameter
process-id
The response contains a status, message, number of processed documents, number of provided documents and list of document results.
The status can have any of the following options:
STARTED
READING_DOCUMENTS
TRAINING
REINFORCING
FINISHED
BAD_REQUEST_ERROR
INTERNAL_SERVER_ERROR
ABORTED
If the status is an error, the message property always offers an explanation.
The “number of processed documents” indicates how many documents have already been transformed by the functions in the transformation list. As soon as all documents have been transformed, the module changes the status to TRAINING.
The “number of provided documents” shows how many documents are in the scan run.
If the process is of type reinforcement, the “number of provided documents” is the sum of documents from the already trained model and the provided scan run.
The document results indicate whether an error occurred while processing a particular document.
Please take in mind that a process status is not stored forever. As soon as the module is turned off or is rebooted, the status of every process is lost.
This chapter explains common and known issues that one might run into.
If the module does not extract embedded images from PDF files, multiple points can be the reason:
Check if the PdfParser property “extract-inline-images” is set to true.
Check if the PdfParser properties “ocr-strategy” is set to “ocr_only” or “ocr_and_text”. If the module has been configured through the WebApp, the strategy is set automatically to “ocr_only” if “extract-inline-images” is true.
Check if the OCR property “enable-image-processing” is set to 1.
If only a subset of the embedded images are extracted, the issue can be the missing file type support for JPG files. Apache TIka uses the software “PdfBox” to process PDF files. The JPG file type is not automatically supported by PdfBox, because the required code is not applicable to the Apache license and prohibits the usage in commercial software projects.
In some circumstances Apache Tika does not start although it is properly configured in the config.ini file. Start an investigation as follows:
Check if the used ports are not occupied.
Check if the tika-server.jar file can be downloaded from the script over the internet (internet connection required). If not, download the newest tika-server.jar file and save it in a directory of your choice (i.e: C:\Temp). If applicable, you must rename the file name to “tika-server.jar” explicitly. Now, you must define the Apache Tika environment variable TIKA_PATH with the path to the directory of the tika-server.jar file:
SET TIKA_PATH=C:\Temp\
python __init__.py samples\config
Start a dedicated Tika server from the Windows CMD terminal. Download the newest tika-server.jar file from the official website and execute this command:
java -jar tika-server.jar --config={path to tika config xml fil} --port=9000
If java cannot be started in general, try to use the absolute path to the java.jar. If this fixes the issue, you must set the Apache Tika environment variable TIKA_JAVA with the absolute path to the java.exe file before starting the module like so:
SET TIKA_JAVA=C:\Program Files\Java\{jre version}\bin
python __init__.py samples\config
If you are confronted with an exception that is not listed here, please get in contact with our technical product support at support@migration-center.com.
The Exchange Removal post processing adapter allows deletion of Exchange mails that were previously imported.
The term post processing job is used in migration-center for describing a component which executes a set of operations on objects that were previously imported.
Post processing jobs work as importer jobs that can be run at any time, and can even be executed repeatedly. For every run a detailed history and log file are created. Multiple post processing jobs can be created or run at a time, each being defined by a unique name, a set of configuration parameters and a description (optional).
The Exchange Removal post-processing was created to delete emails from Exchange Server after they have been imported with a migration-center importer. The adapter will accept any migset that contains imported documents. All imported objects will be processed by the adapter in the following way:
If the Exchange email was deleted previously by Documentum importer or by a previous run of the Exchange Removal adapter then the object will be ignored
If the Exchange email was not deleted previously, the adapter will connect to the Exchange Server where the email was scanned from and delete it. If the object could not be deleted because any reason an appropriate error will be logged in the run report log so the user can see that and fix the errors manually.
Before actually deleting a mail from Exchange server, a backup is made at the location specified by the backupLocation parameter. The mail is saved as .eml file and its metadata is saved as an xml file that is recognizable by the filesystem scanner. If the backupLocation parameter is empty, the emails will not be backed up.
To create a new Exchange Removal job you must create a new importer and specify the respective adapter type in the importer’s Properties window – from the list of available adapters “ExchangeRemoval” must be selected. Once the adapter type has been selected, the Parameters list will be populated with the parameters specific to the selected adapter type, in this case the ExchangeRemoval adapter’s properties.
The Properties window of an importer can be accessed by double-clicking an importer in the list, or selecting the Properties button or entry from the toolbar or context menu.
The configuration parameters available for the Exchange Removal adapter are described below:
A complete history is available for any Exchange Removal job from the respective items’ History window. It is accessible through the History button/menu entry on the toolbar/context menu. The History window displays a list of all runs for the selected job together with additional information, such as the number of processed objects, the start and ending time and the status.
Double clicking an entry or clicking the Open button on the toolbar opens the log file created by that run. The log file contains more information about the run of the selected job:
Version information of the migration-center Server Components the job was run with
The parameters the job was run with
Execution Summary that contains the total number of objects processed, the number of documents and folders scanned or imported, the count of warnings and errors that occurred during runtime.
Log files generated by the Exchange Removal can be found in the Server Components installation folder of the machine where the job was run, e.g. …\fme AG\migration-center Server Components <Version>\logs
The amount of information written to the log files depends on the setting specified in the ‘loggingLevel’ start parameter for the respective job.
The Microsoft Exchange Scanner is a new adapter available since migration-center 3.2.6. The Exchange scanner can extract messages (emails) from Exchange mailboxes and use it as input into migration-center, from where it can be processed and migrated to other systems supported by the various mc importers.
The Microsoft Exchange Scanner currently supports Microsoft Exchange 2010. It uses the Independentsoft JWebServices for Exchange Java API to access an Exchange mailbox and extract emails including attachments and properties.
Scanner is the term used in migration-center for an input adapter. Using the Exchange Scanner Module to read the data that needs processing into migration-center is the first step in a migration project, thus scan also refers to the process used to input data to migration-center.
The scanner module works as a job that can be run at any time and can even be executed repeatedly. For every run a detailed history and log file are created.
A Scanner is defined by a unique name, a set of configuration parameters and an optional description.
Exchange Scanners can be created, configured, started and monitored through migration-center Client, but the corresponding processes are executed by migration-center Job Server.
The Exchange scanner connects to the Exchange Server with a specified Exchange mail account and can extract messages from one (or multiple) folder(s) within the current user’s or other user’s mailboxes. The account used to connect to exchange must have delegate access permission to the other accounts from which mails will be scanned. All subfolders of the specified folder(s) will automatically be processed as well; an option for excluding select subfolders from scanning is also available. See chapter below for more information about the features and configuration parameters available in the Exchange scanner.
In addition to the emails themselves, attachments and properties of the respective messages are also extracted. The messages and included attachments are stored as .eml files on disk, while the properties are written to the mc database, as is the standard with all migration-center scanners.
After a scan has completed, the newly scanned email messages and their properties are available for further processing in migration-center.
To create a new Exchange Scanner job, specify the respective adapter type in the Scanner Properties window – from the list of available adapters, “Exchange” must be selected. Once the adapter type has been selected, the Parameters list will be populated with the parameters specific to the selected adapter type, in this case the Exchange adapter’s parameters.
The Properties window of a scanner can be accessed by double-clicking a scanner in the list, or selecting the Properties button or entry from the toolbar or context menu.
Note: Configuration parameters ending with (*) are mandatory.
A complete history is available for any Exchange Scanner job from the respective items’ History window. It is accessible through the History button/menu entry on the toolbar/context menu. The History window displays a list of all runs for the selected job together with additional information, such as the number of processed objects, the start and ending time and the status.
Double clicking an entry or clicking the Open button on the toolbar opens the log file created by that run. The log file contains more information about the run of the selected job:
Version information of the migration-center Server Components the job was run with
The parameters the job was run with
Execution Summary that contains the total number of objects processed, the number of documents and folders scanned or imported, the count of warnings and errors that occurred during runtime.
Log files generated by the Exchange Adapter can be found in the Server Components installation folder of the machine where the job was run, e.g. …\fme AG\migration-center Server Components <Version>\logs
The amount of information written to the log files depends on the setting specified in the ‘loggingLevel’ start parameter for the respective job.
The Documentum No Content Copy (NCC) Scanner is a special variant of the regular Documentum Scanner. It offers the same features as regular with the difference that the content of the documents is not exported from Documentum during migration. The content files themselves can be attached to the migrated documents in the target repository by using one the following method:
copy files from the source storage to the target storage outside of migration-center
attach the source storage to the target repository so the content will be accessed from the original storage.
The scenario for such migrations usually involves migrations with very large numbers of documents (>>10.000.000), where extracting and transferring the content between the source and target systems would take too much time; thus the approach where only the metadata and content references are migrated is preferred. The actual content is then transferred independently using fast, low overhead file system level access directly in the SAN without having to pass through the API of the source system, migration-center, and then again through the target system’s API as would be the case during a standard Documentum migration. Since the references to the content files are preserved, simply dropping the actual content in place in the respective filestore(s) completes the migration and requires no additional tasks to be performed to link the content to the recently migrated objects. This approach is of course not universally applicable to any Documentum migration project, and needs to be considered and planned beforehand if intended to be used for a given migration.
Time savings
The time it takes to perform a Documenum “NCC” migration versus a “classic” Documentum migration can be as little as one third of the time it takes to perform the latter (when comparing the duration of the scan and import phases which involve content transfer).
The Documentum Scanner currently supports Documentum Content Server versions 6.5 to 20.2, including service packs.
For accessing a Documentum repository Documentum Foundation Classes 6.5 or newer is required. Any combinations of DFC versions and Content Server versions supported by EMC Documentum are also supported by migration-center’s Documentum Scanner, but it is recommended to use the DFC version matching the version of the Content Server being scanned. The DFC must be installed and configured on every machine where migration-center Server Components is deployed.
When scanning Documentum documents with the Documentum No Content Copy (NCC) scanner, content files are not exported to the file system anymore, nor are they being imported to the target system. Instead the scanner exports information about each dmr_content object from the source system and saves this as additional, internal information related to the document objects in the migration-center database. The is able to process the content related information and restore the references to the content files within the specified filestore(s) upon import. Copying, moving, or linking the folders containing the content files to the new filestore(s) of the target system is all that’s required to restore the connection between the migrated objects and their content files. Please consult the Documentum Content Server Administration Guide for your Content Server version to learn about filestores and their structure in order to understand how the content files are to be moved between source and target systems.
During a Documentum NCC migration the content files are not checked for their availability and/or validity; the migration will be performed with or without the content files being available to the target system. The content files themselves can be migrated after the migration using regular file or storage management tools operating on the content files directly at the file system level, independently of migration-center. The NCC feature does not have an impact on folder and audit trail migration. With the Documentum NCC scanner you can scan folders and audit trails as you would do with the standard Documentum Scanner.
The OpenText Scanner allows extracting objects such as documents, folders compound documents and saves this data to migration-center for further processing. The key features of OpenText Scanner are:
Supports OTCS versions: 9.7.1, 10.0, 10.5, 16.0, 16.2, 16.2.4
Export documents content and their metadata (versions, ACLs, categories, attribute sets, classifications, renditions)
Export folders and their metadata (categories, attribute sets, classifications)
Export shortcuts of documents and folders
Scale the performance by using multiple threads for scanning data
Scanner is the term used in migration-center for an input adapter. Using a scanner such as the OpenText Scanner to extract data that needs processing in migration-center is the first step in a migration project, thus scan also refers to the process used to input data to migration-center.
Scanners and importers work as jobs that can be run at any time and can even be executed repeatedly. For every run a detailed history and log file are created. Multiple scanner and import jobs can be created or run at a time, each being defined by a unique name, a set of configuration parameters and a description (optional).
The OpenText scanner connects to the OpenText Content Server via the specified “webserviceURL” set in the scanner properties and can export folders, documents or compound documents. The account used to connect to the content server must have System Administration Rights to extract the content. All subfolders of the specified folder(s) will automatically be processed as well; an option for excluding select subfolders from scanning is also available. See below for more information about the features and configuration parameters available in the OpenText scanner. Email folders are supported so the documents and emails located in email folders are scanned as well.
In addition to the documents themselves, renditions and properties can also be extracted. The content and renditions are exported as files on disk, while the properties are stored to the mc database, as it is the standard with all migration-center scanners.
After a scan has completed, the newly scanned documents and their properties are available for further processing in migration-center.
Below is a list of object types, which are currently supported by the OpenText scanner:
Any type of container nodes: Folder, Email Folder, Project, Binder, EcmWorkspace, etc
Document (ID: 144)
Compound Document (ID: 136)
Email (ID: 749)
Shortcut
Generation
CAD Document (ID: 736) - scanned as a regular document
Any objects with different types will be ignored during initialization. A list of all node types of scanned containers is provided in the run log as well as all node types in the scope of scanning that were not exported because they are not supported.
The ACLs of the scanned folders and documents will be exported automatically as source attribute ACLs.
ACLs attribute can have multiple values, so that each value has the following format:
<ACLType#RightName#Permission-1|Permission-2|Permission-n>
The following table describes all valid values for defining a correct ACLs value:
Ex:
ACL#csuser#See|SeeContents
Owner#manager1#See|SeeContents|Modify
To create a new OpenText Scanner job, specify the respective adapter type in the Scanner Properties window from the list of available adapters, OpenText must be selected. Once the adapter type has been selected, the Parameters list will be populated with the parameters specific to the selected adapter type, in this case the OpenText parameters.
The Properties window of a scanner can be accessed by double-clicking a scanner in the list or selecting the [Properties] button or entry from the toolbar or context menu.
When extracting the Owner, Version Created By, Created By and any other attribute the references a user it may happen the internal id of the user cannot be resolved by the Content Server and therefore instead of the username the scanner will extract its id. When this happens, a warning is logged in the report log.
Email folder can only be extracted from the Content Server 16 and later.
Wildcards feature is working properly in most cases, but it has some limitations:
Folders which contain the character “?” in the name will not be scanned using “*” wildcard. Example: Ex: /* will not scanned /test?name
The wildcards in the middle of the path do not work: “/*/Level2” will scan no documents under the “Level2” folder
“/Level1/*exactname*” will scan no documents which are located in “exactname” folder
When scanning documents by using “scanFolderPaths”, if two or more paths overlap, the documents which are located under them will be scanned twice or more. However, this issue will not affect the content integrity of the documents. Each duplicate document will have a separate content location.
Example:
/folder1/folder2/doc1
/folder1/folder3/doc2
/folder1/folder3/doc3
scanFolderPaths: “/folder1 | /folder1/folder3”
Considering the above scenario, the documents “doc2” & “doc3” will be scanned twice.
The Outlook scanner can extract messages from an Outlook mailbox and use it as input into migration-center, from where it can be processed and migrated to other system supported by the various mc importers.
The Microsoft Outlook Scanner currently supports Microsoft Outlook 2007 and 2010 and uses the Moyosoft Java Outlook Connector API to access an Outlook mailbox and extract emails including attachments and properties.
Scanner is the term used in migration-center for an input adapter. Using the Outlook Scanner Module to read the data that needs processing into migration-center is the first step in a migration project, thus scan also refers to the process used to input data to migration-center.
The scanner module works as a job that can be run at any time and can even be executed repeatedly. For every run a detailed history and log file are created.
A Scanner is defined by a unique name, a set of configuration parameters and an optional description.
Outlook Scanners can be created, configured, started and monitored through migration-center Client, but the corresponding processes are executed by migration-center Job Server.
The Outlook scanner connects to a specified Outlook mail account and can extract messages from one (or multiple) folder(s) existing within that accounts mailbox. All subfolders of the specified folder(s) will automatically be processed as well; an option for excluding select subfolders from scanning is also available. See chapter below for more information about the features and configuration parameters available in the Outlook scanner.
In addition to the emails themselves, attachments and properties of the respective messages are also extracted. The messages and included attachments are stored as .msg files on disk, while the properties are written to the mc database, as is the standard with all migration-center scanners.
After a scan has completed, the newly scanned email messages and their properties are available for further processing in migration-center.
To create a new Outlook Scanner job, specify the respective adapter type in the Scanner Properties window – from the list of available adapters, “Outlook” must be selected. Once the adapter type has been selected, the Parameters list will be populated with the parameters specific to the selected adapter type, in this case the Outlook adapter’s.
The Properties window of a scanner can be accessed by double-clicking a scanner in the list, or selecting the Properties button or entry from the toolbar or context menu.
A complete history is available for any Outlook Scanner job from the respective items’ History window. It is accessible through the History button/menu entry on the toolbar/context menu. The History window displays a list of all runs for the selected job together with additional information, such as the number of processed objects, the start and ending time and the status.
Double clicking an entry or clicking the Open button on the toolbar opens the log file created by that run. The log file contains more information about the run of the selected job:
Version information of the migration-center Server Components the job was run with
The parameters the job was run with
Execution Summary that contains the total number of objects processed, the number of documents and folders scanned or imported, the count of warnings and errors that occurred during runtime.
Log files generated by the Outlook Adapter can be found in the Server Components installation folder of the machine where the job was run, e.g. …\fme AG\migration-center Server Components <Version>\logs
The amount of information written to the log files depends on the setting specified in the ‘loggingLevel’ start parameter for the respective job.
In the lower part of the windows the view can be filtered based on the available source attribute. Select an attribute from the list, enter Filter value, and press ENTER or [Apply filter]. To remove any applied filters, press the [Clear filter] button. Enter “(null)” to list attributes for which the selected filter attribute has no value
[Properties] is used to change the configuration of an item in the list. The -Migration Set- properties window can also be opened by double-clicking an item in the list
Three views are available for a migration set: Source Objects, Processed Objects, and Errors. These views can be accessed in the -Migration Sets- window by context menu or toolbar. Each window displays a different view of the current migration set objects, including their metadata, current status, associated migration-center internal information, messages related to the various processing steps, etc.
Select an attribute from the list, enter Filter value, and press ENTER or [Apply filter]. To remove any applied filters press the [Clear filter] button. Enter “(null)” to list objects for which the selected filter attribute has no value Use “_” (underscore) as a placeholder for one character, and “ % “ (percent) for zero or more characters.
[Copy Rules from...] allows copying a set of transformation rules from another migration set and applies it to the current migration set, thereby replacing any existing rules. Rules can only be copied from migration sets of the same type as the current migration set. If no migration sets of the same type exist, the list will be empty. Any migration sets of the same type will be displayed in a list, allowing the user to select one and click [Ok] to copy and apply its transformation rules to the current migration set.
The |Mapping List| page allows the user to create, duplicate, delete and edit entries for mapping lists using the buttons provided in the toolbar. Mapping lists also have two properties:
The |General settings| tab contains only three directly editable fields: “Name”, “Description” and a checkbox called “Active”. The checkbox changes the status of (Active: true/false) of a configured scheduler. If no flag is set (status: false) a configured scheduler will not run at the configured time and date. The Info dynamic area displays various information about the scheduler.
In the |Configuration| tab, the appropriate (previously configured) scanners, importers and attribute models are selected. Caution is to be executed when choosing an attribute model, so that an attribute model of similar purpose and type to scanner and importer shall be chosen.
The |Interval| tab allows the configuration of the time and date the scheduler will begin to run and stop to function. Also, the frequency of the runs can be set up here.
The |History| tab provides information about various configuration aspects and run dates of the scheduler, of the current state, last operation undertaken, as well as giving access to the logs.
It is possible to search Object History by type of object or attribute values, including internal system attributes used by migration-center. The search parameters also support wildcards such as “_” (underline) for one character, and “ % “ (percent) for zero or more characters. The button displays the full set of attributes associated with the selected object at that respective point in time. This can prove useful in case some attributes are wrong or missing after import to the target system – by verifying the attributes assigned to the object by migration-center at the time of import it is possible to determine whether any of the inconsistencies encountered in the target system are due to migration-center or not.
The number of latest versions to be scanned can be limited through the exportLatestVersions parameter. See more about using this parameter in .
Filename and extension are self-explaining and refer to the filename and extension of the actual document, while metadaextension should be the custom extension chosen to identify metadata files and must be specified as the value for the mcMetadataFileExtension parameter, as described in the paragraph above.
In case date/time type values are included in the metadata file, the date/time formats used must comply with the date/time pattern defined for migration-center during installation. For more information see the .
The IBM Domino Scanner connects to a specified IBM Domino/Notes application and can extract documents, content of richtext fields (composite items), metadata and attachments out of this application based on user-defined criteria. See chapter below for more information about the features and configuration parameters available in the IBM Domino Scanner.
Based on the add-on “PDF Generation Module” (see ), the Domino scanner is capable of generating PDF, PDF/a-1a or PDF/a-1b files for any type of Domino document – independent of the application it originates from.
Follow the standard installation procedure described in the to install the migration-center Server Components containing Job Server and corresponding part of the SharePoint Scanner.
The Documentum specific features supported by Documentum NCC scanner are fully described in the section.
Configuration parameters
Values
Name
Enter a unique name for this scanner
Mandatory
Adapter type
Select the “Documentum” adapter from the list of available adapters
Mandatory
Location
Select the Job Server location where this job should be run. Job Servers are defined in the Jobserver window. If threre are no Job Server defined, migration-center will prompt the user to define a Job Server Location when saving the Scanner.
Mandatory
Description
Enter a description for this job (optional)
Configuration parameters
Values
username*
Username for connecting to the source repository.
This user should have the required permissions and privileges to read access all required objects. Though not mandatory, a user account with super user privileges is recommended.
Mandatory
password*
The user’s password.
Mandatory
repository*
Name of the source repository. The source repository must be accessible from the machine where the selected Job Server is running.
Mandatory
scanFolderPaths
Enter folder paths to be scanned. Paths must always be absolute paths (including the starting “/”, cabinet name, etc.)
Multiple values can be entered by separating them with the “|” character.
Using the dqlString parameter overrides the scanFolderPaths, excludeFolderPaths and documentTypes parameters
excludeFolderPaths
The list of folder paths to be excluded from the scan.
Multiple values can be entered by separating them with the “|” character.
Using the dqlString parameter overrides the scanFolderPaths, excludeFolderPaths and documentTypes parameters
documentTypes
List of documents types to be scanned. When this parameter is set it also requires scanFolderPaths to be set. Only documents having the types specified here will be scanned. If scanFolderPaths is set but documentTypes is empty only the folder structure will be exported.
Multiple values can be entered by separating them with the “|” character.
Using the dqlString parameter overrides the scanFolderPaths, excludeFolderPaths and documentTypes parameters
dqlString
The DQL statement that will be used to retrieve the r_object_id of the objects that will be scanned. The query must select only r_object_id of current versions.
Example:
Select r_object_id from dm_document where Folder(‘/Finance/Invoices’, descend)
Using the dqlString parameter overrides the scanFolderPaths, excludeFolderPaths and documentTypes parameters.
NOTE: do not use the (ALL) option in this DQL statement (e.g.”select … from dm_document(all)” ) in an attempt to extract all versions of an object. To scan versions, use the exportVersions parameter instead (described below).
dqlExtendedString
Specify one or more additional DQL statement to be executed for each document within the scope of the scan. This can be used to access other database tables than the standard dm_document table and extract additional information from there. Any valid DQL statement is accepted, but it must include the “{id}” string as a place holder for the current object’s r_object_id.
If a query returns multiple rows or multiple values for an attribute all values will be added to the corresponding source attribute. In order to prevent wrong DQL configuration the maximum number of values for an extended attribute is limited to 1,000. A warning will be logged when the number of values for a single attribute exceed 1,000.
Example:
select set_client, set_file, set_time from dmr_content WHERE rendition = 0 and any parent_id = ‘{id}’; will return information from the dmr_content table of each dm_document type object within the scanner’s scope
exportRenditions
Boolean. Flag indicating if the document renditions will be exported.
exportVersions
Boolean. Flag indicating if the entire document version tree should be exported. If not checked, only current version will be exported.
exportLatestVersions
Specifies the number of latest versions to scan.
Valid values are positive integers >=1. If the number set is larger than the total number of versions for an object, all versions will be scanned.
Values <=0 will have no effect and cause all versions to be scanned.
This option does not work with branches due to the inconsistencies it would introduce. If version structures with branches are found, this parameter will be ignored.
This option does not work with Virtual Documents due to the inconsistencies it would introduce If both exportLatestVersions and exportVirtualDocs are checked in the scanner, an error will be raised forcing the user to take action and decide on using only one or the other feature, but not both.
exportVirtualDocs
Boolean. Flag indicating if the virtual documents should be exported. If not checked, virtual documents will be scanned as normal documents, so VD structure will be lost.
exportVDversions
Boolean. Flag indicating if all versions of virtual documents should be exported. If not checked, only the latest version will be exported.
maintainVirtualDocsIntegrity
Boolean. Flag indicating if virtual document’s children which are out of scope (objects not included in the scan paths or dqlString) should also be exported. If set to false, the children which are out of scope will not be exported but the virtual document’s structure information will be exported (the dmr_containment objects of a VD will be all exported as relations).
computeChecksum
When it's checked the checksum of scanned files will be computed. Useful for determining whether files with different names and from different locations have in fact the same content, as can frequently happen with common documents copied and stored by several users in a file share environment.
Do not enable this option unless necessary, since the performance impact is significant due to the scanner having to read the full content for each and compute the checksum for it.
hashAlgorithm
Specifies the algorithm that will be used to compute the Checksum of the scanned objects.
Possible values are "MD2", "MD5", "SHA-1", "SHA-224", "SHA-256", "SHA-384" and "SHA-512". Default value is MD5.
hashEncoding
Specifies the encoding that will be used to compute the Checksum of the scanned objects.
Possible values are "HEX", "Base32" and "Base64". Default value is HEX.
ignoredAttributesList
List of attribute names ignored by the scanner. Note that the delimiter is a comma “ , “ in this case. Also please notice the default list of attributes to be ignored. This list refers to attributes that will be set automatically by the Documentum system where the files are migrated to, so there is no need of wasting resources on scanning. Still, check carefully if the list matches Your criteria! *see notes
scanNullAttributes
Boolean. Flag indicating whether attributes with null values should be scanned. By default, the option is off so attributes with null values are not scanned in order to reduce clutter in the database.
exportFolderStructure
Boolean. Flag indicating if the folder structure should be exported. If checked, each folder from target will be exported as a migration-center object. This will work in relation with scanFolderPaths and excludeFolderPaths parameters. Folder structure cannot be exported based on the dqlString parameter’s value This feature best helps to recreate a folder structure which contains no objects.
exportAnnotations
Boolean. Flag indicating if the document's annotations will be exported.
exportComments
Boolean. Flag indicating if the document's comments (dmc_comment) will be exported. This will work only when exportVersions is checked and exportLatestVersions is empty.
exportRelations
Boolean. Flag indicating if the relations (dm_relation objects) between the exported objects (folders, documents) must be exported. Do not check this flag if it’s not necessary in order to gain performance.
exportRelationsAsRenditions
Boolen. Flag indicating if the child documents of relation will be exported as renditions of the parent document. If checked, the scanner will only export the child of the valid relations and add the rendition path to the source attribute "dctm_obj_rendition”
relationsAsRenditionsNames
The name of the relations that will be exported as renditions. Multiple relation names can be specified, separated by ",". Works in combination with "exportRelationsAsRenditions". If set, only the relations with the specified names will be exported. If not set, all relations that have a document as a child will be exported as renditions.
mapRelationNames
Boolean. Check to change names of scanned relations to user specified names (since relation information cannot be edited using transformation rules, there is no other way to change the name of dm_relation between a source and target system).
For the feature to work a text file named relation_config.properties must be created in the mc Server Components installation folder. Its contents must be relation name mappings such as:
oldRelationName=newRelation
source_relation=target_relation
skipContent
Boolean Flag indicating if the documents content will be exported from repository. If it's activated all documents will be scanned as content less objects. This should be used for testing purpose or for special cases when the content is not needed.
exportLocation*
Folder path. The location where the exported object content should be temporary saved. It can be a local folder on the same machine with the Job Server or a shared folder on the network. This folder must exist prior to launching the scanner and must have write permissions. migration-center will not create this folder automatically. If the folder cannot be found an appropriate error will be raised and logged. This path must be accessible by both scanner and importer so if they are running on different machines, it should be a shared folder.
Mandatory
exportAuditTrail
Boolean. Check to enable scanning existing audit trails for any documents and folders resulting from the other scanner parameters
auditTrailType
Specify the Documentum audit trail object type (dm_audittail or custom subtype)
auditTrailSelection
This will contain the DQL where condition for selecting the audit trail objects. Leave it empty to select export all audit trail entries.
Ex: event_name in ('dm_save', 'dm_checkin')
auditTrailIgnoreAttributes
List of audit trail attributes that will be ignored by the scanner
exportAuditTrailAsRendition
Boolean. Flag indicating if audit trails entries will be exported to PDF renditions. See 5.6.1 for more details.
auditTrailPerVersionTree
Boolean. Flag indicating if one PDF audit trail rendition is generated per entire version tree. See 5.6.1 for more details.
loggingLevel*
Sets the verbosity of the log file.
Possible values:
1 - logs only errors during scan
2 - is the default value reporting all warnings and errors
3 - logs all successfully performed operations in addition to any warnings or errors
4 - logs all events (for debugging only, use only if instructed by fme product support since it generates a very large amount of output. Do not use in production)
Mandatory
Configuration parameters
Values
Name
Enter a unique name for this importer
Mandatory
Adapter type
Select ExchangeRemoval from the list of available adapters
Mandatory
Location
Select the Job Server location where this job should be run. Job Servers are defined in the Jobserver window. If no Job Server has been created by the user to this point, migration-center will prompt the user to define a Job Server Location when saving the Importer.
Mandatory
Description
Enter a description for this job (optional)
Configuration parameters
Values
backupLocation
The location on the filesystem where the mails and their metadata will be saved before they are deleted from the Exchange server. The metadata will be saved in xml files that will be recognized by filesystem scanner. If the parameter is empty the mails will not be backed up.
numberOfThreads
The number of concurrent threads that will be used for deleting emails from Exchange server.
Mandatory
loggingLevel
Sets the verbosity of the log file.
Values:
1 - logs only errors during scan
2 - is the default value reporting all warnings and errors
3 - logs all successfully performed operations in addition to any warnings or errors
4 - logs all events (for debugging only, use only if instructed by fme product support since it generates a very large amount of output. Do not use in production)
Mandatory
Configuration parameters
Values
Name
Enter a unique name for this scanner
Mandatory
Adapter type
Select the “eRoom” adapter from the list of available adapters
Mandatory
Location
Select the Job Server location where this job should be run. Job Servers are defined in the Jobserver window. If no Job Server migration-center will prompt the user to define a Job Server Location when saving the Importer.
Mandatory
Description
Enter a description for this job (optional)
Configuration parameters
Values
eRoomSite*
URL of the eroom site.
Mandatory
relativePaths*
List of relative paths of the rooms and/or facilities which will be scanned. The list can be provided either inline, with each relative path being separated by comma (,), or in a file, with the full path of the file being indicated by At (@) in the parameter value and each line in the file containing a relative path. e.g: @C:\pathToFile\file
facility1/room1
facility2
Mandatory
eRoomWebservice*
URL of the eRoom web service (SOAP) to connect to.
The eRoom site holds a representation of its entire content in a virtual XML document. Data from this document can be accessed through SOAP requests.
Mandatory
username*
User name for connecting to eRoom site.
Mandatory
password*
The user’s password.
Mandatory
exportVersions
If it’s enabled all versions of the documents will be exported, otherwise only the latest version will be scanned.
skipContent
Setting this parameter to true will skip extracting the actual content during scan, meaning only the metadata will be scanned. Used mostly for testing purposes.
exportLocation*
Folder path. The location where the exported object content should be temporary saved. It can be a local folder on the same machine with the Job Server or a shared folder on the network. This folder must exist prior to launching the scanner and must have write permissions. migration-center will not create this folder automatically. If the folder cannot be found an appropriate error will be raised and logged. This path must be accessible by both scanner and importer so if they are running on different machines, it should be a shared network folder.
Mandatory
loggingLevel*
Sets the verbosity of the log file.
Possible values:
1 - logs only errors during scan
2 - is the default value reporting all warnings and errors
3 - logs all successfully performed operations in addition to any warnings or errors
4 - logs all events (for debugging only, use only if instructed by fme product support since it generates a very large amount of output. Do not use in production)
Mandatory
Configuration parameters
Values
Name
Enter a unique name for this scanner
Mandatory
Adapter type
Select the “Exchange” adapter from the list of available adapters
Mandatory
Location
Select the Job Server location where this job should be run. Job Servers are defined in the Jobserver window. If no Job Server migration-center will prompt the user to define a Job Server Location when saving the Importer.
Mandatory
Description
Enter a description for this job (optional)
Configuration parameters
Values
username*
The username that will be used to connect to the Exchange server. This user should have delegate access to all accounts that will be scanned.
Mandatory
password*
The password that will be used to connect to the exchange server.
Mandatory
domain
The domain against the user will be authenticated. It should be left empty for authentication against exchange server domain.
useHttps
Specify if the connection between Job Server and Exchange server will be established over a secure SSL channel.
exchangeServer*
The host name or IP address of the exchange server.
Mandatory
scanFolders*
Exchange folder paths to scan.
The syntax is \\<accountname>[\folder path] or \folderPath. If only the account is given (ex: \\john.doe@vw.de) then the scan location will be considered to be the "Top of Information Store" folder of the user. If no account is specified, the path is considered to be in the account specified in the “username” property. Multiple paths can be entered by separating them with the “|” character.
Example:
\\user\Inbox would scan the Inbox of user (including subfolders)
\Inbox\sales is equivalent to \\“username”\Inbox\sales
Mandatory
excludeFolders
Exchange folder paths to exclude from scanning. Follows the same syntax as scanFolderPaths above.
Example: \\user\Inbox\Personal would exclude user’s personal mails stored in the Personal subfolder of the Inbox if used in conjunction with the above example for scanFolderPaths.
ignoredAttributesList
A comma separated list of Exchange properties to be ignored by the scanner.
At least Body,HTMLBody,RTFBody,PermissionTemplateGuid should be always excluded as these significantly increase the size of the information retrieved from Exchange but don’t provide any information useful for migration purposes in return.
exportLocation*
Folder path. The location where the exported object content should be temporary saved. It can be a local folder on the same machine with the Job Server or a shared folder on the network. This folder must exist prior to launching the scanner and must have write permissions. migration-center will not create this folder automatically. If the folder cannot be found an appropriate error will be raised and logged. This path must be accessible by both scanner and importer so if they are running on different machines, it should be a shared network folder.
Mandatory
exportMailAsMsg
Boolean - If true, the mails will be exported as .msg files, otherwise they will be exported as .eml files.
Mandatory
numberOfThreads
Number - The number of concurrent threads that will be used for scanning the emails from the configured locations.
Mandatory
loggingLevel*
Sets the verbosity of the log file.
Possible values:
1 - logs only errors during scan
2 - is the default value reporting all warnings and errors
3 - logs all successfully performed operations in addition to any warnings or errors
4 – logs all events (for debugging only, use only if instructed by fme product support since it generates a very large amount of output. Do not use in production)
Mandatory
Configuration parameters
Values
Name
Enter a unique name for this scanner
Mandatory
Adapter type
Select the “Filesystem” adapter from the list of available adapters
Mandatory
Location
Select the Job Server location where this job should be run. Job Servers are defined in the Jobserver window. If no Job Server migration-center will prompt the user to define a Job Server Location when saving the Scanner.
Mandatory
Description
Enter a description for this job (optional)
Configuration parameters
Values
scanFolderPaths*
The folder paths to be scanned.
Can be local paths or network file shares (SMB/Samba)
Multiple values can be entered by separating them with the “|” character.
Examples: of scanning a network share and a local path:
\\share1\testfolder|c:\documents
Note: To scan network file shares the Job Server running the respective Scanner must be configured to run using a domain account which has full read permission for the given network share.
For information about configuring the Job Server to run using a specific account, see Windows Help for configuring services to run using a different user account, since the Job Server runs as a regular Windows service.
excludeFolderPaths
The folders that need to be excluded from the scan. The folder paths to be excluded from scan must be subpaths of the “scanFolderPaths” parameter.
Multiple values can be entered by separating them with the “|” character.
Note: If the list of excluded folders contains folders which are subfolders of other folders in the same list, these are removed from the list since they are redundant.
excludeFiles
Filename pattern used to exclude certain types of files from scanning. This parameter uses regular expressions.
For example to exclude all documents that have the extension “txt”, use this regular expression: (.)+\.txt
Use “|” as delimiter if you want to enter multiple exclusion patterns.
Note:
The regular expressions use syntax similar to Perl. For more technical details please read the specific javadocs page at:
http://java.sun.com/j2se/1.5.0/docs/api/java/util/regex/Pattern.html
For more information about regular expressions, please visit http://www.regular-expressions.info!
ignoreHiddenFiles
Specifies whether files marked as “Hidden” in the file system should be scanned or not.
scanChangedFilesBehaviour
Specifies the behavior of the scanner when a file update is detected. Accepted values are:
1 – (default) – the changed file will be added as update object
2 – the changed file will be added as a new version
3 – the changed file will be added as a new object
Format: String
For more details please consult chapter 2.4 Working with Versions
moveFilesToFolder
Set a valid local file system path or UNC path folder where to move scanned files. All files which have been scanned successfully will be moved to a cloned folder structure under the configured path.
Example:
scanFolderPath = c:\source\documents
moveFilesToFolder = c:\moved\documents
The source file c:\source\documents\folderA\document.doc will be moved to c:\moved\documents\<scanRunId>\folderA\ document.doc
If this parameter is set and a file is moved by the scanner its contentPath attribute will reference the new location. The importer will use the moved location instead of the source while processing the files.
scanFolders
Boolean. If flag is checked folders will be scanned as fully editable and transformable objects. This way custom attributes, object types, owner and permissions can be defined for the folders. Otherwise folders will be retained only as path references from the documents and will be created using default folder settings in Documentum.
mcMetadataFileExtension
The file-extension of the XML files which contains extra metadata for the scanned files and folders.
For more details please consult chapter 2.2 Enriching the content using metadata from external XML files
scanExtendedMetadata
Flag indicating if extended metadata will be scanned for common documents like: MS Office documents, pdf, etc. Extended metadata is extracted using apache tika library. For more information about all supported formats please refer the apache-tika documentation: http://tika.apache.org/0.9/formats.html
extendedMetadataDateFormats
Can be used for setting one or more Java date formats the scanner will be used to detect the date attribute in the document content. If empty, the default list of patterns will be used
ignoredAttributesList
Contains list of attributes (comma delimited) that will be ignored. All this attributes will be ignored during scanning saving performance and database storage.
computeChecksum
When it's checked the checksum of scanned files will be computed. Useful for determining whether files with different names and from different locations have in fact the same content, as can frequently happen with common documents copied and stored by several users in a file share environment.
Do not enable this option unless necessary, since the performance impact is significant due to the scanner having to read the full content for each and compute the checksum for it.
hashAlgorithm
Specifies the algorithm that will be used to compute the Checksum of the scanned objects.
Possible values are "MD2", "MD5", "SHA-1", "SHA-224", "SHA-256", "SHA-384" and "SHA-512". Default value is MD5.
hashEncoding
Specifies the encoding that will be used to compute the Checksum of the scanned objects.
Possible values are "HEX", "Base32" and "Base64". Default value is HEX.
ignoreWarnings
When it's checked the following warnings are ignored so the affected objects will be scanned:
Warning when an xml-metadata file is missing or cannot be read;
Warning when "owner name" or "creation date" cannot be extracted;
Warning when check sum cannot be computed;
Warning when extended metadata cannot be extracted;
versionIdentifierAttribute
Name of the source attribute which identifies a version tree. Setting this parameter will activate the versioning based on metadata. Must be used together with versionLevelAttribute
The specified source attribute’s value must be the same for all objects that are part of a version group/tree.
The attribute name must be prefixed with xml_, i.e. xml_vid if the attribute containing the value in the external metadata file is called vid
versionLevelAttribute
Name of the source attribute which identifies the order of objects within a group of versions. Must be used together with versionIdentifierAttribute.
The specified source attribute’s values must be distinct for all objects within the same version group/tree, i.e. with the same versionIdentifierAttribute value.
The specified source attribute’s values must be positive numbers. A decimal point followed by one or more digits is also permitted, as long as the value makes sense as a number
The attribute name must be prefixed with xml_, i.e. xml_version if the attribute containing the version in the external metadata file is called version
loggingLevel*
Sets the verbosity of the log file.
Values:
1 - logs only errors during scan
2 - is the default value reporting all warnings and errors
3 - logs all successfully performed operations in addition to any warnings or errors
4 - logs all events (for debugging only, use only if instructed by fme product support since it generates a very large amount of output. Do not use in production)
Configuration parameters
Values
Name
Enter a unique name for this scanner
Mandatory
Adapter type
Select SharePoint Online from the list of available adapters
Mandatory
Location
Select the Job Server location where this job should be run. Job Servers are defined in the Jobserver window. If no Job Server has been created by the user to this point, migration-center will prompt the user to define a Job Server Location when saving the Scanner.
Mandatory
Description
Enter a description for this job (optional)
Configuration parameters
Values
siteURL*
This is the URL to the SharePoint site that will be scanned.
Example: https://<sharepointonlinefarm>/<site>/
Mandatory
username*
The SharePoint Online user on whose behalf the scan process will be executed.
Should be a SharePoint Online Administrator.
Example: admin@company.onmicrosoft.com
Mandatory
password*
Password of the user specified above
Mandatory
proxyServer
The name or IP of the proxy server.
proxyPort
The port of the proxy server.
proxyUsername
The username if required by the proxy server.
proxyPassword
The password for the proxy username.
camlQuery
CAML statement that will be used to retrieve the ids of objects that will be scanned.
In case of setting this parameter the parameters excludeListAndLibraries, includeListAndLibraries, scanSubsites, excludeSubsites must not be set.
excludeListsAndLibraries
The list of libraries and lists path to be excluded from scanning.
includeListsAndLibraries
List of Lists and Libraries the adapter should scan. Multiple values can be entered and separated with the “,” character.
excludeSubsites
The list of subsites path to be excluded from scanning.
Multiple values can be entered and separated with the “,” character.
excludeContentTypes
The list of content types to be excluded from scanning.
Multiple values can be entered and separated with the “,” character.
excludeFolders
The list of folders to be excluded from scanning. All the folders with the specified name from the site/subsite/library/list depending of scanner configuration will be ignored by the scanner. To exclude a specific folder, it is necessary to specify the full path.
Multiple values can be entered and separated with the “,” character.
Example: folder1 then all the folders with the folder1 name from the site/subistes/library/list will be excluded.
<Some_Library>/<Test_folder>/folder1 the scanner will exclude just the folder1 that is in the Test_folder.
includeFolders
List of folders the adapter should scan. All the folders with the specified name from the site/subsite/library/list depending of scanner configuration will be scanned. To scan a specific folder, it is necessary to specify the full path.
The values of the parameter “excludeFolders” will be ignored if this attribute contains values.
Multiple values can be entered and separated with the “,” character.
Example: folder1 then all the folders with the folder1 name from the site/subistes/library/list will be scanned.
<Some_Library>/<Test_folder>/folder1 the scanner will scan just the folder1 that is in the Test_folder.
scanSubsites
Flag indicting if the objects from subsites will be scanned.
scanDocuments
Flag indicting if the documents scanned will be added as migration-center objects.
scanFolders
Flag indicting if the folders scanned will be added as migration center objects.
includedAttributes
The internal attributes that will be scanned even if the value is null
scanLatestVersionOnly
Flag indicating if just the latest version of a document will be scanned.
computeChecksum
If enabled the scanner calculates a checksum for every content it scans. These checksums can be used during import to compare against a second checksum computed during import of the documents. If the checksums differ, it means the content has been corrupted or otherwise altered, causing the affected document to be rolled back and transitioned to the “import error” status in migration-center.
hashAlgorithm
Specifies the algorithm to be used if the computeChecksum parameter is checked. Supported algorithms: MD5, SHA-1, SHA-224, SHA-256, SHA-384 and SHA-512.
Default algorithm is MD5.
hashEncoding
The encoding type which will be used for checksum computation. Supported encoding types are HEX, Base32, Base64.
Default encoding is HEX.
exportLocation*
Folder path. The location where the exported object content should be temporary saved. It can be a local folder on the same machine with the Job Server or a shared folder on the network. This folder must exist prior to launching the scanner and must have write permissions. migration-center will not create this folder automatically. If the folder cannot be found an appropriate error will be raised and logged. This path must be accessible by both scanner and importer so if they are running on different machines, it should be a shared folder.
Mandatory
numberOfThreads
Maximum number of concurrent threads.
Default is 10 and maximum allowed is 20.
loggingLevel*
Sets the verbosity of the log file.
Values:
1 - logs only errors during scan
2 - is the default value reporting all warnings and errors
3 - logs all successfully performed operations in addition to any warnings or errors
4 - logs all events (for debugging only, use only if instructed by fme product support since it generates a very large amount of output. Do not use in production)
Mandatory
Configuration property
Values
excluded_file_extensions
List of file extensions that will be ignored by the scanner.
Default: .aspx|.webpart|.dwp|.master|.preview.
excluded_attributes
List of internal attributes that will be ignored by the scanner.
initialization_timeout
Amount of time in milliseconds to wait before the scanner throws a timeout error during the initialization phase.
Default: 21600000 ms
Configuration parameters
Values
Name
Enter a unique name for this scanner
Mandatory
Adapter type
Select the “LotusDomino” adapter from the list of available adapters
Mandatory
Location
Select the job server location where this job should be run. Jobservers are defined in the Jobserver window. If no job server is selected, migration-center will prompt the user to define a job server location when saving the importer.
Mandatory
Description
Enter a description for this job (optional)
Configuration parameters
Values
dominoServer
The IBM Domino server used to connect to the application. If the application (”.nsf” file) is stored and accessed locally without using an IBM Domino server, leave this field empty.
dominoDatabase*
The filename of the “.nsf” file that holds the application’s documents. If the “.nsf” file is stored inside the IBM Domino/Notes data directory, the path of the “.nsf” file relative to the IBM Domino/Notes data directory is sufficient, otherwise specify the fully qualified filename of the “.nsf” file.
If PDF is used as either the primary format or one of the secondary formats and PDF is to be generated based on existing documents (s.a.), the value for “dominoServer” and “dominoDatabase” will be passed to PDF generation module. Therefore, the database filename should be specified relative to the IBM Domino/Notes data directory.
Mandatory
idFilename*
The filename of the ID file used to access the application.
This ID must have full permissions for all documents that are due to be scanned.
Mandatory
password
The password for the ID file referenced in parameter “idFilename”.
selectionFormula*
An IBM Notes formula to select the documents that should be processed by the scanner.
The default is “select @all” which will process all documents
Mandatory
profileName*
The name of the profile used to extract information out of the IBM Domino/Notes application.
The default value for this parameter “mcProfile” which will cause the scanner to process the application according to the the other scanner configuration parameters, e.g. extract document metadata, document contents and attachments etc.
By changing the value to ''mcStatistics'' the scanner will ignore most of the other scanner configuration parameters and - instead of processing each document (extract metadata, document contents and attachments) - generate a text file with statistical information about the application (forms, documents, attributes). The generated file will be placed inside the folder specified by scanner parameter “exportLocation” and named “<jobID>_statistics.txt”. The profile “mcStatistics” will not generate any objects in the migration-center database.
This parameter’s value must not be changed to any other value than “mcProfile” or “mcStatistics” unless a customized profile has been developed to fulfill specific customer needs.
Mandatory
primaryDocumentFormat*
The primary format used to extract the document. The resulting file will be treated as the primary document content in mc database. Valid values are “dxl”, “html”, “eml”, “eml2html” and “pdf”.
Details regarding the different formats can be found in chapter Document Formats.
The default value is “dxl”.
The 64-bit version of the scanner can only generate “DXL” and “PDF”. Configuring any other format will cause the scanner to fail.
Mandatory
secondaryDocumentFormats
A list of all document formats that should be generated in addition to the primary document format (see “primaryDocumentFormat” above). Multiple values must be separated by “|” (pipe). Valid values are “dxl”, “html”, “eml”, “eml2html” and “pdf”.
The resulting files will be associated with the mc object as secondary formats. Their (fully-qualified) filenames are made available using the mc object’s “secondaryFormats” attribute which is a multi-value attribute.
Details regarding the different formats can be found in chapter Document Formats.
The 64-bit version of the scanner can only generate “DXL” and “PDF”. Configuring any other format will cause the scanner to fail.
includeAttributes
A list of all document attributes (metadata) that should be extracted from the IBM Domino/Notes application and made available inside the MC database. If all attributes should be extracted, leave this field empty
excludedAttributeTypes
A filter specifying Domino data types that should not be exported from the IBM Domino/Notes application.
Please refer to chapter Domino attribute types for details.
Default value is “1” which will exclude all composite items from being exported to the migration-center database
attributeSplitterMaxChunkSizeBytes
Large attribute values are split into chunks of max. bytes as specified with correct handling of multi-byte characters to avoid any SQL exceptions.
Migration-center uses Oracle’s “varchar2” datatype which has a
Maximum of 4,000 bytes.
exportCompositeItems*
Specifies whether composite items (i.e. richtext fields) contained in an IBM Domino/Notes document (e.g. an e-mail’s “Body” element) should be extracted from the document and made available as separate richtext files (RTF format). Valid values are “false” and “true” as well as “0” and “1”.
If this option is chosen, the scanner will generate one RTF file for each of an IBM Domino/Notes document’s composite items. The name of the file will be created as <document’s NoteID>_<item’s name>.rtf.
This option is especially useful if the document’s contents (typically contained in richtext fields) should be editable once the document has been migrated into the target system.
This feature is not supported with the 64-bit version of the scanner.
Mandatory
includedCompositeItems
A list of names of composite items in a document (e.g. “Body”) that should be extracted into separate richtext files. Multiple values must be separated by “|” (pipe), If all composite items should be extracted, leave this field empty.
If you want to exclude specific attributes, prefix each attribute name with a “!”.
It is not possible to mix include and exclude operations. If one composite item’s name in the list is prefixed with “!”, then only those composite item names starting with “!” will be considered and the corresponding items will be excluded
exportAttachments*
Specifies whether attachments contained in the IBM Domino/Notes documents should be extracted from the document in their native format and made available as separate MC objects. Valid values are “false” and “true” as well as “0” and “1”.
Mandatory
embedAttachmentsIntoPDF*
Determines whether the Domino documents’ attachments are extracted and embedded into a PDF rendition of the Domino document. If this parameter is set to true:
- all attachments will automatically be extracted from the document independent of “exportAttachments” parameter’s value,
- a PDF rendition will automatically be created even if it has not been requested according to the values of parameters “primaryDocumentFormat” or “secondaryDocumentFormats”.
Mandatory
embedLinksIntoPDF
If a PDF rendition is requested and this parameter is set to true, links (Domino document links and URL links) contained in the original Domino document will be added as bookmarks to the PDF file.
The default value is “false”.
exportLocation*
The location where the exported object content should be temporary saved. It can be a local folder on the machine that runs the job server or a shared folder on the network.
This folder must exist prior to launching the scanner and the MC user must have write permission for it. MC will not create this folder automatically. If the folder cannot be found, an appropriate error will be raised and logged.
This path must be accessible by both scanner and importer. Therefore, if scanner and importer are running on different machines, using a shared network folder is advisable.
Mandatory
loggingLevel*
Sets the verbosity of the log file.
Valid values are:
1 - logs only errors during scan
2 - is the default value reporting all warnings and errors
3 - logs all successfully performed operations in addition to any warnings or errors
4 - logs all events (for debugging only, use only if instructed by fme product support since it generates a very large amount of output. Do not use in production)
Mandatory
Type
Numeric value
TYPE_ACTION
16
TYPE_ASSISTANT_INFO
17
TYPE_CALENDAR_FORMAT
24
TYPE_COLLATION
2
TYPE_COMPOSITE
1
TYPE_ERROR
0
TYPE_FORMULA
1536
TYPE_HIGHLIGHTS
12
TYPE_HTML
21
TYPE_ICON
6
TYPE_INVALID_OR_UNKNOWN
0
TYPE_LS_OBJECT
20
TYPE_MIME_PART
25
TYPE_NOTELINK_LIST
7
TYPE_NOTEREF_LIST
4
TYPE_NUMBER
768
TYPE_NUMBER_RANGE
769
TYPE_OBJECT
3
TYPE_QUERY
15
TYPE_RFC822_TEXT
1282
TYPE_SCHED_LIST
22
TYPE_SEAL
9
TYPE_SEAL_LIST
11
TYPE_SEALDATA
10
TYPE_SIGNATURE
8
TYPE_TEXT
1280
TYPE_TEXT_LIST
1281
TYPE_TIME
1024
TYPE_TIME_RANGE
1025
TYPE_UNAVAILABLE
512
TYPE_USER_DATA
14
TYPE_USERID
1792
TYPE_VIEW_FORMAT
5
TYPE_VIEWMAP_DATASET
18
TYPE_VIEWMAP_LAYOUT
19
TYPE_WORKSHEET_DATA
13
Configuration parameters
Values
Name
Enter a unique name for this scanner
Mandatory
Adapter type
Select the “Database” adapter from the list of available adapters
Mandatory
Location
Select the Job Server location where this job should be run. Job Servers are defined in the Jobserver window. If no Job Server migration-center will prompt the user to define a Job Server Location when saving the Importer.
Mandatory
Description
Enter a description for this job (optional)
Configuration parameters
Values
connectionURL*
The database connection URL that is a string that your DBMS JDBC driver uses to connect to a database. It can contain information such as where to search for the database, the name of the database to connect to, and configuration properties. The exact syntax of a database connection URL is specified by your DBMS.
Example connection strings for some common databases:
jdbc:oracle:thin:@[host][:port]:SID
jdbc:sqlserver://[serverName[\instanceName][:portNumber]][;property=value[;property=value]]
jdbc:mysql://host_name:port/dbname
Example (Excel):
jdbc:odbc:DRIVER={Microsoft Excel Driver (*.xls, *.xlsx, *.xlsm, *.xlsb)};DBQ=PATH-TO-EXCEL-FILE;ReadOnly=1
Mandatory
driverClass*
The JDBC driver entry class that is the class the implement the interface java.sql.Driver.
Examples:
oracle.jdbc.OracleDriver com.microsoft.sqlserver.jdbc.SQLServerDriver
sun.jdbc.odbc.JdbcOdbcDriver
Mandatory
username*
Database username used for jdbc connection.
Mandatory
password*
Password used for jdbc connection.
Mandatory
queryFile*
The xml file path that contains the SQL queries that will be used by scanner to extract objects and metadata from database.
See the chapter 3.2.3 for more details about configuring queries.
Mandatory
scanUpdates*
Enables the scanner to update previously scanned objects in the mc database. If unticked, previous scanned objects will be skipped.
Mandatory
deltaFields
Contains the fields that will be used for detecting if an object needs to be scanned as an update. If a value of one attribute specified here was changed in the source database after the previous scan was executed, the scanner will scan the object as an update, otherwise it will just ignore the object. Delta fields is taken in consideration only when “scanUpdates” is checked. See 3.3 Delta migration section for more details
computeChecksum
When it's checked the checksum of scanned files will be computed. Useful for determining whether files with different names and from different locations have in fact the same content, as can frequently happen with common documents copied and stored by several users in a file share environment.
Do not enable this option unless necessary, since the performance impact is significant due to the scanner having to read the full content for each and compute the checksum for it.
hashAlgorithm
Specifies the algorithm that will be used to compute the checksum of the scanned objects.
Possible values are "MD2", "MD5", "SHA-1", "SHA-224", "SHA-256", "SHA-384" and "SHA-512". Default value is MD5.
hashEncoding
Specifies the encoding that will be used to compute the Checksum of the scanned objects.
Possible values are "HEX", "Base32" and "Base64". Default value is HEX.
exportLocation
The location where the exported object content should be saved. It can be a job server local folder or a shared folder and it should exist and it should be writable.
loggingLevel*
Sets the verbosity of the log file.
Values:
1 - logs only errors during scan
2 - is the default value reporting all warnings and errors
3 - logs all successfully performed operations in addition to any warnings or errors
4 - logs all events (for debugging only, use only if instructed by fme product support since it generates a very large amount of output. Do not use in production)
Mandatory
Attribute name
Description
type
Defines the type of the query. The following values are possible: main, versions, main-metadata, version-metadata.
main – is the query the returns the unique identifier of every object the will be scanned. A valid configuration must contain a single main query definition. If you need to deal with version this should also return the unique identifier of every version that will be passed to versions query.
versions – is the query the returns the unique identifier of every version for the objects returned by the main query. It will take as parameter the value of column specified in attribute versionid defined in the main query. This query will be run once for every row returns by the main query.
main-metadata - are the queries that extracts the metadata for main objects. They take as parameter the value return by the main query in the column specified by attribute key. These queries will be run once for every row returns by the main query. You can define an unlimited number of such queries.
version-metadata - are the queries that extracts the metadata for versions. They take as parameter the value return by the versions query in the column specified by attribute key. These queries will be run once for every row returns by the versions query. You can define an unlimited number of such queries.
main-content – is the query that extracts the content for main objects. It takes as parameter the value return by the main query in the column specified by attribute key. The query will be run once for every row returned by the main query.
version-content - is the query that extracts the content for version objects. It takes as parameter the value return by the versions query in the column specified by attribute key. The query will be run once for every row returned by the versions query.
Note: If versions query is present, only the objects returned by this query will be extracted. The main query will be used only for identifying the version trees that will be scanned. In this case the main-metadata queries will be ignored.
Mandatory
name
Defines the name of the query and it will be used for logging purpose.
Optional
key
Defines the column name which value will be stored in MC column id_in_source_system. The value of the column defined in this attribute will be passed as parameter in main-metadata or version-metadata
Mandatory for main and versions queries.
versionid
Defines the column name which value will be passed as parameter to the versions query. It can be defined only for main query.
Optional
parentid
Defines the column name that contains the id of the parent version. This might be used only in case of branches. When not used, the versions will be scanned in the order they are returned by the query.
Optional for versions query.
contentpath
Defines the column name that contains the path where object content is stored. It will be used to populate the MC column "content_location".
Note: The content will not be exported by the scanner in the location specified by this attribute. The value should point to an existing file.
Optional for main-metadata and version-metadata queries.
Element
Possible Values
Description
ACLType
· Owner
· OwnerGroup
· Public
· ACL
This refers to the default and assigned access.
RightName
· Content Server User Login Name
· Content Server Group Name
· -1
It’s set with the owner name or owner access, with the username or group name for the assigned access and with “-1” for the public access.
Permissions
· See
· SeeContents
· Modify
· EditAtts
· CreateNode
· Checkout
· DeleteVersions
· Delete
· EditPerms
The granted permissions separated by |.
Configuration parameters
Values
Name*
Enter a unique name for this scanner
Mandatory
Adapter type*
Select the “OpenText” adapter from the list of available adapters
Mandatory
Location*
Select the Job Server location where this job should be run. Job Servers are defined in the Jobserver window. If there are no Job Server defined, migration-center will prompt the user to define a Job Server Location when saving the Importer.
Mandatory
Description
Enter a description for this job (optional)
Configuration parameters
Values
username*
The OpenText Content Server user with "System Administration Rights". This privilege is required so the scanner can access all objects in the scanning scope.
Mandatory
password*
The user password
Mandatory
webserviceUrl*
The URL to the .NET Content Web Services.
http://server:port/cws/Authentication.svc" Mandatory
authenticationWebserviceUrl*
The URL to a valid authentication webservice. Currently, CWS and OTDS authentication webservices are accepted.
Mandatory
authenticationType*
The authentication type. Valid values are CWS for standard content server authentication and OTDS for OpenText Directory Services authentication.
Mandatory
classificationsWebserviceUrl
The URL to a valid classification webservice or similar. Use only for Content Server 10.0.0 or later.
Ex: http://server:port/les-services-classifications/Classifications.svc
sessionLength*
The length of the session which is set into Content Server. The length is represented in minutes and has to be bigger or equal than 16 minutes. This is required for refreshing the authentication token.
Mandatory
rootFolder*
The id of the node under the containers and documents will be scanned. By default, the value is 2000. Can be set with any folder id.
Mandatory
scanFolderPaths
The list of folder paths where the scanner looks for objects (documents and folders) relative to the root folder specified. The paths must start with "/". Multiple paths can be separated using the “|” character. If empty the scanner will scan the entire folder structure under the root folder.
The following wildcards are allowed in the folder path:
* - replace zero, one or multiple characters
? - replace a single character
Examples of using wildcards:
/Shelf/Drugs/Drug No. ?
/Shelf/Drugs/*
/Shelf/Drugs/Drug No. ?/Test
/Shelf/Drugs/*end/Ultimate
excludeFolderPaths
The list of folder paths to be excluded from the scan. Paths must start with "/" and must be relative to the rootFolder
exportDocuments
Flag indicating if the documents will be exported. When exportDocuments is enabled, the scanner will scan all the documents and their versions linked under the folders specified in rootFolder and scanFolderPaths The documents are exported as OTCS(document) objects to MC database.
exportLatestVersions
A number specifying how many versions from every version tree will be exported starting from the latest version to the older versions. If it is empty, 0 or negative, all versions will be exported.
exportCompoundDocuments
Flag indicating if the compound documents will be exported. When enabled the scanner will scan the latest version of the compound document together with its children. There will be no migration-center relations between the compound document and its children. The children are related to the compound document through the parent folder attribute.
The parameter exportCompoundDocuments can be enabled only when exportDocuments is enabled.
exportClassifications
Flag indicating if the classifications will be exported. When exportClassifications is enabled, the scanner will export the classifications of the scanned folders and documents in the source attribute "Classifications". Each classification will be a distinct value of the attribute "Classifications". The classification values will be saved as paths. The export of classifications will be available only for CS 10, 10.5 and 16 since CS 9.7.1 does not provide this functionality.
Ex: Corporate Information/News/Newsletter
exportShortcuts
Flag indicating if the shortcuts of scanned documents and folders will be exported. The full path of every shortcut pointing to the current object (document or folder) is scanned in the source attribute “Shortcuts”.
exportShortcutsAsObjects
Flag indicating if the shortcuts of scanned documents and folders will be exported as independent objects. The shortcuts and generations are exported as OTCS(shortcut) objects to MC database.
exportRenditions
Flag indicating if the renditions will be exported.
exportFolderStructure
Flag indicating if the structure of folders will be exported. When enabled the scanner will scan entire folder structure under the folders configured by parameters rootFolder and scanFolderPaths. The scanner will export all folders as OTCS(container) objects to MC database.
exportLocation*
The location where the exported object content should be saved. It can be a job server local folder or a shared folder. It must exist and it should be writable.
Mandatory
computeChecksum
When it's checked the checksum of scanned files will be computed. Useful for determining whether files with different names and from different locations have in fact the same content, as can frequently happen with common documents copied and stored by several users in a file share environment.
Do not enable this option unless necessary, since the performance impact is significant due to the scanner having to read the full content for each and compute the checksum for it.
hashAlgorithm
Specifies the algorithm that will be used to compute the Checksum of the scanned objects.
Possible values are "MD2", "MD5", "SHA-1", "SHA-224", "SHA-256", "SHA-384" and "SHA-512". Default value is MD5.
hashEncoding
Specifies the encoding that will be used to compute the Checksum of the scanned objects.
Possible values are "HEX", "Base32" and "Base64". Default value is HEX.
numberOfThreads
The number of threads that will be used for scanning the documents. Maximum allowed is 20.
loggingLevel*
The scanner logging level to use: 1-Error, 2-Warn, 3-Info, 4-Debug.
Mandatory
Name
SolutionId
Deployed
mcscanner.wsp
f905025e-3de7-44c9-828a-f7b12f726bc1
False
Configuration parameters
Values
Name
Enter a unique name for this scanner
Mandatory
Adapter type
Select SharePoint from the list of available adapters
Mandatory
Location
Select the Job Server location where this job should be run. Job Servers are defined in the Jobserver window. If no Job Server has been created by the user to this point, migration-center will prompt the user to define a Job Server Location when saving the Scanner.
Mandatory
Description
Enter a description for this job (optional)
Configuration parameters
Values
webserviceUrl*
This is the URL to the SharePoint Scanner component installed on the SharePoint Server
Also see chapter 3 Configuration for more information.
Example: http://<sharepointfarm>/<site>/
Mandatory
Username*
The SharePoint user on whose behalf the scan process will be executed.
This user also needs to be able to access the temporary storage location where the scanned objects will be saved to (see parameter exportLocation below).
Should be a SharePoint Administrator.
Example: sharepoint.corporate.domain\spadmin
Mandatory
Password*
Password of the user specified above
Mandatory
includeLibraries*
List of Document Libraries the adapter should scan. Multiple values can be entered and separated with the “|” character. At least one valid Document Library must be specified
Mandatory
query
Sharepoint CAML query for a detailed selection of scan documents.
Does not work with excludeContentTypes.
excludeContentTypes
Exclude unneeded content types when scanning the document libraries specified above.
Multiple values can be entered and separated with the “|” character.
excludeFileExtensions
Exclude unneeded file types when scanning the document libraries specified above.
Multiple values can be entered and separated with the “|” character.
excludeAttributes
Exclude unneeded columns (attributes) when scanning the document libraries specified above.
Multiple values can be entered and separated with the “|” character.
includeInternalAttributes
List of internal SharePoint attributes to be scanned
scanDocuments
If enabled the scanner will a process all the Documents it encounters for the configured valid path
scanListItems
If enabled the scanner will a process all the List Items it encounters for the configured valid path
scanFolders
If enabled the scanner will a process all the Folders it encounters for the configured valid path
scanLists
If enabled the scanner will a process all the Lists/Libraries it encounters for the configured valid path
scanSubsites
If enabled the scanner will a process the data of the subsites
scanPermissions
If enabled each created item no matter the type will have the Permissions scanned into the migration center database for further use in the migration process
scanVersionsAsSingleObjects
If enabled the scanner will process each version tree as a single object in MC, which contains the content and metadata of the latest version and also the link to the contents of the other versions in the mc_previous_version_content_paths attribute
computeChecksum
If enabled the scanner calculates a checksum for every content it scans. These checksums can be used during import to compare against a second checksum computed during import of the documents. If the checksums differ, it means the content has been corrupted or otherwise altered, causing the affected document to be rolled back and transitioned to the “import error” status in migration-center.
hashAlgorithm
Specifies the algorithm to be used if the computeChecksum parameter is checked.
Supported algorithms: MD5, SHA-1, SHA-224, SHA-256, SHA-384 and SHA-512. Default algorithm is MD5.
Note that SHA-224
hashEncoding
Specified the encoding to be used if the computeChecksum parameter is checked.
Supported algorithms: HEX and Base64. Default is HEX.
exportLocation*
Folder path. The location where the exported object content should be temporary saved. This location is relative to the SharePoint server, thus the account specified above needs access to it. The export location can be both a local folder on the SharePoint server or a network share (recommended).
Mandatory
loggingLevel*
Sets the verbosity of the log file.
Values:
1 - logs only errors during scan
2 - is the default value reporting all warnings and errors
3 - logs all successfully performed operations in addition to any warnings or errors
4 - logs all events (for debugging only, use only if instructed by fme product support since it generates a very large amount of output. Do not use in production)
Mandatory
Configuration parameters
Values
Name
Enter a unique name for this scanner
Mandatory
Adapter type
Select the “Outlook” adapter from the list of available adapters
Mandatory
Location
Select the Job Server location where this job should be run. Job Servers are defined in the Jobserver window. If no Job Server migration-center will prompt the user to define a Job Server Location when saving the Importer.
Mandatory
Description
Enter a description for this job (optional)
Configuration parameters
Values
scanFolderPaths*
Outlook folder paths to scan.
The syntax is \\<accountname>[\folder path]. The account name at least must be specified. Folders are optional (specifying nothing but an account name would scan the entire mailbox, including all subfolders). Multiple paths can be entered by separating them with the “|” character.
Example: \\user@domain\Inbox would scan the Inbox of user@domain (including subfolders)
Mandatory
excludeFolderPaths
Outlook folder paths to exclude from scanning. Follows the same syntax as scanFolderPaths above.
Example: \\user@domain\Inbox\Personal would exclude user@domain’s personal mails stored in the Personal subfolder of the Inbox if used in conjunction with the above example for scanFolderPaths.
ignoredAttributesList
A comma separated list of Outlook properties to be ignored by the scanner.
At least Body,HTMLBody,RTFBody,PermissionTemplateGuid should be always excluded as these significantly increase the size of the information retrieved from Outlook but don’t provide any information useful for migration purposes in return
exportLocation*
Folder path. The location where the exported object content should be temporary saved. It can be a local folder on the same machine with the Job Server or a shared folder on the network. This folder must exist prior to launching the scanner and must have write permissions. migration-center will not create this folder automatically. If the folder cannot be found an appropriate error will be raised and logged. This path must be accessible by both scanner and importer so if they are running on different machines, it should be a shared network folder.
Mandatory
loggingLevel*
Sets the verbosity of the log file.
Possible values:
1 - logs only errors during scan
2 - is the default value reporting all warnings and errors
3 - logs all successfully performed operations in addition to any warnings or errors
4 - logs all events (for debugging only, use only if instructed by fme product support since it generates a very large amount of output. Do not use in production)
Mandatory
The Documentum No Content Copy (NCC) Importer is a special variant of the regular Documentum Importer. It offers the same features as regular Documentum Importer with the difference that the content of the documents is not imported to Documentum during migration. The content files themselves can be attached to the migrated documents in the target repository by using one the following methods:
copy files from the source storage to the target storage outside of migration-center
attach the source storage to the target repository so the content will be access from the original storage.
The Documentum Scanner currently supports Documentum Content Server versions 6.5 to 20.2, including service packs.
For accessing a Documentum repository Documentum Foundation Classes 6.5 or newer is required. Any combinations of DFC versions and Content Server versions supported by EMC Documentum are also supported by migration-center’s Documentum Scanner, but it is recommended to use the DFC version matching the version of the Content Server being scanned. The DFC must be installed and configured on every machine where migration-center Server Components is deployed.
When documents to be migrated are located in a Content Addressable Storage (CAS) like Centera or ECS some additional steps for deployment and configurations are required.
Create the centera.config file in the folder .\lib\mc-dctm-adaptor. The file must contain the following line: PEA_CONFIG = cas.ecstestdrive.com?path=C:/centera/ecs_testdrive.pea Note: Set the storage IP or machine name and the local path to the PEA file.
Copy Centera SDK jar files in .\lib\mc-dctm-adaptor folder
Copy Centera SDK dlls in folder that is set in the path variable. Ex: C:\Program Files\Documentum\Shared
Restart the jobserver
If Centera SDK 32 bit is used, than the Jobserver must be run with Java 32 bit The importer was tested with Centera SDK version 3.2
Documentum NCC importer works in combination with Documentum No Content Copy (NCC) Scanner. The scanner does not export the content from the source repository but it exports the dmr_content objects associated with the documents and stores them in migration center database as relations of type "ContentRelation". The importer will use the information in the exported dmr_content for creating the corresponding dmr_content in the target repository in such such a way that it points to the content in the original filestore.
Documentum NCC importer supports all the features of the standard Documentum Importer but the system rules related to the content behave differently as is described below. Also the some dm_document attributes are now mandatory.
Primary content attribute
Name
Description
a_content_type
It must be set with the format of the content in the target repository. Leave it empty for the documents that don't have a content.
a_storage_type
It must be set with the name of the storage in the target repository where the document will imported. The target storage must be storage that points to filestore where the document was located in the source repository. Leave it empty for the documents that don't have a content.
Rendition system attribute
Name
Description
dctm_obj_rendition
It has to be set with the r_object_id of the dmr_content objects scanned from source repository. The required values are provided in the source attribute with the same name.
dctm_obj_rendition_format
For every value in dctm_obj_rendition, a rendition format must be specified in this rule. The formats specified in this attribute must be valid formats in the target repository.
dctm_obj_rendition_modifier
Specify a page modifier to be set for the rendition. Any string can be set (must conform to Documentum’s page_modifier attribute, as that’s where the value would end up)
Leave empty if you don’t want to set any page modifiers for renditions.
If not set, the importer will not set any page modifier
dctm_obj_rendition_page
Specify the page number of every rendition in rule dctm_obj_rendition.
If not set, the page number 0 will be set for all renditions
dctm_obj_rendition_storage
Specify a valid Documentum filestore for every rendition set in the rule dctm_obj_rendition. It must have the same number of values as dcmt_obj_rendition attribute.
All renditions scanned from the source repository must be imported in the target repository. If dctm_obj_rendition will be set with fewer or more values than the renditions scanned from the source repository the object will fail to import.
In addition to standard parameters inherited from Documentum Importer some specific parameters are provided in the Documentum NCC importer.
Name
Description
maxSizeEmbeddedFileKB
Content Addressable Storage (CAS) allows small content to be saved embedded. Set the max size of the content that can be stored embedded. Max allowed values is 100 (KB). If If 0 is set, no embedded content is saved.
tempStorageName
The name of the temporary storage where a dummy content will be created during the migration. For creating dmr_content objects the importer needs to create a temporary dummy content (files of 0 KB) that will be stored in the storage specified in this parameter.
This storage should be deleted after the migration is done.
This kind of migration requires some settings to be done on the target repositories at the end of the migration.
As it was described above, the importer creates temporary content during migration. This content is not required to be kept after the migration and therefore the storage set in the parameter "tempStorageName" can be deleted from the target repository.
This section applies only when content is located in a file storage. In case of using a Content Addressable Storage updating data ticked sequence is not applicable.
After documents are imported to your new Documentum there will be a mismatch of the data ticket offset between your new Documentum file store and the Documentum Content Server Cache. The Documentum NCC importer is delivered with a tool fixing this offset. You can find this tool in your Server Components Installations folder under "\Tools\mc-fix-data-ticket-sequence". Running this start.bat-file opens a graphical user interface with a dialog to select the repository, which contains imported documents, and provide credentials to connect to it. After a successful login the tool lists available Documentum file stores that may require an action to fix the data sequence. If the value “offset” of a file store is less than 0 your action is required to update the data ticket sequence of this file store. The column “action” informs you to take action as well. If you want to update the data ticket sequence of a file store, you must select the file store and press the button “Update data ticket sequence” at the bottom of the tool. After a successful update of the data ticket sequence you have to restart your repository via the Documentum Server Manager.
Delta migration for the multipage content does not work properly when a new page is added to the primary content.
The box importer takes files processed in migration-center and imports them into the target box site.
Starting with migration-center 3.2.6, the box Importer uses the box API 2.0 to communicate with the box cloud through the respective box web services.
Importer is the term used for an output adapter which is most likely used at the last step of the migration process. An importer works as a job that can be run at any time and can even be executed repeatedly. For every run a detailed history and log file are created. An importer is defined by a unique name, a set of configuration parameters and an optional description.
Box Importers can be created, configured, started and monitored through migration-center Client, but the corresponding processes are executed by migration-center Job Server.
Before being able to import into BOX there are some extra steps that need to be done:
Install the provided certificate
Authorize the Job Server machine from BOX
In the Job Server install location, under ../lib/mc-box-importer there is a file called myKeystore.p12. This file needs to be installed under the Trusted Root Certification Authorities in Windows. The certificate password is fmefme.
Depending on which browser you use for step 2, it might also be necessary to install this certificate in the browser you are using as well.
Please restart your Job Server after installing the certificate.
When you try to perform a BOX import for the first time on a specific Job Server, you will receive an error with a Box URL:
On the machine with the Job Server, access the provided link in the browser of your choice and give authorization to the Job Server to import into Box.
After the user successfully authenticated migration-center, the importer job can be started successfully.
If the Job Server is to be restarted then the importer will give this error again with a new link, and the machine needs to be provided authorization again.
The box importer can import files to box.net and can also create folders. All folders and files imported to box will have their own permissions, so only the box user whose credentials were used for the import can access the imported folders and files on the box site.
Documents targeted at a box site will have to be added to a migration set. Create a new migration set and set the <source object type>ToBox(document).object type in the Type drop-down. This is set in the -Migration Set Properties- window which appears when creating a new migration set (the type of object can no longer be changed after a migration set has been created).
The migration set is now set to work with box documents.
As with other target systems, migration-center objects targeted at box have some predefined system rules available in Transformation Rules. For box, these are “file_name”, “folder_path” and “target_type”. Additional transformation rules for other box document attributes can of course be defined by the user.
If rules for specific attributes are defined, the values need to be associated with the target attributes of a box_document. This is done on the Associations tab of the –Transformation Rules- window by selecting the box_document object type and pointing the target_type rule at it. In this case the value output by the target type rule should be “box_document”, as it must match the object type selected below.
Working with rules and associations is a core product functionality and is described in detail in the Client User Guide.
If the object type “box_document” mentioned above is not available, it can be created from <Manage>, <Object Types…> with specific box attributes:
Collaborators can be set by making a transformation rule with the following format <emailaddress>;<role>.
i.e. j.doe@fme.ro;Editor or j.doe@fme.ro;Viewer.
This transformation rule needs to be associated to the collaborators attribute of the box_document object type.
Tasks can be set by associating valid rules for the following target attributes:
task_users: needs to contain the valid email of a user
task_comment: needs to contain a valid text comment
task_date: (optional) a valid future date for when the task should finish
Multiple tasks can be set by having repeating values on the 3 attributes. All 3 attributes will need to contain the same number of repeating values for the import to succeed.
Note that if the task_users attribute is set, then task_comment needs to be set as well for the task to be successfully imported.
Objects that have changed in the source system since the last scan, are (re-)scanned as update objects. Whether an object in a migration set is an update or not can be seen by checking the value of the Is_update column – if it’s 1, the current object is an update to a previously scanned object (the base object). There are some things to consider when working with the update migration feature:
An update object cannot be imported unless its base object has been imported previously.
Objects deleted from the source after having been migrated are not detected and will not be deleted in the target system. This is by design (due to the added overhead, complexity and risk involved in deleting customer data).
Updates/changes to primary content will be detected and updated accordingly.
To create a new box Importer job, specify the respective adapter type in the Importer Properties window – from the list of available adapters “Box” must be selected. Once the adapter type has been selected, the Parameters list will be populated with the parameters specific to the selected adapter type.
The Properties window of an importer can be accessed by double-clicking an importer in the list or by selecting the Properties button or entry from the toolbar or context menu.
Configuration parameters
Values
Name
Enter a unique name for this scanner
Mandatory
Adapter type
Select the “Box” adapter from the list of available adapters
Mandatory
Location
Select the Job Server location where this job should be run. Job Servers are defined in the Jobserver window. If no Job Server migration-center will prompt the user to define a Job Server Location when saving the Importer.
Mandatory
Description
Enter a description for this job (optional)
Configuration parameters
Values
box_username*
User name for connecting to the target box site.
Mandatory
box_password*
The user’s password.
Mandatory
box_import_location*
The path inside the target box site where objects are to be imported. It must be a valid box path. This path will be appended in front of each folder_path before importing documents if both the box_import_location parameter and the folder_path attribute value are set.
number_of_threads
Number of documents that can be imported simultaneous (optional).
In order to optimize performance, the Box importer supports a multi-threading feature which assures that at the same time a fixed number of documents will be imported.
This parameter is optional, however if no value, non-integer value or a value less than 1 is provided, the importer will use its default value (5) for multi-threading.
Although there is no upper limit for the number of threads that can be configured, setting a high number can overload the system’s CPU and memory.
check_content_hash
Requests a checksum to be computed by box upon import, and compare this checksum against the checksum computed by mc before import. Should the two checksums not match, box returns an appropriate error causing mc to transition the affected documents to the “Import error” state.
loggingLevel*
Sets the verbosity of the log file.
Values:
1 - logs only errors during scan
2 - is the default value reporting all warnings and errors
3 - logs all successfully performed operations in addition to any warnings or errors
4 - logs all events (for debugging only, use only if instructed by fme product support since it generates a very large amount of output. Do not use in production)
Mandatory
On the | Migsets | tab, the user can select the migration sets to be imported with this importer. Depending on the chosen Adapter Type only the migration sets compatible with this type of importer will be displayed and can be selected for import. Also, only migration sets containing at least one object in a validated state will be displayed (since objects which haven’t been validated cannot be imported).
Available migration sets can be moved between the two columns by double clicking or using the arrows buttons.
A complete history is available for any box Importer job from the respective items’ History window. It is accessible through the History button/menu entry on the toolbar/context menu. The History window displays a list of all runs for the selected job together with additional information, such as the number of processed objects, the start and ending time and the status.
Double clicking an entry or clicking the Open button on the toolbar opens the log file created by that run. The log file contains more information about the run of the selected job:
Version information of the migration-center Server Components the job was run with
The parameters the job was run with
Execution Summary that contains the total number of objects processed, the number of documents and folders scanned or imported, the count of warnings and errors that occurred during runtime.
Log files generated by the box Adapter can be found in the Server Components installation folder of the machine where the job was run, e.g. …\fme AG\migration-center Server Components <Version>\logs
The amount of information written to the log files depends on the setting specified in the ‘loggingLevel’ start parameter for the respective job.
The Documentum Importer takes the objects processed in migration-center and imports them into a Documentum repository. As a change in migration-center 3.2, the Documentum scanner and importer are no longer tied to one another – any other importer can now import data scanned with the Documentum Scanner and the Documentum Importer can import data scanned by any other scanner. Starting from version 3.2.9, it supports objects derived from dm_sysobjects.
Importer is the term used for an output adapter used as the last step of the migration process. It takes care of importing the objects processed in migration-center into the target system (such as a Documentum repository).
An importer works as a job that can be run at any time and can even be executed repeatedly. For every run a detailed history and log file are created. It is defined by a unique name, a set of configuration parameters and an optional description.
Documentum Importers can be created, configured, started and monitored through migration-center Client, but the corresponding processes are executed by migration-center Job Server.
The supported Documentum Content Server versions are 5.3 – 20.2, including service packs. For accessing a Documentum repository Documentum Foundation Classes 5.3 or newer is required. Any combinations of DFC versions and Content Server versions supported by OpenText Documentum are also supported by migration-center’s Documentum Importer, but it is recommended to use the DFC version matching the version of the Content Server targeted for import. The DFC must be installed and configured on every machine where migration-center Server Components is deployed.
Starting with version 3.2.8, migration center supports Documentum ECS (Elastic Cloud Storage). Nevertheless, the documents cannot be imported to ECS if retention policy is configured in the CA store.
Starting from version 3.9 of migration-center additional configurations need to be made for the Documentum adapter to be able to locate Documentum Foundation Classes. This is done by modifying the dfc.conf file, located in the Job Server installation folder.
There are two settings inside the file that by default match the paths of a standard DFC install. One needs to have the path for the config folder of DFC and the other needs the path to the dctm.jar.
See below example:
wrapper.java.classpath.dfcConfig=C:/Documentum/config
wrapper.java.classpath.dfcDctmJar=C:/Program Files/Documentum/dctm.jar
The dfcConfig
parameter must point to the configuration folder. The dfcDctmJar
parameter must point to the dctm.jar file!
There are two ways to create Documentum folders with the importer:
Documents Migration set: When creating a new migration set choose the <source type>ToDctm(document) type – this will create migration set containing documents targeted at Documentum. Use the “autoCreateFolders” setting (checked) from the Documentum Importer configuration to generate the folder structure based on the “dctm_obj_link” values assigned by the transformation rules. No attributes or permissions can be set on the created folders.
Folder Migration set: When creating a new migration set choose the <source type>ToDctm(folder) type – this will create migration set containing folders targeted at Documentum. Now only the scanner runs containing folder objects will be displayed on the |Filescan Selection| tab. Note that the number of objects contained in the displayed scanner runs now indicates folders and not documents, which is why the number on display (folders) can be different from the total number of objects processed by the scan (if it contains other types of objects besides folders). When creating transformation rules for the migration set, keep in mind that folder-only migration sets have folder-specific attributes to work with, in this case attributes specifically targeted at Documentum folder objects. You can set permissions and attributes for the imported folders.
Important aspects to consider when importing folder migration set:
The attributes “object_name” and “r_folder_path” are key attributes for folders in Documentum. If these attributes are transformed without taking into consideration how these objects build into a tree structure, it may no longer be possible to reconstruct the folder tree. This is not due to migration-center, but rather because of the nature of folders being arranged in a tree structure which does create dependencies between the individual objects.In Documentum the “r_folder_path” attribute contains the path(s) to the current folder, as well as the folder itself at the end of the path (e.g. /cabinet/folder_one/folder_two/current_folder), while “object_name” contains only the folder name (e.g. current_folder). To make it easier for the user to change a folder name, migration-center prioritizes the “object_name” attribute over the “r_folder_path” attribute; therefore changing “object_name” from current_folder to folder_three for example will propagate this change to the objects’ “r_folder_path” attribute and create the folder /cabinet/folder_one/folder_two/folder_three without the user having to change the “r_folder_path” attribute to match. This only applies to the last part of the path, which represents the current folder, and not to other parts of the path. Those can also be modified using the provided transformation functions, but migration-center does not provide any automations to make sure the information generated in that case is correct.
The importer parameter “autoCreateFolders” applies to both documents migration set and folder migration set.
When importing folder migration set, in case an existing folder structure is already in place an error will be thrown for the folder objects that exist already. It is not possible to avoid this behavior unless you skip them manually by removing them from the migset or putting them in an invalid state for import.
Versions (and branches) are supported by the Documentum Importer, including custom version labels. The structure of the versions tree is generated by the scanners of the systems that support this feature and provide means to extract it. Although the version tree is immutable (i.e. the ordering of the objects relative to their antecedents cannot be changed) the version description (essentially the Documentum “r_version_label”) can be changed using the transformation rules prior to import.
All objects from a version structure must be imported since each of them reference its antecedent, going back to the very first version. Therefore, we advise not to drop the versions of an object between the scan and the import processes, as this will generate inconsistencies and errors. If an object is intended to be migrated without versions (i.e. the current version of the object needs to be migrated only) than the affected objects should be scanned without enabling the respective scanner’s versioning option.
Permissions or ACLs can be set to documents and folders by creating transformation rules for the specific attributes. The attributes group_permit, world_permit and owner_permit can be used to set granulated permissions. For setting ACLs, the attributes acl_domain and acl_name must be used. The user must set either *_permit attributes or acl_* attributes. If both*_permit attributes or acl_* attributes are configured to be migrated together the *_permit attributes will override the permissions set by the acl_* attributes. Because Documentum will not throw an error in such a case migration-center will not be able to tell that the acl_* attributes have been overridden and as such it will not report an error either considering that all attributes have been set correctly.
If the group_permit, world_permit, owner_permit AND acl_domain, acl_name attributes are configured to be migrated together the *_permit attributes will override the permissions set by the acl_* attributes. This is due to Documentum’s inner workings and not migration-center. Also, Documentum will not throw an error in such a case, which makes it impossible for migrationcenter to tell that the acl_* attributes have been overridden and as such it will not report an error either, considering that all attributes have been set correctly. It is advised to use either the *_permit attributes OR the acl_* attributes in the same rule set in order to set permissions.
Renditions are supported by the Documentum Importer. The needed information for this process is typically generated during scan by the Documentum Scanner or other scanners supporting systems where rendition information or other similar features can be extracted. Should rendition information not be obtainable using a particular scanner, or if the respective source system doesn’t have renditions at all, it is still possible to add files as renditions to objects during the transformation process. The renditions of an object can be controlled through the migration-center system attribute called “dctm_obj_rendition”. This attribute appears in the “Rules for system attributes” area of the Transformation Rules window. If the object contains at least one rendition in addition to the main content, the source attribute “dctm_obj_rendition” will be available for use in transformation rules. To keep the renditions for an object and migrate them as they are, the system attribute “dctm_obj_rendition” must be set to contain one single transformation function: GetValue(dctm_obj_rendition[all]). This will reference the path of the files where the corresponding renditions have been exported to; the Importer will pick up the content from this location and add them as renditions to the respective documents.
It is possible to add/remove individual renditions to/from objects by using the provided transformation functions. This can prove useful if renditions generated by third party software need to be appended during migration. These renditions can be saved to files in any location which is accessible to the Job Server where the import will be run from. The paths to these files can be specified as values for the “dctm_obj_rendition” attribute. A good practice for being able to assign such third-party renditions to the respective objects is to name the files with the objects’ id and use the format name as the extension. This way the “dctm_obj_rendition” attributes’ values can be built easily to match external rendition files to respective Documentum documents.
Other properties of renditions are also available to be set by the user through a series of rendition-related system attribute which are available automatically in any Migration set targeting a Documentum system:
Rendition system attribute
Description
dctm_obj_rendition
Specify a file on disk (full path) to be used as the content of that particular rendition
dctm_obj_rendition_format
Specify a valid Documentum format to be set for the rendition Leave empty to let Documentum decide the format automatically based on the extension of the file specified in dctm_obj_rendition (if the extension is not known to Documentum the rendition’s format will be unknown)
Will be ignored if dctm_obj_rendition is not set
dctm_obj_rendition_modifier
Specify a page modifier to be set for the rendition. Any string can be set (must conform to Documentum’s page_modifier attribute, as that’s where the value would end up)
Leave empty if you don’t want to set any page modifiers for renditions.
Will be ignored if dctm_obj_rendition is not set
dctm_obj_rendition_page
The page number where the rendition is linked. A repeating attribute to allow multiple values. Used for multiple page content.
Leave empty if you don’t want to set any page for renditions so the importer will set default value 0
Will be ignored if dctm_obj_rendition is not set
dctm_obj_rendition_storage
Specify a valid Documentum filestore where the rendition’s content file will be stored. Leave empty to store the rendition file in Documentum’s default filestore.
Will be ignored if dctm_obj_rendition is not set
All dctm_obj_rendition* system attributes are repeating attributes and as such accept multiple values, allowing multiple renditions to be added to the same object. Normally the number of values for all dctm_obj_rendition* attributes should be the same and equal the maximum number of renditions one would like to set for an object. E.g. if three renditions should be set, then each of the dctm_obj_rendition* attributes should have three values for each of the three renditions. More values will be ignored, missing values will be filled in with whatever Documentum would use as default in place of that missing value.
Relations are supported by Documentum Importer. They can currently be generated only by the Documentum Scanner, but technically it is possible to customize any scanner to compute the necessary information if similar data or close to Documentum relations can be extracted from their source systems.
Relations cannot be altered using transformation rules; migration-center will manage them automatically if the appropriate options in the scanner and importer have been selected. A connection between the parent object and its relations must always exist and can be viewed in migration-center by right-clicking on the object in any view of a migration set and selecting <View Relations> from the context menu. A grid with all the relations of the selected object will be displayed together with their associated metadata, such as relation name, child object, etc.
In order for the importer to use this feature the option importRelations from its configuration must be checked. Based on the settings of the importer it is possible to import objects and relations together or separately. This feature enables you to attach relations to already imported objects.
Importing a relation will fail if its child object is not present at the import location. This is not to be considered a fatal error. Since the relation is connected to the parent object, the parent object itself will be imported successfully and marked as “Partially Imported”, indicating it has one or more relations which could not be imported (because the respective child object could not be found). After the child object gets imported, the import for the parent object can be repeated. The already imported parent object will not be touched, but its missing relation(s) will now be created and connected to the child object. Once all relations have been successfully created, the parent object’s status will change from “Partially imported” to “Imported”, indicating a fully migrated object, including all its relations. Should some objects remain in a “Partially imported” state because the child objects the relation depends on are not migrated for a reason, then the objects can remain in this state and this state can be considered a final state equivalent to “imported”. The “Partially imported” state does not have any adverse effects on the current or future migrations even if these depend on the respective objects.
migration-center’s Documentum Importer supports relations between folders and/or documents only (i.e. “dm_folder” and “dm_document” objects, as well as their respective subtypes) “dm_subscription” type objects for example, although supports relations from a technical point of view, will be ignored by the scanner because they are relations involving a “dm_user” object. Custom relation objects (i.e. relation-type objects which are subtypes of “dm_relation”) are also supported, including any custom attributes they may have. The restrictions mentioned above regarding the types of objects connected by a relation also apply to custom relation objects.
The virtual documents data is stored in the migration-center database as a special type of relation. Please view the above chapter for more details about the behavior of this type of data.
In order to rebuild the structure of the virtual documents the importRelations setting must be checked in the Documentum Importer configuration.
This special type of relation is based on the “dmr_containment” information; in order for the data to be compatible with the importer you will need specific information as you see below to be created by the source scanner.
The Snapshot feature of virtual documents is not supported by migration-center.
Audit Trails can be scanned using the Documentum Scanner (see the Documentum Scanner user guide for more information about scanning audit trails), added to a Documentum Audit Trail migration set and imported using the Documentum Importer. This type of migration sets is subject to the exact same migration procedure as Documentum documents and folders. They can be imported together with the document and folder migration sets with which they are related to or on their own after the required document and folder objects have been imported.
It is not possible to import any audit trail if the actual object it belongs to hasn’t been migrated.
Importing audit trails is controlled in the Documentum Importer via the importAuditTrails parameter (disabled by default).
A typical workflow for migrating audit trails consists of the following main steps:
Scan folders/documents with Documentum Scanner by having “exportAuditTrail” flag activated. The scanning process will create tree kind of distinct objects in migration-center: Documentum(folder), Documentum(document) and Documentum(audittrail).
Assign the Documentum(audittrail) objects to a DctmToDctm(audittrail) migset(s) and follow the regular migration-center workflow to promote the objects through transformation rules to a “Validated” state required for the objects to be imported.
Import audit trails objects using Documentum Importer by assigning the respective migset(s) to a Documentum Importer job with the importAuditTrails parameter enabled (checked)
In order to prepare audit trail for import, first create a migset containing audit trail objects (more than one migset containing audit trails can be created, just like for documents or folders). For a migset to work with audit trails, the type of object must be set “DctmToDctm(audittrail)” accordingly. After setting the migset type to “DctmToDctm(audittrail)” the | Filescan | tab will display only scans which contain Documentum audit trail objects. Any of these can be added/removed to/from the migration set as usual.
Transformation rules allow setting values for the attributes audit trail entries on import to the target system. Values can be simply carried over unchanged from the source attributes, or they can be transformed using the available transformation functions. All attributes that can be set and associated are defined in the target type “dm_audittrail”.
As with other migration-center objects, audit trail objects have some predefined system attributes:
audited_object_id this should be filled with the corresponding value that comes from source system. No transformation or mapping is necessary because the importer will translate that id into the corresponding id in target repository.
r_object_type must be set to a valid Documentum audit trail object type. This is normally “dm_audittrail” but custom audit trails object types are supported as well.
The following audit trail attributes don’t need to be set through the transformation rules because they are automatically taken from the corresponding audited object in target system: chronicle_id, version_label, object_type.
All other “dm_audittrail” attributesthat refer to the audited object (acl_domain, acl_name, audited_obj_vstamp, controlling_app, time_stamp, etc) can be set either to the values that come from the source repository or not set at all, case in which the importer will set the corresponding values by taking them from the audited object located in the target repository.
The source attributes “attribute_list” and “attribute_list_old” may appear as multi-value in migration-center. This is because their values may exceed the maximum size of a value allowed in migration-center (4000 bytes), case in which migration-center handles such an attribute as a multi-value attribute internally. The user doesn’t need to take any action to handle such attributes; the importer knows how to process and set these values correctly in Documentum.
Not setting an attribute means not defining a rule for it or not associating an existing rule with any object type definition or attribute.
Working with rules and associations is core product functionality and is described in detail in the Client User Guide.
Assigning Documentum aspects to document and folder objects is supported by Documentum Importer. One or multiple aspects can be assigned to any document or folder object during transformation.
Documentum aspects are handled just like regular Documentum object types in migration-center. This means aspects need to be defined as a migration-center object type first before being available for use during transformation
During transformation, aspects are assigned just like actual object types to Documentum objects using the r_object_type system rule; the r_object_type system rule has been modified to accept multiple values for Documentum documents and folders, thus allowing multiple aspects to be specified
Note: the first value of r_object_type needs to reference an actual object type (i.e. dm_document or a subtype thereof), while any other values of r_object_type will be interpreted as references to aspects which need to be attached to the current object.
As with actual object types, all references to aspects will need to be valid in order for the object to be imported successfully. Trying to set invalid aspects or aspect attributes will cause the object to fail on import.
As with an actual object type, transformation rules used to generate values for attributes defined through aspects will need to be associated with the respective aspect types and attributes – again, just use aspects as if they were actual object types while on the Associations page of the Transformation Rules window.
Important note on updating previously imported objects with aspects:
New aspects are added during a delta migration, but existing aspects are not removed – the reason is that any aspects not present in the current update being migrated may have been added on purpose after the migration and may not necessarily be meant to be removed.
The Documentum importer has the ability to configure the behavior to attach the imported objects to a specific lifecycle. This ability refers only to attaching a lifecycle to a state that allows attaching to it and does not include the Promote, Demote, Suspend or Resume actions of the lifecycle.
When the adapter is configured to attach the lifecycle the transformation rules should contain the r_policy_id attribute that should have as value a valid lifecycle id from the target repository and also the state attribute r_current_state that should have as value a valid lifecycle state number (a state that allows attaching the lifecycle to it).
To be able to overwrite some attribute changes made by lifecycle when attaching the document, the adapter allows to configure a list of attributes that will be set again with the values set in the transformation rules after the lifecycle is attached, to make sure the values of those attributes are the ones coming from the migration set.
For importing annotations that have been scanned by Documentum Scanner the “importRelations” parameter must be enabled.
The corresponding “dm_note” objects have been imported prior or together with the parent documents in order for the DM_ANNOTATE relations to be imported.
The Documentum importer has the ability to import the comments on documents and folders (dmc_comment) by enabling “importComments” (default is unchecked). When enabled, the comments scanned by the Documentum scanner will be create and attached to the imported documents and folders in the target repository.
The pre-condition of a correct comments migration is that the users that created the comment objects and their group names from the source system need to exist in the target repository as well. This is necessary to be able to import the same permission access to the comment owners as in the source system.
The importer is also able to import updates of comments (see below Note on pre-condition) in a delta migration import.
To be able to import updates of a document’s comments it is necessary that the DFC used to have a valid Global Registry configured, because the adapter is using the “CommentManager” BOF service to read them. The behavior implemented in the framework inside the “CommentsManager” service requires that the DFC session used by the Job Server to be an approved privileged DFC session (this can be configured in DA – see DA documentation for privileged DFC clients).
If creating a comment fails for any reason, the error will be reported as warning in the importer log. The affected document or folder is set to the status “Partially Imported” in the migset. Since it is not possible to resume importing failed comments, after fixing the root cause of the error, the affected documents need to be reset in the migset, destroyed from repository (only if they are not updates) and imported again.
Objects that have changed in the source system since the last scan are scanned as update objects. Whether an object in a migration set is an update or not can be seen by checking the value of the Is_update column – if it’s 1, the current object is an update to a previously scanned object (the base object). An update object cannot be imported unless its base object has been imported previously.
Depending on the source systems the object come from, the method of obtaining the update information will differ but the objects behavior will stay the same once scanned. See the documentation of the scanners in case you need more information about the supported updates and how they are detected.
migration-center Client, which is used to set up transformation and validation rules does not connect directly to any source or target system to extract this information. Object type definitions can be exported from the respective systems to a CSV file which in turn can be imported to migration-center.
One tool to easily accomplish this for Documentum object types is dqMan, which is used in the following steps to illustrate the process. dqMan is an administration Tool for EMC Documentum supporting DQL queries and API commands and much more. dqMan is free and can be downloaded at http://www.fme.de. Other comparable administration tools can also be used, provided they can output a compatible CSV file or generate some similar output which can be processed to match the required format using other tools.
Start dqMan and connect to the target DOCUMENTUM repository. dqMan normally starts with the interface for working with DQL selected by default. Press the [DQL] button in the toolbar if not already selected.
In the “DQL Query” -box paste the following command and replace dm_document with the targeted object type: select distinct t.attr_name, t.attr_type, '0' as min_length, t.attr_length, t.attr_repeating, a.not_null as mandatory from dm_type t, dmi_dd_attr_info a where t.name=a.type_name and t.attr_name=a.attr_name and t.name='dm_document' enable(row_based); Press the [Run] button.
Click somewhere in the “Results” box. Use {CTRL+A} to select all. Right-click to open the context menu and choose <Export to> <CSV>.
The extracted object type template is now ready to be imported to migration-center 3.x as described in the chapter Object Types (or object type template definitions) in the migration-center Client User Guide
The Content Validation functionality is based on checksums (an MD5 hash) computed for the document’s contents during scan (by the Scanner itself) and after import (by the Importer). The two checksums are then compared - for any given piece of content its checksums from before and after the migration should be identical.
If the two checksums differ in any way, this is an indication that the content has been corrupted/modified during/after migration. Modifications can happen on purpose or simply by user error and can usually be traced back based on the r_modifier and r_modify_date attributes of the affected object. Actual data corruption rarely happens and is usually due to software and/or hardware errors on the systems involved during content transfer or storage.
Validating content is always performed after the content has been imported to the target repository, thus adding another step to the migration process. Accordingly, using this feature may significantly increase import time due to having to read back every piece of content for every document and compute its checksum in order to compare it against the initial checksum computer during scan.
This feature is controlled through the checkContentIntegrity parameter in the Documentum Importer (disabled by default).
This feature works only in tandem with a Scanner that supports it: Documentum Scanner, Filesystem Scanner and SharePoint Scanner.
The mc_content_location attribute can be used to import the content of the document from another place than the location where the scanner exported the document. It should be set with a valid file path. If it is not set, the content will be picked up from the original location. Useful when the content was moved or copied to other location than the initial one. If its value is set to “nocontent”, contentless documents will be imported. For importing primary content with multiple pages, the attribute mc_content_location can be set with multiple content paths so the importer will create a page for every given content.
The Documentum Content Validation also supports renditions in addition to a document’s main content. Renditions are processed automatically if found and do not require further configuration by the user.
There is one limitation that applies to renditions though: since the Content Validation functionality is based on a checksum computed initially during scan (before the migration), renditions are supported only if scanned from a Documentum repository using the Documentum Scanner. Currently this is the only scanner aware of calculating the required checksums for renditions. Other scanners, even though they may provide metadata pointing to other content files, which may become renditions during import, do not handle the content directly and therefore do not compute the checksum at scan time needed by the Content Validation to be compared against the imported content’s checksum.
When the rendition has different pages, those will be also imported on specific rendition page.
To create a new Documentum Importer job, specify the respective adapter type in the importer’s Properties window – from the list of available adapters “Documentum” must be selected. Once the adapter type has been selected, the Parameters list will be populated with the parameters specific to the selected adapter type, in this case the Documentum.
The Properties window of an importer can be accessed by double-clicking an importer in the list, by selecting the Properties button from the toolbar or from the context menu.
Configuration parameters
Values
Name
Enter a unique name for this scanner
Mandatory
Adapter type
Select the “Documentum” adapter from the list of available adapters
Mandatory
Location
Select the Job Server location where this job should be run. Job Servers are defined in the Jobserver window. If no Job Server migration-center will prompt the user to define a Job Server Location when saving the Scanner.
Mandatory
Description
Enter a description for this job (optional)
Configuration parameters
Values
username*
Username for connecting to the target repository.
A user account with super user privileges must be used to support the full Documentum functionality offered by migration-center.
Mandatory
password*
The user’s password.
Mandatory
repository*
Name of the target repository. The target repository must be accessible from the machine where the selected Job Server is running.
Mandatory
importObjects
Boolean. Selected by default; if NOT checked, documents and folders will NOT be imported. The reason for NOT importing folders and documents is to allow importing only relations between already imported folders/documents.
importRelations
Boolean. Determines whether to import relations between objects. In this case a relation means both “dm_relations” and “dmr_containment” objects.
Hint: Depending on project requirements and possibilities, it can make sense to import folders and documents first and add relations between these objects in a second migration step. For such a two-step approach the importer can be configured accordingly using the importObjects and importRelations parameters.
It is possible to select both options at the same time as well and import everything in a single step is the migration data is organized suitably.
It doesn’t make sense, however, to deselect both options.
importComments
Boolean. Flag indicating if the document's comments will be imported.
importAuditTrails
Boolean. Determines whether “Documentum(audittrail)” migsets are imported or not. If the setting is false, any Documentum(audit trail) migsets added to the Importer will be ignored (but can be imported later, after enabling this option)
numberOfThreads
The number threads that will be used for importing objects. Maximum allowed is 20.
importLocation
The path inside the target repository where objects are to be imported. It must be a valid Documentum path. This path will be appended in front of each “dctm_obj_link” (for documents) and “r_folder_path” (for folders) before linking objects if both the importLocation parameter and the “dctm_obj_link/r_folder_path” attribute values are set. If the attributes mentioned previously already contain a full path, the importLocation parameter does not need to be filled.
autoCreateFolders
Boolean. This option will be used for letting the importer automatically create any missing folders that are part of “dctm_obj_link” or “r_folder_path”.
Use this option to have migration-center re-create a folder structure at the target repository during import. If the target repository already has a fixed/predefined folder structure and creating new folders is not desired, deselect this option
checkContentIntegrity
Boolean. If enabled will check compare the checksums of imported objects with the checksum computed during scan time. Objects with a different checksum will be treated as errors.
May significantly increase import time due to having to read back every document and compute its checksum after import.
ignoreRenditionErrors
Determines whether errors affecting individual renditions will also trigger an error for the object the rendition belongs to or not
Checked - renditions with errors will be reported as warnings; the objects and other renditions will be imported normally
Unchecked - renditions with errors will cause the entire object, including other renditions, to fail on import. The object will be transitioned to the "Import error" status and will not be imported.
defaultFolderType
String. The Documentum folder type name used when automatically creating the missing object links. If left empty, “dm_folder” will be used as default type.
attachLifecycle
Boolean. This option will be used for instruction the importer to apply lifecycle (use in combination use transformation attributes: r_policy_id and r_current_state)
setAttributesAfterLifecycle
Repeating String. Optional configuration used when attachLifecycle is checked to specify the comma separated list of attributes which will be set again after attaching the document to the lifecycle (to override possible changes).
loggingLevel*
Sets the verbosity of the log file.
Values:
1 - logs only errors during scan
2 - is the default value reporting all warnings and errors
3 - logs all successfully performed operations in addition to any warnings or errors
4 - logs all events (for debugging only, use only if instructed by fme product support since it generates a very large amount of output. Do not use in production)
Mandatory
On the | Migsets | tab, the user can select the migration sets to be imported with this importer. Depending on the chosen Adapter Type only the migration sets compatible with this type of importer will be displayed and can be selected for import. Also, only migration sets containing at least one object in a validated state will be displayed (since objects which haven’t been validated cannot be imported). Available migration sets can be moved between the two columns by double clicking or using the arrows buttons.
A complete history is available for any Documentum Scanner or Importer job from the respective item’s History window. It is accessible through the History button/menu entry on the toolbar/context menu. The list of all runs for the selected job together with additional information, such as the number of processed objects, the starting time, the ending time and the status are displayed in a grid format.
Double clicking an entry or clicking the Open button on the toolbar opens the log file created by that run. The log file contains more information about the run of the selected job:
Version information of the migration-center Server Components the job was run with
The parameters the job was run with
Execution Summary that contains the total number of objects processed, the number of documents and folders scanned or imported, the count of warnings and errors that occurred during runtime.
Log files generated by the Documentum Scanner can be found in the chosen “logs” folder at the installation of the Server Components of the machine where the job was run, e.g. …\fme AG\migration-center Server Components <Version>\logs\ Dctm-Importer
the amount of information written to the log files depends on the setting specified in the loggingLevel start parameter for the respective job.
Each job run of the importer generates along with its log a rollback script that can be used to remove all the imported data from the target system. This feature can be very useful in the testing phase to clear the resulting items or even in production in case the user wants to correct the imported data and redo the import.
The name of the rollback script is build based on the following formula:
<Importer name>(<run number>)_<script generation date time>_ rollback_script.api
Its location is the same as the logs location:
<Server components installation folder>/logs/DCTM-Importer/
Composed by a series of Documentum API commands that will remove in the proper order the items created in the import process, the script should look similar to the following example:
//Links:
//Virtual Document Components:
//Relations:
//Documents:
destroy,c,090000138001ab99,1
destroy,c,090000138001ab96,1
destroy,c,090000138001ab77,1
destroy,c,090000138001ab94,1
//Folders:
destroy,c,0c0000138001ab5b,1
You can run it with any applications that supports Documentum API scripting this includes the fme dqMan application and the IAPI tool from Documentum.
The rollback script is created at the end of the import process. This means that it will not be created if the job run stops before it gets to this stage, this doesn’t include manual stops done directly from the client.
This list displays which Documentum attributes can be associated with a migration-center transformation rule.
dm_document
Attribute Name
Type
Length
Is Repeating
Association Possible
a_application_type
String
32
No
Yes
a_archive
Boolean
0
No
No
a_category
String
64
No
Yes
a_compound_architecture
String
16
No
No
a_content_type
String
32
No
Yes
a_controlling_app
String
32
No
No
a_effective_date
DateTime
0
Yes
No
a_effective_flag
String
8
Yes
No
a_effective_label
String
32
Yes
No
a_expiration_date
DateTime
0
Yes
No
a_extended_properties
String
32
Yes
No
a_full_text
Boolean
0
No
No
a_is_hidden
Boolean
0
No
Yes
a_is_signed
Boolean
0
No
No
a_is_template
Boolean
0
No
Yes
a_last_review_date
DateTime
0
No
No
a_link_resolved
Boolean
0
No
No
a_publish_formats
String
32
Yes
No
a_retention_date
DateTime
0
No
No
a_special_app
String
32
No
No
a_status
String
16
No
Yes
a_storage_type
String
32
No
No
acl_domain
String
32
No
Yes
acl_name
String
32
No
Yes
authors
String
48
Yes
Yes
group_name
String
32
No
Yes
group_permit
Number
0
No
Yes
i_antecedent_id
ID
0
No
No
i_branch_cnt
Number
0
No
No
i_cabinet_id
ID
0
No
No
i_chronicle_id
ID
0
No
No
i_contents_id
ID
0
No
No
i_direct_dsc
Boolean
0
No
No
i_folder_id
ID
0
Yes
No
i_has_folder
Boolean
0
No
No
i_is_deleted
Boolean
0
No
No
i_is_reference
Boolean
0
No
No
i_is_replica
Boolean
0
No
No
i_latest_flag
Boolean
0
No
No
i_partition
Number
0
No
No
i_reference_cnt
Number
0
No
No
i_retain_until
DateTime
0
No
No
i_retainer_id
ID
0
Yes
No
i_vstamp
Number
0
No
No
keywords
String
48
Yes
Yes
language_code
String
5
No
Yes
log_entry
String
120
No
No
object_name
String
255
No
Yes
owner_name
String
32
No
Yes
owner_permit
Number
0
No
Yes
r_access_date
DateTime
0
No
No
r_alias_set_id
ID
0
No
No
r_aspect_name
String
64
Yes
No
r_assembled_from_id
ID
0
No
No
r_component_label
String
32
Yes
No
r_composite_id
ID
0
Yes
No
r_composite_label
String
32
Yes
No
r_content_size
Number
0
No
No
r_creation_date
DateTime
0
No
Yes
r_creator_name
String
32
No
Yes
r_current_state
Number
0
No
Yes
r_frozen_flag
Boolean
0
No
No
r_frzn_assembly_cnt
Number
0
No
No
r_full_content_size
Double
0
No
No
r_has_events
Boolean
0
No
No
r_has_frzn_assembly
Boolean
0
No
No
r_immutable_flag
Boolean
0
No
No
r_is_public
Boolean
0
No
Yes
r_is_virtual_doc
Number
0
No
Yes
r_link_cnt
Number
0
No
No
r_link_high_cnt
Number
0
No
No
r_lock_date
DateTime
0
No
No
r_lock_machine
String
80
No
No
r_lock_owner
String
32
No
No
r_modifier
String
32
No
Yes
r_modify_date
DateTime
0
No
Yes
r_object_type
String
32
No
Yes
r_order_no
Number
0
Yes
No
r_page_cnt
Number
0
No
No
r_policy_id
ID
0
No
No
r_resume_state
Number
0
No
No
r_version_label
String
32
Yes
Yes
resolution_label
String
32
No
Yes
subject
String
192
No
Yes
title
String
400
No
Yes
world_permit
Number
0
No
Yes
Custom object types
Attribute Name
Type
Length
Is Repeating
Association Possible
<custom_attribute_number>
Number
-
-
Yes
<custom_attribute_string>
String
-
-
Yes
<custom_attribute_dateTime>
DateTime
-
-
Yes
<custom_attribute_double>
Double
-
-
Yes
<custom_attribute_ID>
ID
-
-
No
<custom_attribute_boolean>
Boolean
-
-
Yes
The Alfresco Importer takes care of importing the documents and folders processed in migration-center into a target Alfresco repository.
Importer is the term used for an output adapter which is most likely used at the last step of the migration process. An importer works as a job that can be run at any time and can even be executed repeatedly. For every run a detailed history and log file are created. An importer is defined by a unique name, a set of configuration parameters and an optional description.
Alfresco importers can be created, configured, started and monitored through migration-center Client, but the corresponding processes are executed inside the Alfresco Repository Server.
The Alfresco Importer it’s not included in the standard installer of migration-center Server Components but it is delivered packaged as Alfresco Module Package (amp). This is because the Alfresco Importer has to be installed within the Alfresco Repository Server. The following versions of Alfresco are supported (on Windows or Linux): 3.4, 4.0, 4.1, 4.2, 5.0.2, 5.1, 5.2, 6.1.1, 6.2.0.
Java 1.8 is required for the installation of Alfresco Importer.
For installing the other adapters you need during your migration process please install the Server Components as it is described in the Installation Guide. It is recommended to install the Server Components on another machine but it is also possible to install it on the Alfresco Server. In case you use the Alfresco Importer in combination with a Scanner running on another machine then the scanner should export the files on a network share that is accessible from the Alfresco Server.
The first step of the installation is to copy mc-alfresco-adaptor-<Version>.amp file in the “amps-folder” of the Alfresco installation.
The last step is to finish the installation by installing the mc-alfresco-adaptor-<Version>.amp file as it is described by the wiki guide of Alfresco under http://wiki.alfresco.com/wiki/Module_Management_Tool
Before doing this, please backup your original alfresco.war and share.war files to ensure that you can uninstall the migration-center Job Server after successful migration. This is the only way at the moment as long the Module Management Tool of Alfresco does not support to remove a module from an existing WAR-file.
The Alfresco-Server should be stopped when applying the amp-files. Please notice that Alfresco provides files for installing the amp files, e.g.:
C:\Alfresco34\apply_amps.bat (Windows)
/opt/alfresco/commands/apply_amps.sh (Linux)
Due to a bug of the Alfresco installer under Windows, please be careful if the amp installer via apply_amps.sh works correctly! Under Alfresco 3.4, the file apply_amps.bat must be location in the alfresco location and not in the subfolder bin!
The Alfresco can be uninstalled by following steps:
Stop the Alfresco Server.
Restore the original alfresco.war and share.war which have been backed up before Alfresco importer installation
Remove the file mc-alfresco-adaptor-<Version>.amp from the “amps-folder”
For migration in Alfresco systems, there are some system attributes available for transformation which need further explanation because they have a special syntax or meaning.
name of system attribute
description
examples
folderpath
Each object in Alfresco must be created under a parent object (parent-child-association). This parent object must be a Alfresco folder or a subtype of the Alfresco folder object (cm:folder).
For defining the parent object for an object which should be imported into Alfresco, use the system attribute folder.
©©
Currently, only one folder path is supported and the starting point for imports is the company_home-node of the Spacesstore. Only imports below this path are currently possible.
The format of the value must be a valid child node path (without using prefixes for the namespace).
If the import location of the Importer Module (see section 5.2) is set to /Sites/demosite
and the folderpath value is set to
/documentLibrary/Alfresco Demo
The object will be created under the documentLibrary of the demosite
Full path is:
/Sites/demosite/documentlibrary/Alfresco Demo
mc_content_location
This rule offers the possibility to overwrite the location of the objects content.
Leave empty in case the scanned content_location should be used.
Initial content location:
\\server\data\alfresco\testdocument.txt
Set a new location in order to access it from a linux mount of the above share:
/server/data/alfresco/testdocument.txt
permissions
In Alfresco each object (document or folder) can have several permissions where as one permissions consists of an Authority (person or group) which have a certain right (a certain role like Consumer or a certain right like Read). The access status (Allow/Deny) allows to define if the authority has the permission or not.
User and groups can be configured via Alfresco Explorer or Alfresco Share (Admin functions). Roles can be defined via permissions.xml (look at the Alfresco wiki to find more information).
To configure permissions for an object, use the system attribute permissions.
It can be multivalue and each value must have the following format:
AUTHORITY###PERMISSION### ACCESSSTATUS (ALLOWED|DENIED)
### is used as separator.
You can leave the last part of the value, so the default access status will be ALLOWED.
You can configure the permissions for all object types (folder and documents).
ROLE_ADMINISTRATOR###READ
each user in the administrator role will get read permission on the object for which this permission is set
GROUP_EVERYONE###CONSUMER###ALLOWED
Each user of the system (because each user is in the group everyone) will have the permissions of the consumer role.
types / aspects
Alfresco as an ECM-System provides built-in object types which can be used for documents and folders. It is also possible to define your own custom types.
Alfresco also provides the functionality to attach and detach aspects on nodes. Additionally, Alfresco has built-in aspects like a “cm:titled” or “cm:author” aspect. More information is provided here: http://wiki.alfresco.com/wiki/Aspect
To configure the object type and aspects for the object to be imported, use the systemattribute types / aspects.
The attribute is multi-value, so it is possible to define exactly one object type and zero, one or more aspects for one object to be imported.
Please note that any custom object type can be used which are derived from cm:content or cm:folder.
Important: The first value in this attribute must be the content type.
cm:content
cm:auditable
cm:titled
The object imported will be of type cm:content (alfresco standard document type) and will get the aspects cm:auditable and cm:titled.
The importer imports the versions in the same order as they are scanned from the source system. If the “cm:versionable” aspect is not set in the system rule “types / aspects”, the importer assigns it automatically when importing the second version. If “cm:versionLabel” was not set in the transformation rules, the importer will create major versions preserving the order of versions in the source system. By setting “cm:versionLabel” in the “cm:versionable” aspect in transformation rules, the importer will create minor or major versions following these rules: if the version label ends with “.0”, a major version is created, otherwise a minor version is created. The imported object’s version-label is determined automatically by Alfresco.
To create a new Alfresco Importer job, specify the respective adapter type in the properties window – from the list of available adapters “Alfresco” must be selected. Once the adapter type has been selected, the parameters list will be populated with the parameters specific to the selected adapter type, in this case the Alfresco adapter’s parameters.
The properties window of an importer can be accessed by double-clicking an importer in the list or selecting the Properties button or entry from the toolbar or context menu.
Configuration parameter
description
Name
Enter a unique name for this importer
Mandatory
Adapter type
Select the “Alfresco” adapter from the list of available adapters
Mandatory
Location
Select the Alfresco Importer location where this job should be run. Locations are defined in the Jobserver window. If there is no Job Server Location configured, migration-center will prompt the user to define a Job Server Location when saving the importer.
Mandatory
Note: that the Alfresco server must be used as a Job Server Location with default port 9701.
Description
Enter a description for this job (optional)
Configuration parameters
Description
username*
User name for connecting to the target repository.
A user account with admin privileges must be used to support the full Alfresco functionality offered by migration-center.
Mandatory
password*
The user’s password.
Mandatory
importLocation
The path inside the target repository where objects are to be imported. It must be a valid Alfresco path and have to start below the “company_home”-node inside the spacesStore.
Examples for valid values of importLocation:
“/sites/demosite/documentLibrary/” for import into the document library of a share site with internal name demosite
“/demofolder” for import a folder with name demofolder below the company home folder.
This path will be appended in front of each folder value (for documents and folders) before creating parent-child association for this object if both the importLocation parameter and the folder attribute values are set.
If the attributes mentioned previously already contain a full path, the importLocation parameter does not need to be filled.
Examples:
importLocation = ”sites/demosite/documentLibrary/” and folder for documents “/test” = complete path “/sites/demosite/documentLibrary/test”
autoCreateFolders
This option will be used for letting the importer automatically create any missing folders (spaces) that are part of the folderpath for any object (folder or documents).
Use this option to have migration-center re-create a folder structure at the target repository during import. If the target repository already has a fixed/predefined folder structure and creating new folders is not desired, deselect this option.
defaultFolderType
Specifies the default folder type when creating folders using the autoCreateFolders option described above. If the parameter is empty, “cm:folder” will be used by default.
Examples for valid values:
“cm:folder” for standard folder type
“fme:folder” for your own folder type
loggingLevel*
Sets the verbosity of the log file.
Values:
1 - logs only errors during scan
2 - is the default value reporting all warnings and errors
3 - logs all successfully performed operations in addition to any warnings or errors
4 - logs all events (for debugging only, use only if instructed by fme product support since it generates a very large amount of output. Do not use in production)
Mandatory
On the | Migsets | tab, the user can select the migration sets to be imported with this importer. Depending on the chosen Adapter Type only the migration sets compatible with this type of importer will be displayed and can be selected for import. Also, only migration sets containing at least one object in a validated state will be displayed (since objects which haven’t been validated cannot be imported).
Available migration sets can be moved between the two columns by double clicking or using the arrows buttons.
When scanning and importing documents, their folder paths are also scanned and the folder structure can be automatically created by migration-center in the target repository. This procedure will not keep any of the metadata attached to the folder objects, such as owners, permissions, specific object types, or any custom attributes. Depending on project requirements, it may be required to do a “folder-only migration”, e.g. for migrating a complete folder structure including custom folder types, permissions and other attributes first, and then populate this folder structure with documents afterwards. In order to execute a folder-only migration the following steps should be performed to configure the migration process accordingly:
Scanner: Folder migration works only in conjunction with on the scanners that support scanning folders as distinct migration-center objects. For more information about scanning folders please read the specific scanner userguide.
Migration set: When creating a new migration set choose the <SourceType>ToAlfresco(folder) type. Now only the scanner runs containing folder objects will be displayed on the |Filescan Selection| tab. Note that the number of objects contained in the displayed scanner runs now indicates folders and not documents, which is why the number on display (folders) can be different from the total number of objects processed by the scan (if it contains other types of objects besides folders).
Transformation rules: When creating transformation rules for the migration set, keep in mind that folder-only migration sets have folder-specific attributes to work with.
“name” and “folder” are key attributes for folders in Alfresco. If these attributes are transformed without taking into consideration how these objects build into a tree structure, it may no longer be possible to reconstruct the folder tree. This is not due to migration-center, but rather because of the nature of folders being arranged in a tree structure which does create dependencies between the individual objects.
In Alfresco the “folder” attribute contains the path(s) to the current folder, as well as the folder itself at the end of the path (e.g. /folder_null/folder_one/folder_two/current_folder), while “foldername” contains only the folder name (e.g. current_folder).
Importer: When configuring the importer on the |Parameters| tab, the “importObjects” and “autoCreateFolders” options are not used for the folder-only migration. Note that these parameters will still be in effect if other migration set(s) containing <SourceType>ToAlfresco(document) objects will be imported together with the folder migration set.
Folder migration is important. It is necessary to take the approach described above when migrating folder structures with complex folder objects containing custom object types, permissions, attributes, relations, etc.
A complete history is available for the Filesystem Scanner or Alfresco Importer job from the respective items’ History window. It is accessible through the History button/menu entry on the toolbar/context menu. The History window displays a list of all runs for the selected job together with additional information, such as the number of processed objects, the start and ending time and the status.
Double clicking an entry or clicking the Open button on the toolbar opens the log file created by that run. The log file contains more information about the run of the selected job:
Version information of the migration-center Server Components the job was run with
The parameters the job was run with
Execution Summary that contains the total number of objects processed, the number of documents and folders scanned or imported, the count of warnings and errors that occurred during runtime.
Log files generated by the Alfresco Importer can be found in subfolder .\logs on the
The amount of information written to the log files depends on the setting specified in the ‘loggingLevel’ start parameter for the respective job.
The Documentum InPlace importer takes the objects processed in migration-center and imports them back in a Documentum repository. Documentum InPlace importer works together only with Documentum scanner.
Documentum InPlace adaptor supports a limited amount of Documentum features such as changing the object types of the documents, changing the links of the documents and changing the attributes. Changing object’s relations is not supported, neither is changing the Virtual documents or Audit trails.
Importer is the term used for an output adapter used as the last step of the migration process. It takes care of importing the objects processed in migration-center into the target system (such as a Documentum repository).
An importer works as a job that can be run at any time and can even be executed repeatedly. For every run, a detailed history and log file are created. It is defined by a unique name, a set of configuration parameters and an optional description.
Documentum InPlace can be created, configured, started and monitored through migration-center Client, but the corresponding processes are executed by migration-center Job Server.
The supported Documentum Content Server versions are 5.3 – 20.2, including service packs. For accessing a Documentum repository Documentum Foundation Classes 5.3 or newer is required. Any combinations of DFC versions and Content Server versions supported by EMC Documentum are also supported by migration-center’s Documentum InPlace Importer, but it is recommended to use the DFC version matching the version of the Content Server targeted for import. The DFC must be installed and configured on every machine where migration-center Server Components is deployed.
Starting from version 3.9 of migration-center additional configurations need to be made for the Documentum adapter to be able to locate Documentum Foundation Classes. This is done by modifying the dfc.conf file, located in the Job Server installation folder.
There are two settings inside the file that by default match the paths of a standard DFC install. One needs to have the path for the config folder of DFC and the other needs the path to the dctm.jar.
See below example:
wrapper.java.classpath.dfcConfig=C:/Documentum/config
wrapper.java.classpath.dfcDctmJar=C:/Program Files/Documentum/dctm.jar
The dfcConfig
parameter must point to the configuration folder. The dfcDctmJar
parameter must point to the dctm.jar file!
For scanning the data from the Documentum repository, Documentum Scanner is used. As Documentum InPlace is not altering the content, only the metadata of the files, the parameter "skipContent" should be checked in the scanner’s configuration. For more details regarding the configuration of the scanner please check Documentum Scanner user guide.
Changing the types of the documents is possible with the Documentum InPlace Importer, by setting the value of the r_object_type attribute to a new type. In order to update the target type of the document, the new target type needs to be created beforehand in migration-center in Manage > Object types section and the attributes need to be associated in the Associations tab of the Transformation rules window.
migration-center Client does not connect directly to any source or target system to extract information about r_object_type thus object type definitions can be exported from Documentum to a CSV file which in turn can be imported to migration-center Object Types definition window.
DqMan is recommended to connect to Documentum to extract the object type definition. DqMan is an administration Tool for EMC Documentum supporting DQL queries and API commands and much more. dqMan is free and can be downloaded at http://www.fme.de. Other comparable administration tools can also be used, provided they can output a compatible CSV file or generate some similar output, which can be processed to match the required format using other tools.
Start dqMan and connect to the target DOCUMENTUM repository. dqMan normally starts with the interface for working with DQL selected by default. Press the [DQL] button in the toolbar if not already selected.
In the “DQL Query” box, paste the following command and replace dm_document with the targeted object type: select distinct t.attr_name, t.attr_type, '0' as min_length, t.attr_length, t.attr_repeating, a.not_null as mandatory from dm_type t, dmi_dd_attr_info a where t.name=a.type_name and t.attr_name=a.attr_name and t.name='dm_document' enable(row_based); Press the [Run] button.
Click somewhere in the “Results” box. Use {CTRL+A} to select all. Right-click to open the context menu and choose <Export to> <CSV>.
The extracted object type template is now ready to be imported to migration-center 3.x as described in the chapter Object Types (or object type template definitions) in the migration-center Client User Guide
The attributes of documents can be updated, with the values provided by the user, through the associations tab.
Removing a value for an attribute is possible by not providing a value for that attribute and associating it.
The attributes r_creation_date and r_creator_name cannot be modified, however r_modify_date and r_modifier can.
In order to set the r_modify_date and r_modifier attributes they need to have the values associated in Associations section. If the attributes r_modify_date and r_modifier are not set, the current date and current user will be set to the documents.
Permissions can be assigned to documents by setting and associating the attributes group_permit, world_permit and owner_permit. For setting ACLs, the attributes acl_domain and acl_name must be used. The user must set either *_permit attributes or acl_* attributes. If both*_permit attributes or acl_* attributes are configured to be migrated together the *_permit attributes will override the permissions set by the acl_* attributes. Because Documentum will not throw an error in such a case migration-center will not be able to tell that the acl_* attributes have been overridden and as such it will not report an error either, considering that all attributes have been set correctly.
For changing the links of the documents, dcmt_obj_link rule for system attribute is used. The rule is multi value thus a document can be linked in multiple locations.
If the dctm_obj_links attribute is set, the old links of the documents will be replaced with the new links.
If the dctm_obj_links attribute is not set, the links will not be updated, and the document will be linked in the original location.
As other migration-center paths, InPlace Importer has some predefined system attributes:
dctm_obj_link this must be filled with the links where the objects should be placed.
r_object_type must be set to a valid Documentum object type. This is normally “dm_document” but custom object types are supported as well.
The importer allows moving the content from the actual storage to another storage. This can be done by setting the target storage name in the attribute “a_storage_type”. When this attribute is set, the importer will use the MIGRATE_CONTENT server method for moving the content to the specified storage. The importer parameters allow you to specify if the renditions or checked out content will be moved and if the content will be removed from the original storage. For more details regarding these configurations see Documentum InPlace importer parameters.
In case of any error that may occur during the content movement a generic error is logged in the importer run log but another log with the root cause of the error is created on the content server in the location specified in the importer parameter “moveContentLogFile”.
If the storage name specified in the rule “a_storage_type” is the same as the storage where content is already stored, the importer will just mark the object as being successfully processed, so no error or warning is logged in this case.
To create a new Documentum InPlace Importer job, specify the respective adapter type in the importer’s Properties window – from the list of available adapters, “DocumentumInPlace” must be selected. Once the adapter type has been selected, the Parameters list will be populated with the parameters specific to the selected adapter type, in this case the Documentum.
The Properties window of an importer can be accessed by double-clicking an importer in the list, by selecting the Properties button from the toolbar or from the context menu.
Configuration parameters
Values
Name
Enter a unique name for this scanner
Mandatory
Adapter type
Select the “DocumentumInPlace” adapter from the list of available adapters
Mandatory
Location
Select the Job Server location where this job should be run. Job Servers are defined in the Jobserver window. If no Job Server migration-center will prompt the user to define a Job Server Location when saving the Scanner.
Mandatory
Description
Enter a description for this job (optional)
Configuration parameters
Values
username*
Username for connecting to the target repository.
A user account with super user privileges must be used to support the full Documentum functionality offered by migration-center.
Mandatory
password*
The user’s password.
Mandatory
repository*
Name of the target repository. The target repository must be accessible from the machine where the selected Job Server is running.
Mandatory
moveContentOnly
Flag indicating if the metadata will not be updated but only the content should be moved. This will save some processing in case there is no need to update any metadata.
autoCreateFolders
This option will be used for letting the importer automatically create any missing folders that are part of “dctm_obj_link” or “r_folder_path”.
Use this option to have migration-center re-create a folder structure at the target repository during import. If the target repository already has a fixed/predefined folder structure and creating new folders is not desired, deselect this option
defaultFolderType
String. The Documentum folder type name used when automatically creating the missing object links. If left empty, “dm_folder” will be used as default type.
moveRenditionContent
Flag indicating if renditions will be moved to the new storage. If checked, all renditions and primary content are moved otherwise only the primary content is moved.
moveCheckoutContent
Flag indicating if checkout documents will be moved to new storage. If not checked, the importer will throw an error if a document is checked out.
removeOriginalContent
Flag indicating if the content will be removed from the original storage. If checked, the content is removed from the original storage, otherwise the content remains there.
moveContentLogFile
The file path on the content server where the log related to move content operations will be saved. The folder must exist on the content server. If it does not exist, the log will not be created at all. A value must be set when move content feature is activated by the setting of attribute “a_storage_type”.
numberOfThreads
The number threads that will be used for importing objects. Maximum allowed is 20.
loggingLevel*
Sets the verbosity of the log file.
Values:
1 - logs only errors during scan
2 - is the default value reporting all warnings and errors
3 - logs all successfully performed operations in addition to any warnings or errors
4 - logs all events (for debugging only, use only if instructed by fme product support since it generates a very large amount of output. Do not use in production)
Mandatory
On the | Migsets | tab, the user can select the migration sets to be imported with this importer. Depending on the chosen Adapter Type only the migration sets compatible with this type of importer will be displayed and can be selected for import. In addition, only migration sets containing at least one object in a validated state will be displayed (since objects that have not been validated cannot be imported). Available migration sets can be moved between the two columns by double clicking or using the arrows buttons.
A complete history is available for any Documentum Scanner or Importer job from the respective item’s History window. It is accessible through the History button/menu entry on the toolbar/context menu. The list of all runs for the selected job together with additional information, such as the number of processed objects, the starting time, the ending time and the status are displayed in a grid format.
Double clicking an entry or clicking the Open button on the toolbar opens the log file created by that run. The log file contains more information about the run of the selected job:
Version information of the migration-center Server Components the job was run with
The parameters the job was run with
Execution Summary that contains the total number of objects processed, the number of documents and folders scanned or imported, the count of warnings and errors that occurred during runtime.
Log files generated by the Documentum Scanner can be found in the chosen “logs” folder at the installation of the Server Components of the machine where the job was run, e.g. …\fme AG\migration-center Server Components <Version>\logs\ Dctm-Importer
The amount of information written to the log files depends on the setting specified in the loggingLevel start parameter for the respective job.
This list displays which Documentum attributes can be associated with a migration-center transformation rule.
dm_document
Attribute Name
Type
Length
Is Repeating
Association Possible
a_application_type
String
32
No
Yes
a_archive
Boolean
0
No
No
a_category
String
64
No
Yes
a_compound_architecture
String
16
No
No
a_content_type
String
32
No
Yes
a_controlling_app
String
32
No
No
a_effective_date
DateTime
0
Yes
No
a_effective_flag
String
8
Yes
No
a_effective_label
String
32
Yes
No
a_expiration_date
DateTime
0
Yes
No
a_extended_properties
String
32
Yes
No
a_full_text
Boolean
0
No
No
a_is_hidden
Boolean
0
No
Yes
a_is_signed
Boolean
0
No
No
a_is_template
Boolean
0
No
Yes
a_last_review_date
DateTime
0
No
No
a_link_resolved
Boolean
0
No
No
a_publish_formats
String
32
Yes
No
a_retention_date
DateTime
0
No
No
a_special_app
String
32
No
No
a_status
String
16
No
Yes
a_storage_type
String
32
No
No
acl_domain
String
32
No
Yes
acl_name
String
32
No
Yes
authors
String
48
Yes
Yes
group_name
String
32
No
Yes
group_permit
Number
0
No
Yes
i_antecedent_id
ID
0
No
No
i_branch_cnt
Number
0
No
No
i_cabinet_id
ID
0
No
No
i_chronicle_id
ID
0
No
No
i_contents_id
ID
0
No
No
i_direct_dsc
Boolean
0
No
No
i_folder_id
ID
0
Yes
No
i_has_folder
Boolean
0
No
No
i_is_deleted
Boolean
0
No
No
i_is_reference
Boolean
0
No
No
i_is_replica
Boolean
0
No
No
i_latest_flag
Boolean
0
No
No
i_partition
Number
0
No
No
i_reference_cnt
Number
0
No
No
i_retain_until
DateTime
0
No
No
i_retainer_id
ID
0
Yes
No
i_vstamp
Number
0
No
No
keywords
String
48
Yes
Yes
language_code
String
5
No
Yes
log_entry
String
120
No
No
object_name
String
255
No
Yes
owner_name
String
32
No
Yes
owner_permit
Number
0
No
Yes
r_access_date
DateTime
0
No
No
r_alias_set_id
ID
0
No
No
r_aspect_name
String
64
Yes
No
r_assembled_from_id
ID
0
No
No
r_component_label
String
32
Yes
No
r_composite_id
ID
0
Yes
No
r_composite_label
String
32
Yes
No
r_content_size
Number
0
No
No
r_creation_date
DateTime
0
No
Yes
r_creator_name
String
32
No
Yes
r_current_state
Number
0
No
Yes
r_frozen_flag
Boolean
0
No
No
r_frzn_assembly_cnt
Number
0
No
No
r_full_content_size
Double
0
No
No
r_has_events
Boolean
0
No
No
r_has_frzn_assembly
Boolean
0
No
No
r_immutable_flag
Boolean
0
No
No
r_is_public
Boolean
0
No
Yes
r_is_virtual_doc
Number
0
No
Yes
r_link_cnt
Number
0
No
No
r_link_high_cnt
Number
0
No
No
r_lock_date
DateTime
0
No
No
r_lock_machine
String
80
No
No
r_lock_owner
String
32
No
No
r_modifier
String
32
No
Yes
r_modify_date
DateTime
0
No
Yes
r_object_type
String
32
No
Yes
r_order_no
Number
0
Yes
No
r_page_cnt
Number
0
No
No
r_policy_id
ID
0
No
No
r_resume_state
Number
0
No
No
r_version_label
String
32
Yes
Yes
resolution_label
String
32
No
Yes
subject
String
192
No
Yes
title
String
400
No
Yes
world_permit
Number
0
No
Yes
Custom object types
Attribute Name
Type
Length
Is Repeating
Association Possible
<custom_attribute_number>
Number
-
-
Yes
<custom_attribute_string>
String
-
-
Yes
<custom_attribute_dateTime>
DateTime
-
-
Yes
<custom_attribute_double>
Double
-
-
Yes
<custom_attribute_ID>
ID
-
-
No
<custom_attribute_boolean>
Boolean
-
-
Yes
The D2 Importer currently supports the following D2 versions: 4.7, 16.5, 16.6, 20.2. For using the importer with a D2 version older than 20.2, some additional configurations are required (see chapter D2 Configuration for older versions).
Since D2 itself is still based on the Documentum platform, D2’s requirements regarding Documentum product also apply to the migration-center D2 Importer – namely Documentum Content Server 7.1 and higher as well as DFC version 7.1 or higher. When migrating between different Documentum and Documentum-based systems, it is recommended to use the DFC version matching the Content Server being accessed. Should different versions of Documentum Content Server be involved in the same migration project at the same time, it is recommended to use the DFC version matching the latest Content Server in use.
Importer is the term used for an output adapter which is most likely used at the last step of the migration process. An Importer (such as the D2 Importer) takes care of importing the objects processed in migration-center into the target system (such as a D2 repository).
An importer works as a job that can be run at any time, and can even be executed repeatedly. For every run a detailed history and log file are created.
An importer is defined by a unique name, a set of configuration parameters and an optional description. D2 Importers can be created, configured, started and monitored through migration-center Client, while the corresponding processes are executed in the background by migration-center Job Server.
Starting from version 3.9 of migration-center additional configurations need to be made for the Documentum adapter to be able to locate Documentum Foundation Classes. This is done by modifying the dfc.conf file, located in the Job Server installation folder.
There are two settings inside the file that by default match the paths of a standard DFC install. One needs to have the path for the config folder of DFC and the other needs the path to the dctm.jar.
See below example:
wrapper.java.classpath.dfcConfig=C:/Documentum/config
wrapper.java.classpath.dfcDctmJar=C:/Program Files/Documentum/dctm.jar
The dfcConfig
parameter must point to the configuration folder. The dfcDctmJar
parameter must point to the dctm.jar file!
For using the D2 Importer with a D2 Content Server older than 20.2 some additional steps must be performed:
Ensure the Job Server is stopped
Go to the ...\lib\mc-d2-importer folder
Remove all existing libraries (either by moving them outside the Job Server folder, or by deleting them)
Unzip all files from ...\D2-4.7 \d2-4.7-importer into the folder...\lib\mc-d2-importer folder
Start the Job Server service again
If your D2 environment has a lockbox configured additional steps need to be performed for the D2 Importer to work properly. The D2 lockbox files must be configured on the machine where the Job Server will perform the import job.
Before running the D2 installer please make sure that Microsoft Visual C++ 2010 Service Pack 1 Redistributable Package MFC Security Update - 32 bit is installed.
Run the D2 installer according to the D2 Installation Guide, using the same java version as on the D2 environment:
select Configure Lockbox
select Lockbox for – Other Application Server
set the install location to …\lib\mc-d2-importer\Lockbox of the Job Server folder location.
set the correct password and passphrase as it was configured on the D2 Server
restart the Job Server
Note that if a different location is selected for the Lockbox installation the wrapper.conf file must be change to reflect the new location:
wrapper.java.classpath.14=./lib/mc-d2-importer/LockBox/lib/*.jar
wrapper.java.classpath.15=./lib/mc-d2-importer/LockBox
wrapper.java.additional.3=-Dclb.library.path=./lib/mc-d2-importer/LockBox/lib/native/win_vc100_ia32
Many of the new features available through D2 have been implemented in the migration-center D2 adapter in addition to the basic functionalities of importing documents and setting attributes. Features such as Autolinking, Autonaming and Security are all available. In addition, more features such as validating attribute values obtained from transformation rules using D2 dictionaries and taxonomies, using D2 templates for setting predefined default values, or applying D2 rules based on a document’s owner rather than the user the import process is configured to run with are provided.
The D2 autonaming feature is fully supported by the D2 Importer. The feature can be toggled on or off through the applyD2Autonaming parameter present in a D2 Importer’s properties.
If the applyD2Autonaming parameter is checked, D2’s autonaming rules will take effect as documents are imported; this also means any value set in Transformation Rules for the object_name attribute will be ignored and overridden by D2’s autonaming rules.
If the applyD2Autonaming parameter is unchecked (which is the default state the parameter is set to), D2’s autonaming rules will not be used. Instead the value set for object_name will be set for the imported documents, as is the case currently when using the standard Documentum Importer
The D2 autolinking feature is fully supported by the D2 Importer. The feature can be toggled on or off through the applyD2Autolinking parameter present in a D2 Importer’s properties.
If the applyD2Autolinking parameter is checked, D2’s autolinking rules will take effect as documents are imported, with the documents being arranged in the folder structure imposed by the rules applicable to the imported document(s).
If the applyD2Autolinking parameter is unchecked (which is the default state the parameter is set to), D2’s autolinking rules will not be used. Instead the dctm_obj_link system rule will be used as the target path for the imported documents, as is the case currently when using the standard Documentum Importer
Even when using D2 autolinking in the importer, a valid path must be provided as a value for the dctm_obj_link system rule. This is because of the way migration-center works – documents will be linked to the path specified by the dctm_obj_link first and re-linked to the proper (autolink) paths later if the applyD2autolinking parameter is enabled.
Migrating folder objects using the D2 Importer is not supported currently. Currently the only possible way of creating folders in D2 is by using the Autolinking feature or by checking the AutoCreateFolders parameter in the D2 Importer properties.
The D2 Security feature is fully supported by the D2 Importer. The feature can be toggled on or off through the applyD2Security parameter present in a D2 Importer’s properties.
If the applyD2Security parameter is checked, D2’s Security rules will take effect as documents are imported; this also means any value set in Transformation Rules for the acl_name attribute will be ignored and overridden by D2’s Security rules which will set the appropriate ACLs.
If the applyD2Security parameter is unchecked (which is the default state the parameter is set to), D2’s Security rules will not be used. Instead the value set for acl_name will be used to set the ACL for the imported documents, as is the case currently when using the standard Documentum Importer. If neither the acl_name rule nor the applyD2Security parameter has been set, then the documents will fall back to the Content Server’s configuration for setting a default ACL; depending on the Content Server configuration this may or may not be appropriate, therefore please setting permissions explicitly by using either the acl_name rule or enabling the applyD2Security parameter.
The D2 Importer can set D2 Lifecycle states and run the state actions on documents. Lifecycles can be applied on documents by checking the applyD2Lifecycle property in the D2 Importer configuration. The lifecycle to be used can either be specified in the d2_lifecycle_name system rule, or not specified at all. Specifying it should increase performance as the importer does not need to search for it.
If no state was specified in the a_status system rule, the importer will apply the initial state as defined in D2-Config. The importer is able to set states that are only on the first 2 or 3 levels in the lifecycle. The third level is available only if the first state automatically transitions to the second one (see (init) -> Draft states). Note that state entry conditions are not checked upon setting the state.
When applying a lifecycle some of the attributes set in migration-center may get overwritten by the state actions. In order to work around this issue, the reapplyAttrsAfterD2Lifecycle property can be set. Here you can specify which attribute values to be reapplied after setting the lifecycle state.
The throwD2LifecycleErrors parameter can be used to specify whether the objects should be set to Error or Partially Imported when an error occurs during the application of lifecycle actions.
The D2 Importer can use D2 templates to set default values for attributes. The template to be used can be specified through the d2_default_template system rule. If a valid template has been set through the, all attributes configured with a default value in the respective template will get the default value set during import.
It is possible to override the default value even if the d2_default_template system rule has been set to a valid D2 template by creating a transformation rule for the respective attributes. Thus if both a transformation rule and a default value via D2 template apply to a given attribute, the value resulting from the transformation rule will override the default value resulting from the template.
Certain attributes associated with D2 dictionaries or taxonomies can be validated to make sure the value set is among the permissible values defined in the associated D2 dictionary/taxonomy.
D2 creates such an association through a property page. Migration-center can be configured to read a D2 property page, identify attributes associated with dictionaries and taxonomies and validate the values resulting from the transformation rules of the respective attributes against the values defined in the associated dictionary/taxonomy.
One property page can be defined per migration set through the d2_property system rule; the resulting value must be the name of a valid D2 property page.
Failure to validate an attribute value in accordance with its associated dictionary/taxonomy will cause the entire document to fail during import and transition to the Import error status.
The D2 importer allows D2 rules to be applied to objects during import based on the object’s owner, rather than the user running the import process. This makes sense, as the import process would typically be running using the same user all the time, while real-life usage scenarios with D2 would involve different users connecting using their own accounts, based on which D2 rules would then apply. Migration-center can simulate this behavior by passing on the document owner as the user based on which rules should be applied instead of the currently connected user the import process is running with.
The feature can be enabled through the ApplyD2RulesbyOwner parameter in the D2 Importer. In order for the feature to work, a rule for the owner_name attribute must also be defined and reference a valid user. Should the parameter be enabled without an owner having been set, it will have no effect.
This feature will not work if the D2 variable $USER is used to reference users in D2 rules. D2 will always resolve this variable to the currently connected user’s name, i.e. the user running the import process, and migration-center cannot override this behavior.
In addition to the dedicated D2 features presented in the previous chapter, the D2 Importer also supports basic Documentum features used by D2, such as versions, renditions, virtual documents, audit trails, and so on. These features generally work the same as they do in the migration-center Documentum Importer and should be instantly familiar to users of the Documentum adapter.
Versions (including branches) are supported by the D2 Importer, including custom version labels. Version information is generated during scan, be it a Documentum Scanner or other scanners supporting systems where version information can be extracted from. This version information (essentially the Documentum r_version_label) can be changed using the transformation rules prior to import. The version structure (i.e. the ordering of the objects relative to their antecedents) cannot be changed using migration-center.
If objects have been scanned with with version information, all versions must be imported as well since each object references its antecedent, going back to the very first version. Therefore it is advised not to drop the versions of an object between the scan and the import processes since this will most likely generate inconsistencies and errors. If an object is intended to be migrated without versions (i.e. only the current version of the object needs to be migrated), than the affected objects should be scanned without enabling the respective scanner’s versioning option.
Renditions are supported by the D2 Importer. Rendition information is typically generated during scan, be it by the Documentum Scanner or other scanners supporting systems where rendition information or other similar features can be extracted. Should rendition information not be obtainable using a particular scanner, or if the respective source system doesn’t have renditions at all, it is still possible to add files as renditions to objects during the transformation process. The renditions of an object can be controlled through the migration-center system attribute called dctm_obj_rendition. The dctm_obj_rendition attribute appears in the “System attributes” area of the Transformation Rules window. If the object contains at least one rendition in addition to the main content, the source attribute dctm_obj_rendition will be available for use in transformation rules. To keep the renditions for an object and migrate them as they are, the system attribute dctm_obj_rendition must be set to contain one single transformation function: GetValue(dctm_obj_rendition[all]). This will reference the path of the files where the corresponding renditions have been exported to; the Importer will pick up the content from this location and add them as renditions to the respective documents.
It is possible to add/remove individual renditions to/from objects by using the provided transformation functions. This can prove useful if renditions generated by third party software need to be appended during migration. These renditions can be saved to files in any location which is accessible to the Job Server where the import will be run from. The paths to these files can be specified as values for the dctm_obj_rendition attribute. A good practice for being able to assign such third party renditions to the respective objects is to name the files with the objects’ id and use the format name as the extension. This way the dctm_obj_rendition attributes’ values can be built easily to match external rendition files to respective D2 documents.
Other properties of renditions are also available to be set by the user through a series of rendition-related system attribute which are available automatically in any migset targeting a Documentum, D2 or FirstDoc system:
Rendition system attribute
Description
dctm_obj_rendition
Specify a file on disk (full path) to be used as the content of that particular rendition
dctm_obj_rendition_format
Specify a valid Documentum format to be set for the rendition Leave empty to let Documentum decide the format automatically based on the extension of the file specified in dctm_obj_rendition (if the extension is not known to Documentum the rendition’s format will be unknown)
Will be ignored if dctm_obj_rendition is not set
dctm_obj_rendition_modifier
Specify a page modifier to be set for the rendition. Any string can be set (must conform to Documentum’s page_modifier attribute, as that’s where the value would end up)
Leave empty if you don’t want to set any page modifiers for renditions.
Will be ignored if dctm_obj_rendition is not set
dctm_obj_rendition_page
The page number where the rendition is linked. A repeating attribute to allow multiple values. Used for multiple page content.
Leave empty if you don’t want to set any page for renditions.
Will be ignored if dctm_obj_rendition is not set
dctm_obj_rendition_storage
Specify a valid Documentum filestore where the rendition’s content file will be stored. Leave empty to store the rendition file in Documentum’s default filestore.
Will be ignored if dctm_obj_rendition is not set
All dctm_obj_rendition* system attributes are repeating attributes and as such accept multiple values, allowing multiple renditions to be added to the same object. Normally the number of values for all dctm_obj_rendition* attributes should be the same and equal the maximum number of renditions one would like to set for an object. E.g. if three renditions should be set, then each of the dctm_obj_rendition* attributes should have three values for each of the three renditions. More values will be ignored, missing values will be filled in with whatever Documentum would use as default in place of that missing value.
Relations are supported by D2 Importer. Relations for import to D2 can currently be generated only by the Documentum Scanner, but technically it is possible to customize any scanner to generate relation information that can be processed by the D2 Importer if information similar or close to Documentum relations needs to be extracted from non-Documentum system. The option named exportRelations in the Documentum scanner’s configuration determines if relations are scanned and exported to migration-center.
Relations cannot be altered using transformation rules; migration-center will manage relations automatically if the appropriate options in the scanner and importer have been selected. Relations will always be connected to the parent object of the relation and can be viewed in migration-center by right-clicking on an object in any view of a migration set and selecting <View Relations> from the context menu. All relations with the selected object as the parent object are listed with their associated metadata, such as relation name, child object, etc.
An option corresponding to the scanners’ must be selected in the importer as well to restore the relations between objects on import; the option is labeled importRelations. The importer can be configured to import objects and relations together or independently of one another. This can be used to migrate only the objects first and attach relations to the imported objects later.
Relations will always be connected to the parent object of the relation, which is why importing a relation will always be attempted when importing its parent object and the importRelations option mentioned above is selected as well. Importing a relation will fail if its child object is not present at the import location. This is not to be considered a fatal error. Since the relation is connected to the parent object, the parent object itself will be imported successfully and marked as “Partially Imported”, indicating it has one or more relations which could not be imported (because the respective child object could not be found). After the child object gets imported, the import for the parent object can be repeated. The already imported parent object will not be touched, but its missing relation(s) will now be created and connected to the child object. Once all relations have been successfully created, the parent object’s status will change from “Partially imported” to “Imported”, indicating a fully migrated object, including all its relations. Should some objects remain in a “Partially imported” state because the child objects the relation depends on are not migrated for a reason, then the objects can remain in this state and this state can be considered a final state equivalent to “imported” in such a case. The “Partially imported” state does not have any adverse effects on the current or future migrations even if these depend on the respective objects.
migration-center’s Documentum adapters support relations between folders and/or documents only (i.e. dm_folder and dm_document objects, as well as their respective subtypes) dm_subscription type objects for example, although relations from a technical point of view, will be ignored by the scanner because they are relations involving a dm_user object.
Custom relation objects (i.e. relation-type objects which are subtypes of dm_relation) are also supported, including any custom attributes they may have. The restriction mentioned above regarding the types of objects connected by such a relation also apply to custom relation objects.
Documentum Virtual Documents are supported by the D2 Importer. The option named exportVirtualDocs in the configuration of the scanner determines if virtual documents are scanned and exported to migration-center.
Another option related to virtual documents, named maintainVirtualDocsIntegrity is recommended when scanning VDs. This option will allow the scanner to include children of VDs which may be outside the scope of the scanner (paths to scan or dqlString) in order to maintain the integrity of the VD. If this option is not turned on, the children in the VD that are out of scope (they are not linked under the scanned path or they are not returned by dqlString) will not be scanned and the VD may be incomplete. This option can be enabled/disabled based on whichever setting makes sense for the migration project.
The VD binding information (dmr_containment objects) are always scanned and attached to the root object of a VD regardless of the maintainVirtualDocsIntegrity option. This way it is possible to scan any missing child objects later on and still be able to restore the correct VD structure based on the information stored with the root object.
The ExportVersions option needs to be checked for scanning Virtual Documents (i.e. if the ExportVirtualDocuments option is checked) even if the virtual documents themselves do not have multiple versions, otherwise the virtual documents export might produce unexpected results. This is because the VD parents may still reference child objects which are not current versions of those respective objects. This is not an actual product limitation, but rather an issue caused by this particular combination of Scanner options and Documentum’s VD features, which rely on information related to versioning.
The Snapshot feature of virtual documents is not supported by migration-center.
The D2 Importer also supports migrating audit trails of documents and folders. Audit Trails can be scanned using the Documentum Scanner (see the Documentum Scanner user guide for more information about scanning audit trails), added to a DctmToD2(audittrail) migset and imported using the D2 Importer.
DctmtoD2(audittrail) migsets are subject to the exact same migration procedure as Documentum documents and folders. DctmtoD2(audittrail) migsets can be imported together with the document and folder migsets they are related to, or on their own at any time after the required document and folder objects have been migrated. It is of course not possible to import any audit trails if the actual object the audit trails belong to hasn’t been migrated.
Importing audit trails is controlled in the D2 Importer via the importAuditTrails parameter (disabled by default).
A typical workflow for migrating audit trails consists of the following main steps:
Scan folders/documents with Documentum Scanner by enabling the “exportAuditTrail” parameter. The scanning process will create tree kind of distinct objects in migration-center: Documentum(folder), Documentum(document) and Documentum(audittrail).
Assign audit trail objects to (a) DctmToD2(audittrail) migset(s) and follow the regular migration-center workflow to promote the objects through transformation rules to a “Validated” state required for the objects to be imported.
Import audit trails objects using D2 Importer by assigning the respective migset(s) to a D2 Importer job with the importAuditTrails parameter enabled (checked)
In order to prepare audit trail for import, first create a migset containing audit trail objects (more than one migset containing audit trails can be created, just like for documents or folders). For a migset to work with audit trails, the type of object must be set “DctmToD2(audittrail)” accordingly. After setting the migset type to “DctmToD2(audittrail)” the | Filescan | tab will display only scans which contain Documentum audit trail objects. Any of these can be added/removed to/from the migration set as usual.
Transformation rules allow setting values for the attributes audit trail entries on import to the target system. Values can be simply carried over unchanged from the source attributes, or they can be transformed using the available transformation functions. All attributes that can be set and associated are defined in the target type “dm_audittrail”.
As with other migration-center objects, audit trail objects have some predefined system attributes:
audited_object_id this should be filled with the corresponding value that comes from source system. No transformation or mapping is necessary because the importer will translate that id into the corresponding id in target repository.
r_object_type must be set to a valid Documentum audit trail object type. This is normally “dm_audittrail” but custom audit trails object types are supported as well.
The following audit trail attributes don’t need to be set through the transformation rules because they are automatically taken from the corresponding audited object in target system: chronicle_id, version_label, object_type.
All other “dm_audittrail” attributes that refer to the audited object (acl_domain, acl_name, audited_obj_vstamp, controlling_app, time_stamp, etc) can be set either to the values that come from the source repository or not set at all, case in which the importer will set the corresponding values by taking them from the audited object located in the target repository.
The source attributes “attribute_list” and “attribute_list_old” may appear as multi-value in migration-center. This is because their values may exceed the maximum size of a value allowed in migration-center (4000 bytes), case in which migration-center handles such an attribute as a multi-value attribute internally. The user doesn’t need to take any action to handle such attributes; the importer knows how to process and set these values correctly in Documentum.
Not setting an attribute means not defining a rule for it or not associating an existing rule with any object type definition or attribute.
Working with rules and associations is core product functionality and is described in detail in the Client User Guide.
The D2 Importer does also support “Update” or “Delta” migration if the source data is proved through a scanner which also supports the feature, such as the Documentum Scanner or File System Scanner.
Objects that have changed in the source system since the last scan are scanned as update objects. Whether an object in a migration set is an update or not can be seen by checking the value of the Is_update column – if it’s 1, the current object is an update to a previously scanned object (the base object). Some things that need to be considered when working with the update migration feature in combination with the Documentum Scanner will be illustrated next:
An update object cannot be imported unless its base object has been imported previously.
Updated objects are detected based on the r_modify_date and i_vstamp attributes. If one of these attributes has changed, the object is considered to have been updated and will be registered accordingly. By default actions performed in Documentum change at least one if not both of these attributes, offering a reliable way to detect whether an object has changed since the last scan; on the other hand, objects changed by third party code/applications which do not touch these attributes might not be detected by migration-center as having changed.
Objects deleted from the source after having been migrated are not detected and will not be deleted in the target system. This is by design (due to the added overhead, complexity and risk involved in deleting customer data).
Updates/changes to primary content, renditions, metadata, VD structures, and relations of objects will be detected and updated accordingly.
migration-center Client, which is used to set up transformation and validation rules does not connect directly to any source or target system to extract this information. Object type definitions can be exported from the respective systems to a CSV file which in turn can be imported to migration-center.
One tool to easily accomplish this for Documentum object types is dqMan, which is used in the following steps to illustrate the process. dqMan is an administration Tool for EMC Documentum supporting DQL queries and API commands and much more. dqMan is free and can be downloaded at http://www.fme.de. Other comparable administration tools can also be used, provided they can output a compatible CSV file or generate some similar output which can be processed to match the required format using other tools.
Start dqMan and connect to the target DOCUMENTUM repository. dqMan normally starts with the interface for working with DQL selected by default. Press the [DQL] button in the toolbar if not already selected.
In the “DQL Query” -box paste the following command and replace dm_document with the targeted object type: select distinct t.attr_name, t.attr_type, '0' as min_length, t.attr_length, t.attr_repeating, a.not_null as mandatory from dm_type t, dmi_dd_attr_info a where t.name=a.type_name and t.attr_name=a.attr_name and t.name='dm_document' enable(row_based); Press the [Run] button.
Click somewhere in the “Results” box. Use {CTRL+A} to select all. Right-click to open the context menu and choose <Export to> <CSV>.
The extracted object type template is now ready to be imported to migration-center 3.x as described in the Client User Guide.
The Content Validation functionality is based on checksums (an MD5 hash) computed for the document’s contents during scan (by the Scanner itself) and after import (by the Importer). The two checksums are then compared - for any given piece of content its checksums from before and after the migration should be identical.
If the two checksums differ in any way, this is an indication that the content has been corrupted/modified during/after migration. Modifications can happen on purpose or simply by user error and can usually be traced back based on the r_modifier and r_modify_date attributes of the affected object. Actual data corruption rarely happens and is usually due to software and/or hardware errors on the systems involved during content transfer or storage.
Validating content is always performed after the content has been imported to the target repository, thus adding another step to the migration process. Accordingly, using this feature may significantly increase import time due to having to read back every piece of content for every document and compute its checksum in order to compare it against the initial checksum computer during scan.
This feature is controlled through the checkContentIntegrity parameter in the D2 Importer (disabled by default).
Note: Currently only documents scanned by Documentum Scanner or Filesystem Scanner can be used in a migration workflow involving Documentum Content Validation. The D2 Importer does support Content Validation when used with the previously mentioned Scanners.
The Documentum Content Validation also supports renditions in addition to a document’s main content. Renditions are processed automatically if found and do not require further configuration by the user.
There is one limitation that applies to renditions though: since the Content Validation functionality is based on a checksum computed initially during scan (before the migration), renditions are supported only if scanned from a Documentum repository using the Documentum Scanner. Currently this is the only scanner aware of calculating the required checksums for renditions; other scanners, even though they may provide metadata pointing to other content files which, may become renditions during import, do not handle the content directly and therefore do not compute the checksum at scan time needed by the Content Validation to be compared against the imported content’s checksum.
To create a new D2 Importer job specify the respective adapter type in the New Importer Properties window – from the list of available adapters “D2” must be selected. Once the adapter type has been selected, the Parameters list will be populated with the parameters specific to the selected adapter type, in this case the D2 adapter’s parameters.
The Properties window of an importer can be accessed by double-clicking an importer in the list, or selecting the Properties button or entry from the toolbar or context menu.
Configuration parameters
Values
Name
Enter a unique name for this scanner
Mandatory
Adapter type
Select the “D2” adapter from the list of available adapters
Mandatory
Location
Select the Job Server location where this job should be run. Job Servers are defined in the Jobserver window. If no Job Server migration-center will prompt the user to define a Job Server Location when saving the Scanner.
Mandatory
Description
Enter a description for this job (optional)
Configuration parameters
Values
username*
User name for connecting to the target repository.
A user account with superuser privileges must be used to support the full Documentum functionality offered by migration-center.
Mandatory
password*
The user’s password.
Mandatory
repository*
Name of the target repository. The target repository must be accessible from the machine where the selected Job Server is running.
Mandatory
importObjects
Boolean. Selected by default; if NOT checked, documents and folders will NOT be imported. The reason for NOT importing folders and documents is to allow importing only relations between already imported folders/documents.
importRelations
Boolean. Determines whether to import relations between objects. In this case a relation means both dm_relations and dmr_containment objects.
Hint: Depending on project requirements and possibilities, it can make sense to import folders and documents first, and add relations between these objects in a second migration step. For such a two-step approach the importer can be configured accordingly using the importObjects and importRelations parameters.
It is possible to select both options at the same time as well and import everything in a single step is the migration data is organized suitably.
It doesn’t make sense, however, to deselect both options.
importAuditTrails
Boolean. Determines whether Documentum(audit trail) migsets are imported or not. If the setting is false, any Documentum(audit trail) migsets added to the Importer will be ignored (but can be imported later, after enabling this option)
importLocation
The path inside the target repository where objects are to be imported. It must be a valid Documentum path. This path will be appended in front of each dctm_obj_link (for documents) and r_folder_path (for folders) before linking objects if both the importLocation parameter and the dctm_obj_link/r_folder_path attribute values are set. If the attributes mentioned previously already contain a full path, the importLocation parameter does not need to be filled.
autoCreateFolders
Boolean. Flag indicating if the folder paths defined in "dctm_obj_link" attribute should be automatically created using the default folder type.
defaultFolderType
String. The Documentum folder type name used when automatically creating the missing object links. If left empty, dm_folder will be used as default type.
applyD2Autonaming
Enable or disable D2’s autonaming rules on import.
applyD2Autolinking
Enable or disable D2’s autolinking rules on import.
applyD2Security
Enable or disable D2’s Security rules on import.
applyD2RulesByOwner
Apply D2 rules based on the current document’s owner, rather than the user configured to run the import process.
throwD2LifecycleErrors
Specifies the behaviour when an error occurs during the application of D2 lifecycle actions:
- if unchecked the error is reported as warning and the object is set as partially imported so next time when it will be imported only the lifecycle actions will be performed.
- if checked the lifecycle error is reported and the affected object is moved to the error state
applyD2Lifecycle
Apply the appropriate D2 Lifecycle, or the one specified in the d2_lifecycle_name system rule.
reapplyAttrsAfterD2Lifecycle
Comma separated list of attributes existing in the transformation rules to be reapplied after attaching lifecycle. This is useful when some attributes are overwritten by the lifecycle state actions.
numberOfThreads
The number of threads on which the import should be done. Each document will be imported as a separate session on its own thread.
Max. no. is 20
checkContentIntegrity
Boolean. If enabled will check compare the checksums of imported objects with the checksum computed during scan time. Objects with a different checksum will be treated as errors.
May significantly increase import time due to having to read back every document and compute its checksum after import.
ignoreRenditionErrors
Boolean. Determines whether errors affecting individual renditions will also trigger an error for the entire object.
-Checked: renditions with errors will be reported as warnings; the objects and other renditions will be imported normally
-Unchecked: renditions with errors will cause the entire object, including other renditions to fail on import. The object will be transitioned to the "Import error" status and will not be imported.
loggingLevel*
Sets the verbosity of the log file.
Values:
1 - logs only errors during scan
2 - is the default value reporting all warnings and errors
3 - logs all successfully performed operations in addition to any warnings or errors
4 - logs all events (for debugging only, use only if instructed by fme product support since it generates a very large amount of output. Do not use in production)
Mandatory
The autoCreateFolders Importer familiar from the Documentum Importer has been dropped from the D2 Importer. The reason is D2‘s Autolinking feature, which is the preferred way for creating folders in D2.
On the | Migsets | tab, the user can select the migration sets to be imported with this importer. Depending on the chosen Adapter Type only the migration sets compatible with this type of importer will be displayed and can be selected for import. Also, only migration sets containing at least one object in a validated state will be displayed (since objects which haven’t been validated cannot be imported).
Available migration sets can be moved between the two columns by double clicking or using the arrows buttons.
A complete history is available for any D2 Importer job is available from the respective items’ History window. It is accessible through the History button/menu entry on the toolbar/context menu. The History window displays a list of all runs for the selected job together with additional information, such as the number of processed objects, the start and ending time and the status.
Double clicking an entry or clicking the Open button on the toolbar opens the log file created by that run. The log file contains more information about the run of the selected job:
Version information of the migration-center Server Components the job was run with.
The parameters the job was run with.
Execution Summary that contains the total number of objects processed, the number of documents and folders scanned or imported, the count of warnings and errors that occurred during runtime.
Log files generated by the Documentum Adapter can be found in the Server Components installation folder of the machine where the job was run, e.g. …\fme AG\migration-center Server Components <Version>\logs
The amount of information written to the log files depends on the setting specified in the ‘loggingLevel’ start parameter for the respective job.
Each job run of the importer generates along with its log a rollback script that can be used to remove all the imported data from the target system. This feature can be very useful in the testing phase to clear the resulting items or even in production in case the user wants to correct the imported data and redo the import.
The name of the rollback script is build based on the following formula:
<Importer name>(<run number>)_<script generation date time>_ rollback_script.api
Its location is the same as the logs location:
<Server components installation folder>/logs/DCTM-Importer/
Composed by a series of Documentum API commands that will remove in the proper order the items created in the import process, the script should look similar to the following example:
//Links:
//Virtual Document Components:
//Relations:
//Documents:
destroy,c,090000138001ab99,1
destroy,c,090000138001ab96,1
destroy,c,090000138001ab77,1
destroy,c,090000138001ab94,1
//Folders:
destroy,c,0c0000138001ab5b,1
You can run it with any applications that supports Documentum API scripting this includes the fme dqMan application and the IAPI tool from Documentum.
The rollback script is created at the end of the import process. This means that it will not be created if the job run stops before it gets to this stage, this doesn’t include manual stops done directly from the client.
The DCM Importer takes the objects processed in migration-center and imports them to a DCM repository. It currently supports DCM version 6.7. For accessing the DCM repository DFC 6.7 or later is required.
Importer is the term used for an output adapter which is most likely used at the last step of the migration process. An Importer (such as the DCM Importer) takes care of importing the objects processed in migration-center into the target system (such as a DCM repository).
An importer works as a job that can be run at any time, and can even be executed repeatedly. For every run a detailed history and log file are created.
An importer is defined by a unique name, a set of configuration parameters and an optional description.
DCM Importers can be created, configured, started and monitored through migration-center Client, but the corresponding processes are executed by migration-center Job Server.
Starting from version 3.9 of migration-center additional configurations need to be made for the Documentum adapter to be able to locate Documentum Foundation Classes. This is done by modifying the dfc.conf file, located in the Job Server installation folder.
There are two settings inside the file that by default match the paths of a standard DFC install. One needs to have the path for the config folder of DFC and the other needs the path to the dctm.jar.
See below example:
wrapper.java.classpath.dfcConfig=C:/Documentum/config
wrapper.java.classpath.dfcDctmJar=C:/Program Files/Documentum/dctm.jar
The dfcConfig
parameter must point to the configuration folder. The dfcDctmJar
parameter must point to the dctm.jar file!
When importing DCM documents, their folder paths are also scanned and the folder structure can be automatically re-created by migration-center in the target repository. This procedure will not keep any of the metadata attached to the folder objects, such as owners, permissions, specific object types, or any custom attributes. Depending on project requirements, it may be required to do a “folder-only migration” first, e.g. for migrating a complete folder structure including custom folder object types, permissions and other attributes first, and then populate this folder structure with documents afterwards. In order to execute a folder-only migration the following steps should be performed to configure the migration process accordingly:
Scanner: Scan the source data using a scanner which can generate folder metadata to be used later for import to a DCM repository. See the appropriate scanner’s documentation.
Migration set: When creating a new migration set choose the <source type>ToDCM(folder) type – this will create migration set containing folders targeted at DCM. Now only the scanner runs containing folder objects will be displayed on the |Filescan Selection| tab. Note that the number of objects contained in the displayed scanner runs now indicates folders and not documents, which is why the number on display (folders) can be different from the total number of objects processed by the scan (if it contains other types of objects besides folders).
Transformation rules: When creating transformation rules for the migration set, keep in mind that folder-only migration sets have folder-specific attributes to work with, in this case attributes specifically targeted at the DCM folder objects.
“object_name” and “r_folder_path” are key attributes for folders in Documentum/DCM. If these attributes are transformed without taking into consideration how these objects build into a tree structure, it may no longer be possible to reconstruct the folder tree. This is not due to migration-center, but rather because of the nature of folders being arranged in a tree structure which does create dependencies between the individual objects.
In Documentum/DCM the “r_folder_path” attribute contains the path(s) to the current folder, as well as the folder itself at the end of the path (e.g. /cabinet/folder_one/folder_two/current_folder), while “object_name” contains only the folder name (e.g. current_folder). To make it easier for the user to change a folder name, migration-center prioritizes the “object_name” attribute over the “r_folder_path” attribute; therefore changing “object_name” from current_folder to folder_three for example will propagate this change to the objects’ “r_folder_path” attribute and create the folder /cabinet/folder_one/folder_two/folder_three without the user having to change the “r_folder_path” attribute to match. This only applies to the last part of the path, which represents the current folder, and not to other parts of the path. Those can also be modified using the provided transformation functions, but migration-center does not provide any automations to make sure the information generated in that case is correct.
For more information about these attributes and folders in Documentum/DCM see the appropriate documentation about the EMC Documentum Object Model and Content Server.
Note that the “importObjects” and “autoCreateFolders” options on the DCM Importer’s |Parameters| tab are not used for the folder-only migration. Note that these parameters will still be in effect if other migration set(s) containing DctmToDctm(document) objects will be imported together with the folder migration set.
Folder migration is important. It is necessary to take the approach described above when migrating folder structures with complex folder objects containing custom object types, permissions, attributes, relations, etc. This information will be lost if “exportFolderStructure” is not switched on during scan.
The key concept around which DCM revolves is that of document classes and controlled documents. These concepts are also represented in migration-center when working with a migration set targeted at DCM. A valid document class and type must be specified through specific system rules provided for migration sets targeted at DCM in order for the importer to create a proper controlled document upon import to DCM. It is not mandatory to import documents as controlled documents to DCM, as omitting the document class is possible, thus creating a regular document in the DCM target repository.
Other attributes and can be set by defining transformation rules for them in the same way as if these were imported to a regular Documentum repository.
The document class to be set is controlled via the predefined system rule controlled_document_class. This one is created automatically for migsets targeted at DCM and appears alongside the other system rules in the bottom left corner of the Transformation Rules window.
A valid DCM document class existing on the target DCM system must be specified for the importer to be able create controlled documents.
Specifying an invalid or inexistent document class will cause the import to fail with the appropriate error message.
Omitting setting a value for the “controlled_document_class” rule will result in the documents being imported as regular documents as opposed to controlled documents.
Currently the importer supports importing controlled documents and change notices. Change notices can be imported by associating the “r_object_type” rule with the type “dmc_change_notice”. The following sections in this chapter refers to both controlled documents and change notices.
When setting a document class through the controlled_document_class rule, the object type set through the r_object_type rule must be a valid object type defined for the document class.
DCM autonaming rules can be applied to the imported documents simply by omitting the object_name rule, or at least not associating the object_name rule with the object_name attribute. This will result in the documents being named according to the autonaming rules applicable to the specified document class.
Defining and associating a rule for object_name will override DCM autonaming rules. Due to DCM behavior the object name has to be set in the importer parameter “preserveMCAttributes” in order to set the value of object_name from MC for the first version.
Setting the object_name through transformation rules is recommended if no autonaming rules are available or intended to be used for the specified document class.
Another key feature of DCM are the lifecycles attached to controlled documents. Lifecycle states can be set for DCM objects through migration-center by creating rules and associating them with the attributes handling lifecycle status and associations, such as a_aplication_type, a_status, r_current_state, r_resume_state.
If Documentum scanner is used to obtain source objects, delete “a_application_type” from “ignoredAtrributesList” scanner parameters.
For more information about these attributes in Documentum/DCM see the appropriate documentation about the EMC Documentum Object Model and Content Server.
If any of the attribute acl_name, acl_domain, group_permit, world_permit and owner_permit is not set in the transformation rules, the imported documents will be assigned with the permissions according with the DCM configurations.
The default DCM permissions on imported objects can be overwritten by creating transformation rules for acl_name and acl_domain or group+permi, world_permit and owner_permit. The attributes group_permit, world_permit and owner_permit can be used to set granulated permissions. For setting ACLs the attributes acl_domain and acl_name must be used. The user must set either *_permit attributes or acl_* attributes. If both*_permit attributes or acl_* attributes are configured to be migrated together the *_permit attributes will override the permissions set by the acl_* attributes. Because Documentum will not throw an error in such a case migration-center will not be able to tell that the acl_* attributes have been overridden and as such it will not report an error either considering that all attributes have been set correctly.
If the group_permit, world_permit, owner_permit AND acl_domain, acl_name attributes are configured to be migrated together the *_permit attributes will override the permissions set by the acl_* attributes. This is due to Documentum’s inner workings and not migration-center. Also, Documentum will not throw an error in such a case, which makes it impossible for migrationcenter to tell that the acl_* attributes have been overridden and as such it will not report an error either, considering that all attributes have been set correctly. It is advised to use either the *_permit attributes OR the acl_* attributes in the same rule set in order to set permissions.
Standard Documentum features such as versions, renditions, relations, virtual documents and audit trails are also supported by the DCM Importer, which, just like DCM, is based on Documentum.
Versions (including branches) are supported by the DCM Importer, including custom version labels. Version information is generated during scan, be it a Documentum Scanner or other scanners supporting systems where version information can be extracted from. This version information (essentially the Documentum r_version_label) can be changed using the transformation rules prior to import. The version structure (i.e. the ordering of the objects relative to their antecedents) cannot be changed using migration-center.
If objects have been scanned with with version information, all versions must be imported as well since each object references its antecedent, going back to the very first version. Therefore it is advised not to drop the versions of an object between the scan and the import processes since this will most likely generate inconsistencies and errors. If an object is intended to be migrated without versions (i.e. only the current version of the object needs to be migrated), than the affected objects should be scanned without enabling the respective scanner’s versioning option.
Renditions are supported by the DCM Importer. Rendition information is typically generated during scan, be it by the Documentum Scanner or other scanners supporting systems where rendition information or other similar features can be extracted. Should rendition information not be obtainable using a particular scanner, or if the respective source system doesn’t have renditions at all, it is still possible to add files as renditions to objects during the transformation process. The renditions of an object can be controlled through the migration-center system attribute called dctm_obj_rendition. The dctm_obj_rendition attribute appears in the “System attributes” area of the Transformation Rules window. If the object contains at least one rendition in addition to the main content, the source attribute dctm_obj_rendition will be available for use in transformation rules. To keep the renditions for an object and migrate them as they are, the system attribute dctm_obj_rendition must be set to contain one single transformation function: GetValue(dctm_obj_rendition[all]). This will reference the path of the files where the corresponding renditions have been exported to; the Importer will pick up the content from this location and add them as renditions to the respective documents.
It is possible to add/remove individual renditions to/from objects by using the provided transformation functions. This can prove useful if renditions generated by third party software need to be appended during migration. These renditions can be saved to files in any location which is accessible to the Job Server where the import will be run from. The paths to these files can be specified as values for the dctm_obj_rendition attribute. A good practice for being able to assign such third party renditions to the respective objects is to name the files with the objects’ id and use the format name as the extension. This way the dctm_obj_rendition attributes’ values can be built easily to match external rendition files to respective DCM documents.
Other properties of renditions are also available to be set by the user through a series of rendition-related system attribute which are available automatically in any migset targeting a Documentum based system:
Rendition system attribute
Description
dctm_obj_rendition
Specify a file on disk (full path) to be used as the content of that particular rendition
dctm_obj_rendition_format
Specify a valid Documentum format to be set for the rendition Leave empty to let Documentum decide the format automatically based on the extension of the file specified in dctm_obj_rendition (if the extension is not known to Documentum the rendition’s format will be unknown)
Will be ignored if dctm_obj_rendition is not set
dctm_obj_rendition_modifier
Specify a page modifier to be set for the rendition. Any string can be set (must conform to Documentum’s page_modifier attribute, as that’s where the value would end up)
Leave empty if you don’t want to set any page modifiers for renditions.
Will be ignored if dctm_obj_rendition is not set
dctm_obj_rendition_storage
Specify a valid Documentum filestore where the rendition’s content file will be stored. Leave empty to store the rendition file in Documentum’s default filestore.
Will be ignored if dctm_obj_rendition is not set
All dctm_obj_rendition* system attributes are repeating attributes and as such accept multiple values, allowing multiple renditions to be added to the same object. Normally the number of values for all dctm_obj_rendition* attributes should be the same and equal the maximum number of renditions one would like to set for an object. E.g. if three renditions should be set, then each of the dctm_obj_rendition* attributes should have three values for each of the three renditions. More values will be ignored, missing values will be filled in with whatever Documentum would use as default in place of that missing value.
Relations are supported by DCM Importer. Relations for import to DCM can currently be generated only by the Documentum Scanner, but technically it is possible to customize any scanner to generate relation information that can be processed by the DCM Importer if information similar or close to Documentum relations needs to be extracted from non-Documentum system. The option named exportRelations in the Documentum scanner’s configuration determines if relations are scanned and exported to migration-center.
Important: relations cannot be altered using transformation rules; migration-center will manage relations automatically if the appropriate options in the scanner and importer have been selected. Relations will always be connected to the parent object of the relation and can be viewed in migration-center by right-clicking on an object in any view of a migration set and selecting <View Relations> from the context menu. All relations with the selected object as the parent object are listed with their associated metadata, such as relation name, child object, etc.
An option corresponding to the scanners’ must be selected in the importer as well to restore the relations between objects on import; the option is labeled importRelations. The importer can be configured to import objects and relations together or independently of one another. This can be used to migrate only the objects first and attach relations to the imported objects later.
Relations will always be connected to the parent object of the relation, which is why importing a relation will always be attempted when importing its parent object and the importRelations option mentioned above is selected as well. Importing a relation will fail if its child object is not present at the import location. This is not to be considered a fatal error. Since the relation is connected to the parent object, the parent object itself will be imported successfully and marked as “Partially Imported”, indicating it has one or more relations which could not be imported (because the respective child object could not be found). After the child object gets imported, the import for the parent object can be repeated. The already imported parent object will not be touched, but its missing relation(s) will now be created and connected to the child object. Once all relations have been successfully created, the parent object’s status will change from “Partially imported” to “Imported”, indicating a fully migrated object, including all its relations. Should some objects remain in a “Partially imported” state because the child objects the relation depends on are not migrated for a reason, then the objects can remain in this state and this state can be considered a final state equivalent to “imported” in such a case. The “Partially imported” state does not have any adverse effects on the current or future migrations even if these depend on the respective objects.
migration-center’s Documentum based adapters support relations between folders and/or documents only (i.e. dm_folder and dm_document objects, as well as their respective subtypes) dm_subscription type objects for example, although relations from a technical point of view, will be ignored by the scanner because they are relations involving a dm_user object. Custom relation objects (i.e. relation-type objects which are subtypes of dm_relation are also supported, including any custom attributes they may have. The restriction mentioned above regarding the types of objects connected by such a relation also apply to custom relation objects.
Documentum Virtual Documents are supported by migration-center. The option named exportVirtualDocs in the configuration of the scanner determines if virtual documents are scanned and exported to migration-center.
Another option related to virtual documents, named maintainVirtualDocsIntegrity is recommended when scanning VDs. This option will allow the scanner to include children of VDs which may be outside the scope of the scanner (paths to scan or dqlString) in order to maintain the integrity of the VD. If this option is not turned on, the children in the VD that are out of scope (they are not linked under the scanned path or they are not returned by dqlString) will not be scanned and the VD may be incomplete. This option can be enabled/disabled based on whichever setting makes sense for the migration project.
The VD binding information (dmr_containment objects) are always scanned and attached to the root object of a VD regardless of the maintainVirtualDocsIntegrity option. This way it is possible to scan any missing child objects later on and still be able to restore the correct VD structure based on the information stored with the root object.
The ExportVersions option needs to be checked for scanning Virtual Documents (i.e. if the ExportVirtualDocuments option is checked) even if the virtual documents themselves do not have multiple versions, otherwise the virtual documents export might produce unexpected results. This is because the VD parents may still reference child objects which are not current versions of those respective objects. This is not an actual product limitation, but rather an issue caused by this particular combination of Scanner options and Documentum’s VD features, which rely on information related to versioning.
The Snapshot feature of virtual documents is not supported by migration-center.
The DCM Importer also supports migrating audit trails of documents and folders. Audit Trails can be scanned using the Documentum Scanner (see the Documentum Scanner use guide for more information about scanning audit trails), added to a DCM Audit Trail migset and imported using the DCM Importer.
Audit Trail migsets are subject to the exact same migration procedure as documents and folders. Audit Trail migsets can be imported together with the document and folder migsets they are related to, or on their own at any time after the required document and folder objects have been migrated. It is of course not possible to import any audit trails if the actual object the audit trails belong to hasn’t been migrated.
Importing audit trails is controlled in the DCM Importer via the importAuditTrails parameter (disabled by default).
A typical workflow for migrating audit trails consists of the following main steps:
Scan folders/documents with Documentum Scanner by having “exportAuditTrail” flag activated. The scanning process will create tree kind of distinct objects in migration-center: Documentum(folder), Documentum(document) and Documentum(audittrail).
Assign the. Documentum(audittrail) objects to a DctmToDCM(audittrail) migset(s) and follow the regular migration-center workflow to promote the objects through transformation rules to a “Validated” state required for the objects to be imported.
Import audit trails objects by assigning the respective migset(s) to a DCM Importer job with the importAuditTrails parameter enabled (checked)
In order to prepare audit trail for import, first create a migset containing audit trail objects (more than one migset containing audit trails can be created, just like for documents or folders). For a migset to work with audit trails, the type of object must be set “DctmToDCM(audittrail)” accordingly. After setting the migset type to “DctmToDCM(audittrail)” the | Filescan | tab will display only scans which contain Documentum audit trail objects. Any of these can be added/removed to/from the migration set as usual.
Transformation rules allow setting values for the attributes audit trail entries on import to the target system. Values can be simply carried over unchanged from the source attributes, or they can be transformed using the available transformation functions. All attributes that can be set and associated are defined in the target type “dm_audittrail”.
As with other migration-center objects, audit trail objects have some predefined system attributes:
audited_object_id this should be filled with the corresponding value that comes from source system. No transformation or mapping is necessary because the importer will translate that id into the corresponding id in target repository.
r_object_type must be set to a valid Documentum audit trail object type. This is normally “dm_audittrail” but custom audit trails object types are supported as well.
The following audit trail attributes don’t need to be set through the transformation rules because they are automatically taken from the corresponding audited object in target system: chronicle_id, version_label, object_type.
All other “dm_audittrail” attributes that refer to the audited object (acl_domain, acl_name, audited_obj_vstamp, controlling_app, time_stamp, etc) can be set either to the values that come from the source repository or not set at all, case in which the importer will set the corresponding values by taking them from the audited object located in the target repository.
The source attributes “attribute_list” and “attribute_list_old” may appear as multi-value in migration-center. This is because their values may exceed the maximum size of a value allowed in migration-center (4000 bytes), case in which migration-center handles such an attribute as a multi-value attribute internally. The user doesn’t need to take any action to handle such attributes; the importer knows how to process and set these values correctly in Documentum.
Not setting an attribute means not defining a rule for it or not associating an existing rule with any object type definition or attribute.
Working with rules and associations is core product functionality and is described in detail in the Client User Guide.
Objects that have changed in the source system since the last scan are scanned as update objects. Whether an object in a migration set is an update or not can be seen by checking the value of the Is_update column – if it’s 1, the current object is an update to a previously scanned object (the base object). There are some things to consider when working with the update migration feature:
An update object cannot be imported unless its base object has been imported previously.
Updated objects are detected based on the r_modify_date and i_vstamp attributes. If one of these attributes has changed, the object is considered to have been updated and will be registered accordingly. By default actions performed in Documentum change at least one if not both of these attributes, offering a reliable way to detect whether an object has changed since the last scan; on the other hand, objects changed by third party code/applications which do not touch these attributes might not be detected by migration-center as having changed.
Objects deleted from the source after having been migrated are not detected and will not be deleted in the target system. This is by design (due to the added overhead, complexity and risk involved in deleting customer data).
Updates/changes to primary content, renditions, metadata, VD structures, and relations of objects will be detected and updated accordingly.
migration-center Client, which is used to set up transformation and validation rules does not connect directly to any source or target system to extract this information. Object type definitions can be exported from the respective systems to a csv file which in turn can be imported to migration-center.
One tool to easily accomplish this for Documentum object types is dqMan, which is used in the following steps to illustrate the process. dqMan is an administration Tool for EMC Documentum supporting DQL queries and API commands and much more. dqMan is free and can be downloaded at http://www.fme.de. Other comparable administration tools can also be used, provided they can output a compatible CSV file or generate some similar output which can be processed to match the required format using other tools.
Start dqMan and connect to the target DCM repository. dqMan normally starts with the interface for working with DQL selected by default. Press the [DQL] button in the toolbar if not already selected.
In the “DQL Query” -box paste the following command and replace dm_document with the targeted object type: select distinct t.attr_name, t.attr_type, '0' as min_length, t.attr_length, t.attr_repeating, a.not_null as mandatory from dm_type t, dmi_dd_attr_info a where t.name=a.type_name and t.attr_name=a.attr_name and t.name='dm_document' enable(row_based); Press the [Run] button.
Click somewhere in the “Results” box. Use {CTRL+A} to select all. Right-click to open the context menu and choose <Export to> <CSV>.
The extracted object type template is now ready to be imported to migration-center 3.x as described in the chapter Object Types (or object type template definitions) in the migration-center Client User Guide
The Content Validation functionality is based on checksums (an MD5 hash) computed for the document’s contents during scan (by the Scanner itself) and after import (by the Importer). The two checksums are then compared - for any given piece of content its checksums from before and after the migration should be identical.
If the two checksums differ in any way, this is an indication that the content has been corrupted/modified during/after migration. Modifications can happen on purpose or simply by user error and can usually be traced back based on the r_modifier and r_modify_date attributes of the affected object. Actual data corruption rarely happens and is usually due to software and/or hardware errors on the systems involved during content transfer or storage.
Validating content is always performed after the content has been imported to the target repository, thus adding another step to the migration process. Accordingly, using this feature may significantly increase import time due to having to read back every piece of content for every document and compute its checksum in order to compare it against the initial checksum computer during scan.
This feature is controlled through the checkContentIntegrity parameter in the DCM Importer (disabled by default).
Currently only documents scanned by Documentum Scanner or Filesystem Scanner can be used in a migration workflow involving Documentum Content Validation.
The Documentum Content Validation also supports renditions in addition to a document’s main content. Renditions are processed automatically if found and do not require further configuration by the user.
There is one limitation that applies to renditions though: since the Content Validation functionality is based on a checksum computed initially during scan (before the migration), renditions are supported only if scanned from a Documentum repository using the Documentum Scanner. Currently this is the only scanner aware of calculating the required checksums for renditions; other scanners, even though they may provide metadata pointing to other content files which may become renditions during import, do not handle the content directly and therefore do not compute the checksum during scan time needed by the Content Validation to compare the imported content’s checksum against.
To create a new DCM Importer job specify the respective adapter type in the importer’s Properties window – from the list of available adapters “DCM” must be selected. Once the adapter type has been selected, the Parameters list will be populated with the parameters specific to the selected adapter type, in this case the DCM adapter’s.
The Properties window of an importer can be accessed by double-clicking an importer in the list, or selecting the Properties button or entry from the toolbar or context menu.
Configuration parameters
Values
Name
Enter a unique name for this scanner
Mandatory
Adapter type
Select the “DCM” adapter from the list of available adapters
Mandatory
Location
Select the Job Server location where this job should be run. Job Servers are defined in the Jobserver window. If no Job Server migration-center will prompt the user to define a Job Server Location when saving the Scanner.
Mandatory
Description
Enter a description for this job (optional)
Configuration parameters
Values
username*
User name for connecting to the target repository.
A user account with superuser privileges must be used to support the full Documentum/DCM functionality offered by migration-center.
Mandatory
password*
The user’s password.
Mandatory
repository*
Name of the target repository. The target repository must be accessible from the machine where the selected Job Server is running.
Mandatory
importObjects
Boolean. Selected by default; if NOT checked, documents and folders will NOT be imported. The reason for NOT importing folders and documents is to allow importing only relations between already imported folders/documents.
importRelations
Boolean. Determines whether to import relations between objects. In this case a relation means both dm_relations and dmr_containment objects.
Hint: Depending on project requirements and possibilities, it can make sense to import folders and documents first, and add relations between these objects in a second migration step. For such a two-step approach the importer can be configured accordingly using the importObjects and importRelations parameters.
It is possible to select both options at the same time as well and import everything in a single step is the migration data is organized suitably.
It doesn’t make sense, however, to deselect both options.
importAuditTrails
Boolean. Determines whether Documentum(audit trail) migsets are imported or not. If the setting is false, any Documentum(audit trail) migsets added to the Importer will be ignored (but can be imported later, after enabling this option)
importLocation
The path inside the target repository where objects are to be imported. It must be a valid Documentum path. This path will be appended in front of each dctm_obj_link (for documents) and r_folder_path (for folders) before linking objects if both the importLocation parameter and the dctm_obj_link/r_folder_path attribute values are set. If the attributes mentioned previously already contain a full path, the importLocation parameter does not need to be filled.
autoCreateFolders
Boolean. This option will be used for letting the importer automatically create any missing folders that are part of “dctm_obj_link” or “r_folder_path”.
Use this option to have migration-center re-create a folder structure at the target repository during import. If the target repository already has a fixed/predefined folder structure and creating new folders is not desired, deselect this option
checkContentIntegrity
Boolean. If enabled will check compare the checksums of imported objects with the checksum computed during scan time. Objects with a different checksum will be treated as errors.
May significantly increase import time due to having to read back every document and compute its checksum after import.
ignoreRenditionErrors
Determines whether errors affecting individual renditions will also trigger an error for the object the rendition belongs to or not
Checked - renditions with errors will be reported as warnings; the objects and other renditions will be imported normally
Unchecked - renditions with errors will cause the entire object, including other renditions, to fail on import. The object will be transitioned to the "Import error" status and will not be imported.
defaultFolderType
The Documentum folder type name used when automatically creating the missing object links. If left empty, dm_folder will be used as default type.
preserveMCAttributes
Specifies a list of attributes that should be preserved upon import as set in migration-center, rather than set to the values DCM’s rules and lifecycle may set.
The default selection of attributes listed here is a good choice for allowing controlled documents to be imported into any lifecycle state; the list can of course be edited depending on requirements.
loggingLevel*
Sets the verbosity of the log file.
Values:
1 - logs only errors during scan
2 - is the default value reporting all warnings and errors
3 - logs all successfully performed operations in addition to any warnings or errors
4 - logs all events (for debugging only, use only if instructed by fme product support since it generates a very large amount of output. Do not use in production)
Mandatory
On the | Migsets | tab, the user can select the migration sets to be imported with this importer. Depending on the chosen Adapter Type only the migration sets compatible with this type of importer will be displayed and can be selected for import. Also, only migration sets containing at least one object in a validated state will be displayed (since objects which haven’t been validated cannot be imported).
Available migration sets can be moved between the two columns by double clicking or using the arrows buttons.
A complete history is available for any DCM Scanner or Importer job from the respective items’ History window. It is accessible through the History button/menu entry on the toolbar/context menu. The History window displays a list of all runs for the selected job together with additional information, such as the number of processed objects, the start and ending time and the status.
Double clicking an entry or clicking the Open button on the toolbar opens the log file created by that run. The log file contains more information about the run of the selected job:
Version information of the migration-center Server Components the job was run with
The parameters the job was run with
Execution Summary that contains the total number of objects processed, the number of documents and folders scanned or imported, the count of warnings and errors that occurred during runtime.
Log files generated by the DCM Adapter can be found in the Server Components installation folder of the machine where the job was run, e.g. …\fme AG\migration-center Server Components <Version>\logs
The amount of information written to the log files depends on the setting specified in the ‘loggingLevel’ start parameter for the respective job.
The Filesystem Importer can save objects from migration-center to the file system. It can also write metadata for those objects into either separate or a unified XML file. The folder structure (if any) can also be created in the filesystem during import. The filesystem can be either local filesystem or a share accessible via a UNC path.
Importer is the term used for an output adapter and is used at the last step of the migration process. In the context of the Filesystem Importer the filesystem itself is considered to be the target location for migrated data, hence the designation “importer”. The Filesystem Importer imports data sourced from other systems and processed with migration-center to the filesystem.
This module works as a job that can be run at any time, and can even be executed repeatedly. For every run a detailed history and log file are created. An importer is defined by a unique name, a set of configuration parameters and an optional description.
Filesystem Importers can be created, configured, started and monitored through migration-center Client, but the corresponding processes are executed by migration-center Job Server.
In addition to the actual content files, metadata files containing the objects attributes can be created when outputting files from migration-center. These files use a simple XML schema and usually should be placed next to the objects they are related to. It is also possible to collect metadata for all objects imported in a given run to a single metadata file, rather than separate files.
Starting with version 3.2.6 the way of creating objects metadata has become more flexible. The following options are available:
Generate the metadata for each object to an individual xml file. The name and the location of the individual metadata file is now configurable through the system rule “metadata_file_path”. If left empty no individual metadata files will be generated.
Generate the metadata of the imported objects in a single xml file. The name and the location of the unified metadata file will be set in the importer parameter “unifiedMetadataPath”. In this case the system rule “metadata_file_path” must be empty.
Generate the metadata for each object to an individual xml file and create also the unified metadata file. The individual metadata file will be set through the system rule “metadata_file_path” and the unified metadata through the importer parameter “unifiedMetadataPath”
Import only the content of files without generating any metadata file. In this case the system rule “metadata_file_path” and the importer parameter “unifiedMetadataPath” should be left empty.
If one of the goals of importing files and metadata to the filesystem is to be scanned in the future with the filesystem scanner, then the individual metadata file names should comply with the filesystem scanner naming convention. The location of the individual metadata must be the folder where content is exported and the name should be composed from the name and the extension of the content file plus the extension of the metadata file.
For example: If one content file is exported to “d:\export\file1.pdf” the generated individual metadata should be “d:\export\file1.pdf.xml” where “.xml” is the extension you chose for the metadata file.
A sample metadata file’s XML structure is illustrated below. The sample content could belong to the report.pdf.fme file mentioned above. In this case the report.pdf file has 4 attributes, each attribute being defined as a name-value pair. There are five lines because one of the attributes is a multi-value attribute. Multi-value attributes are represented by repeating the attribute element with the same name, but different value attribute (i.e. the keywords attribute is listed twice, but with different values)
To generate metadata files in a different format than the one above, an XSL template can be used to transform the above XML into another output. To use this functionality a corresponding XSL file needs to be build and its location specified in the importer’s parameters. This way it is possible to obtain XML files in a format that could be processed further by other software if needed. The specified XSL template will apply to both metadata files: individual and unified.
For a unified metadata file it is also possible to specify the name of the root node (through an importer parameter) that will be used to enclose the individual objects’ <contentattributes> nodes.
Filesystem attributes like created, modified and owner can not only be set in the metadata file but they are also set on the created content file in the operating system. Any source attribute can be used and mapped to one of these attributes in the migset system rules.
Even though the filesystem does not explicitly support “renditions”, i.e. representations of the same file in different formats, the Filesystem importer can work with multiple files which represent different formats of the same content. The Filesystem Importer does not and cannot generate these files – “Renditions” would typically come from an external source such as PDF representations of editable Office file formats or technical drawings created using one of the many PDF generation applications available, or renditions extracted by a migration-center scanner from a system which supports such a feature. If files intended to be used as renditions exist, the Filesystem Importer can be configured to get these files from their current location and move them to the import location together with the migrated documents. The “renditions” can then be renamed for example in order to match the name of the main document they relate to; any other transformation is of course also possible. “Renditions” are an optional feature and can be managed through dedicated system rules during the migration. See for more.
The source data imported with the Filesystem Importer can originate from various content management systems which typically also support multiple versions of the same object.
The Filesystem Importer does not support outputting versioned objects to the filesystem (multiple versions of the same document for example).
This is due to the filesystem’s design which does not support versioning an object, nor creating multiple files in the same folder with the same name. If versions need to be imported using the Filesystem Importer the version information should be appended to the filename attribute to generate unique filenames. This way there will be no conflicting names and the importer will be able to write all files correctly to the location specified by the user.
The source data imported with the Filesystem Importer can originate from various content management systems which can support multiple links for the same object, i.e. one and the same object being accessible in multiple locations.
The Filesystem Importer does not support creating multiple links for objects in the filesystem (the same folder linked to multiple different parent folders for example). If the object to be imported with the Filesystem Importer has had multiple links originally, only the first link will be preserved and used by the Filesystem Importer for creating the respective object. This may put some objects in unexpected locations, depending on how the objects were linked or arranged originally.
Using scanner configuration parameters and/or transformation rules it should be possible to filter out any unneeded links, leaving only the required links to be used by the Filesystem Importer.
Objects meant to be migrated to the filesystem using the Filesystem Importer have their own type in migration-center. This allows migration-center and the user to target aspects and properties specific to the filesystem.
Documents targeted at the filesystem will have to be added to a migration set first. This migration set must be configured to accept objects of type <source object type>ToFilesystem(document).
Create a new migration set and set the <source object type>ToFilesystem(document).object type in the Type drop-down. This is set in the –Migration Set Properties- window which appears when creating a new migration set. The type of object can no longer be changed after a migration set has been created.
The migset is now set to work with Filesystem documents.
<source object type>ToFilesystem(document)-type objects have a number of predefined rules listed under Rules for system attributes in the –Transformation Rules- window. These rules are described in the table below.
Configuration parameters
Values
content_target_file_path
Sets the full path, including filename and extension, where the current document should be saved on import.
Use the available transformation methods to build a string representing a valid file system path.
Example: d:\Migration\Files\My Documents\Report for 12-11.xls
Mandatory
rendition_source_file_paths
Sets the full path, including filename and extension, where a “rendition” file for the current document is located.
Use the available transformation methods to build a string representing a valid file system path.
This is a multi-value rule, allowing multiple paths to be specified if more than one rendition exists (PDF, plain text, and XML for example)
rendition_target_file_paths
Sets the full path, including filename and extension, where a “rendition” file for the current document should be saved to. Typically this would be somewhere near the main document, but any valid path is acceptable.
Use the available transformation methods to build a string representing a valid file system path.
Example:
d:\Migration\Files\My Documents\Renditions\Report for 12-11.pdf
This is a multi-value rule, allowing multiple paths to be specified if more than one rendition exists (PDF, plain text, and XML for example)
metadata_file_path
The path to the individual metadata file that will be generated for current object.
created_date
Sets the “Created” date attribute in the filesystem.
(Leave empty to skip)
modified_date
Sets the “Modified” date attribute in the filesystem.
(Leave empty to skip)
file_owner
Sets the “Owner” attribute in the filesystem. The user e.g. “domain\user” or “jdoe” must either exist in the computer ‘users’ or in the LDAP-directory.
(Leave empty to skip)
Working with rules and associations is a core product functionality and is described in detail in the Client User Guide.
Since the Filesystem doesn’t use different object types for files, the Filesystem Importer doesn’t need this information either. But due to migration-center’s workflow an association with at least one object type needs to exist in order to validate and prepare objects in a migration set for import. To work around this, any existing migration-center object type definition can be used with the Filesystem Importer. A good practice would be to create a new object type definition containing the attribute values used with the Filesystem Importer, and to use this object type definition for association and validation.
Working with object type definitions and defining attributes is a core product functionality and is described in detail in the Client User Guide.
To create a new Filesystem Importer job, specify the respective adapter type in the Importer Properties window – from the list of available adapters “Filesystem” must be selected. Once the adapter type has been selected, the Parameters list will be populated with the parameters specific to the selected adapter type.
The Properties window of an importer can be accessed by double-clicking an importer in the list, or selecting the Properties button/menu item from the toolbar/context menu.
A detailed description is always displayed at the bottom of the window for a selected parameter.
Configuration parameters
Values
Name
Enter a unique name for this scanner
Mandatory
Adapter type
Select the “Filesystem” adapter from the list of available adapters
Mandatory
Location
Select the Job Server location where this job should be run. Job Servers are defined in the Jobserver window. If no Job Server migration-center will prompt the user to define a Job Server Location when saving the Scanner.
Mandatory
Description
Enter a description for this job (optional)
Configuration parameters
Values
xsltPath
The path to the XSL file used for transformation of the meta-data (leave empty for default metadata XML output)
unifiedMetadataPath
The path and filename where the unified metadata file should be saved; the parent folder must exist, otherwise the import will stop with an error
Leave empty to create individual XML metadata files for each object
unifiedMetadataRootNodes
The list of XML root nodes to be inserted in the unified meta-data file which will contain the document and folder metadata nodes; the default value is “root”, which will create a <root>…</root> element.
Multiple values are also supported, separated by “;”, e.g. “root;metadata”, which would create a <root><metadata>…</metadata></root> structure in the XML file for storing the object’s metadata.
moveFiles
Flag for moving content files; if false the content files will be just copied, if true they will be moved (copied and then deleted from original location) - default value: false
loggingLevel*
Sets the verbosity of the log file.
Values:
1 - logs only errors during scan
2 - is the default value reporting all warnings and errors
3 - logs all successfully performed operations in addition to any warnings or errors
4 - logs all events (for debugging only, use only if instructed by fme product support since it generates a very large amount of output. Do not use in production)
Mandatory
A complete history is available for any Filesystem-Importer job from the respective items’ History window. It is accessible through the History button/menu entry on the toolbar/context menu. The History window displays a list of all runs for the selected job together with additional information, such as the number of processed objects, the start and ending time and the status.
Double clicking an entry or clicking the Open button on the toolbar opens the log file created by that run. The log file contains more information about the run of the selected job:
Version information of the migration-center Server Components the job was run with
The parameters the job was run with
Execution Summary that contains the total number of objects processed, the number of documents and folders scanned or imported, the count of warnings and errors that occurred during runtime.
Log files generated by the Filesystem-Importer can be found in the Server Components installation folder of the machine where the job was run, e.g. …\fme AG\migration-center Server Components <Version>\logs
the amount of information written to the log files depends on the setting specified in the ‘loggingLevel’ start parameter for the respective job.
The SharePoint Online Batch importer allows migrating documents and folders to SharePoint Online and OneDrive. Since OneDrive is based on SharePoint Online, you can use the SharePoint Online Batch importer to import documents and folders to OneDrive as well. If we refer to SharePoint Online in the following, this does apply to OneDrive as well in most cases. We will write appropriate notes in case the behavior of the importers differs for OneDrive.
The SharePoint Online Batch importer offers the following features:
Leverages Microsoft SharePoint Online Import Migration API (bulk import API)
Import documents to Microsoft SharePoint Online Document Library items
Import folders to Microsoft SharePoint Online Document Library items
Set values for any columns in SharePoint Online, including user defined columns
Set values for SharePoint Online specific internal field values, i.e. author, editor, time created, time last modified
Set the folder path and create folders if they don’t exist
Set role assignments (permissions) on documents and folders
Import versions
Import files with a size up to 15GB
Automatic or manual/custom version numbering
Set moderation/approval status on documents
The SharePoint Online Batch importer is implemented mainly as a Job Server component but comes with a separate component for communicating with SharePoint Online (we refer to this component as CSOM Service because it uses the Microsoft CSOM API).
SharePoint Online Batch importers can be created, configured, started and monitored through migration-center Client, while the corresponding processes are executed by migration-center Job Server and the migration-center SharePoint Online Batch Importer respectively.
The term importer is used in migration-center for describing a component which imports processed data from migration-center to another system.
Scanners and importers work as jobs that can be run at any time and can even be executed repeatedly. For every run a detailed history and log file are created. Multiple scanner and import jobs can be created or run at a time, each being defined by a unique name, a set of configuration parameters and a description (optional).
The SharePoint Online Batch importer comes with the following features, limitations, and known issues:
The importer uses the Azure storage containers provided by SharePoint Online.
The importer only supports
import of documents with metadata (incl. role assignments) and versions and
import of folders with metadata (incl. role assignments).
You can only assign site collection groups to user/group fields and role assignments.
The following column / field types are currently supported by the importer:
User
Text
Integer
Number
Choice
Note
DateTime
Boolean
TaxonomyFieldType
Any invalid XML characters in Text and Note field values will be automatically replaced by a '_' character during import. Otherwise the import would fail with an error.
The target of a migration job must be a Folder in a Document Library of the specified Web (Site). All folders and documents will go inside or below that Folder.
When importing documents or folders, the importer will automatically create any subfolders (of default type) needed below the configured base folder if parameter "autoCreateFolder" is set to "true".
The importer will not respect the version numbers of the objects. Instead it will create ascending major version numbers, i.e. "1.0" for version 1, "2.0" for version 2 etc.
The importer only supports setting managed metadata terms (taxonomies) by their ID in the format "{9ca5bcac-b5b5-445c-8dab-3d1c2181bcf2}". Setting multiple terms on a target attribute is currently not supported.
Delta migration is not supported by the SharePoint Online Migration API and thus this feature is not available with the SharePoint Online Batch Importer. If you need the delta migration feature, please use our SharePoint Online Importer, which supports this feature.
If you set the system rule “declareAsRecord” to “true” for an object, the importer will declare the object as a record if the target library is a Records Library.
Due to the asynchronous nature of the SharePoint Migration API
The progress indicator in the migration-center client works currently only on a per batch level, i.e. if you import only one batch, the progress jumps from 0% to 100% when that batch was imported. Future versions of the importer may provide a more granular progress indicator.
The migration-center database might not reflect the actual import state of the objects in the migset. For example, it might happen that objects get imported but are still marked as Validated or Error in the migration-center database. For technical details please see Chapter 5 - Technical Details on the Importer
If your site has the Document ID feature enabled: The document ID will be populated for the files imported via the migration API as well – just that it will happen asynchronously. So the DocId might be missing right after the import, but it will be populated within 24hrs (when the backend job runs).
Import into OneDrive was only tested with Azure AD app-only principal authentication.
To install the main product components, consult the migration-center Installation Guide document.
The migration-center SharePoint Online Batch Importer requires installing an additional, separate component besides the main product components. In the following we refer to this component as “SPOnline Batch CSOM Service”.
The SPOnline Batch CSOM Service is necessary to use the SharePoint Online Migration API with migration-center. Its installation and uninstallation are described below.
The SPOnline Batch CSOM Service is designed to run as a Windows service and needs the .NET Framework 4.7.2 (or later) installed on the migration-center Job Server machine.
You must install the SPOnline Batch CSOM Service on all machines where you had installed the migration-center server components.
To install SPOnline Batch CSOM Service, it is necessary to run an installation file, which is located within the SharePoint Online Batch Importer component folder of your migration-center Job Server installation location, which is by default C:\Program Files (x86)\fme AG\migration-center Server Components <Version>\lib\mc-spo-batch-importer\CSOM_Service\install. This folder contains the file install.bat, which must be executed with administrative privileges (i.e. “Run as Administrator”).
After the service is installed you will need to start it manually for the first time, after that the service is configured to start automatically as soon as the computer’s operating system is loaded.
In case you need to uninstall the SPOnline Batch CSOM Service, please run the file uninstall.bat in C:\Program Files (x86)\fme AG\migration-center Server Components <Version>\lib\mc-spo-batch-importer\CSOM_Service\install with administrative privileges (i.e. “Run as Administrator”).
In this chapter you will learn working with the SharePoint Online Batch importer. Specifically, you will see how to import documents with metadata and versions and how to import folders with metadata.
The importer supports app-principal authentication for connecting to SharePoint Online. The app-principal authentication comes in two flavors: Azure AD app-only principal authentication and SharePoint app-only principal authentication.
Azure AD app-only authentication requires full control access for the migration-center application on your SharePoint Online tenant. This includes full control on ALL site collections of your tenant.
If you want to restrict the access of the migration-center application to certain site collections or sites, you can use SharePoint app-only authentication.
The migration-center SharePoint Online Batch Importer supports Azure AD app-only authentication. This is the authentication method for background processes accessing SharePoint Online recommended by Microsoft. When using SharePoint Online you can define applications in Azure AD and these applications can be granted permissions to your SharePoint Online tenant.
Please follow these steps in order to setup your migration-center application in your Azure AD.
Step 1: Create a self-signed certificate for your migration-center Azure AD application
In Azure AD when doing App-Only you typically use a certificate to request access: anyone having the certificate and its private key can use the app and the permissions granted to the app. The below steps walk you through the setup of this model.
You are now ready to configure the Azure AD Application for invoking SharePoint Online with an App-Only access token. To do that, you must create and configure a self-signed X.509 certificate, which will be used to authenticate your migration-center Application against Azure AD, while requesting the App-Only access token. First you must create the self-signed X.509 Certificate, which can be created using the makecert.exe tool that is available in the Windows SDK or through a provided PowerShell script which does not have a dependency to makecert. Using the PowerShell script is the preferred method and is explained in this chapter.
It's important that you run the below scripts with Administrator privileges.
To create a self-signed certificate with this script, which you can find in the <job server folder>\lib\mc-spo-batch-importer\scripts folder:
.\Create-SelfSignedCertificate.ps1 -CommonName "MyCompanyName" -StartDate 2020-07-01 -EndDate 2022-06-30
The dates are provided in ISO date format: YYYY-MM-dd
You will be asked to give a password to encrypt your private key, and both the .PFX file and .CER file will be exported to the current folder.
Save the password of the private key as you’ll need it later.
Step 2: Register the migration-center Azure AD application
In the "App registrations" tab you will find the list of Azure AD applications registered in your tenant. Click the "New registration" button in the upper left part of the blade. Next, provide a name for your application, e.g. “migration-center” and click on "Register" at the bottom of the blade.
Once the application has been created copy the "Application (client) ID" as you’ll need it later.
Step 3: Configure necessary permissions for the migration-center application
Now click on "API permissions" in the left menu bar and click on the "Add a permission" button. A new blade will appear. Here you choose the permissions that are required by migration-center. Choose i.e.:
Microsoft APIs
SharePoint
Application permissions
Sites
Sites.FullControl.All
TermStore
TermStore.Read.All
User
User.Read.All
Graph
Application permissions
Sites
Sites.FullControl.All
Step 4: Uploading the self-signed certificate
Next step is “connecting” the certificate you created earlier to the application. Click on "Certificates & secrets" in the left menu bar. Click on the "Upload certificate" button, select the .CER file you generated earlier and click on "Add" to upload it.
Step 5: Grand admin consent
The “Sites.FullControl.All” application permission requires admin consent in a tenant before it can be used. In order to do this, click on "API permissions" in the left menu again. At the bottom you will see a section "Grand consent". Click on the "Grand admin consent for" button and confirm the action by clicking on the "Yes" button that appears at the top.
Final Step: Setting the necessary parameters in the importer
In order to use Azure AD app-only principal authentication with the SharePoint Online Batch importer you need to fill in the following importer parameters with the information you gathered in the steps above:
SharePoint app-only authentication allows you to grant fine granular access permissions on your SharePoint Online tenant for the migration-center application.
The information in this chapter is based on the following guidelines from Microsoft:
Step 1: Create a self-signed certificate for your migration-center Azure AD application
In Azure AD when doing App-Only you typically use a certificate to request access: anyone having the certificate and its private key can use the app and the permissions granted to the app. The below steps walk you through the setup of this model.
You are now ready to configure the Azure AD Application for invoking SharePoint Online with an App-Only access token. To do that, you must create and configure a self-signed X.509 certificate, which will be used to authenticate your migration-center Application against Azure AD, while requesting the App-Only access token. First you must create the self-signed X.509 Certificate, which can be created by using the makecert.exe tool that is available in the Windows SDK or through a provided PowerShell script which does not have a dependency to makecert. Using the PowerShell script is the preferred method and is explained in this chapter.
It's important that you run the below scripts with Administrator privileges.
To create a self-signed certificate with this script, which you can find in the <job server folder>\lib\mc-sharepointonline -importer\scripts folder:
.\Create-SelfSignedCertificate.ps1 -CommonName "MyCompanyName" -StartDate 2020-07-01 -EndDate 2022-06-30
The dates are provided in ISO date format: YYYY-MM-dd
You will be asked to give a password to encrypt your private key, and both the .PFX file and .CER file will be exported to the current folder.
Save the password of the private key as you’ll need it later.
Step 2: Register the migration-center Azure AD application
In the "App registrations" tab you will find the list of Azure AD applications registered in your tenant. Click the "New registration" button in the upper left part of the blade. Next, provide a name for your application, e.g. “migration-center” and click on "Register" at the bottom of the blade.
Once the application has been created copy the "Application (client) ID" as you’ll need it later.
Step 3: Uploading the self-signed certificate and generate secret key
Next step is “connecting” the certificate you created earlier to the application. Click on "Certificates & secrets" in the left menu bar. Click on the "Upload certificate" button, select the .CER file you generated earlier and click on "Add" to upload it.
After that, you need to create a secret key. Click on “New client secret” to generate a new secret key. Give it an appropriate description, e.g. “migration-center” and choose an expiration period that matches your migration project time frame. Click on “Add” to create the key.
Store the retrieved information (client id and client secret) since you'll need this later! Please safeguard the created client id/secret combination as would it be your administrator account. Using this client id/secret one can read/update all data in your SharePoint Online environment!
Step 4: Granting permissions to the app-only principal
Next step is granting permissions to the newly created principal in SharePoint Online.
If you want to grant tenant scoped permissions this granting can only be done via the “appinv.aspx” page on the tenant administration site. If your tenant URL is https://contoso-admin.sharepoint.com, you can reach this site via https://contoso-admin.sharepoint.com/_layouts/15/appinv.aspx.
If you want to grant site collection scoped permissions, open the “appinv.aspx” on the specific site collection, e.g. https://contoso.sharepoint.com/sites/mysite/_layouts/15/appinv.aspx.
Once the page is loaded add your client id and look up the created principal:
Please enter “www.migration-center.com” in field “App Domain” and “https://www.migration-center.com” in field “Redirect URL”.
To grant permissions, you'll need to provide the permission XML that describes the needed permissions. The migration-center application will always need the “FullControl” permission. Use the following permission XML for granting tenant scoped permissions:
<AppPermissionRequests AllowAppOnlyPolicy="true">
<AppPermissionRequest Scope="http://sharepoint/content/tenant" Right="FullControl" />
<AppPermissionRequest Scope="http://sharepoint/taxonomy" Right="Read" />
</AppPermissionRequests>
Use this permission XML for granting site collection scoped permissions:
<AppPermissionRequests AllowAppOnlyPolicy="true">
<AppPermissionRequest Scope="http://sharepoint/content/sitecollection" Right="FullControl" />
<AppPermissionRequest Scope="http://sharepoint/taxonomy" Right="Read" />
</AppPermissionRequests>
When you click on “Create” you'll be presented with a permission consent dialog. Press “Trust It” to grant the permissions:
Please safeguard the created client id/secret combination as would it be your administrator account. Using this client id/secret one can read/update all data in your SharePoint Online environment!
Final Step: Setting the necessary parameters in the importer
In order to use SharePoint app-only principal authentication with the SharePoint Online importer you need to fill in the following importer parameters with the information you gathered in the steps above:
In order to import documents, you need to create a migration set with a processing type of “<Source>ToSPOnlineBatch(document)” as shown in the screenshot below:
After you have selected the objects to import from the file scans, you need to configure the migration set’s transformation rules. A migration set with a target type of “SPOnlineBatch(document)” has the following system attributes, which primarily determine the content type, name and location of the documents.
You can find a detailed description of the available system rules in the table below:
You also need to select the system rule “s__contentType” in the “Get type name from rule” dropdown box on the | Associations | tab and add all required type definitions in the “Types” list box.
Finally, after you have transformed and validated your migration set, you should be ready to import it using the SharePoint Online Batch Importer.
In order to import folders, you need to create a migration set with a processing type of “<Source>ToSPOnlineBatch(document)” as shown in the screenshot below:
After you have selected the objects to import from the file scans, you need to configure the migration set’s transformation rules.
A migration set with a target type of “SPOnlineBatch(folder)” has the following system attributes, which primarily determine the content type, name and location of the folders.
You can find a detailed description of the available system rules in the table below:
You also need to select the system rule “s__contentType” in the “Get type name from rule” dropdown box on the | Associations | tab and add all required type definitions in the “Types” list box.
Finally, after you have transformed and validated your migration set, you should be ready to import it using the SharePoint Online Batch Importer.
To create a new SharePoint Online Batch Importer, go to -Importers- and press the [New] button. Then select as
An importer of type SPOnline Batch has the following parameters, which you need to set with the appropriate values:
This chapter will give you an overview of the importer’s operating principle.
Each SharePoint Online Batch Importer job will go through the following steps:
Split the list of objects to import into batches of approximately 250 objects.
For each batch repeat
Generate the XML files (e.g. Manifest.xml etc.) necessary for submitting the batch to the SharePoint Migration API.
Upload XML files and content files of the batch to an Azure BLOB container storage.
Submit the import batch to the SharePoint platform as a migration job.
Monitor the progress of the migration job.
When the migration job has finished, verify that all objects submitted for import were successfully imported by retrieving them by ID and save the result of the verification back to the migration-center database.
Depending on your job configuration, several batches are imported in parallel (see “numberOfThreads” parameter in the import job configuration).
The importer will split the list of documents or folders to import into batches of approximately 250 items using the following rules
All versions of an item must be contained in the same batch.
A batch should contain a maximum of 250 items.
The content size of the items in a batch should not exceed 250 MB
The first rule can lead to batches with more than 250 objects. The last rule can lead to batches with less than 250 objects.
The SharePoint Import Migration API requires the following XML files for each import batch:
ExportSettings.XML
LookupListMap.XML
Manifest.XML
Requirements.XML
RootObjectMap.XML
SystemData.XML
UserGroupMap.XML
ViewFormsList.XML
A complete history is available for any SharePoint Online Batch Importer job from the respective items’ –History- window. It is accessible through the [History] button/menu entry on the toolbar/context menu. The -History- window displays a list of all runs for the selected job together with additional information, such as the number of processed objects, the start and ending time and the status.
Double clicking an entry in the -History-window or clicking the Open button on the toolbar opens the job report file created by that run. The report file contains more information about the run of the selected job:
Version information of the migration-center Server Components the job was run with
The parameters the job was run with
Execution Summary that contains the total number of objects processed, the number of documents and folders scanned or imported, the count of warnings and errors that occurred during runtime.
Log files generated by the SharePoint Online Batch Importer can be found in the Server Components installation folder of the machine where the job was run, e.g. …\fme AG\migration-center Server Components <Version>\logs\SPOnline Batch Importer\<job run id>
You can find the following files in the <job run id> folder:
The import-job.log contains detailed information about the job run and can be used to investigate import issues.
The generated manifest files for each batch in the <batch number>/manifest sub-folder.
The log files generated by SharePoint Online for each batch in the <batch-number>/spo-logs sub-folder.
The amount of information written to the report and log files depends on the setting specified in the ‘loggingLevel’ start parameter for the respective job.
The OpenText Importer takes the objects processed in migration-center and imports them into an OpenText Content Server.
Importer is the term used for an output adapter used as the last step of the migration process. It takes care of importing the objects processed in migration-center into the target system (such as an OpenText Content Server).
An importer works as a job that can be run at any time and can even be executed repeatedly. For every run a detailed history and log file are created. It is defined by a unique name, a set of configuration parameters and an optional description.
OpenText Importers can be created, configured, started and monitored through migration-center Client, but the corresponding processes are executed by migration-center Job Server.
The OpenText Importer is compatible with the version 10.5, 16.0, 16.4 and 20.2 of OpenText Content Server. The version 10.0 is not supported anymore.
It requires les-services (for v10.5) or Content Web Services (for v10.5+) to be installed on the Content Server. In case of setting classifications to the imported files or folders the Classification Webservice must be installed on the Content Server. For supporting Record Management Classifications, the Record Management Webservice is required.
Some specific importer features require the installation of some of the provided patches on the Content Serve. The required patches are delivered with MC kit within the folder “..\ServerComponents\Jobserver\lib\mc-otcs-importer\cspatches”.
For deployment, copy the provided patches to the folder .\patch on the Content Server and restart the Content Server.
This patch extends the OpenText DOCMANSERVICE.Service.DocumentManagement. CreateSimpleFolder method.
The patch allows setting of custom CreateDate, ModifyDate, FileCreateDate and FileModifiyDate for nodes and versions.
To create a new OpenText Importer job specify the respective adapter type in the importer’s
properties window from the list of available adapters “OpenText” must be selected. Once the adapter type has been selected, the Parameters list will be populated with the parameters specific to the selected adapter type, in this case OpenText.
The -Properties window- of an importer can be accessed by double-clicking an importer in the list, by selecting the [Properties] button from the toolbar or from the context menu.
There are two ways to create folders in the OpenText repository with the importer:
On the fly when importing documents When creating a new migration set choose the “<source type>ToOpenText(document)“ type – this will create migration set containing documents targeted at OpenText. Use the “autoCreateFolders” parameter from the OpenText Importer configuration to generate the folder structure based on values extract in the system rule “ParentFolder”. No categories, classification or permissions can be set on the created folders.
Using a dedicated migration set for folders When creating a new migration set choose the “<source type>ToOpenText(container)“ type – this will create a migration set containing folders targeted at OpenText.
Now only the scanner runs containing folder objects will be displayed on the |Filescan Selection| tab. Note that the number of objects contained in the displayed scanner runs now indicates folders and not documents, which is why the number on display (folders) can be different from the total number of objects processed by the scan (if it contains other types of objects besides folders). When creating transformation rules for the migration set, keep in mind that folder-only migration sets have folder-specific attributes to work with, in this case attributes specifically targeted at OpenText folder objects. You can set permissions, categories and classifications to the imported folders.
When importing folder migration set, in case an existing folder structure is already in place an error will be thrown for the folder objects that already exist. It is not possible to avoid this behavior unless you skip them manually by removing them from the migset or putting them in an invalid state for import.
The importer parameter “autoCreateFolders” applies to both documents and folders migration sets.
OpenText importer allows importing documents from any supported source system to OpenText. For that a migset of type “<Source>toOpenTextDocument” has to be created:
Importing documents with multiple versions is supported by OpenText importer. The structure of the versions tree is generated by the scanners of the systems that support this feature and provide means to extract it. Although the version tree is immutable (i.e. the ordering of the objects relative to their antecedents cannot be changed) the type of the versions (linear, minor or major) can be set in the system attribute “VersionControl” (see next sections for more detailed)
All objects from a version structure must be imported since each of them reference its antecedent, going back to the very first version. Therefore, it is advised to not drop the versions of an object between the scan and the import processes as this will generate inconsistencies and errors. If an object is intended to be migrated without versions (i.e. only the current version of the object needs to be migrated) than the affected objects should be scanned without enabling the respective scanner’s versioning option.
The Content Validation functionality is based on checksums computed for the document’s contents during scan (by the Scanner itself) and after import (by the Importer). The two checksums are then compared - for any given piece of content its checksums from before and after the migration should be identical.
If the two checksums differ in any way, this is an indication that the content has been corrupted/modified during/after migration. Modifications can happen on purpose or simply by user or environment error. Actual data corruption rarely happens and is usually due to software and/or hardware errors on the systems involved during content transfer or storage.
Validating content is always performed after the content has been imported to the target repository, thus adding another step to the migration process. Accordingly, using this feature may significantly increase import time due to having to read back every piece of content for every document and compute its checksum in order to compare it against the initial checksum computer during scan.
This feature is controlled through the checkContentIntegrity parameter in the OpenText Importer (disabled by default).
This feature work only in tandem with a Scanner that supports it: Documentum Scanner, Filesystem Scanner, Database Scanner and SharePoint Scanner.
This current version of the importer allows importing virtual documents scanned from Documentum as compound document in OpenText. For creating a compound document, the first value of the system attribute “target_type” must be “opentext_compound”. Setting the categories and classifications to the compound documents are supported as they are supported for normal documents. The children of the VD will be imported inside the compound documents in a “box-in-box” manner, similar to the way VDs, children, content, and VDs in VDs are represented in Documentum Webtop.
Importing virtual documents as compound documents is done by OpentText Importer in two steps:
Import all virtual documents as empty CD and all non-virtual documents as normal documents. They are all imported within the folder specified in the "ParentFolder" attribute.
Add the virtual documents children to the compound documents based on the VDRelation relations scanned from Documentum in the following way:
if the child is linked in the folder where it was imported in the step 1 and the importer parameter “moveDocumentsToCD” is checked, then the child document is moved to the compound document. If “moveDocumentsToCD” is not checked then a shortcut to the child is added inside the compound document.
if the child is already linked in a compound document then a shortcut is created in the current compound document (the case when a document is child of a multiple VDs)
If the virtual documents being imported as compound documents have content, the content is imported as the first child of the compound document having the same name as the compound document itself.
OpenText importer allows importing emails scanned as MSG files from Outlook, Exchange or from other supported systems. In order to be imported as emails, the objects in migration center need to be associated with the target type “opentext_email”.
If the importer parameter “extractAttachmentsFromEmail” is checked the importer will extract the email attachments and import them as cross references for the imported email. In this case the parameter “crossRefCode” should be set with the value of Email Cross-Reference in the RM settings (Records Management -> Records Management Administration -> System Settings -> RM Settings -> Email Cross-Reference) .
For the case when MSG files are scanned from other source system than Outlook or Exchange, the importer allows extracting the email properties (subject, from, to, cc, body, etc) from the email file. This is done during the import.
Starting with version 3.7 the importer allows users to import the scanned documents and folders as Physical Items. All predefined types of physical items (Non Container, Container and Box) are supported. For importing physical objects, the user needs to configure the webservice URL in the “physicalObjectsWebserviceURL”.
Physical Items can be imported with migsets having the appropriate processing type: <SourceType>toOpentext(physicalObj). Physical objects migration sets have a predefined set of rules for system attributes listed under - Rules for system attributes- in the -Transformation rules- window. The system attributes have a special meaning for the importer, so they have to be set with valid values as it is described below.
Until version 3.12 of migration-center assigning physical items into physical box items was done by setting the ParentFolder system rule to point to the box item. This would import the physical item in the parent folder of the box item and assign it to the box item.
Starting from version 3.13 a new parameter was added, PhysicalBoxPath, for specifying which box item the object should be assigned to. This allows setting the ParentFolder to any other location, in order to better replicate the OTCS functionality.
The first value of the target type should be set with a valid Physical Object Type that is defined in the section Manage/Object types…. The accepted values should start with one of the following value: “opentext_physical_item”, “opentext_physical_box”, “opentext_physical_container”. For distinguish between different Item type you can create new types like: “opentext_physical_item_notebook”, “opentext_physical_container_records”. The provided object types can be extended with the custom physical properties that are defined in the Content Server for every Physical Item Type (Physical Properties).
Starting with the second value of “target_type” rule the user is allowed to optionally set one or multiple categories and records management classifications. See the dedicated chapters for more details regarding these features.
Starting with version 3.7 the Records Management Classifications are not handled anymore as system rules, but they are handled internally as a dedicated and predefined object type: “opentext_rm_classification”.
For assigning the record management classification to the imported objects (Documents, Containers and Physical Items) the “target_type” system rule should one value (after the first value) equal with: “opentext_rm_classification”. For setting the specific attributes of the classification the association with the predefined type “opentext_rm_classification” and its attributes should be done as you can see in the screen shots below.
Only the attributes of “opentext_rm_classification” object type provided with the installation are accepted by the importer.
OpenText importer allows assigning categories to the imported documents and folders. A category is handled internally by migration center client as target object type and therefore the categories has to be defined in the migration-center client as object types (menu Manage/Object types…):
Since multiple categories with the same name can exist in an OpenText repository the category name must be always followed by its internal id. Ex: BankCustomer-44632.
The sets defined in the OpenText categories are supported by migration-center. The set attributes will be defined within the corresponding object type using the pattern <Set Name>#<Attribute Name> . The importer will recognize the attributes containing the separator “#” to be attributes belonging to the named Set and it will import them accordingly.
Only the categories specified in the system rules “target_type” will be assigned to the imported objects:
For setting the category attributes the rules must be associated with the category attributes in the migration set’s |Associations| tab:
Since version 3.2.9 table key lookup attributes are supported in the categories. These attributes should be defined in migration-center in the same way the other attributes for categories are defined. Supported type of table key lookup attributes are Varchar, Number and Date. The only limitation is that Date type attributes should be of type String in the migration-center Object types.
Both folders and documents migration sets have a predefined set of rules for system attributes listed under -Rules for system attributes- in the -Transformation rules- window. The system attributes have a special meaning for the importer, so they have to be set with valid values as it is described in the next sections.
The following table lists all provided system attributes available for documents and folders.
The system attribute ACLs (Access Control List) is responsible for optionally assigning permissions based on the source data to the target object.
Each value to be assigned consists of three main elements separated by #. The third value for individual permissions itself is separated by |.
<ACLType#RightName#Permission-1|Permission-2|Permission-n>
Sample ACLs value string: ACL#csuser#See|SeeContents
The following table describes all valid values for defining a correct ACLs value:
You may use as many individual entries as individual permissions required at OpenText Content Server for this MC System Rule.
Example:
The system attribute Classifications is responsible for optionally assigning one or more Classifications to the target object.
The system attribute “ContentName” is responsible for assigning the internal target version filename for the source document to be uploaded.
OpenText Content Server uses this internal version filename also for its mimetype recognition, so it is required to always build “ContentName” together with a valid extension.
The system attribute ImpersonateUser is responsible for assigning the correct owner of the source object to the imported target object.
Notes: If authenticated via RCS add the appropriate domain to the user.
Since the creation of the object to be imported is done with the context of this assigned user, this user needs at least "Add Items" permissions assigned for the target location defined in MC System Rule 'ParentFolder'
This attribute can be used to import the content of the document from another place than the location where the scanner exported the document. It should be set with a valid file path. If it’s not set the content will be picked up from the original location.
The system attribute Name is responsible for assigning a Name to the target document or folder. If a node with the same name already exists, the object will be skipped and the importer will throw an error.
The system attribute ParentFolder will be set with the full path of the container where the current object will be imported.
Notes: The adaptor internally uses forward slash (/) as the path delimiter. Make sure to consider this in your rules.
If a folder name in the path contains a forward slash (/) that should be escaped with the character sequence: %2F
The system rules RenditionPaths and RenditionTypes can be used for importing renditions that had been exported from the source system.
RenditionTypes is multiple-value rule that will be set with the rendition types that will be imported to the content server (Ex: “pdf”, “png”). If no values are set for this attribute there will be no imported rendition.
RenditionPaths is multiple-value rule that will be used to set the paths where renditions exported from source system are located. If the rendition paths will not be set, the importer will ask the content server to generate the rendition on the file based on the document content.
RenditionTypes and RenditionPaths work in pairs in the following way:
when RenditionTypes has one or more values and the corresponding rendition paths are readable, the renditions are imported
when a RenditionTypes value is missing but a rendition path value is present, renditions are ignored
If a RenditionTypes value is duplicated the first pair of rendition type-rendition path are take in the consideration, the second pair being ignored.
The system attribute Shortcuts is responsible for optionally creating shortcuts to the imported document or folder in the folder specified as value of this attribute. One or more folder paths can be specified. If the specified folder does not exist, it will be created automatically if “autoCreateFolders” is enabled in the importer.
If a shortcut cannot be created when importing a document an appropriate error will be reported by the importer and the document will be skipped.
If a shortcut cannot be created when importing a folder, the folder will be created but its status in migration center will be partially imported.
The name of the shortcut is taken from the system rule “Name”.
The first value of this attribute must be the type of object to be imported. For documents migsets the value must be “opentext_document”, for folders migsets the value must be “opentext_folder”.
The next values of this attribute are reserved to the categories that will be assigned to the imported document or folder.
The system attribute VersionControl is used for controlling the creation of versions at Content Server side with the required versioning method (Linear Versioning or Advanced Versioning with Minor or Major Versions).
The valid values are:
“0” - for creating a liner version at Content Server side
“1” - for creating minor version at Content Sever side
“2” - for creating a major version at Content Server side
You cannot mix linear versioning with advanced versioning (minor or major) for the versions belonging to the same version tree.
Objects that have been changed in the source system since the last scan are scanned as update objects. Whether an object in a migration set is an update or not can be seen by checking the value of the Is_update column – if it’s 1, the current object is an update to a previously scanned object (the base object). An update object cannot be imported unless its base object has been imported previously.
Depending on the source systems the object comes from, the method of obtaining the update information will differ but the objects behavior will stay the same once scanned. See the documentation of the scanners in case you need more information about the supported updates and how they are detected.
In order for delta migration to work properly it is essential to not reset the migration sets (objects) after they have been imported.
When updating the documents or versions, the importer may need to delete some documents or versions that where imported previously. This is because of a limitation of the Content Webservice that does not allow updating the content of existing objects. If Records Manager is installed on the Content Server, importing documents updates may not work since the Content Server does not allow deleting documents and versions.
A complete history is available for any OpenText Importer job from the respective items’ History window. It is accessible through the [History] button/menu entry on the toolbar/context menu. The -History window- displays a list of all runs for the selected job together with additional information, such as the number of processed objects, the start and ending time and the status.
Double clicking an entry or clicking the [Open] button on the toolbar opens the log file created by that run. The log file contains more information about the run of the selected job:
Version information of the migration-center Server Components the job was run with
The parameters the job was run with
Execution Summary that contains the total number of objects processed, the number of documents and folders scanned or imported, the count of warnings and errors that occurred during runtime.
Log files generated by the OpenText Importer can be found in the Server Components installation folder of the machine where the job was run, e.g. …\fme AG\migration-center Server Components <Version>\logs
The amount of information written to the log files depends on the setting specified in the “loggingLevel” start parameter for the respective job.
The Microsoft SharePoint Importer allows migrating documents, folders, list items and lists/libraries to SharePoint 2013, 2016, and 2019 offering following features:
Import documents to Microsoft SharePoint Document Library items
Import folders to Microsoft SharePoint Document Library items
Import list items to Microsoft SharePoint Document Library items
Import lists/libraries to Microsoft SharePoint Sites
Set values for any columns in SharePoint, including user defined columns
Set values for SharePoint specific internal field values
Create documents, folders and list items using standard SharePoint content types or custom content types
Set permissions for individual documents, folders, list items and lists/libraries
Set the folder path and create folders if they don’t exist
Apply Content Types automatically to lists/libraries if they are not applied yet
Delta migration
Import versions (minor or major, automatic enabling of versioning in the targeted document library)
Import files with a size up to 15 GB (depending on the target SP version)
To install the main product components, consult the migration-center Installation Guide document.
The migration-center SharePoint Importer requires installing an additional, separate component besides the main product components. This additional component is able to set system values (such as creation date, modified date, author and editor) as well as taxonomy values for your objects. It is designed to run as a Windows service and needs the .NET Framework 4.7.2 installed on your computer, which is running this service as well as the migration-center Job Server.
This component must be installed in all machines where the migration-center server components is installed.
To install this additional component, it is necessary to run an installation file, which is located within the
SharePoint component folder of your migration-center Job Server installation location, which is by default C:\Program Files (x86)\fme AG\migration-center Server Components <version>\lib\mc-sharepoint-online-importer\CSOM_Service\install. This folder contains the file install.bat, which must be executed with administrative privileges.
After the service is installed you will need to start it manually for the first time, after that the service is configured to start automatically as soon as the computers operating system is loaded.
In case it is necessary to uninstall this component, the file uninstall.bat must be executed.
The migration-center SharePoint Importer can import objects generated by any of the available (and compatible) scanners. Most scanners can store the data they extract from the source systems they access in either a local path, or a UNC network path.
As is the case with all importers, they need to able to access the files extracted by a scanner in order to import.
See the respective scanner’s user guide for more information on configuration parameters if necessary.
The SharePoint on-premise importer supports only the following authentication types:
NTLM Windows authentication (authenticationMethod = direct)
AD FS SAML token-based authentication (authenticationMethod = adfs)
Kerberos Windows authentication is currently NOT supported.
Due to restrictions in SharePoint, documents cannot be moved from one Library to another using migration-center once they have been imported. This applies to Version and Update objects.
Even though some other systems such as Documentum allow editing of older versions, either by replacing metadata, or creating branches, this is not supported by SharePoint. If you have updates to intermediate versions to a version tree that is already imported, the importer will return an error upon trying to import them. The only way to import them is to reset the original version tree and re-import it in the same run with the updates.
Running multiple Job Servers for importing into SharePoint must be done with great care, and each of the Job Servers must NOT import in the same library at the same time. If this occurs, because the job will change library settings concurrently, the versioning of the objects being imported in that library will not be correct.
The SharePoint system has some limitations regarding file names, folder names, and file size. Our SharePoint importer will perform the following validations before a file gets imported to SharePoint (in order to fail fast and avoid unnecessary uploads):
Max. length of a file name: 128 characters
Max. length of a folder name: 128 characters
Invalid leading chars for file name: SPACE, PERIOD
Invalid leading chars for folder name: SPACE, PERIOD
Invalid trailing chars for folder name: SPACE, PERIOD
Invalid file or folder names: "AUX", "PRN", "NUL", "CON", "COM0", "COM1", "COM2", "COM3", "COM4", "COM5", "COM6", "COM7", "COM8", "COM9", "LPT0", "LPT1", "LPT2", "LPT3", "LPT4", "LPT5", "LPT6", "LPT7", "LPT8", "LPT9"
Consecutive PERIOD characters are not allowed in file or folder names
The following characters are not allowed in file or folder names: ~ # % & * { } \ : < > ? / | “
Max. length of a file path: 260 characters
Max. size of a file: 2 GB
Max. size of an attachment: 250 MB
Max. length of a file name: 128 characters
Max. length of a folder name: 128 characters
Invalid leading chars for file name: SPACE, PERIOD
Invalid leading chars for folder name: SPACE, PERIOD, ~
Invalid trailing chars for folder name: SPACE, PERIOD
Invalid file or folder names: "AUX", "PRN", "NUL", "CON", "COM0", "COM1", "COM2", "COM3", "COM4", "COM5", "COM6", "COM7", "COM8", "COM9", "LPT0", "LPT1", "LPT2", "LPT3", "LPT4", "LPT5", "LPT6", "LPT7", "LPT8", "LPT9"
The following characters are not allowed in file or folder names: " # % * : < > ? / \ |
Max. length of a file path: 260 characters
Max. size of a file: 10 GB
Max. size of an attachment: 250 MB
Max. length of a file name: 400 characters
Max. length of a folder name: 400 characters
Invalid leading chars for file name: SPACE, PERIOD
Invalid leading chars for folder name: SPACE, PERIOD, ~
Invalid trailing chars for folder name: SPACE, PERIOD
Invalid file or folder names: "AUX", "PRN", "NUL", "CON", "COM0", "COM1", "COM2", "COM3", "COM4", "COM5", "COM6", "COM7", "COM8", "COM9", "LPT0", "LPT1", "LPT2", "LPT3", "LPT4", "LPT5", "LPT6", "LPT7", "LPT8", "LPT9"
The following characters are not allowed in file or folder names: " * : < > ? / \ |
Max. length of a file path: 400 characters
Max. size of a file: 15 GB
Max. size of an attachment: 250 MB
To create a new SharePoint Importer, create a new importer and select SharePoint from the Adapter Type drop-down. Once the adapter type has been selected, the Parameters list will be populated with the parameters specific to the selected adapter type. Mandatory parameters are marked with an *.
The Properties of an existing importer can be accessed after creating the importer by double-clicking the importer in the list or selecting the Properties button/menu item from the toolbar/context menu. A description is always displayed at the bottom of the window for the selected parameter.
Multiple importers can be created for importing to different target locations, provided each importer has a unique name.
The configuration parameters available for the SharePoint Importer are described below:
Starting from migration-center 3.5 the SharePoint importer has the option of checking the integrity of each document’s content after it has been imported. This will be done if the “checkContentIntegrity” parameter is checked and it will verify only documents that have a generated checksum in the Source Attributes.
Currently the only supported checksum algorithm is MD5 with HEX encoding.
For certain types of documents such as Office documents and also .MSG documents, “Document Information Panel” is activated and SharePoint changes the content slightly upon upload. This will cause the integrity check to fail for those documents and there is no workaround so far, other than importing without the content integrity check or finding a way to disable this feature directly in SharePoint.
Documents targeted at a Microsoft SharePoint Document library will have to be added to a migration set. This migration set must be configured to accept objects of type <source object type>ToSharePoint (document).
Create a new migration set and set the <source object type>ToSharePoint(document) in the Type drop-down. This is set in the -Migration Set Properties- window which appears when creating a new migration set. The type of object can no longer be changed after a migration set has been created.
The migration set is now set to work with SharePoint documents.
The same procedure as for documents also applies to folders about to be imported to SharePoint. For folders the object type to select for a migration set would be <source object type>ToSharePoint (folder).
The same procedure as for documents also applies to lists or libraries about to be imported to SharePoint. For lists or libraries the object type to select for a migration set would be <source object type>ToSharePoint (list).
The same procedure as for documents also applies to list items about to be imported to SharePoint. For list items the object type to select for a migration set would be <source object type>ToSharePoint (listItem).
<source object type>ToSharePoint (document) type objects have a number of predefined rules listed under Rules for system attributes in the -Transformation Rules- window. These rules are described in the table below.
Other rules can be defined by the user to set various SharePoint column values, such as Name, Title, checkin_comment to name a few of the more frequently used. Any SharePoint column which exists in the document library targeted by the contentType rule can be set by creating a rule and associating it with the corresponding column in the associations tab.
Folders have different rules as documents. The exact system rules applicable to folders are listed below.
List items have different rules as documents. The exact system rules applicable to list items are listed below.
Lists have different rules as documents. The exact system rules applicable to lists are listed below.
In order to associate transformation rules with SharePoint columns, a migration-center object type definition for the respective content type needs to be created. Object type definitions are created in the migration-center client. To create an object type definition, go to Manage/Object Types and create or import a new object type definition. In case your content type contains two or more columns with the same display name, you need to specify the columns internal names as attribute names.
A Microsoft SharePoint content type corresponds to a migration-center object type definition. For the Microsoft SharePoint adapter, an object type definition can be specified in four ways, depending on whether a particular SharePoint content type is to be used or not or multiple but different content types using the same name across site collections:
My Content Type describes explicitly a SharePoint content Type named My Content Type defined either in the SharePoint document library or at Site Collection level
@My Document Library describes the SharePoint document library named My Document Library using only columns, which are defined explicitly in the SharePoint document library My Document Library
My Content Type;#Any Custom Value describes a SharePoint content type named My Content Type. Everything after the delimiter ;# will be cut off on importing
@My Document Library;#Any Custom Value describes the SharePoint document library named My Document library using only columns, which are defined explicitly in the SharePoint document library My Document Library. Everything after the delimiter ;# will be cut off on importing.
The migration-center SharePoint importer is able to modify the following SharePoint specific internal field values for documents, list items and folders: Created By, Modified By, Created and Modified. To set these internal field values it is necessary to create attributes named Author, Editor, Created and Modified and associate them in the transformation rules appropriately.
The SharePoint Importer is able to read values from files. This might be necessary if the length of a string might exceed the maximum length of an Oracle database column, which is 4096 bytes.
To tell the SharePoint Importer reading strings from a text file, the filepath of the file containing the value must be placed within the markup <@MCFILE>filepath</@MCFILE>.
The File Reader does not read values from files for the following attributes: Check-In Comment, Author, Editor, Creation Date, Modified Date.
Example:
Assuming you have a file at path \\scanlocation\temp\1\4a9daccf-5ace-4237-856c-76f3bd3e3165.txt you must type the following string in a rule:
<@MCFILE>\\scanlocation\temp\1\4a9daccf-5ace-4237-856c-76f3bd3e3165.txt</@MCFILE>
On import the SharePoint importer extracts the contents of this file and adds them to the associated target attribute.
The SharePoint Importer is able to modify the filename of a file, which is imported to SharePoint. To set the filename, you must create a target attribute named Name in the object type definition. You can associate any filename (without an extension) for a document. The importer automatically picks the extension of the original file or of the file specified in the rule mc_content_location. If the extension needs to be changed as well, use the system rule fileExtension in order to specify a new file extension. You can also change only the extension of a filename by setting fileExtension and not setting a new filename in the Name attribute.
Example:
The SharePoint Importer can set the check-in comment for documents. To set the check-in comment, you must create a target attribute named checkin_comment in the object type definition. You can associate any string to set the check-in comment
The SharePoint Importer can set the author for list items, folders and documents. To set the author, you must create a target attribute named author in the object type definition. The value of this attribute must be the loginname of the user, who shall be set as author.
The SharePoint Importer can set the editor for list items, folders and documents. To set the editor, you must create a target attribute named editor in the object type definition. The value of this attribute must be the loginname of the user, who shall be set as editor.
The SharePoint Importer can set the creation date for list items, folders and documents. To set the creation date, you must create a target attribute named created in the object type definition. The value of this attribute must be any valid date.
The SharePoint Importer can set the modified date for list items, folders and documents. To set the modified date, you must create a target attribute named modified in the object type definition. The value of this attribute must be any valid date.
The SharePoint Importer can set lookup values. If you want to set a lookup value, you can either set the ID of the related item or the title. If you set the title of the document and there is more than one item with the same title in the lookup list, the SharePoint Importer marks your import object as an error, because the lookup item could not be identified unequivocally.
The importer will treat any numeric value provided as the ID of the lookup value. In case you want to look up the value by title and the title is numeric, please surround the numeric value with " characters. The importer will automatically remove any " characters in the provided value before trying to match it with the title of the lookup field.
The SharePoint Importer can set URL values. You can define a friendly name as well as the actual URL of the link by concatenating the friendly name with ;# and the actual URL.
Example: migration-center;#http://www.migration-center.com
The SharePoint Importer can set taxonomy values and migration-center provides two ways to do it. The first possibility is to use the name of the Taxonomy Service Application, a term group name, a term set name and the actual term. Those four values must be concatenated like the following representation:
TaxonomyServiceApplicationName>TermGroupName>TermSetName>TermName
Example: Taxonomy_+poj2l53kslma4==>Group1>TermSet 1>Value
The second possibility is to use the unique identifier of the term within curly brackets {}.
Example: { 2e9b53f9-04fe-4abd-bc33-e1410b1a062a}
Multiple values can be set also but the attribute has to be set as a Multi-value Repeating attribute.
In case you receive an error on setting taxonomy values make sure the TaxonomyServiceApplicationName is up to date and valid since Microsoft changes the identifier at certain times.
The performance of the import process is heavily impacted by some specific features, namely autoAdjustVersioning, autoAdjustAttachments, autoAddContentTypes, autoCreateFolders, setting taxonomy values, and setting system attributes like Author, Editor, Created, and Modified.
In case you are importing an update for a major version the increase in processing time can get up to three times compared to a normal document import. Combining all the above-mentioned features over the same document can increase the time for up to four times. Take this into consideration when planning an import as the time might vary based on the above described features.
You will achieve the highest import performance if you perform the appropriate configuration of your SharePoint system before you start the import and disabled the above-mentioned features in the SharePoint importer.
A complete history is available for any SharePoint Importer job from the respective items’ –History- window. It is accessible through the [History] button/menu entry on the toolbar/context menu. The -History- window displays a list of all runs for the selected job together with additional information, such as the number of processed objects, the start and ending time and the status.
Double clicking an entry or clicking the Open button on the toolbar opens the log file created by that run. The log file contains more information about the run of the selected job:
Version information of the migration-center Server Components the job was run with
The parameters the job was run with
Execution Summary that contains the total number of objects processed, the number of documents and folders scanned or imported, the count of warnings and errors that occurred during runtime.
Log files generated by the SharePoint Importer can be found in the Server Components installation folder of the machine where the job was run, e.g. …\fme AG\migration-center Server Components <Version>\logs
The amount of information written to the log files depends on the setting specified in the ‘loggingLevel’ start parameter for the respective job.
The Microsoft SharePoint Online Importer allows migrating documents, folders, list items and lists/libraries to SharePoint Online offering following features:
Import documents to Microsoft SharePoint Online Document Library items
Import folders to Microsoft SharePoint Online Document Library items
Import list items to Microsoft SharePoint Online Document Library items
Import lists/libraries to Microsoft SharePoint Online Sites
Set values for any columns in SharePoint Online, including user defined columns
Set values for SharePoint Online specific internal field values
Create documents, folders and list items using standard SharePoint Online content types or custom content types
Set permissions for individual documents, folders, list items and lists/libraries
Set the folder path and create folders if they don’t exist
Apply Content Types automatically to lists/libraries if they are not applied yet
Delta migration
Import versions (minor or major, automatic enabling of versioning in the targeted document library)
Import files with a size up to 15 GB
The SharePoint Importer is implemented mainly as a Job Server component but comes with a separate component for setting SharePoint Online specific internal field values, which can be installed optionally and if necessary.
Starting with migration-center version 3.13 Update 2 the SharePoint Online importer only supports app-only principal authentication. It is not possible to use user name / password authentication with this and later versions. Please consider this before you upgrade your existing installation.
The SharePoint Importer is implemented mainly as a Job Server component but comes with a separate component for setting SharePoint Online specific internal field values, which can be installed optionally and if necessary.
SharePoint Online Importers can be created, configured, started and monitored through migration-center Client, while the corresponding processes are executed by migration-center Job Server and the migration-center SharePoint Online Importer respectively.
The term importer is used in migration-center for describing a component which imports processed data from migration-center to another system.
Scanners and importers work as jobs that can be run at any time and can even be executed repeatedly. For every run a detailed history and log file are created. Multiple scanner and import jobs can be created or run at a time, each being defined by a unique name, a set of configuration parameters and a description (optional).
To install the main product components, consult the migration-center Installation Guide document.
The migration-center SharePoint Importer requires installing an additional, separate component besides the main product components. This additional component is able to set system values (such as creation date, modified date, author and editor) as well as taxonomy values for your objects. It is designed to run as a Windows service and needs the .NET Framework 4.7.2 installed on your computer, which is running this service as well as the migration-center Job Server.
This component must be installed on all machines where the migration-center server components is installed.
To install this additional component, it is necessary to run an installation file, which is located within the
SharePoint component folder of your migration-center Job Server installation location, which is by default C:\Program Files (x86)\fme AG\migration-center Server Components <Version>\lib\mc-sharepoint-online-importer\CSOM_Service\install. This folder contains the file install.bat, which must be executed with administrative privileges.
After the service is installed you will need to start it manually for the first time, after that the service is configured to start automatically as soon as the computers operating system is loaded.
In case it is necessary to uninstall this component, the file uninstall.bat must be executed.
The migration-center SharePoint Online Importer can import objects generated by any of the available (and compatible) scanners. Most scanners can store the data they extract from the source systems they access in either a local path, or a UNC network path.
As is the case with all importers, they need to able to access the files extracted by a scanner in order to import.
See the respective scanner’s user guide for more information on configuration parameters if necessary.
The importer supports only app-principal authentication for connecting to SharePoint Online. The app-principal authentication comes in two flavors: Azure AD app-only principal authentication and SharePoint app-only principal authentication.
Azure AD app-only authentication requires full control access for the migration-center application on your SharePoint Online tenant. This includes full control on ALL site collections of your tenant.
If you want to restrict the access of the migration-center application to certain site collections or sites, you can use SharePoint app-only authentication.
Running multiple Job Servers for importing into SharePoint must be done with great care, and each of the Job Servers must NOT import in the same library at the same time. If this occurs, because the job will change library settings concurrently, the versioning of the objects being imported in that library will not be correct.
The migration-center SharePoint Online Batch Importer supports Azure AD app-only authentication. This is the authentication method for background processes accessing SharePoint Online recommended by Microsoft. When using SharePoint Online you can define applications in Azure AD and these applications can be granted permissions to your SharePoint Online tenant.
Please follow these steps in order to setup your migration-center application in your Azure AD.
In Azure AD when doing App-Only you typically use a certificate to request access: anyone having the certificate and its private key can use the app and the permissions granted to the app. The below steps walk you through the setup of this model.
You are now ready to configure the Azure AD Application for invoking SharePoint Online with an App-Only access token. To do that, you must create and configure a self-signed X.509 certificate, which will be used to authenticate your migration-center Application against Azure AD, while requesting the App-Only access token. First you must create the self-signed X.509 Certificate, which can be created using the makecert.exe tool that is available in the Windows SDK or through a provided PowerShell script which does not have a dependency to makecert. Using the PowerShell script is the preferred method and is explained in this chapter.
It's important that you run the below scripts with Administrator privileges.
To create a self-signed certificate with this script, which you can find in the <job server folder>\lib\mc-spo-batch-importer\scripts folder:
.\Create-SelfSignedCertificate.ps1 -CommonName "MyCompanyName" -StartDate 2020-07-01 -EndDate 2022-06-30
The dates are provided in ISO date format: YYYY-MM-dd
You will be asked to give a password to encrypt your private key, and both the .PFX file and .CER file will be exported to the current folder.
Save the password of the private key as you’ll need it later.
Once the application has been created copy the "Application (client) ID" as you’ll need it later.
Now click on "API permissions" in the left menu bar and click on the "Add a permission" button. A new blade will appear. Here you choose the permissions that are required by migration-center. Choose i.e.:
Microsoft APIs
SharePoint
Application permissions
Sites
Sites.FullControl.All
TermStore
TermStore.Read.All
User
User.Read.All
Graph
Application permissions
Sites
Sites.FullControl.All
Click on the blue "Add permissions" button at the bottom to add the permissions to your application. The "Application permissions" are those granted to the migration-center application when running as App Only.
Next step is “connecting” the certificate you created earlier to the application. Click on "Certificates & secrets" in the left menu bar. Click on the "Upload certificate" button, select the .CER file you generated earlier and click on "Add" to upload it.
The “Sites.FullControl.All” application permission requires admin consent in a tenant before it can be used. In order to do this, click on "API permissions" in the left menu again. At the bottom you will see a section "Grand consent". Click on the "Grand admin consent for" button and confirm the action by clicking on the "Yes" button that appears at the top.
In order to use Azure AD app-only principal authentication with the SharePoint Online Batch importer you need to fill in the following importer parameters with the information you gathered in the steps above:
SharePoint app-only authentication allows you to grant fine granular access permissions on your SharePoint Online tenant for the migration-center application.
The information in this chapter is based on the following guidelines from Microsoft:
In Azure AD when doing App-Only you typically use a certificate to request access: anyone having the certificate and its private key can use the app and the permissions granted to the app. The below steps walk you through the setup of this model.
You are now ready to configure the Azure AD Application for invoking SharePoint Online with an App-Only access token. To do that, you must create and configure a self-signed X.509 certificate, which will be used to authenticate your migration-center Application against Azure AD, while requesting the App-Only access token. First you must create the self-signed X.509 Certificate, which can be created by using the makecert.exe tool that is available in the Windows SDK or through a provided PowerShell script which does not have a dependency to makecert. Using the PowerShell script is the preferred method and is explained in this chapter.
It's important that you run the below scripts with Administrator privileges.
To create a self-signed certificate with this script, which you can find in the <job server folder>\lib\mc-sharepointonline -importer\scripts folder:
.\Create-SelfSignedCertificate.ps1 -CommonName "MyCompanyName" -StartDate 2020-07-01 -EndDate 2022-06-30
The dates are provided in ISO date format: YYYY-MM-dd
You will be asked to give a password to encrypt your private key, and both the .PFX file and .CER file will be exported to the current folder.
Save the password of the private key as you’ll need it later.
In the "App registrations" tab you will find the list of Azure AD applications registered in your tenant. Click the "New registration" button in the upper left part of the blade. Next, provide a name for your application, e.g. “migration-center” and click on "Register" at the bottom of the blade.
Once the application has been created copy the "Application (client) ID" as you’ll need it later.
Next step is “connecting” the certificate you created earlier to the application. Click on "Certificates & secrets" in the left menu bar. Click on the "Upload certificate" button, select the .CER file you generated earlier and click on "Add" to upload it.
After that, you need to create a secret key. Click on “New client secret” to generate a new secret key. Give it an appropriate description, e.g. “migration-center” and choose an expiration period that matches your migration project time frame. Click on “Add” to create the key.
Store the retrieved information (client id and client secret) since you'll need this later! Please safeguard the created client id/secret combination as would it be your administrator account. Using this client id/secret one can read/update all data in your SharePoint Online environment!
Next step is granting permissions to the newly created principal in SharePoint Online.
If you want to grant tenant scoped permissions this granting can only be done via the “appinv.aspx” page on the tenant administration site. If your tenant URL is https://contoso-admin.sharepoint.com, you can reach this site via https://contoso-admin.sharepoint.com/_layouts/15/appinv.aspx.
If you want to grant site collection scoped permissions, open the “appinv.aspx” on the specific site collection, e.g. https://contoso.sharepoint.com/sites/mysite/_layouts/15/appinv.aspx.
Once the page is loaded add your client id and look up the created principal:
Please enter “www.migration-center.com” in field “App Domain” and “https://www.migration-center.com” in field “Redirect URL”.
To grant permissions, you'll need to provide the permission XML that describes the needed permissions. The migration-center application will always need the “FullControl” permission. Use the following permission XML for granting tenant scoped permissions:
<AppPermissionRequests AllowAppOnlyPolicy="true">
<AppPermissionRequest Scope="http://sharepoint/content/tenant" Right="FullControl" />
<AppPermissionRequest Scope="http://sharepoint/taxonomy" Right="Read" />
</AppPermissionRequests>
Use this permission XML for granting site collection scoped permissions:
<AppPermissionRequests AllowAppOnlyPolicy="true">
<AppPermissionRequest Scope="http://sharepoint/content/sitecollection" Right="FullControl" />
<AppPermissionRequest Scope="http://sharepoint/taxonomy" Right="Read" />
</AppPermissionRequests>
When you click on “Create” you'll be presented with a permission consent dialog. Press “Trust It” to grant the permissions:
Please safeguard the created client id/secret combination as would it be your administrator account. Using this client id/secret one can read/update all data in your SharePoint Online environment!
In order to use SharePoint app-only principal authentication with the SharePoint Online importer you need to fill in the following importer parameters with the information you gathered in the steps above:
Due to restrictions in SharePoint, documents cannot be moved from one Library to another using migration-center once they have been imported. This applies to Version and Update objects.
Even though some other systems such as Documentum allow editing of older versions, either by replacing metadata, or creating branches, this is not supported by SharePoint. If you have updates to intermediate versions to a version tree that is already imported, the importer will return an error upon trying to import them. The only way to import them is to reset the original version tree and re-import it in the same run with the updates.
Running multiple Job Servers for importing into SharePoint must be done with great care, and each of the Job Servers must NOT import in the same library at the same time. If this occurs, because the job will change library settings concurrently, the versioning of the objects being imported in that library will not be correct.
The SharePoint Online system has some limitations regarding file names, folder names, and file size. Our SharePoint Online importer will perform the following validations before a file gets imported to SharePoint Online (in order to fail fast and avoid unnecessary uploads):
Max. length of a file name: 400 characters
Max. length of a folder name: 400 characters
Invalid leading chars for file or folder name: SPACE
Invalid trailing chars for file or folder name: SPACE, PERIOD
Invalid file or folder names: "AUX", "PRN", "NUL", "CON", "COM0", "COM1", "COM2", "COM3", "COM4", "COM5", "COM6", "COM7", "COM8", "COM9", "LPT0", "LPT1", "LPT2", "LPT3", "LPT4", "LPT5", "LPT6", "LPT7", "LPT8", "LPT9"
The following characters are not allowed in file or folder names: " * : < > ? / \ |
Max. length of a file path: 400 characters
Max. size of a file: 15 GB
Max. size of an attachment: 250 MB
To create a new SharePoint Online Importer, create a new importer and select SharePoint Online from the Adapter Type drop-down. Once the adapter type has been selected, the Parameters list will be populated with the parameters specific to the selected adapter type. Mandatory parameters are marked with an *.
The Properties of an existing importer can be accessed after creating the importer by double-clicking the importer in the list or selecting the Properties button/menu item from the toolbar/context menu. A description is always displayed at the bottom of the window for the selected parameter.
Multiple importers can be created for importing to different target locations, provided each importer has a unique name.
The configuration parameters available for the SharePoint Importer are described below:
Starting from migration-center 3.5 the SharePoint Online importer has the option of checking the integrity of each document’s content after it has been imported. This will be done if the “checkContentIntegrity” parameter is checked and it will verify only documents that have a generated checksum in the Source Attributes.
Currently the only supported checksum algorithm is MD5 with HEX encoding.
For certain types of documents such as Office documents and also .MSG documents, “Document Information Panel” is activated and SharePoint changes the content slightly upon upload. This will cause the integrity check to fail for those documents and there is no workaround so far, other than importing without the content integrity check or finding a way to disable this feature directly in SharePoint.
Documents targeted at a Microsoft SharePoint Online Document library will have to be added to a migration set. This migration set must be configured to accept objects of type <source object type>ToSPOnline(document).
Create a new migration set and set the <source object type>ToSPOnline(document) in the Type drop-down. This is set in the -Migration Set Properties- window which appears when creating a new migration set. The type of object can no longer be changed after a migration set has been created.
The migration set is now set to work with SharePoint Online documents.
The same procedure as for documents also applies to folders about to be imported to SharePoint Online. For folders the object type to select for a migration set would be <source object type>ToSPOnline(folder).
The same procedure as for documents also applies to lists or libraries about to be imported to SharePoint Online. For lists or libraries, the object type to select for a migration set would be <source object type>ToSPOnline(list).
The migration-center SharePoint Online Importer supports only the creation of lists or library!
The same procedure as for documents also applies to list items about to be imported to SharePoint Online. For list items the object type to select for a migration set would be <source object type>ToSPOnline(listItem).
<source object type>ToSPOnline(document) type objects have a number of predefined rules listed under Rules for system attributes in the -Transformation Rules- window. These rules are described in the table below.
Other rules can be defined by the user to set various SharePoint column values, such as Name, Title, checkin_comment to name a few of the more frequently used. Any SharePoint column which exists in the document library targeted by the contentType rule can be set by creating a rule and associating it with the corresponding column in the associations tab.
Folders have different rules as documents. The exact system rules applicable to folders are listed below.
List items have different rules as documents. The exact system rules applicable to list items are listed below.
Lists have different rules as documents. The exact system rules applicable to lists are listed below.
In order to associate transformation rules with SharePoint Online columns, a migration-center object type definition for the respective content type needs to be created. Object type definitions are created in the migration-center client. To create an object type definition, go to Manage/Object Types and create or import a new object type definition. In case your content type contains two or more columns with the same display name, you need to specify the columns internal names as attribute names.
A Microsoft SharePoint Online content type corresponds to a migration-center object type definition. For the Microsoft SharePoint Online adapter, an object type definition can be specified in four ways, depending on whether a particular SharePoint Online content type is to be used or not or multiple but different content types using the same name across site collections:
My Content Type describes explicitly a SharePoint Online content Type named My Content Type defined either in the SharePoint document library or at Site Collection level
@My Document Library describes the SharePoint Online document library named My Document Library using only columns, which are defined explicitly in the SharePoint Online document library My Document Library
My Content Type;#Any Custom Value describes a SharePoint Online content type named My Content Type. Everything after the delimiter ;# will be cut off on importing
@My Document Library;#Any Custom Value describes the SharePoint Online document library named My Document library using only columns, which are defined explicitly in the SharePoint Online document library My Document Library. Everything after the delimiter ;# will be cut off on importing.
The migration-center SharePoint Online importer is able to modify the following SharePoint Online specific internal field values for documents, list items and folders: Created By, Modified By, Created and Modified. To set these internal field values it is necessary to create attributes named Author, Editor, Created and Modified and associate them in the transformation rules appropriately.
The SharePoint Online Importer is able to read values from files. This might be necessary if the length of a string might exceed the maximum length of an oracle database column, which is 4,096 bytes.
To tell the SharePoint Online Importer reading Strings from a text file, the filepath of the file, containing the value must be placed within the markup <@MCFILE>filepath</@MCFILE>.
The File Reader does not read values from files for the following attributes: Check-In Comment, Author, Editor, Creation Date, Modified Date.
Example:
Assuming you have a file at path \\scanlocation\temp\1\4a9daccf-5ace-4237-856c-76f3bd3e3165.txt you must type the following String in a rule:
<@MCFILE>\\scanlocation\temp\1\4a9daccf-5ace-4237-856c-76f3bd3e3165.txt</@MCFILE>
On Import the SharePoint Online Importer extracts the contents of this file and adds them to the associated target attribute.
The SharePoint Online Importer is able to modify the filename of a file, which is imported to SharePoint Online. To set the filename, you must create a target attribute named Name in the object type definition. You can associate any filename (without an extension) for a document. The importer automatically picks the extension of the original file or the file specified in rule mc_content_location. If the extension needs to be changed as well, use the system rule fileExtension in order to specify a new file extension. You can also change only the extension of a filename by setting fileExtension and not setting a new filename in the Name attribute.
Example:
The SharePoint Online Importer is able to set the check-in comment for documents. To set the check-in comment, you must create a target attribute named checkin_comment in the object type definition. You can associate any string to set the check-in comment
The SharePoint Online Importer is able to set the author for list items, folders and documents. To set the author, you must create a target attribute named author in the object type definition. The value of this attribute must be the loginname of the user, who shall be set as author.
The SharePoint Online Importer is able to set the editor for list items, folders and documents. To set the editor, you must create a target attribute named editor in the object type definition. The value of this attribute must be the loginname of the user, who shall be set as editor.
The SharePoint Online Importer is able to set the creation date for list items, folders and documents. To set the creation date, you must create a target attribute named created in the object type definition. The value of this attribute must be any valid date.
The SharePoint Online Importer can set the modified date for list items, folders and documents. To set the modified date, you must create a target attribute named modified in the object type definition. The value of this attribute must be any valid date.
The SharePoint Online Importer can set lookup values. If you want to set a lookup value, you can either set the ID of the related item or the title. If you set the title of the document and there is more than one item with the same title in the lookup list, the SharePoint Importer marks your import object as an error, because the lookup item could not be identified unequivocally.
The importer will treat any numeric value provided as the ID of the lookup value. In case you want to look up the value by title and the title is numeric, please surround the numeric value with " characters. The importer will automatically remove any " characters in the provided value before trying to match it with the title of the lookup field.
The SharePoint Online Importer can set URL values. You can define a friendly name as well as the actual URL of the link by concatenating the friendly name with ;# and the actual URL.
Example: migration-center;#http://www.migration-center.com
The SharePoint Importer can set taxonomy values and Migration Center provides two ways to do it. The first possibility is to use the name of the Taxonomy Service Application, a term group name, a term set name and the actual term. Those four values must be concatenated like the following representation:
TaxonomyServiceApplicationName>TermGroupName>TermSetName>TermName
Example: Taxonomy_+poj2l53kslma4==>Group1>TermSet 1>Value
The second possibility is to use the unique identifier of the term within curly brackets {}.
Example: {2e9b53f9-04fe-4abd-bc33-e1410b1a062a}
Multiple values can be set also but the attribute has to be set as a Multi-value Repeating attribute.
In case you receive an error on setting taxonomy values make sure the TaxonomyServiceApplicationName is up to date and valid since Microsoft changes the identifier at certain times.
The performance of the import process is heavily impacted by some specific features, namely autoAdjustVersioning, autoAdjustAttachments, autoAddContentTypes, autoCreateFolders, setting taxonomy values, and setting system attributes like Author, Editor, Created, and Modified.
In case you are importing an update for a major version the increase can get up to three times compared to a normal document import. Combining all the above-mentioned features over the same document can increase the time for up to four times. Take this into consideration when planning an import as the time might vary based on the above described features.
You will achieve the highest import performance if you perform the appropriate configuration of your SharePoint Online system before you start the import and disabled the above-mentioned features in the SharePoint Online importer.
A complete history is available for any SharePoint Online Importer job from the respective items’ –History- window. It is accessible through the [History] button/menu entry on the toolbar/context menu. The -History- window displays a list of all runs for the selected job together with additional information, such as the number of processed objects, the start and ending time and the status.
Double clicking an entry or clicking the Open button on the toolbar opens the log file created by that run. The log file contains more information about the run of the selected job:
Version information of the migration-center Server Components the job was run with
The parameters the job was run with
Execution Summary that contains the total number of objects processed, the number of documents and folders scanned or imported, the count of warnings and errors that occurred during runtime.
Log files generated by the SharePoint Online Importer can be found in the Server Components installation folder of the machine where the job was run, e.g. …\fme AG\migration-center Server Components <Version>\logs
The amount of information written to the log files depends on the setting specified in the ‘loggingLevel’ start parameter for the respective job.
The OpenText InPlace importer takes the objects processed in migration-center and imports them back in an OpenText repository. OpenText InPlace importer works together only with OpenText scanner.
OpenText InPlace adaptor supports a limited amount of OpenText features, specifically changing documents categories and category attributes.
Importer is the term used for an output adapter used as the last step of the migration process. It takes care of importing the objects processed in migration-center into the target system (such as an OpenText repository).
An importer works as a job that can be run at any time, and can even be executed repeatedly. For every run, a detailed history and log file are created. It is defined by a unique name, a set of configuration parameters and an optional description.
OpenText InPlace can be created, configured, started and monitored through migration-center Client, but the corresponding processes are executed by the migration-center Job Server.
OpenText InPlace is compatible with the version 10.5, 16.0, 16.4 and 20.2 of OpenText Content Server.
It does requires Content Web Services to be installed on the Content Server. In case of setting classifications to the imported files or folders the Classification Webservice must be installed on the Content Server. For supporting Record Management Classifications the Record Management Webservice is required.
OpenText InPlace importer allows assigning categories to the imported documents and folders. A category is handled internally by migration center client as target object type and therefore the categories has to be defined in the migration-center client in the object types window ( <Manage> <Object types> ):
Since multiple categories with the same name can exist in an OpenText repository the category name must be always followed by its internal id. Ex: BankCustomer-44632.
The sets defined in the OpenText categories are supported by migration-center. The set attributes will be defined within the corresponding object type using the pattern <Set Name>#<Attribute Name>. The importer will recognize the attributes containing the separator “#” to be attributes belonging to the named Set and it will import them accordingly.
Only the categories specified in the system rules “target_type” will be assigned to the imported objects:
For setting the category attributes the rules must be associated with the category attributes in the migration set’s |Associations| tab:
Since version 3.2.9 table key lookup attributes are supported in the categories. This attributes should be defined in migration-center in the same way the other attributes for categories are defined. Supported type of table key lookup attributes are Varchar, Number and Date. The only limitation is that Date type attributes should be of type String in the migration-center Object types.
If the importer parameter overwriteExistingCategories is checked, only the specified category and category attributes associated in the migset will be updated when importing, leaving the rest of the categories the same as they were before the import.
If left unchecked, the categories and category attributes associated in the migset will be updated but any unspecified category in the migset will be removed from the document.
To create a new OpenText InPlace Importer job specify the respective adapter type in the importer’s Properties window – from the list of available adapters, “OpenText InPlace” must be selected. Once the adapter type has been selected, the Parameters list will be populated with the parameters specific to the selected adapter type, in this case, OpenText InPlace.
The Properties window of an importer can be accessed by double-clicking an importer in the list, by selecting the Properties button from the toolbar or from the context menu.
Configuration parameters
Values
Name
Enter a unique name for this scanner
Mandatory
Adapter type
Select the “OpenTextInPlace” adapter from the list of available adapters
Mandatory
Location
Select the Job Server location where this job should be run. Job Servers are defined in the Jobserver window. If no Job Server migration-center will prompt the user to define a Job Server Location when saving the Scanner.
Mandatory
Description
Enter a description for this job (optional)
Configuration parameters
Values
username*
User name for connecting to the target repository.
A user account with super user privileges must be used to support the full OpenText functionality offered by migration-center.
Mandatory
password*
The user’s password.
Mandatory
authenticationMode*
The OpenText Content Server authentication mode.
Valid values are:
CWS for regular Content Server authentication
RCS for authentication of OpenText Runtime and Core Services
RCSCAP for authentication via Common Authentication Protocol over Runtime and Core Services
webserviceURL*
The URL to the Authentication service of the “les-services”:
rootFolder*
The id of node under the documents will be imported.
Ex. 2000
overwriteExistingCategories
When checked the attributes of the existing category will be overwritten with the specified values. If not checked, the existing categories will be deleted before the specified categories will be added.
numberOfThreads
The number threads that will be used for importing objects. Maximum allowed is 20.
loggingLevel*
Sets the verbosity of the log file.
Values:
1 - logs only errors during scan
2 - is the default value reporting all warnings and errors
3 - logs all successfully performed operations in addition to any warnings or errors
4 - logs all events (for debugging only, use only if instructed by fme product support since it generates a very large amount of output. Do not use in production)
Mandatory
On the | Migsets | tab, the user can select the migration sets to be imported with this importer. Depending on the chosen Adapter Type only the migration sets compatible with this type of importer will be displayed and can be selected for import. In addition, only migration sets containing at least one object in a validated state will be displayed (since objects that have not been validated cannot be imported). Available migration sets can be moved between the two columns by double clicking or using the arrows buttons.
A complete history is available for any OpenText Importer job from the respective items’ History window. It is accessible through the [History] button/menu entry on the toolbar/context menu. The -History window- displays a list of all runs for the selected job together with additional information, such as the number of processed objects, the start and ending time and the status.
Double clicking an entry or clicking the [Open] button on the toolbar opens the log file created by that run. The log file contains more information about the run of the selected job:
Version information of the migration-center Server Components the job was run with
The parameters the job was run with
Execution Summary that contains the total number of objects processed, the number of documents and folders scanned or imported, the count of warnings and errors that occurred during runtime.
Log files generated by the OpenText Importer can be found in the Server Components installation folder of the machine where the job was run, e.g. …\fme AG\migration-center Server Components <Version>\logs
The amount of information written to the log files depends on the setting specified in the “loggingLevel” start parameter for the respective job.
InfoArchive is an archive system from OpenText which fulfills the international standard OAIS (http://de.wikipedia.org/wiki/OAIS).
The InfoArchive Importer provides the necessary functionality for creating Submission Information Packages (SIP) compressed into ZIP format that will be ready to be ingested into an InfoArchive repository. A SIP is a data container used to transport data to be archived from the producer (source application) to InfoArchive. It consists of a SIP descriptor containing packaging and archival information about the package and the data to be archived. Based on the metadata configured in migration-center the InfoArchive Importer will create a valid SIP descriptor (eas_sip.xml) and a valid PDI file (eas_pdi.xml) for every generated SIP.
The supported InfoArchive versions are 3.2 – 16.7. Synchronous ingestion is only supported for the version 3.2.
Importer is the term used for an output adapter and is used at the last step of the migration process. In the context of the InfoArchive Importer the filesystem itself is considered to be the target location for migrated data, hence the designation “importer”. The InfoArchive Importer imports data sourced from other systems and processed with migration-center to the filesystem into Zip-files (SIPs).
This module works as a job that can be run at any time and can even be executed repeatedly. For every run a detailed history and log file are created. An importer is defined by a unique name, a set of configuration parameters and an optional description.
InfoArchive Importers can be created, configured, started and monitored through migration-center Client, but the corresponding processes are executed by migration-center Job Server.
Objects meant to be migrated to InfoArchive using the InfoArchive Importer have their own type in migration-center. This allows migration-center and the user to target aspects and properties specific to the filesystem.
Documents targeted at InfoArchive will have to be added to a migration set first. This migration set must be configured to accept objects of type <source object type>ToInfoArchive(document).
Create a new migration set and set the <source object type>ToInfoArchive(document) object type in the Type drop-down. The type of object can no longer be changed after a migration set has been created.
The migration sets of type “<source object type>ToInfoArchive(document)” have a number of predefined rules listed under Rules for system attributes in the –Transformation Rules - window.
The values of system rules prefixed with DSS are used by the InfoArchive Importer to create the SIP descriptor (eas_sip.xml) as shown in the following example:
Every unique combination of the values of the “DSS_” rules together with the “target_type” will correspond to a “Data Submission Session (DSS)”. See more information about DSS in the InfoArchive configuration guide.
Configuration parameters
Values
Mandatory
content_name
Must be set with the names of the content files in the SIP associated with the current document. If the current document does not have content this attribute will be ignored by the importer.
If this attribute contains multiple values, the number values must match and be in the same order as the corresponding ones in the mc_content_location attribute
Important: This rule must have the same value as the rule associated with the PDI attribute configured in the holding as the content reference.
Yes
custom_attribute_key
Represents the name parameter for custom attributes from eas_sip.xml. Number of repeating attributes must match with custom_attribute_value
custom_attribute_value
Represents the values of custom attributes from eas_sip.xml. Number of repeating attributes must match with custom_attribute_key
DSS_application
Sets the <application> Value
Yes
DSS_base_retention_date
Sets the <base_retention_date> Value
Yes
DSS_entity
Sets the <entity> Value
Yes
DSS_holding
Sets the <holding> Value
This value will be used for naming the generated SIP file(s)
Yes
DSS_id
Sets the <id> Value
Yes
DSS_pdi_schema
Sets the <pdi_schema> Value
Yes
DSS_pdi_schema_version
Sets the <pdi_schema_version> Value
No
DSS_priority
Sets the <priority> Value
Yes
DSS_producer
Sets the <producer> Value
Yes
DSS_production_date
Sets the <production_date> Value
Yes
DSS_rentention_class
Sets the <retention_class> Value if needed
No
mc_content_location
By default the document content will be picked up by the importer from its original location (the location where the scanner exported it to)
If mc_content_location is set with a local path or network share pointing to an existing file then the original location will be ignored and the content will be picked up from the location specified in this attribute.
If the value “nocontent” is set to this system rule the document will be handled by the importer as a content less object.
No
target_type
Must be set with the MC internal types that will be used for association.
Important: The first value of this attribute also determines which XSL and XSD file will be used for generating and validating PDI File (eas_pdi.xml)!
E.g. Value=”Office, PhoneCalls” XML filenames must be: Office.xsl, Office.xsd.
Yes
Working with rules and associations is core product functionality and is described in detail in the Client User Guide.
The target types should be defined in MC according with the InfoArchive PDI schema definition. The object types are used by the validation engine to validate the attributes before the import phase. They are also used by the importer to generate the PDI file for the SIPs.
Working with object type definitions and defining attributes is core product functionality and is described in detail in the Client User Guide.
The importer generates the PDI file (eas_pdi.xml) by transforming the structured data from a standard structure based on a XSL file and validating it against a XSD file. An example of the standard structure of the PDI file can be found at Default format of PDI File.
The “pdiSchemasPath” parameter in the importer configuration is used to locate the XSL and XSD files needed for the PDI file (eas_pdi.xml) transformation and validation. If this parameter does not have any value then the eas_pdi.xml file will be created using the standard structure.
If the parameter does contain a value, then the user must make sure that the XSL and XSD files are present in the path. The name of the XSL and XSD files must match the first value of system rule “target_type” otherwise the importer will return an error. If the “pdiSchemasPath” is set to “D:\IA\config” and “target_type” has the following multiple values “Office,PhoneCalls,Tweets” then the XSL and XSD file names must be: “Office.xsl” and “Office.xsd”.
The XSD file needed for the PDI validation should be the same one used in InfoArchive when configuring a holding and specifying a PDI schema. The XSL file however needs to be created manually by the user. The XSL is responsible for extracting the correct values from the standard output generated by the importer in memory and to transform them into the needed structure for your configuration. An example of such files can be found at Sample PDI schema and Sample PDI transformation style sheet.
Starting from version 3.2.9 of migration-center the InfoArchive Importer supports generation of PDI files which contain metadata from multiple object types. By specifying multiple object types definitions in the “target_type” system attribute, one can associate metadata to multiple object types in the associations tab. Note that only the first value from this rule will be used to find the XSD and XSL files for transforming and validating the eas_pdi.xml file. Those file need to support the standard output provided by the importer as seen in Default format of PDI File.
Starting from version 3.2.9 of migration-center the InfoArchive Importer supports multiple contents per AIU. Each content location must be specified in the mc_content_location system attribute as repeating values, and the names of each content must be specified in the content_name system attribute as repeating values as well.
The number of repeating values must be the same for both attributes.
Starting from version 3.2.9 of migration-center the InfoArchive Importer supports setting custom attributes in the eas_sip.xml file. This can be done by setting the “custom_attribute_key” and custom_attribute_value” system attributes. The number of repeating values in these attributes must match.
custom_attribute_key: which represents the name parameter for custom attributes from eas_sip.xml.
custom_attribute_value: which represents the values of custom attributes from eas_sip.xml.
Please see Default format of PDI File for more details on how the output will look like.
The InfoArchive Importer offers the possibility to automatically distribute the given content to multiple sequential SIPs grouped together as a batch that pertains to a single Data Submission Session (DSS). For activating this feature the check box “batchMode” must be enabled in the importer configuration. Additionally one of the parameters “maxObjectsPerSIP" or "maxContentSizePerSIP" must be set with a value greater than 0.
The importer will process sequentially the documents having the same values for "DSS_" system rules and a new SIP file will be generated anytime when one of the following conditions is met:
The number of objects in the current SIP exceeds the value of "maxObjectsPerSIP"
The total size of the objects content exceeds the "maxContentSizePerSIP"
The importer will set the value of <seqno> element of the SIP descriptor with the sequence number of the SIP inside the DSS. The value of the element <is_last> will be set to "false" for all SIPs that belong to the same DSS except for the last one where it will be set to "true".
For the cases when the generated SIP will contain too many objects or the size of the SIP will be too big, the importer offers the possibility to distribute the given content to multiple independent SIPs (that belong to different DSS). For activating this feature the check box “batchMode” must be disabled but one of the parameters “maxObjectsPerSIP" or "maxContentSizePerSIP" must be set to a value greater than 0. The importer will create a new independent SIP any time when one of the following conditions is met:
The number of objects in the current SIP exceeds the value of "maxObjectsPerSIP"
The total size of the objects content exceeds the "maxContentSizePerSIP"
In this scenario the value of <seqno> element in SIP descriptor will be set to “1” and <is_last> will be set to “true” for all generated SIPs.
Additionally the importer will change the value provided by “DSS_id” by adding a counter to the end of the value. This is necessary in order to assure a unique identifier for each DSS that is derived from information contained in the SIP descriptor:
external DSS ID = <holding>+<producer><id>.
The external DSS id of multiple independent SIPs must be unique for InfoArchive in order to process them.
IMPORTANT: In this scenario the length of the DSS_id value should be less than maximum allowed length (32 char) so the importer can add the counter as “_1”, “_2” and so on at the end of the value.
Since the InfoArchive Importer does not import the data directly to InfoArchive it does offer a post processing functionality to allow the user to automate the content ingestion to InfoArchive. This can be done by providing a batch file that will be executed by the importer after all objects will have been processed. The path to the batch file can be configured in the importer parameter “targetScript”. Such a script may, for example, start the ingestion process on the InfoArchive server.
When the importer parameter “includeAuditTrails” checked”, the importer will add a list with all audit trail records of the currently processed object to the output XML file. The importer will take the data for the audit trail records from the audit trail migration set that must be assigned to the importer. Therefore, the user has to assign at least two migration sets to the importer: one for the documents and one for the corresponding audit trail records. Each audit trail node in the output XML file will contain all the attributes defined in the audit trail migration set. The default XSLT transformation mechanism can be used to create the needed output structure for the audit trail records.
The default PDI output looks like below:
When the parameter “includeVirtualDocuments” is checked the importer will include for each virtual document it processes all its descendants and add them as a list of child nodes to its output record. Each node will contain the name of the child and the content hash of the primary content (that was calculated by the scanner). The default XSLT transformation mechanism can be used to create the needed output structure for the VD objects.
The PDI file looks like below:
The parameter “includeChildrenVersions” allows specifying if all versions of the children will be included or only the latest version.
There are several limitation that should be taken in consideration when using this feature:
All related objects, i.e. all descendants of a virtual document, must be associated to the same import job in migration-center. This limitation is necessary in order to ensure that all descendant objects of a virtual document are in the transformed and validated state before they are processed by the importer. If an descendant object is not contained in any of the migration sets that are associated with the import job, the migration-center will throw an error for the parent object during the import.
For children, the <file_name> value is taken from the first value of the system rule “content_name”. The “content_name” is the system attribute that defines the content names in the zip.
For children, only the content specified in the first value of "mc_content_location" will be added. If "mc_content_location" is null, the content will be taken from the column "content_location" that stores the path where the document was scanned.
If the same document is part of multiple VDs within the same SIP then its content will be added only one time.
If the size limit for one SIP is exceeded, the importer will throw an error
Delta migration does not work with this feature.
If the parameter “includeChildrenOnce” is checked the VD children are only added to the first imported parent. If is unchecked the children are added to every parent they belongs to and they are also added as distinct nodes in the PDI file.
Migration center can ingest the generated ZIP files synchronously over the Webservice into InforArchive 3.2. Therefore the InfoArchive (Holding) must be configured as described in the InfoArchive documentation.
In order to let the importer transfer the files the parameter “webserviceURL” must be filled. If that is the case the importer will try to reach the Webservice at the start of the import to ensure a connection to the webservices can be established. Once a SIP file is created in the filesystem it will be transferred via Webservice to InfoArchive in a separate Thread. The number of threads that run in parallel can be set with the parameter “numberOfThreads”.
If the transfer is successful the SIP file can be moved to a directory specified by the parameter “moveFilesToFolder”. A SIP file that fails to transfer will be deleted by default, unless the parameter “keepUntransferredSIPs” is checked.
To create a new InfoArchive Importer job, specify the respective adapter type in the Importer Properties window from the list of available adapters “InfoArchive”. Once the adapter type has been selected, the Parameters list will be populated with the parameters specific to the selected adapter type.
The Properties window of an importer can be accessed by double-clicking an importer in the list, or selecting the Properties button/menu item from the toolbar/context menu.
A detailed description is always displayed at the bottom of the window for a selected parameter.
Configuration parameters
Values
Name
Enter a unique name for this importer
Mandatory
Adapter type
Select the “InfoArchive” adapter from the list of available adapters
Mandatory
Location
Select the Job Server location where this job should be run. Job Servers are defined in the Jobserver window. If no Job Server migration-center will prompt the user to define a Job Server Location when saving the importer.
Mandatory
Description
Enter a description for this job (optional)
Configuration parameters
Values
pdiSchemasPath
Should be set with the folder path where XSL and XSD files needed for generating and validating PDI files are located.
If no value is set to this parameter the PDI file will be generated in the default format. In most of the cases this parameter needs to be set.
targetDirectory
The folder where SIP files will be created. Can be a local drive or a network share.
Mandatory
includeAuditTrails
Enable audittrail entries to be added the generated SIPs. The audittrail migsets need to be associated with the importer. This works only with the audittrail objects exported from Documentum.
includeVirtualDocuments
Enable the children of the virtual documents (scanned from Documentum) to be include in the SIP together with the parent document.
includeChildrenVersions
Indicates whether all children of the virtual documents will be include in the SIP. If not checked, only the most recent version of the children will be added to the SIP. This parameter is used only when “includeVirtualDocuments” is checked.
includeChildrenOnce
If enabled, the VD children will be only added under the parent node in the PDI. If disabled, they will be added also as distinct nodes.
batchMode
Enable batch ingestion mode. Enabling this parameter has effect only when “maxObjectsPerSIP” or “maxContentSizePerSIP” is set with a value greater than 0.
maxObjectsPerSIP
Maximum number of objects in a SIP (ZIP). If it’s 0 or less it will be ignored.
maxContentSizePerSIP
Maximum overall content size of a SIP (ZIP) in MB. If it’s 0 or less it will be ignored.
computeChecksum
Flag indicating if the checksum of the generated eas_pdi.xml file should be computed. The importer will use SHA-256 algorithm and base64 encoding.
triggerScript
Path to a custom script or batch file to be executed at the end of the import.
webServiceURL
Set a valid Webservice URL here if the SIP files should be transferred via InfoArchive webserviecs. With this is empty, no Webservice transfer will be done.
moveFilesToFolder
If set, successfully transferred files will be moved to another folder. (only webservice transfer related)
keepUntransferredSIPs
Enable this to keep SIPs which have produced an error while being transferred. Normally SIPs get deleted in case of an error. Transfer errors can either be technically (e.g. connection lost) or e.g. attribute validation failed, schema is missing and other misconfigurations.
numberOfThreads
Set the maximum number of Threads to use for each Webservice transfer. Default is 10.
loggingLevel
Logging level, 4-Debug, 3-Info, 2-Warning, 1-Error
Mandatory
A complete history is available for any InfoArchiveImporter job from the respective items’ History window. It is accessible through the History button/menu entry on the toolbar/context menu. The History window displays a list of all runs for the selected job together with additional information, such as the number of processed objects, the start and ending time and the status.
Double clicking an entry or clicking the Open button on the toolbar opens the log file created by that run. The log file contains more information about the run of the selected job:
Version information of the migration-center Server Components the job was run with
The parameters the job was run with
Execution Summary that contains the total number of objects processed, the number of documents and folders scanned or imported, the count of warnings and errors that occurred during runtime.
Log files generated by the InfoArchive Importer can be found in the Server Compo nents installation folder of the machine where the job was run, e.g. …\fme AG\migration-center Server Components <Version>\logs
The amount of information written to the log files depends on the setting specified in the “loggingLevel” start parameter for the respective job.
The Veeva Importer is one of the target adapters available in migration-center starting with version 3.7. It takes the objects processed in migration-center and imports them into a Veeva Vault. Currently, for every supported Veeva Vault, there is a specific importer type:
Clinical: ClinicalImporter
Quality: QualityImporter
RIM: RIMImporter
Importer is the term used for an output adapter which is most likely used at the last step of the migration process. An Importer (such as the Veeva Importer) takes care of importing the objects processed in migration-center into the target system (such as the Veeva Vault).
An importer works as a job that can be run at any time and can even be executed repeatedly. For every run a detailed history and log file are created.
An importer is defined by a unique name, a set of configuration parameters and an optional description. Veeva Importers can be created, configured, started and monitored through migration-center Client, while the corresponding processes are executed in the background by migration-center Job Server.
When the term "Veeva Importer" is used in the following, it refers to all three Veeva Importers. The name of the specific Veeva Importer will be used if there is a feature applicable just for that adapter.
The importer uses FTP to upload the content of the documents and their versions to the Veeva Vault environment. This means the content will be uploaded first to the Veeva FTP server, so the necessary outbound ports (i.e. 21 and 56000 – 56100) should be opened in your firewalls as described here:
In order to be able to successfully migration documents into Veeva Vault using migration-center each vault must have the Migration Mode setting enabled. Please contact Veeva Product Support for more details about how to enable Migration Mode for your Vault.
Before starting the ingestion into Veeva Vault you should inform Vault Product Support about this by filling this form:
You should use the form 3 business days in advance of any migration and at least one week in advance of any significant migration. A significant migration includes over 10,000 office documents, 1,000 large documents (like video) or 500,000 object records.
In addition, if Migration is being run during “off hours” for the Vault’s location, or weekends, you should ensure that Vault Product Support is aware of requested activity.
To create a new Veeva Clinical Importer job click on the New Importer button and select “Veeva Clinical” from the adapter type dropdown list. Once the adapter type has been selected, the parameters list will be populated with the Veeva parameters. The other types of Veeva Importer can be created in a similar way by selecting the corresponding adapter type.
The Properties window of an importer can be accessed by double-clicking an importer in the list or selecting the Properties button or entry from the toolbar or context menu.
On the | Migsets | tab, the user can select the migration sets to be imported with this importer. Depending on the chosen Adapter Type only the migration sets compatible with this type of importer will be displayed and can be selected for import. Also, only migration sets containing at least one object in a validated state will be displayed (since objects which haven’t been validated cannot be imported).
Available migration sets can be moved between the two columns by double clicking or using the arrows buttons.
For configuring some additional parameters that will apply to all importers runs, a configuration file (config.properties) provided in the folder …\lib\mc-veeva-importer. The following settings are available:
For applying the changes in the config.properties file the Job Server needs to be restarted.
Vault Objects are part of the application data model in Veeva Vault. Vault Objects are usually configured to be referenced in the document fields. For example, the Product document field corresponds to the Product object and is associated with all document types.
Veeva Importer allows importing object records from any supported source system to any supported Veeva Vault. For that, a migset of type “<Source>toVeeva<type>(object)” must be created. Ex: Veeva Quality Importer works with migsets of type “<Source>toVeevaQuality(object)”.
The following rules for system attributes are available when importing objects.
Any editable object fields can be set using the normal transformation rules.
On Delta Migration, the existing attachments of an object will be replaced with those that are set on the "attachments" system attribute otherwise, the object will keep its attachments.
Veeva importer allows importing documents from any supported source system to any supported Veeva Vault. For that, a migset of type “<Source>toVeevaClinical(document)” / “<Source>toVeevaQuality(document)” / “<Source>toVeevaRIM(document)” must be created.
Importing documents with multiple versions is supported by Veeva importer. The structure of the versions tree is generated by the scanners of the systems that support this feature and provide means to extract it. The order of the versions in Veeva Vault is set through the system rules: major_version_number_v and minor_version_number__v.
All objects from a version structure must be imported since each of them references its antecedents, going back to the very first version. Therefore, it is advised to not drop the versions of an object between the scan and the import processes as this will generate inconsistencies and errors. If an object is intended to be migrated without versions (i.e. only the current version of the object needs to be migrated) than the affected objects should be scanned without enabling the respective scanner’s versioning option.
The following rules for system attributes are available when importing documents.
When importing documents using the Veeva importer you have the option of migrating existing renditions adding different content as renditions. This is done using the two system rules available in the migset: ”mc_rendition_paths” and ”mc_rendition_types”. Each of these two rules can have multiple values and you need to set the same number of values for both rules. Doing otherwise will cause the importer to throw an error.
If the parameter skipUploadToFTP is not checked then the ”mc_rendition_paths” rule will need to contain a path on the filesystem to the content just like for the ”mc_content_location” attribute otherwise, the rule will contain the FTP path of the desired content.
The ”mc_rendition_type” will need to specify the name of the rendition type from Veeva. The specific types of renditions available are specified in the document type definition in Veeva.
”mc_rendition_type” needs the internal name of the rendition type (e.g. large_size_asset__v).
Veeva Importer allows importing "dm_relation" objects scanned by Documentum Scanner as Veeva Vault relations. The internal name of Veeva Vault relations that will be created should be configured in the configurations file "relations-mapping.properties" located in the jobserver folder ./lib/mc-veeva-importer.
The internal name of available Veeva Vault relations can be found in the Vault configuration menu. The internal name must be used instead of the label.
The relations will be imported after all documents and versions are imported. If a document part of a relation could not be imported, the relation will not be imported so the document being parent in the relation will be moved in status "Partially Imported". The documents in the status "Partially Imported" can be submitted to the importer again but their relations will be imported only after the documents being children in the relations are imported.
Binders allow users to organize and group documents in a flexible structure. Binders are comprised of sections, which can nest to build a hierarchical structure, and links to documents in Vault.
Veeva importer allows you to import Virtual Documents scanned from Documentum as Binders in any supported Veeva Vault. For that a migset of type “DctmtoVeeva<type>(binder)” must be created.
The following features are currently supported:
Create binders from scratch by preserving the hierarchical structure of the Virtual Document
Create binders based on a template so the sections are generated as they are defined in the template
Importing Virtual Documents version as Binder versions
Import the content of the Virtual Document as the first child of the Binder
Preserving version binding (implicit and symbolic) for documents and binders
When importing Virtual Documents as Binders all normal documents that are referenced by the Virtual Documents should be imported before or together with the binders (in the same importer run).
Setting version labels as minor or major does not work for binders!
The following rules for system attributes are available when importing binders.
When importing Documentum Virtual Documents as Veeva Vault Binders, the binder components can be bound to a specific version. When checking “preserveVersionBinding” in the importer configuration, the binder components are bound to the same version as they are bound in Virtual Documents in Documentum. The components will be bound to the latest version in Veeva Vault in the following cases:
The version binding is not specified in the “version_label” attribute of the “dmr_containment” object in the Documentum
The version binding is specified in Documentum but “preserveVersionBinding” was not enabled in Documentum
Circular references between binders are not allowed in Veeva Vault and therefore the importer will throw an error when trying to import circular references. To avoid that, the circular references should be fixed in the source system before scanning the Virtual Documents.
Creating binders with sections is possible by specifying a template name that contains sections. The documents can be imported to sections by setting the “parent_section” system rule in the migset containing the normal documents. In case of hierarchical sections, the sections will be defined separated by slash “/”. Ex: Section1/Section1.1/Section1.1.1
Veeva RIM Importer allows you to import Submission Archives in RIM Vault. For that, a migset of type “DctmtoVeevaRIM(submission)” must be created. The importer does not create the submission definition but import submission archives to existing submission. The zip file to be imported must contain the “Submission” folder and “Application” folder.
The following rules for system attributes are available when importing binders.
When performing a job with the Veeva importer you must specify the “target_type” and the type/subtype/classification for the documents. Due to how the Veeva document types are structured, the object type definition in migration-center will act as a container for all the necessary target attributes from Veeva. Therefore the object type definition will be set for the ”target_type” system rule, but the actual type of the document will be defined by the values set for the type__v / subtype__v / classification__v system rules.
Depending on which type of Veeva Vault you are importing into (Clinical, Quality or RIM) there are several attributes which are mandatory, and you will need to associate transformation rules with valid values, as follows:
Clinical: product__v, study__v, blinding__v
Quality: country__v, owning_department__v, owning_facility__v
RIM: none
The values for these attributes need to be the actual label value and not the internal name or id (i.e. “Base Doc Lifecycle” instead of “clinical_doc_lifecycle__c”. This applies to all of the attributes mentioned above.
Each document type in Veeva can have one or multiple Lifecycles associated with it. When importing a document of a certain type you need to specify, using the ”lifecycle__v” attribute, which of the available lifecycles must be set for the document.
To find out or modify the available lifecycles for a specific type you need to navigate in Veeva to the Admin section -> Configuration tab -> Document Types and open the specific type you are interested in.
From the same Configuration section you will find Document Lifecycles, here you can select a lifecycle and see the specific lifecycle states which you can set for the ”status__v” attribute.
The available permissions for a specific document are specified by the lifecycle which is attached to that document. To see what available permissions there are, you need to go to the Admin interface -> Configuration -> Document Lifecycles then select the lifecycle and go to the tab Roles. These are the permissions you will be able to set with the Veeva Importer if you attach this particular lifecycle to the document you are importing.
The format is the following:
{Role_name}.{USERS or GROUPS}=username1,username2,…
Example:
reviewer__v.users=user1@vvtechpartner.com,user2@vvtechpartner.com
mc_custom_role__c.groups=approver_group__c
Veeva supports document/object attributes that reference master data objects by their internal ID. Usually this internal IDs are unknown when migrating documents/objects from a source system to Veeva. Thus, the migration-center let the users set other values than “ID” for the referenced objects. By default, the user should provide the value of the field name__v for the referenced objects. Setting the value of other fields is possible though a configuration file that is specified in the importer parameter: attributeMappingLocation.
The configuration file should have the following structure for document attributes:
<document field name>=<Object name>,<Object lookup field>.
For example when setting the field “country__v” the user may want to set in migration center the country abbreviation value. In this case the following line should be added in the configuration file:
country__v=country__v,abbreviation__c
The configuration file should have the following structure for object attributes:
<object field name>.<Object name>=<Ref object name>,<Object lookup field>
If you want to use the "name__v" instead of "id" when you set the "country__v" field from "location__v" object then the following line should be added in the configuration file:
country__v.location__v=country__v,name__v
The field of the object (abbreviation__c, contry__v) that will be set in the configuration file should be defined as unique in the Object definition in Veeva Vault otherwise the lookup functionality may not work properly.
The system rule “attachments” allows the user to import attachments for the documents. It should be set with the fully qualified paths of the attachments.
If a document has two or more attachments with the same name but different content, the importer will create multiple versions for the same attachment.
Objects that have changed in the source system since the last scan are scanned as update objects. Whether an object in a migration set is an update or not can be seen by checking the value of the Is_update column – if it’s 1, the current object is an update to a previously scanned object (the base object). An update object cannot be imported unless its base object has been imported previously.
Currently, update objects are processed by Veeva importer with the following limitations:
Only document and object updates can be imported. The updates for Binders are reported as errors by the importer
Only metadata is updated for the documents. The content, renditions, permissions and classifications are not updated.
New versions of documents are fully supported
New fields can be added by an update to existing Veeva Objects
Beside the delta mechanism described above, the importer allows importing new versions to documents based on “external_id__v” system rule. If that’s set, the importer will behave in the following way:
A document with the same external id exists in the vault The importer adds the document being imported as a new version to the existing document. The new version can only be imported if a version with the same major and minor value doesn’t already exists.
A document with the same external id cannot be found in the vault
If the document being imported is a root version (level_in_version_tree = 0) then a new document is created.
If the document being imported is not a root version (level_in_version_tree > 0) then the documents fail to import, and an appropriate error message is logged.
Object names must be unique. Thus, the importer will report an error if an update tries to change the name of the object to an existing one.
A complete history is available for any Veeva Importer job is available from the respective items’ History window. It is accessible through the History button/menu entry on the toolbar/context menu. The History window displays a list of all runs for the selected job together with additional information, such as the number of processed objects, the start and ending time and the status.
Double clicking an entry or clicking the Open button on the toolbar opens the log file created by that run. The log file contains more information about the run of the selected job:
Version information of the migration-center Server Components the job was run with
The parameters the job was run with
Execution Summary that contains the total number of objects processed, the number of documents and folders scanned or imported, the count of warnings and errors that occurred during runtime.
Log files generated by the Documentum Adapter can be found in the Server Components installation folder of the machine where the job was run, e.g. …\fme AG\migration-center Server Components <Version>\logs
The amount of information written to the log files depends on the setting specified in the ‘loggingLevel’ start parameter for the respective job.
See for more details.
See for more details.
See for more details.
See for more details.
See for more details.
Example:
The SharePoint Online Batch importer lets you import documents and folders into a folder of a document library, i.e. you have to configure the SharePoint Online site, document library and the base folder of your import in the importer configuration (see ). All documents and folders that you import go into the specified base folder or into a folder below.
The information in this chapter is based on the following Microsoft guidelines:
Next step is registering an Azure AD application in the Azure Active Directory tenant that is linked to your Office 365 tenant. To do that, open the Office 365 Admin Center () using the account of a user member of the Tenant Global Admins group. Click on the "Azure Active Directory" link that is available under the "Admin centers" group in the left-side tree view of the Office 365 Admin Center. In the new browser's tab that will be opened you will find the Microsoft Azure portal (https://portal.azure.com/). If it is the first time that you access the Azure portal with your account, you will have to register a new Azure subscription, providing some information and a credit card for any payment need. But don't worry, in order to play with Azure AD and to register an Office 365 Application you will not pay anything. In fact, those are free capabilities. Once having access to the Azure portal, select the "Azure Active Directory" section and choose the option "App registrations". See the next figure for further details.
Click on the blue "Add permissions" button at the bottom to add the permissions to your application. The "Application permissions" are those granted to the migration-center application when running as App Only.
Next step is registering an Azure AD application in the Azure Active Directory tenant that is linked to your Office 365 tenant. To do that, open the Office 365 Admin Center () using the account of a user member of the Tenant Global Admins group. Click on the "Azure Active Directory" link that is available under the "Admin centers" group in the left-side tree view of the Office 365 Admin Center. In the new browser's tab that will be opened you will find the Microsoft Azure portal (https://portal.azure.com/). If it is the first time that you access the Azure portal with your account, you will have to register a new Azure subscription, providing some information and a credit card for any payment need. But don't worry, in order to play with Azure AD and to register an Office 365 Application you will not pay anything. In fact, those are free capabilities. Once having access to the Azure portal, select the "Azure Active Directory" section and choose the option "App registrations". See the next figure for further details.
Usually you want to set additional custom attributes on the documents. Therefore, you need to define the regular transformation rules first, and associate them with the corresponding target type attributes after that. For more details on transformations rules and associations, please see the corresponding chapters in the .
Usually you want to set additional custom attributes on the folders. Therefore, you need to define the regular transformation rules first, and associate them with the corresponding target type attributes after that. For more details on transformations rules and associations, please see the corresponding chapters in the .
For more details on those files, please see
Importing multiple versions of compound documents is not support. Therefore, the option for scanning only the last version of the virtual documents from Documentum should be activated. For more details please check the user guide.
Each value for a Classification to be assigned must be an existing Content Server Classification path, divided with forward slashes (/), located under the node specified in the importer parameter "classRootFolder" (see ).
Working with object type definitions and defining attributes is a core product functionality and is described in detail in the .
Importing lists or libraries into SharePoint is a little bit different than for documents, folders or list items, since the migration-center SharePoint Importer sets Microsoft defined attributes, which are not visible to the user. In order to set those attributes, it is necessary to create object type definitions for each type of list template (see ). The SharePoint Adapter is able to set any possible attribute of a list. See for more information about possible attributes.
Original content location:
The information in this chapter is based on the following Microsoft guidelines:
Next step is registering an Azure AD application in the Azure Active Directory tenant that is linked to your Office 365 tenant. To do that, open the Office 365 Admin Center () using the account of a user member of the Tenant Global Admins group. Click on the "Azure Active Directory" link that is available under the "Admin centers" group in the left-side tree view of the Office 365 Admin Center. In the new browser's tab that will be opened you will find the Microsoft Azure portal (https://portal.azure.com/). If it is the first time that you access the Azure portal with your account, you will have to register a new Azure subscription, providing some information and a credit card for any payment need. But don't worry, in order to play with Azure AD and to register an Office 365 Application you will not pay anything. In fact, those are free capabilities. Once having access to the Azure portal, select the "Azure Active Directory" section and choose the option "App registrations". See the next figure for further details.
In the "App registrations" tab you will find the list of Azure AD applications registered in your tenant. Click the "New registration" button in the upper left part of the blade. Next, provide a name for your application, e.g. “migration-center” and click on "Register" at the bottom of the blade.
Next step is registering an Azure AD application in the Azure Active Directory tenant that is linked to your Office 365 tenant. To do that, open the Office 365 Admin Center () using the account of a user member of the Tenant Global Admins group. Click on the "Azure Active Directory" link that is available under the "Admin centers" group in the left-side tree view of the Office 365 Admin Center. In the new browser's tab that will be opened you will find the Microsoft Azure portal (https://portal.azure.com/). If it is the first time that you access the Azure portal with your account, you will have to register a new Azure subscription, providing some information and a credit card for any payment need. But don't worry, in order to play with Azure AD and to register an Office 365 Application you will not pay anything. In fact, those are free capabilities. Once having access to the Azure portal, select the "Azure Active Directory" section and choose the option "App registrations". See the next figure for further details.
Working with object type definitions and defining attributes is a core product functionality and is described in detail in the .
Importing lists or libraries into SharePoint Online is a little bit different than for documents, folders or list items, since the migration-center SharePoint Online Importer sets Microsoft defined attributes, which are not visible to the user. In order to set those attributes, it is necessary to create object type definitions for each type of list template (see ). The SharePoint Online Adapter is able to set any possible attribute of a list. See for more information about possible attributes.
Original content location:
Note: If this version of OpenText Content Server Import Adaptor is used together with together with “Extended ECM for SAP Solutions”, then ‘authenticationmode’ has to be set to “RCS”, since OpenText Content Server together with “Extended ECM for SAP Solutions” is deployed under “Runtime and Core Services”. For details of the individual authentication mechanisms and scenarios provided by OpenText, see appropriate documentation at .
Ex:
See for more details.
For more details about PDI generation see .
More information about how relations are working in the Veeva Vault can be found here: .
Configuration parameters
Values
appClientId
The ID of the migration-center Azure AD application.
appCertificatePath
The full path to the certificate .PFX file, which you have generated when setting up the Azure AD application.
appCertificatePassword
The password to read the certificate specified in appCertificatePath.
Configuration parameters
Values
appClientId
The ID of the SharePoint application you have created.
appClientSecret
The client secret, which you have generated when setting up the SharePoint application.
Configuration parameters
Values
mc_content_location
Rule for setting alternative content of the document, i.e. if you want to upload a different content file than the scanned one.
Example: D:\Some\Alternative\Content\Path\Content.pdf
Optional
If you leave this rule empty, the importer will import the scanned content along with the document. In case you want to upload a different content, e.g. maybe you converted a PDF file into PDF-A format, use this rule to specify the location of the new content file.
s__contentType*
Rule for setting the SharePoint content type of the document.
Example: Document
Mandatory
This value must match an existing migration-center object type definition and of course a content type in the target SharePoint document library.
s__createdBy
Name of the user who created the document. The value must be a unique username.
Example: MikeGration@asdf.onmicrosoft.com
Optional
If not provided, the system account will be set.
s__createdDate
Date and time when the document was created.
Example: 2020-03-31 14:14:03
Optional
If not provided, the current date and time will be set.
s__declareAsRecord
Flag indicating if the document should be declared as a record.
Example: True
Optional
Record declaration will only work if
SharePoint Online is the target (and not OneDrive)
the target library is a Records Library and
manual record declaration is enabled in the Records Library.
s__moderationStatus
Sets the moderation / approval status of the document. Must be one of the following values (you can either use the status names or their numerical value: Approved (or 0), Denied (or 1), Draft (or 3), Pending (or 2), Scheduled (or 4)
Optional
s__moderationStatusComment
Sets the moderation / approval status comment of the document.
Optional
s__modifiedBy
Name of the user who last modified the document. The value must be a unique username.
Example: MikeGration@asdf.onmicrosoft.com
Optional
If not provided, the system account will be set.
s__modifiedDate
Date and time when the document was last modified.
Example: 2020-03-31 14:14:03
Optional
If not provided, the current date and time will be set.
s__name*
The name of the document including its file extension.
Example: My Document.docx
Mandatory
s__parentFolder*
The parent folder of the document.
Example: /My subfolder
Mandatory
The parent folder is relative to the base folder configured in the importer definition, e.g. if the base folder is /Shared Documents and the parent folder is /My subfolder, the effective folder for the document will be /Shared Documents/My subfolder
s__roleAssignments
Specify role assignments for the current object. If a role definition is assigned to the current object, the Importer breaks the role inheritance. It is possible to specify either a list of users and/or a list of SharePoint groups. If a group is specified, it is necessary, that the targeted SharePoint group exists in your SharePoint site. To set role assignments for single users, the value must match the following pattern: <loginname>;#<roledefinitionname> To set role assignments for SharePoint groups, the value must match the following pattern: @<groupname>;#<roledefinitionname> Example: user;#Read @Contributors;#Contribute
Optional
s__versionNumber
Sets the version number of the document. Must be in the format: x.y Example: 1.0
Mandatory if importer parameter "autoVersioning" is set to "false".
Specified value will be ignored if parameter "autoVersioning" is set to "true".
Configuration parameters
Values
s__contentType*
Rule for setting the SharePoint content type of the folder.
Example: Folder
Mandatory
This value must match an existing migration-center object type definition and of course a content type in the target SharePoint document library.
s__createdBy
Name of the user who created the folder. The value must be a unique username.
Example: MikeGration@asdf.onmicrosoft.com
Optional
If not provided, the system account will be set.
s__createdDate
Date and time when the folder was created.
Example: 2020-03-31 14:14:03
Optional
If not provided, the current date and time will be set.
s__modifiedBy
Name of the user who last modified the folder. The value must be a unique username.
Example: MikeGration@asdf.onmicrosoft.com
Optional
If not provided, the system account will be set.
s__modifiedDate
Date and time when the folder was last modified.
Example: 2020-03-31 14:14:03
Optional
If not provided, the current date and time will be set.
s__name*
The name of the folder.
Example: My Folder
Mandatory
s__parentFolder*
The parent folder of the folder.
Example: /My subfolder
Mandatory
The parent folder is relative to the base folder configured in the importer definition, e.g. if the base folder is /Shared Documents and the parent folder is /My subfolder, the effective folder path of the folder will be /Shared Documents/My subfolder/My Folder
s__roleAssignments
Specify role assignments for the current object. If a role definition is assigned to the current object, the Importer breaks the role inheritance. It is possible to specify either a list of users and/or a list of SharePoint groups. If a group is specified, it is necessary, that the targeted SharePoint group exists in your SharePoint site. To set role assignments for single users, the value must match the following pattern: <loginname>;#<roledefinitionname> To set role assignments for SharePoint groups, the value must match the following pattern: @<groupname>;#<roledefinitionname> Example: user;#Read @Contributors;#Contribute
Optional
Configuration parameters
Values
tenantName*
The name of your SharePoint Online Tenant
Example: Contoso
Mandatory
There are several web site that explain how to determine a SharePoint Online tenant name, e.g. https://morgantechspace.com/2019/07/how-to-find-your-tenant-name-in-office-365.html
tenantUrl*
The URL of your SharePoint Online Tenant
Examples: https://contoso.sharepoint.com https://contoso-my.sharepoint.com (for OneDrive)
Mandatory
siteName*
The path to your target site for the import.
Examples: /sites/My Site /personal/my_name_company_domain (for OneDrive)
Mandatory
documentLibraryName*
The name of your target document library in the specified site.
Example: Shared Documents
Mandatory
The name of the document library might depend on your site's locale settings, e.g. it might be Dokumente for a German locale.
importLocation*
The base folder for your import in the specified document library.
Example: /The/documents/go/here
Mandatory
If you want to import into the root folder of the document library, just specify “/” as your import location.
appClientId*
The ID of either the migration-center Azure AD application or the SharePoint application.
Example: ab187da0-c04d-4f82-9f43-51f41c0a3bf0
Mandatory
appCertificatePath
The full path to the certificate .PFX file, which you have generated when setting up the Azure AD application.
Example: D:\migration-center\config\azure-ad-app-cert.pfx
Mandatory
appCertificatePassword
The password to read the certificate specified in appCertificatePath.
Mandatory
appClientSecret
The client secret, which you have generated when setting up the SharePoint application (SharePoint app-only principal authentication).
Mandatory
proxyServer
This is the URL, which defines the location of your proxy to connect to the Internet.
Optional
Just leave blank if no proxy is used.
proxyUsername
The login of the user, who connects and authenticates on the proxy, which was specified in parameter proxyServer.
Example: corporatedomain\username
Optional
Just leave blank if no proxy is used.
proxyPassword
Password of the proxy user specified above.
Optional
Just leave blank if no proxy is used.
autoCreateFolder*
Enable/disable whether folders, which do not exist, should be created during import.
Enabled: The importer creates any specified folders automatically.
Disabled: No new folders will be created, but any existing folders (same path and name) are used. References to non-existing folders will throw an error.
autoVersioning*
Enable/disable whether version numbers should be generated automatically by the importer.
Enabled: The importer will generate consecutive version numbers for the versions of a document based on the versioning setting of the target library.
Disabled: The user must provide appropriate version numbers for the documents using the "s__versionNumber" system attribute.
numberOfThreads*
Number of batches that shall be imported in parallel.
Default: 8
Mandatory
You may increase this value for large imports, depending on the configuration of the machine that runs the job server.
loggingLevel*
Sets the verbosity of the log file.
Allowed values:
1 - logs only errors during scan
2 - is the default value reporting all warnings and errors
3- logs all successfully performed operations in addition to any warnings or errors
4 - logs all events (for debugging only, use only if instructed by fme product support since it generates a very large amount of output. Do not use in production)
Mandatory
Configuration parameters
Values
Name
The unique name for the importer
Mandatory
Adapter type
Select the “OpenText” adapter from the list of available adapters
Mandatory
Location
Select the Job Server location where this job should be run. Job Servers are defined in the –Jobserver window-. If no Job Server exists in migration-center the application will prompt the user to define a Job Server Location when saving the Importer
Mandatory
Description
Enter a description for this job (optional)
Configuration parameters
Values
username*
The OpenText Content Server user with “System Administration Rights”
Note:
“System Administration Rights” are required to internally allow “Impersonating” other users as the individual owners of the objects to be imported.
Use either built-in user otadmin@otds.admin if Content Server is deployed under OpenText “Runtime and Core Services” or user “Admin” which have already “System Administration Rights”. If you have to use another user, set this privilege within the OpenText Content Server User Administration for that user.
This user is also internally used for assigning permissions based on the MC System Rule “ACLs”
password*
The user’s password.
authenticationMode*
The OpenText Content Server authentication mode.
Valid values are:
CWS for regular Content Server authentication
RCS for authentication of OpenText Runtime and Core Services
RCSCAP for authentication via Common Authentication Protocol over Runtime and Core Services
Note: If this version of OpenText Content Server Import Adaptor is used together with together with “Extended ECM for SAP Solutions”, then ‘authenticationmode’ has to be set to “RCS”, since OpenText Content Server together with “Extended ECM for SAP Solutions” is deployed under “Runtime and Core Services”. For details of the individual authentication mechanisms and scenarios provided by OpenText, see appropriate documentation at OpenText KnowledgeCenter.
webserviceURL*
The URL of the OpenText Content Web Services.Ex: http://server:port/les-services/services/Authentication
classificationsWebServiceURL
The URL of the Classification WebService. This is necessary when need to set classifications to the imported objects. Ex: http://server:port/les-classifications/services/Classifications
rmWebserviceURL
The URL of the Record Management WebService. This is necessary when need to set records management classifications to the imported objects.
rcsAuthenticationWebserviceURL
The URL of the Authentication Service for RCS. It must be set only when RCS or RCSCAP is used.
physicalObjectsWebserviceUrl
The URL of the Physical Objects WebService. This is necessary when need to import physical items. Ex: http://server:port/les-physicalObjects/services/PhysicalObjects
rootFolder*
The internal node id of the Content Server root folder where the content of the imported objects will be stored.
Note: The individual location for each object to be imported below this “rootfolder” is defined by MC System Rule “ParentFolder”
autoCreateFolders
Check this if you want to automatically create missing folders (Subtype 0) while importing objects to OpenText Content Server.
inheritFolderCategories
When enabled, the imported folders will inherit the categories from the parent folder. When the folders are created by “autoCreateFolders” functionality, the auto created folders will also inherit the categories from the parent folder.
When not enabled the categories will not be inherited by the created folders.
inheritDocumentCategories
When enabled, the categories from the parent folder will be assigned to the imported documents.
The following rules applies:
The importer will not inherit the categories that have been assigned to the documents in the migration-center.
The folder categories (other than the ones defined for documents in migration-center) will be inherited by the imported documents.
excludeFolderCategories
The list of folder categories (list the names of the categories) that will not be inherited from the parent folder by the documents and folders when “inheritFolderCategories” or “inheritDocumentCategories” parameter is activated. If the parameter is left empty, all folder categories will be inherited from the parent folder.
Format: String
Delimiter: |
inheritFolderPermissions
When enabled, the folder permissions together with the permissions set in the “ACLS” rule will applied to the documents. If set to false, only the permissions set in the “ACLs” rule will be set to the documents.
classRootFolder
The internal node id of the Classification root folder. This is required when setting classifications.
extractMetadataFromEmail
When checked, the related metadata of email type object will be extracted from the msg file. If not, the email metadata will be mapped from source attributes.
extractAttachmentsFromEmail
When checked, the email attachments will be extracted as XReference into the content server. The attachment will be created as distinct objects into the same place with the email.
crossRefCode
The value of parameter must match set in the Content Server Configuration. (Records Management -> Records Management Administration -> System Settings -> RM Settings -> Email Cross-Reference).
It becomes mandatory if extractAttachmentsFromEmail is checked.
checkContentIntegrity
When checked the content integrity will be checked after the import based on the checksum calculated during scan.
hashAlgorithm
Specify the hash algorithm that will be used for checking the content integrity. It is mandatory when “checkContentIntegrity” is checked, so it should be set with the same algorithm that was used at scanning phase. Valid values are: MD2, MD5, SHA-1, SHA-224, SHA-256, SHA-384, SHA-512
hashEncoding
Specify the hash eoncoding that will be used for checking the content integrity. It is mandatory when “checkContentIntegrity” is checked, so it should be set with the same encoding that was used at scanning phase. Valid values are: HEX, Base32, Base64
moveDocumentsToCD
When checked, the documents that are children of compound documents are moved the folder where they have been initially imported to the compound document. When not checked, the child document remains in the original location but a shortcut pointing to the compound document is created.
numberOfThreads
The number of threads that will be used for importing the documents. Maximum allowed is 20.
Note: Due to their hierarchical structure the Folders will be created using a single thread.
loggingLevel*
Sets the verbosity of the log file.
Values:
1 - logs only errors during scan
2 - is the default value reporting all warnings and errors
3 - logs all successfully performed operations in addition to any warnings or errors
4 - logs all events (for debugging only, use only if instructed by fme product support since it generates a very large amount of output. Do not use in production)
Attribute Name
Multivalue
Mandatory
Description
Area
No
No
Responsible for assigning an existing area code to the physical item.
Client
No
No
The username to be set as Client to the physical item.
Facility
No
No
Responsible for assigning a valid facility code (that allow users to track boxes that are sent to off-site storage facilities) to the physical item.
HomeLocation
No
Yes
Responsible for assigning a value to attribute 'HomeLocation'. This value can be from location table or a new value specified by user.
Keywords
Yes
No
Responsible for assigning values to target object's 'Keywords' attribute.
Locator
No
No
Responsible for assigning a valid Locator name to the physical item. The Locator should be predefined in Content Server
LocatorType
No
No
Responsible for assigning an existing locator type code from Locator Type Table to the physical item.
Name
No
Yes
Responsible for assigning the Name of the physical item.
OffsiteStorageID
No
No
Responsible for assigning a valid value to 'Offsite Storage ID' attribute.
ParentFolder
No
Yes
Responsible for holding the folder path where the physical item will be imported.
PhysicalBoxPath
No
No
Responsible for holding the path to the Physical Box where the item will be assigned. If it’s not set, the physical object is imported to the folder specified in the “ParentFolder” but it is not assigned to any box.
PhysicalItemType
No
Yes
This attribute specifies the Physical Item Type for physical object. The value should be already defined in Content Server.
ReferenceRate
No
No
Responsible for assigning a valid reference rate code to the target type. The value should be already defined in Reference Rate table.
TemporaryID
No
No
Responsible for assigning a valid value to 'TemporaryID' attribute.
UniqueId
No
No
Responsible for assigning a unique value to 'Unique ID' attribute. This value should be unique in entire Opentext Content Server.
Attribute Name
Available for migset
Multivalue
Mandatory
Description
ACLs
…OpenText(container)
…OpenText(document)
Yes
No
Responsible for assigning specific permissions for the target object.
Classifications
…OpenText(container)
…OpenText(document)
Yes
No
Responsible for assigning valid Content Server Classifications to the target object.
ContentName
…OpenText(container)
No
Yes
Responsible for assigning the internal target version filename to be uploaded.
Impersonate-User
…OpenText(container)
…OpenText(document)
No
No
Responsible for assigning the owner to the object imported to Content Server.
Note: Since this attribute holds the owner of the object to be imported, valid permission at Content Server side at least 'Add Items' for the target location must have been already provided for this user before executing any import operation.
mc_content_location
…OpenText(document)
No
No
Can be set with the path where the content of the document is located. If not set, the content will be picked up from the original location where it was exported by the scanner.
Name
…OpenText(container)
…OpenText(document)
No
Yes
Responsible for assigning valid Content Server Name to the target object.
ParentFolder
…OpenText(container)
…OpenText(document)
No
Yes
Responsible for holding the folder path where for the target object will be imported.
RenditionTypes
…OpenText(container)
Yes
No
Responsible for controlling the creation of rendition types for the object to be imported, based on the given source rendition types.
Note: This MC System Rule handles only triggering the creation of the valid rendition at Content Server site, so no source rendition will be uploaded!
Shortcuts
…OpenText(container)
…OpenText(document)
Yes
No
Responsible for creating shortcuts to the imported object in the specified folder.
target_type
…OpenText(container)
…OpenText(document)
Yes
Yes
Responsible for assigning the main target type and OpenText Categories
VersionControl
…OpenText(document)
No
Yes
Responsible for controlling the versioning type (Linear, Minor or Major) for the imported document.
Element
Possible Values
Description
ACLType
Owner
OwnerGroup
Public
ACL
Owner is responsible for setting owner permissions of the Content Server object
OwnerGroup is responsible for setting owner group permissions of the Content Server object
Public is responsible for setting public permissions of the Content Server object
ACL is responsible for setting specific permissions for a user or group of the Content Server object
RightName
Content Server User Login Name
Content Server Group Name
-1
Use a valid Content Server User Login Name or Group Name for ACLType, "Owner", "OwnerGroup" and "ACL". Use -1 for ACLType "Public"
Permissions
See
SeeContents
Modify
EditAtts
CreateNode
Checkout
DeleteVersions
Delete
EditPerms
Use a valid combination of permissions concatenated by |.
Note: You must ensure that that given combination of permissions is valid. This means setting only "Delete" would not be valid for Content Server since for "Delete" Permission at least "See", "SeeContents", "Modify" and "DeleteVersions" is required too.
A correct and valid ACLs value would be then:
ACL#csuser#See|SeeContents|Modify|DeleteVersions|Delete
Configuration parameters
Values
Name
Enter a unique name for this importer
Mandatory
Adapter type
Select SharePoint from the list of available adapters
Mandatory
Location
Select the Job Server location where this job should be run. Job Servers are defined in the Jobserver window. If no Job Server has been created by the user to this point, migration-center will prompt the user to define a Job Server Location when saving the Importer.
Mandatory
Description
Enter a description for this job (optional)
Configuration parameters
Values
authenticationMethod
Method used for authentication against SharePoint. If your SP is setup to directly authenticate your users, enter "direct" and provide user name and password in the corresponding parameters. If your SP is setup to use ADFS authentication, enter "adfs" and provide values for the user name, password, domain, adfsBaseUrl, adfsEndpoint, and adfsRealm parameters.
Mandatory
serverURL
This is the URL to the root site collection of your SharePoint environment.
Mandatory
adfsBaseUrl
ADFS login page URL. Depends on your ADFS configuration. Example: https://adfs.foo.bar
adfsEndpoint
ADFS Federation Service endpoint URL. Depends on your ADFS configuration. Example: https://foo.bar/_trust/
adfsRealm
Realm for relying party identifier. Depends on your ADFS configuration. Example: https://foo.bar
username
The SharePoint user on whose behalf the import process will be executed.
Should be a SharePoint site administrator.
Example: sharepoint\administrator
Mandatory
Note: If you import into a SharePoint sub-site, this user needs Full Control permission on the root site as well!
password
Password of the user specified above
Mandatory
autoAdjustVersioning
Enable/Disable whether the lists/libraries should be updated to allow versions imports.
Enabled: The importer will update the lists/libraries if needed.
Disabled: In case the feature is needed in the import process but not allowed by the target lists/libraries and error will be thrown.
autoAdjustAttachments
Enable/Disable whether the lists should be updated to allow attachments imports.
Enabled: The importer will update the lists if needed.
Disabled: In case the feature is needed in the import process but not allowed by the target lists/libraries and error will be thrown.
autoAddContentTypes
Enable/Disable whether the lists/libraries should be updated to allow items with custom content type imports.
Enabled: The importer will update the lists if needed.
Disabled: In case the feature is needed in the import process but not allowed by the target lists/libraries and error will be thrown.
autoCreateFolders
Enable/disable whether folders which do not exist should be created during import.
Enabled: the importer created any specified folders automatically.
Disabled: no new folders are created, but any existing folders (same path and name) are used. References to non-existing folders throw an error.
proxyURL
This is the URL, which defines the location of your proxy to connect to the Internet. This parameter can be left blank if no proxy is used to connect to the internet.
proxyUsername
The login of the user, who connects and authenticates on the proxy, which was specified in parameter proxyURL.
Example: corporatedomain\username
proxyPassword
Password of the proxy user specified above
loggingLevel*
Sets the verbosity of the log file.
Values:
1 - logs only errors during scan
2 - is the default value reporting all warnings and errors
3- logs all successfully performed operations in addition to any warnings or errors
4 - logs all events (for debugging only, use only if instructed by fme product support since it generates a very large amount of output. Do not use in production)
Mandatory
Configuration parameters
Values
contentType
Rule setting the content type.
Example: Task
Mandatory
This value must match existing migration-center object type definitions; see paragraph Object Type definitions
copyRoleAssignments
Specify if roles from the parent object are copied or not. This system rule is necessary only if the system rule roleAssignments is set, otherwise it is not used.
By default, this value is set to false.
fileExtension
Specify the file extension of the file name that is used to upload the content file to SharePoint.
See also section Object Values
isMajorVersion
Specify, if the current object will be created as a major or minor version in the SharePoint target library.
Example:
TRUE or YES or 1 (check in as major version)
FALSE or NO or 0 (check in as minor version)
library
Specify the name of the library, where to import the current object
Mandatory
mc_content_location
Specify the location of a document’s content file.
If not set, the default content location (i.e. where the document has been scanned from) will be used automatically.
Set a different path (including filename) if the content has moved since it was scanned, or if content should be substituted with another file
parentFolder
Rule which sets the path for the current object inside the targeted Document Library item
Example: /username/myfolder/folder
Mandatory
This rule must contain a value. If the current object shall be imported on root level of the targeted library, specify a forward slash “/”
roleAssignments
Specify role assignments for the current object. If a role definition is assigned to the current object, the migration-center SharePoint Importer breaks the role inheritance.
It is possible to specify either a list of users and/or a list of SharePoint groups. If a group is specified, it is necessary, that the targeted SharePoint Group exists in your SharePoint Site.
To set role assignments for single users, the value must match the following pattern:
<loginname>;#roledefinitionname
To set role assignments for SharePoint groups, the value must match the following pattern:
@<group name>;#roledefinitionname
Example:user;#Read
@Contributors;#Contribute
site
Specify the target (sub-) site relative to your site collection, which was specified in the SharePoint Importer adapter parameters.
Mandatory
This rule must contain a value. If the current object shall be imported on root level of the targeted site collection, specify a forward slash “/”
Configuration parameters
Values
contentType
Rule setting the content type.
Example: Task
Mandatory
This value must match existing migration-center object type definitions; see paragraph Object Type definitions
library
Specify the name of the library, where to import the current object
Mandatory
name
Rule which sets the name of the current folder inside the targeted Document Library item
Example: /my first folder
Mandatory
parentFolder
Rule which sets the path for the current object inside the targeted Document Library item
Example: /username/myfolder/folder
Mandatory
This rule must contain a value. If the current object shall be imported on root level of the targeted library, specify a forward slash “/”
copyRoleAssignments
Specify if roles from the parent object are copied or not. This system rule is necessary only if the system rule roleAssignments is set, otherwise it is not used.
By default, this value is set to false.
roleAssignments
Specify role assignments for the current object. If a role definition is assigned to the current object, the migration-center SharePoint Importer breaks the role inheritance.
It is possible to specify either a list of users and/or a list of SharePoint groups. If a group is specified, it is necessary, that the targeted SharePoint Group exists in your SharePoint Site.
To set role assignments for single users, the value must match the following pattern:
<loginname>;#roledefinitionname
To set role assignments for SharePoint groups, the value must match the following pattern:
@<group name>;#roledefinitionname
Example:user;#Read
@Contributors;#Contribute
site
Specify the target (sub-) site relative to your site collection, which was specified in the SharePoint Importer adapter parameters.
Mandatory
This rule must contain a value. If the current object shall be imported on root level of the targeted site collection, specify a forward slash “/”
Configuration parameters
Values
contentType
Rule setting the content type.
Example: Task
Mandatory
This value must match existing migration-center object type definitions; see paragraph Object Type definitions
isMajorVersion
Specify, if the current object will be created as a major or minor version in the SharePoint target library.
Example:
TRUE or YES or 1 (check in as major version)
FALSE or NO or 0 (check in as minor version)
library
Specify the name of the library, where to import the current object
Mandatory
parentFolder
Rule which sets the path for the current object inside the targeted Document Library item
Example: /username/myfolder/folder
Mandatory
This rule must contain a value. If the current object shall be imported on root level of the targeted library, specify a forward slash “/”
relationLinkField
Rule which sets a list of column names, where to insert links to AttachmentRelation associated with the current object. This rule can be left blank, if no links shall be inserted.
copyRoleAssignments
Specify if roles from the parent object are copied or not. This system rule is necessary only if the system rule roleAssignments is set, otherwise it is not used.
By default, this value is set to false.
roleAssignments
Specify role assignments for the current object. If a role definition is assigned to the current object, the migration-center SharePoint Importer breaks the role inheritance.
It is possible to specify either a list of users and/or a list of SharePoint groups. If a group is specified, it is necessary, that the targeted SharePoint Group exists in your SharePoint Site.
To set role assignments for single users, the value must match the following pattern:
<loginname>;#roledefinitionname
To set role assignments for SharePoint groups, the value must match the following pattern:
@<group name>;#roledefinitionname
Example: user;#Read
@Contributors;#Contribute
site
Specify the target (sub-) site relative to your site collection, which was specified in the SharePoint Importer adapter parameters.
Mandatory
This rule must contain a value. If the current object shall be imported on root level of the targeted site collection, specify a forward slash “/”
Configuration parameters
Values
baseTemplate
Rule setting the base template for this list. A list of valid values can be found in the appendix.
Mandatory
This value must match existing migration-center object type definitions; see paragraph Object Type definitions
copyRoleAssignments
Specify if roles from the parent object are copied or not. This system rule is necessary only if the system rule roleAssignments is set, otherwise it is not used.
By default, this value is set to false.
roleAssignments
Specify role assignments for the current object. If a role definition is assigned to the current object, the migration-center SharePoint Importer breaks the role inheritance.
It is possible to specify either a list of users and/or a list of SharePoint groups. If a group is specified, it is necessary, that the targeted SharePoint Group exists in your SharePoint Site.
To set role assignments for single users, the value must match the following pattern:
<loginname>;#roledefinitionname
To set role assignments for SharePoint groups, the value must match the following pattern:
@<group name>;#roledefinitionname
Example: user;#Read
@Contributors;#Contribute
site
Specify the target (sub-) site relative to your site collection, which was specified in the SharePoint Importer adapter parameters.
Mandatory
This rule must contain a value. If the current object shall be imported on root level of the targeted site collection, specify a forward slash “/”
title
Specify the title of the list or library, which will be created
mc_content_location
Name
fileExtension
Result filename
-
-
-
content.dat
\\Fileshare\Migration\Conversion\invoice.pdf
-
-
invoice.pdf
-
MyContent
-
MyContent.dat
-
MyContent
MyContent.pdf
-
-
content.pdf
Template Name
Template Identifier
Description
GenericList
100
A basic list which can be adapted for multiple purposes.
DocumentLibrary
101
Contains a list of documents and other files.
Survey
102
Fields on a survey list represent questions that are asked of survey participants. Items in a list represent a set of responses to a particular survey.
Links
103
Contains a list of hyperlinks and their descriptions.
Announcements
104
Contains a set of simple announcements.
Contacts
105
Contains a list of contacts used for tracking people in a site.
Events
106
Contains a list of single and recurring events. An events list typically has special views for displaying events on a calendar.
Tasks
107
Contains a list of items that represent completed and pending work items.
DiscussionBoard
108
Contains discussions topics and their replies.
PictureLibrary
109
Contains a library adapted for storing and viewing digital pictures.
DataSources
110
Contains data connection description files. - hidden
XmlForm
115
Contains XML documents. An XML form library can also contain templates for displaying and editing XML files via forms, as well as rules for specifying how XML data is converted to and from list items.
NoCodeWorkflows
117
Contains additional workflow definitions that describe new processes that can be used within lists. These workflow definitions do not contain advanced code-based extensions. - hidden
WorkflowProcess
118
Contains a list used to support execution of custom workflow process actions. - hidden
WebPageLibrary
119
Contains a set of editable Web pages.
CustomGrid
120
Contains a set of list items with a grid-editing view.
WorkflowHistory
140
Contains a set of history items for instances of workflows.
GanttTasks
150
Contains a list of tasks with specialized Gantt views of task data.
IssueTracking
1100
Contains a list of items used to track issues.
Attribute
Type
Description
ContentTypesEnabled
Boolean
Gets or sets a value that specifies whether content types are enabled for the list.
DefaultContentApprovalWorkflowId
String
Gets or sets a value that specifies the default workflow identifier for content approval on the list. Returns an empty GUID if there is no default content approval workflow.
DefaultDisplayFormUrl
String
Gets or sets a value that specifies the location of the default display form for the list. Clients specify a server-relative URL, and the server returns a site-relative URL
DefaultEditFormUrl
String
Gets or sets a value that specifies the URL of the edit form to use for list items in the list. Clients specify a server-relative URL, and the server returns a site-relative URL.
DefaultNewFormUrl
String
Gets or sets a value that specifies the location of the default new form for the list. Clients specify a server-relative URL, and the server returns a site-relative URL.
Description
String
Gets or sets a value that specifies the description of the list.
Direction
String
Gets or sets a value that specifies the reading order of the list. Returns "NONE", "LTR", or "RTL".
DocumentTemplateUrl
String
Gets or sets a value that specifies the server-relative URL of the document template for the list. Returns a server-relative URL if the base type is DocumentLibrary, otherwise returns null.
DraftVersionVisibility
Number
Gets or sets a value that specifies the minimum permission required to view minor versions and drafts within the list. Represents an SP.DraftVisibilityType value: Reader = 0; Author = 1; Approver = 2.
EnableAttachments
Boolean
Gets or sets a value that specifies whether list item attachments are enabled for the list.
EnableFolderCreation
Boolean
Gets or sets a value that specifies whether new list folders can be added to the list.
EnableMinorVersions
Boolean
Gets or sets a value that specifies whether minor versions are enabled for the list.
EnableModeration
Boolean
Gets or sets a value that specifies whether content approval is enabled for the list.
EnableVersioning
Boolean
Gets or sets a value that specifies whether historical versions of list items and documents can be created in the list.
ForceCheckout
Boolean
Gets or sets a value that indicates whether forced checkout is enabled for the document library.
Hidden
Boolean
Gets or sets a Boolean value that specifies whether the list is hidden. If true, the server sets the OnQuickLaunch property to false.
IrmEnabled
Boolean
IrmExpire
Boolean
IrmReject
Boolean
IsApplicationList
Boolean
Gets or sets a value that specifies a flag that a client application can use to determine whether to display the list.
LastItemModifiedDate
DateTime
Gets a value that specifies the last time a list item, field, or property of the list was modified.
MultipleDataList
Boolean
Gets or sets a value that indicates whether the list in a Meeting Workspace site contains data for multiple meeting instances within the site.
NoCrawl
Boolean
Gets or sets a value that specifies that the crawler must not crawl the list.
OnQuickLaunch
Boolean
Gets or sets a value that specifies whether the list appears on the Quick Launch of the site. If true, the server sets the Hidden property to false.
ValidationFormula
String
Gets or sets a value that specifies the data validation criteria for a list item. Its length must be <= 1023.
ValidationMessage
String
Gets or sets a value that specifies the error message returned when data validation fails for a list item. Its length must be <= 1023.
Configuration parameters
Values
Name
Enter a unique name for this importer
Mandatory
Adapter type
Select the “Veeva Clinical” adapter from the list of available adapters
Mandatory
Location
Select the Job Server location where this job should be run. Job Servers are defined in the Jobserver window. If no Job Server was selected, migration-center will prompt the user to define a Job Server Location when saving the importer.
Mandatory
Description
Enter a description for this importer (optional)
Configuration parameters
Values
username*
Veeva username. It must be a Vault Owner.
Mandatory
password*
The user’s password.
Mandatory
server*
Veeva Vault name. Ex: fme-clinical.veevavault.com
Mandatory
proxyServer
The name or IP of the proxy server if there is any.
proxyPort
The port for the proxy server.
proxyUser
The username if required by the proxy server.
proxyPassword
The password if required by the proxy server.
attributeMappingLocation
The path of the configuration file that will be used for setting the references to the existing master data objects when importing documents. See Setting References to Existing... for more details.
preserveVersionBinding
Indicates if the version binding will be preserved when importing virtual documents from Documentum as binders in Veeva Vault. See Working with Binders for more details.
importRelations
Indicates if relations between documents will imported. If checked, the relations between the documents being imported will be imported as well. If not checked the relations will not be imported. They can be imported in another run by assigning the same migsets to the importer.
batchSize*
The number of documents or versions that will be imported in a single bulk operation.
Mandatory
skipUploadToFTP
If not checked, the importer will copy the documents content to the FTP server during import from the location where they have been exported by the scanner.
If checked, the importer presumes the documents content was copied to the FTP server prior starting the import. In this case all paths of the content in the system rules mc_content_location, mc_rendition_paths, attachments or submissionFilePath must be set with the corresponding paths on the FTP server.
loggingLevel*
Sets the verbosity of the log file.
Values:
1 - logs only errors during scan
2 - is the default value reporting all warnings and errors
3 - logs all successfully performed operations in addition to any warnings or errors
4 - logs all events (for debugging only, use only if instructed by fme product support since it generates a very large amount of output. Do not use in production)
Mandatory
Configuration name
Description
threshold_burst_limit_remaining
This is set with the number of API calls remaining for the current 5-minutes burst limit before the importer enters in sleep mode and waits for the next burst interval.
Default value = 500
threshold_daily_limit_remaining
This is set with the number of API calls remaining for the current 24-hours daily window before the importer stops importing objects.
An appropriate message is logged in the report log when daily limit is reached.
Default value = 50,000
client_id_prefix
This is the prefix of the Client ID that is passed to every Vault API call. The Client ID is composed from this prefix followed by the id of the job run. The Client ID is always logged in the report log.
Default value= fme-mc-importer
request_timeout
The time in milliseconds the importer waits for response after every API call.
Default value = 60,000
System Rule
Description
name__v
It must be set with the name of the object being imported. For some objects the “name__v” field is configured to be generated automatically by Veeva Vault when object is created. Usually, “name__v” is configured to be unique so if duplicated values are provided to the importer the affected objects fail to import.
Optional
object_name
The name of the object. Ex: study__v, study_country__v
Mandatory
target_type
The name of the migration-center internal object type that is used in the association. Ex: veeva-quality-object
Mandatory
attachments
Can be used to set file attachments to the object just in case it allows that (The "Allow attachments" option has to be set on true).
You must provide the full path (UNC or local filesystem on the job server) to the attachment file. Multiple files can be specified.
If the parameter skipUploadToFTP is checked, you must provide the FTP paths to the files.
Optional
System Rule
Description
type__v*
The Type from the Veeva Document Types tree
Mandatory
subtype__v
The SubType from the Veeva Document Types tree
classification__v
The Classification from the Veeva Document Types tree
file_extension
Optional rule for:
- setting the extension of the imported document in case the source content does not have any extension in the filesystem. - overriding the existing extension from the source content on the filesystem
lifecycle__v*
The name of the lifecycle that should be attached to the document.
Mandatory
status__v*
The name of the lifecycle status
Mandatory
major_version_number__v
Rule for indicating the major version number
minor_version_number__v
Rule for indicating the minor version number
mc_content_location
Optional rule for importing the content from another location than the one exported by the scanner.
If the skipUploadToFTP parameter is checked in the adapter definition then the adapter will suppose that this parameter contains the FTP path for every document.
mc_rendition_paths
Path on the filesystem or FTP where the renditions are located.
Must be the same number of values as mc_rendition_types. See Renditions for more details.
mc_rendition_types
The rendition type from the available Veeva types for the specified document type.
Must be the same number of values as mc_rendition_paths.
name__v
The name of the document
parent_section
Optional rule used for specifying the section names in the binders where the documents will be imported.
permissions
Allows setting permissions for the imported document.
The format is the following:
{Role_name}.{USERS or GROUPS}=username1,username2,…
Example:
reviewer__v.users=user1@vvtechpartner.com,user2@vvtechpartner.com
mc_custom_role__c.groups=approver_group__c
attachments
Should be set with the file paths of the attachments to set for the current document.
If the parameter skipUploadToFTP is not checked then the values should contain the paths on the filesystem to the files otherwise, the rule will contain the FTP paths of the attachment files.
target_type*
The name of the object type from the migration-center’s object type definitions. Ex: veeva-rim-document
Mandatory
mc_content_location
This rule offers the possibility to overwrite the location of the objects content.
System Rule
Description
type__v*
The Type from the Veeva Document Types tree
Mandatory
subtype__v
The SubType from the Veeva Document Types tree
classification__v
The Classification from the Veeva Document Types tree
lifecycle__v*
Should be set with the name of the lifecycle where the document will be attached.
Mandatory
status__v*
The name of the lifecycle status
Mandatory
major_version_number__v
Rule for indicating the major version number (does not work for binders)
minor_version_number__v
Rule for indicating the minor version number (does not work for binders)
mc_content_location
Optional rule for importing the content from another location than the one exported by the scanner.
If the skipUploadToFTP parameter is checked in the adapter definition then the adapter will suppose that this parameter contains the FTP path of the file.
name__v
The name of the document
permissions
Allows setting permissions for the imported document.
The format is the following:
{Role_name}.{USERS or GROUPS}=username1,username2,…
Example:
reviewer__v.users=user1@vvtechpartner.com,user2@vvtechpartner.com
mc_custom_role__c.groups=approver_group__c
template_name
The name of the template the Binder will be created from. Leave it empty to create the binder from scratch.
target_type*
The name of the object type from the migration-center’s object type definitions. Ex: veeva-clinical-document
Mandatory
System Rule
Description
application_number
Application number the submission belongs to
submission__v
The submission name
submission_file_path
The path to the zip file containing the submission archive.
If the skipUploadToFTP parameter is set in the adapter definition then the adapter will suppose that this parameter contains the FTP path of the zip file.
target_type
The name of the object type from the migration-center’s object type definitions: veeva-rim-submission
Configuration parameters
Values
appClientId
The ID of the migration-center Azure AD application.
appCertificatePath
The full path to the certificate .PFX file, which you have generated when setting up the Azure AD application.
appCertificatePassword
The password to read the certificate specified in appCertificatePath.
Configuration parameters
Values
appClientId
The ID of the SharePoint application you have created.
appClientSecret
The client secret, which you have generated when setting up the SharePoint application.
Configuration parameters
Values
Name
Enter a unique name for this importer
Mandatory
Adapter type
Select SharePoint Online from the list of available adapters
Mandatory
Location
Select the Job Server location where this job should be run. Job Servers are defined in the Jobserver window. If no Job Server has been created by the user to this point, migration-center will prompt the user to define a Job Server Location when saving the Importer.
Mandatory
Description
Enter a description for this job (optional)
Configuration parameters
Values
tenantName*
The name of your SharePoint Online Tenant
Example: Contoso
Mandatory
There are several web site that explain how to determine a SharePoint Online tenant name, e.g. https://morgantechspace.com/2019/07/how-to-find-your-tenant-name-in-office-365.html
tenantUrl*
siteName*
The path to your target site collection for the import.
Example: /sites/My Site
Mandatory
appClientId*
The ID of either the migration-center Azure AD application or the SharePoint application.
Example: ab187da0-c04d-4f82-9f43-51f41c0a3bf0
Mandatory
See section Configuration
appCertificatePath
The full path to the certificate .PFX file, which you have generated when setting up the Azure AD application.
Example: D:\migration-center\config\azure-ad-app-cert.pfx
Mandatory for Azure AD app
See section Configuration
appCertificatePassword
The password to read the certificate specified in appCertificatePath.
Mandatory for Azure AD app
See section Configuration
appClientSecret
The client secret, which you have generated when setting up the SharePoint application (SharePoint app-only principal authentication).
Mandatory for SharePoint app
See section Configuration
autoAdjustVersioning
Enable/Disable whether the lists/libraries should be updated to allow versions imports.
Enabled: The importer will update the lists/libraries if needed.
Disabled: In case the feature is needed in the import process but not allowed by the target lists/libraries and error will be thrown.
autoAdjustAttachments
Enable/Disable whether the lists should be updated to allow attachments imports.
Enabled: The importer will update the lists if needed.
Disabled: In case the feature is needed in the import process but not allowed by the target lists/libraries and error will be thrown.
autoAddContentTypes
Enable/Disable whether the lists/libraries should be updated to allow items with custom content type imports.
Enabled: The importer will update the lists if needed.
Disabled: In case the feature is needed in the import process but not allowed by the target lists/libraries and error will be thrown.
autoCreateFolders
Enable/disable whether folders which do not exist should be created during import.
Enabled: the importer created any specified folders automatically.
Disabled: no new folders are created, but any existing folders (same path and name) are used. References to non-existing folders throw an error.
proxyURL
This is the URL, which defines the location of your proxy to connect to the Internet. This parameter can be left blank if no proxy is used to connect to the internet.
proxyUsername
The login of the user, who connects and authenticates on the proxy, which was specified in parameter proxyURL.
Example: corporatedomain\username
proxyPassword
Password of the proxy user specified above
checkContentIntegrity
If checked the importer will check the integrity of each document that has a checksum generated by the scanner, by downloading a copy of the content after the import and comparing the two checksums.
Please see chapter Content Integrity Check for more details.
loggingLevel*
Sets the verbosity of the log file.
Values:
1 - logs only errors during scan
2 - is the default value reporting all warnings and errors
3- logs all successfully performed operations in addition to any warnings or errors
4 - logs all events (for debugging only, use only if instructed by fme product support since it generates a very large amount of output. Do not use in production)
Mandatory
Configuration parameters
Values
contentType
Rule setting the content type.
Example: Task
Mandatory
This value must match existing migration-center object type definitions; see paragraph 0
Object Type definition
copyRoleAssignments
Specify if roles from the parent object are copied or not. This system rule is necessary only if the system rule roleAssignments is set, otherwise it is not used.
By default, this value is set to false.
fileExtension
Specify the file extension of the file name that is used to upload the content file to SharePoint.
See section Object Values
isMajorVersion
Specify, if the current object will be created as a major or minor version in the SharePoint Online target library.
Example:
TRUE or YES or 1 (check in as major version)
FALSE or NO or 0 (check in as minor version)
library
Specify the name of the library, where to import the current object
Mandatory
mc_content_location
Specify the location of a document’s content file.
If not set, the default content location (i.e. where the document has been scanned from) will be used automatically.
Set a different path (including filename) if the content has moved since it was scanned, or if content should be substituted with another file
parentFolder
Rule which sets the path for the current object inside the targeted Document Library item
Example: /username/myfolder/folder
Mandatory
This rule must contain a value. If the current object shall be imported on root level of the targeted library, specify a forward slash “/”
roleAssignments
Specify role assignments for the current object. If a role definition is assigned to the current object, the migration-center SharePoint Online Importer breaks the role inheritance.
It is possible to specify either a list of users and/or a list of SharePoint groups. If a group is specified, it is necessary, that the targeted SharePoint Group exists in your SharePoint Site.
To set role assignments for single users, the value must match the following pattern: <loginname>;#roledefinitionname
To set role assignments for SharePoint groups, the value must match the following pattern: @<group name>;#roledefinitionname
To set role assignments for AD domain groups, you must specify the group's login name, which is: c:0o.c|federateddirectoryclaimprovider|<group guid>
Examples:
spoAdmin@companyname.onmicrosoft.com;#Read
@Contributors;#Contribute
c:0o.c|federateddirectoryclaimprovider|f919916d-15d1-4b13-8cbd-1973d4e50671;#Read
site
Specify the target (sub-) site relative to your site collection, which was specified in the SharePoint Online Importer adapter parameters.
Mandatory
This rule must contain a value. If the current object shall be imported on root level of the targeted site collection, specify a forward slash “/”
Configuration parameters
Values
contentType
Rule setting the content type.
Example: Task
Mandatory
This value must match existing migration-center object type definitions.
library
Specify the name of the library, where to import the current object.
Mandatory
name
Rule which sets the name of the current folder inside the targeted Document Library item
Example: /my first folder
Mandatory
parentFolder
Rule which sets the path for the current object inside the targeted Document Library item.
Example: /username/myfolder/folder
Mandatory
This rule must contain a value. If the current object shall be imported on root level of the targeted library, specify a forward slash “/”.
copyRoleAssignments
Specify if roles from the parent object are copied or not. This system rule is necessary only if the system rule roleAssignments is set, otherwise it is not used.
By default, this value is set to false.
roleAssignments
Specify role assignments for the current object. If a role definition is assigned to the current object, the migration-center SharePoint Online Importer breaks the role inheritance.
It is possible to specify either a list of users and/or a list of SharePoint groups. If a group is specified, it is necessary, that the targeted SharePoint Group exists in your SharePoint Site.
To set role assignments for single users, the value must match the following pattern:
<loginname>;#roledefinitionname
To set role assignments for SharePoint groups, the value must match the following pattern:
@<group name>;#roledefinitionname
Example:
spoAdmin@companyname.onmicrosoft.com;#Read
@Contributors;#Contribute
site
Specify the target (sub-) site relative to your site collection, which was specified in the SharePoint Online Importer adapter parameters.
Mandatory
This rule must contain a value. If the current object shall be imported on root level of the targeted site collection, specify a forward slash “/”.
Configuration parameters
Values
contentType
Rule setting the content type.
Example: Task
Mandatory
This value must match existing migration-center object type definitions.
isMajorVersion
Specify, if the current object will be created as a major or minor version in the SharePoint Online target library.
Example:
TRUE or YES or 1 (check in as major version)
FALSE or NO or 0 (check in as minor version)
library
Specify the name of the library, where to import the current object
Mandatory
parentFolder
Rule which sets the path for the current object inside the targeted Document Library item
Example: /username/myfolder/folder
Mandatory
This rule must contain a value. If the current object shall be imported on root level of the targeted library, specify a forward slash “/”
relationLinkField
Rule which sets a list of column names, where to insert links to AttachmentRelation associated with the current object. This rule can be left blank, if no links shall be inserted.
copyRoleAssignments
Specify if roles from the parent object are copied or not. This system rule is necessary only if the system rule roleAssignments is set, otherwise it is not used.
By default, this value is set to false.
roleAssignments
Specify role assignments for the current object. If a role definition is assigned to the current object, the migration-center SharePoint Online Importer breaks the role inheritance.
It is possible to specify either a list of users and/or a list of SharePoint groups. If a group is specified, it is necessary, that the targeted SharePoint Group exists in your SharePoint Site.
To set role assignments for single users, the value must match the following pattern:
<loginname>;#roledefinitionname
To set role assignments for SharePoint groups, the value must match the following pattern:
@<group name>;#roledefinitionname
Example:spoAdmin@companyname.onmicrosoft.com;#Read
@Contributors;#Contribute
site
Specify the target (sub-) site relative to your site collection, which was specified in the SharePoint Online Importer adapter parameters.
Mandatory
This rule must contain a value. If the current object shall be imported on root level of the targeted site collection, specify a forward slash “/”.
Configuration parameters
Values
baseTemplate
Rule setting the base template for this list. A list of valid values can be found in the appendix.
Mandatory
This value must match existing migration-center object type definitions.
copyRoleAssignments
Specify if roles from the parent object are copied or not. This system rule is necessary only if the system rule roleAssignments is set, otherwise it is not used.
By default, this value is set to false.
roleAssignments
Specify role assignments for the current object. If a role definition is assigned to the current object, the migration-center SharePoint Online Importer breaks the role inheritance.
It is possible to specify either a list of users and/or a list of SharePoint groups. If a group is specified, it is necessary, that the targeted SharePoint Group exists in your SharePoint Site.
To set role assignments for single users, the value must match the following pattern:
<loginname>;#roledefinitionname
To set role assignments for SharePoint groups, the value must match the following pattern:
@<group name>;#roledefinitionname
Example:spoAdmin@companyname.onmicrosoft.com;#Read
@Contributors;#Contribute
site
Specify the target (sub-) site relative to your site collection, which was specified in the SharePoint Online Importer adapter parameters.
Mandatory
This rule must contain a value. If the current object shall be imported on root level of the targeted site collection, specify a forward slash “/”.
title
Specify the title of the list or library, which will be created.
mc_content_location
Name
fileExtension
Result filename
-
-
-
content.dat
\\Fileshare\Migration\Conversion\invoice.pdf
-
-
invoice.pdf
-
MyContent
-
MyContent.dat
-
MyContent
MyContent.pdf
-
-
content.pdf
Template Name
Template Identifier
Description
GenericList
100
A basic list which can be adapted for multiple purposes.
DocumentLibrary
101
Contains a list of documents and other files.
Survey
102
Fields on a survey list represent questions that are asked of survey participants. Items in a list represent a set of responses to a particular survey.
Links
103
Contains a list of hyperlinks and their descriptions.
Announcements
104
Contains a set of simple announcements.
Contacts
105
Contains a list of contacts used for tracking people in a site.
Events
106
Contains a list of single and recurring events. An events list typically has special views for displaying events on a calendar.
Tasks
107
Contains a list of items that represent completed and pending work items.
DiscussionBoard
108
Contains discussions topics and their replies.
PictureLibrary
109
Contains a library adapted for storing and viewing digital pictures.
DataSources
110
Contains data connection description files. - hidden
XmlForm
115
Contains XML documents. An XML form library can also contain templates for displaying and editing XML files via forms, as well as rules for specifying how XML data is converted to and from list items.
NoCodeWorkflows
117
Contains additional workflow definitions that describe new processes that can be used within lists. These workflow definitions do not contain advanced code-based extensions. - hidden
WorkflowProcess
118
Contains a list used to support execution of custom workflow process actions. - hidden
WebPageLibrary
119
Contains a set of editable Web pages.
CustomGrid
120
Contains a set of list items with a grid-editing view.
WorkflowHistory
140
Contains a set of history items for instances of workflows.
GanttTasks
150
Contains a list of tasks with specialized Gantt views of task data.
IssueTracking
1100
Contains a list of items used to track issues.
Attribute
Type
Description
ContentTypesEnabled
Boolean
Gets or sets a value that specifies whether content types are enabled for the list.
DefaultContentApprovalWorkflowId
String
Gets or sets a value that specifies the default workflow identifier for content approval on the list. Returns an empty GUID if there is no default content approval workflow.
DefaultDisplayFormUrl
String
Gets or sets a value that specifies the location of the default display form for the list. Clients specify a server-relative URL, and the server returns a site-relative URL
DefaultEditFormUrl
String
Gets or sets a value that specifies the URL of the edit form to use for list items in the list. Clients specify a server-relative URL, and the server returns a site-relative URL.
DefaultNewFormUrl
String
Gets or sets a value that specifies the location of the default new form for the list. Clients specify a server-relative URL, and the server returns a site-relative URL.
Description
String
Gets or sets a value that specifies the description of the list.
Direction
String
Gets or sets a value that specifies the reading order of the list. Returns "NONE", "LTR", or "RTL".
DocumentTemplateUrl
String
Gets or sets a value that specifies the server-relative URL of the document template for the list. Returns a server-relative URL if the base type is DocumentLibrary, otherwise returns null.
DraftVersionVisibility
Number
Gets or sets a value that specifies the minimum permission required to view minor versions and drafts within the list. Represents an SP.DraftVisibilityType value: Reader = 0; Author = 1; Approver = 2.
EnableAttachments
Boolean
Gets or sets a value that specifies whether list item attachments are enabled for the list.
EnableFolderCreation
Boolean
Gets or sets a value that specifies whether new list folders can be added to the list.
EnableMinorVersions
Boolean
Gets or sets a value that specifies whether minor versions are enabled for the list.
EnableModeration
Boolean
Gets or sets a value that specifies whether content approval is enabled for the list.
EnableVersioning
Boolean
Gets or sets a value that specifies whether historical versions of list items and documents can be created in the list.
ForceCheckout
Boolean
Gets or sets a value that indicates whether forced checkout is enabled for the document library.
Hidden
Boolean
Gets or sets a Boolean value that specifies whether the list is hidden. If true, the server sets the OnQuickLaunch property to false.
IrmEnabled
Boolean
IrmExpire
Boolean
IrmReject
Boolean
IsApplicationList
Boolean
Gets or sets a value that specifies a flag that a client application can use to determine whether to display the list.
LastItemModifiedDate
DateTime
Gets a value that specifies the last time a list item, field, or property of the list was modified.
MultipleDataList
Boolean
Gets or sets a value that indicates whether the list in a Meeting Workspace site contains data for multiple meeting instances within the site.
NoCrawl
Boolean
Gets or sets a value that specifies that the crawler must not crawl the list.
OnQuickLaunch
Boolean
Gets or sets a value that specifies whether the list appears on the Quick Launch of the site. If true, the server sets the Hidden property to false.
ValidationFormula
String
Gets or sets a value that specifies the data validation criteria for a list item. Its length must be <= 1023.
ValidationMessage
String
Gets or sets a value that specifies the error message returned when data validation fails for a list item. Its length must be <= 1023.