Alfresco Scanner

Introduction

The Alfresco Scanner allows extracting object such as document, folders and lists and saves this data to migration-center for further processing. The key features of Alfresco Scanner are:

  • Extract documents, folders, custom lists and list items

  • Extract content, metadata

  • Extract documents versions

The following versions of Alfresco are supported (on Windows or Linux): 4.0, 4.1, 4.2, 5.2, 6.1.1, 6.2.0, 7.1, 7.2, 7.3.1.

Known issues & limitations

  • Last version content is missing online edits when cm:autoVersion was false and then it's switched to true before scanning (#55983)

Install/Uninstall Alfresco Scanner

Alfresco Jobserver

The Alfresco connectors are not included in the standard migration-center Jobserver but it is delivered packaged as Alfresco Module Package (.amp) which has to be installed in the Alfresco Repository Server. This amp file contains an entire Jobserver that will run under the Alfresco's Tomcat, and contains only the Alfresco connectors in it. For using other connectors please install the regular Server Components as it is described in the Installation Guide and use that one.

The installation kit contains two AMP files, one for Java 11 and one for Java 8. Use the corect one deppending on the java the Tomcat the Alfresco Server is running on.

To use the Alfresco Scanner, your scanner configuration must use the Alfresco Server as a Jobserver, with port 9701 by default.

Install Alfresco Scanner

The first step of the installation is to copy mc-alfresco-adaptor-<version>.amp file in the “amps-folder” of the alfresco installation.

The last step is to finish the installation by installing the mc-alfresco-adaptor-<version>.amp file as described in the Alfresco documentation: https://docs.alfresco.com/content-services/latest/install/zip/amp

Before doing this, please backup your original alfresco.war and share.war files to ensure that you can uninstall the migration-center Jobserver after successful migration. This is the only way at the moment as long the Module Management Tool of Alfresco does not support to remove a module from an existing WAR-file.

The Alfresco-Server should be stopped when applying the amp-files. Please notice that Alfresco provides files for installing the amp files, e.g.:

C:\Alfresco\apply_amps.bat (Windows)

/opt/alfresco/commands/apply_amps.sh (Linux)

Due to a bug in older versions of the Alfresco installer under Windows, please be careful that the amp installer via apply_amps.sh works correctly!

Uninstall Alfresco Scanner

The Alfresco Scanner can be uninstalled by following steps:

  • Stop the Alfresco Server.

  • Restore the original alfresco.war and share.war which have been backed up before Alfresco Scanner installation

  • Remove the file mc-alfresco-adaptor-<version>.amp from the “amps-folder”

Scanner configuration

To create a new Alfresco Scanner, create a new scanner and select Alfresco from the Adapter Type drop-down. Once the adapter type has been selected, the Parameters list will be populated with the parameters specific to the selected adapter type. Mandatory parameters are marked with an *.

The Properties of an existing scanner can be accessed after creating the scanner by double-clicking the scanner in the list, or selecting the Properties button/menu item from the toolbar/context menu. A description is always displayed at the bottom of the window for the selected parameter.

Multiple scanners can be created for scanning different locations, provided each scanner has a unique name.

Scanner parameters

The common adaptor parameters are described in Common Parameters.

The configuration parameters available for the Alfresco Scanner are described below:

  • username*

    User name for connecting to the source repository. A user account with admin privileges must be used to support the full Alfresco functionality offered by migration-center.

    Example: Alfresco.corporate.domain\spadmin

  • password*

    Password of the user specified above

  • scanLocations

    The entry point(s) in the Alfresco repository where the scan starts.

    Multiple values can be entered by separating them with the “|” character.

    Needs to follow the Alfresco Repository folder structure, ex:

    /Sites/SomeSite/documentLibrary/Folder/AnotherFolder

    /Sites/SomeSite/dataLists/02496772-2e2b-4e5b-a966-6a725fae727a

    Valid scan locations: an entire site, a specific library, a specific folder in a library, a specific data list.

    If one location is invalid the scanner will report an appropriate error to the user so it will not start.

  • contentLocation*

    Folder path. The location where the exported object content should be temporary saved. It can be a local folder on the same machine with the Job Server or a shared folder on the network. This folder must exist prior to launching the scanner. The Jobserver must have write permissions for the folder. This path must be accessible by both scanner and importer so if they are running on different machines, it should be a shared folder.

  • exportLatestVersions

    This parameter specifies how many versions from every version tree will be exported starting from the latest version to the older versions. If it is empty, not a valid number, 0 or negative or greater than the latest "n" versions, all versions will be exported.

  • exportContent

    Setting this parameter to true will extract the actual content of the documents during the scan and save it in the contentLocation specified earlier.

    This setting should always be checked in a production environment.

  • dissolveGroups

    Setting this parameter to true will cause every group permission to be scanned as the separate users that make up the group

  • excludeAttributes

    List of attributes to be excluded from scanning. Multiple values can be entered by separating them with the “|” character.

  • loggingLevel*

    See Common Parameters.

Parameters marked with an asterisk (*) are mandatory.

Last updated