Veeva Scanner

Introduction

The Veeva Scanner allows extracting documents, audit trails and their related information from any supported Veeva Vault.

Currently, the Veeva Connector uses version 24.3 of the Veeva REST API.

Prerequisites

Network

TThe scanner uses FTP/S to download the content of the documents and their versions from the Veeva Vault environment to the selected export location. This means the content will be exported first from the Veeva Vault to the Veeva FTP server and the Veeva scanner will then download the content files via FTP/S from the Veeva FTP server. So, the necessary outbound ports (i.e. TCP 21 and TCP 56000 – 56100) should be opened in your firewalls as described in the Veeva documentation:

https://platform.veevavault.help/en/gr/38653/#network--firewall-settings

Common errors

MALFORMED_URL when exporting submission archives

The cause of the following error is due to a configurable restriction in Veeva causing the name of long submission archive files to be truncated. The same error is thrown when making the same request in Postman:

[Type: MALFORMED_URL, Message: The specified item [/example/path/submissionArchive.zip] cannot be found]

To solve this please remove the following restriction or increase it: "Maximum characters for exported file name (including extension)":

Scanner Configuration

To create a new Veeva Scanner job click on the New Scanner button and select “Veeva” from the adapter type dropdown list. Once the adapter type has been selected, the parameters list will be populated with the Veeva Scanner parameters.

The Properties of an existing scanner can be accessed after creating the scanner by double-clicking the scanner in the list or by selecting the Properties button/menu item from the toolbar/context menu. A description is always displayed at the bottom of the window for the selected parameter.

Multiple scanners can be created for scanning different locations, provided each scanner has a unique name.

Scanner Parameters

The common adaptor parameters are described in Common Parameters.

The configuration parameters available for the Veeva Scanner are described below:

username* Veeva username. It must be a Vault Owner.
password* The user’s password.
server* Veeva Vault fully qualified domain name. Ex: fme-clinical.veevavault.com
proxyServer The name or IP of the proxy server if there is any.
proxyPort The port for the proxy server.
proxyUser The username if required by the proxy server.
proxyPassword The password if required by the proxy server.
documentSelection The parameter contains the conditions, which will be used in the WHERE statement of the VQL. The Vault will validate the conditions. If the string comprises several conditions, you must ensure the intended order between the logical operators. If the parameter is empty, then the entire Veeva Vault will be scanned.
Examples:
type__v=’Your Type’
classification__v=’SOP’
product__v=’00P000000000201’ OR product__v=’00P000000000202’
(study__v=’0ST000000000501’ OR study__v=’0ST000000000503’) AND blinding__v=’Blinded’
batchSize* The batch size representing how many documents to be loaded in a single bulk operation.
skipContent The flag indicates if the documents content will be exported from repository. If checked only the metadata will be exported.
scanBinders The flag indicates if the binders should be scanned or not. This parameter will scan just the latest version of each version tree.
scanBinderVersions Flag indicating if all the versions should be scanned or not. If it is checked, every version will be scanned.
maintainBinderIntegrity The flag indicates whether the scanner must maintain the binder’s integrity. If the binder is scanned, all of its children will be exported to ensure consistency, regardless of whether they fall within the scanning scope defined by the documentSelection parameter.
exportRenditions This checkbox parameter indicates if the document renditions will be scanned or not.
renditionTypes This parameter indicates which types of renditions will be scanned. If the value is left empty, all the types will be exported. This parameter is repeating and is used just if the exportRenditions is checked.
Examples:
viewable_rendition__v
custom_rendition__c
exportSignaturePages The flag indicates if the Signature Page should be added to the viewable rendition. If checked, the scanner will export the viewable rendition of the documents - when it is in the scope of rendition export - including any eSignature pages. If uncheck, the scanner will export the viewable rendition of the documents - when it is in the scope of rendition export - without any eSignature pages.
exportCrosslinkDocuments This checkbox indicates if the cross-link documents should be scanned or not.
scanAuditTrails This checkbox parameter indicates if the audit trail records of the scanned documents will be scanned as a separate MC object. The audit trail object will have ‘Source_type’ attribute set to Veeva(audittrail) and the ‘audited_obj_id’ will contain the source id of the audited document.
enableDeltaForAuditTrails Flag indicating if the new audit trails should be detected during delta scan even the document was not changed.
skipFtpContentDownload This flag indicates if the content files should be transferred from the FTP staging server. If selected, the contents will remain on the staging server and the scanner will store their FTP paths.
The FTP paths have the following OOTB pattern:
ftp:/u<user-id>/<veeva-job-id>/<doc-id>/<version-identifier>/<doc-name>
Examples:
ftp:/u6236635/23625/423651/2_1/monthly-bill.pdf
useHTTPSContentTransfer This flag indicates if the content files should be transferred from the Veeva Vault to the export location via the REST endpoint. When this box is checked, the scanner will download every content file sequentially via HTTPS.
Note: This feature is much slower, but it will not involve any FTP connection.
downloadFtpContentUsingRest Flag indicating if the content from Staging Folder will be downloaded by using REST API.
This is the recommended way to download the content locally.
Note: Connection to FTPS server is not made so the FTPS ports don't need to be opened.
exportLocation* Folder path. The location where the exported object content should be temporarily saved. It can be a local folder on the same machine with the Job Server or a shared folder on the network. This folder must exist prior to launching the scanner and the user account running the Job Server must have write permission on the folder. migration-center will not create this folder automatically. If the folder does not exist, an appropriate error will be raised and logged. This path must be accessible by both scanner and importer so if they are running on different machines, it should be a shared folder.
ftpMaxNrOfThreads* Maximum number of concurrent threads that Maximum number of concurrent threads that will be used to transfer content from FTP server to local system. The max value allowed by the scanner is 20 but according with Veeva migration best practices it is strongly recommended to use maximum 5 threads.
loggingLevel*
See: Common Parameters.

Parameters marked with an asterisk (*) are mandatory.

Additional Configuration Settings

There is a configuration file for additional settings for the Veeva Scanner located under the …/lib/mc-veeva-scanner/ folder in the Job Server install location. It has the following properties that can be set:

request_timeout The time in milliseconds the scanner waits for a response after every API call. Default: 600000 ms
client_id_prefix This is the prefix of the Client ID that is passed to every Vault API call. The Client ID is always logged in the report log. For a better identification of the requests (if necessary) the default value should be changed to include the name of the company as described here: https://developer.veevavault.com/docs/#client-id Default: fme-vault-client-migrationcenter
no_of_request_attempts Represents the number of request attempts when the first call fails due to a server error(5xx error code). The REST API call will be executed at least once independent of the parameter value. Default: 2

Working with Veeva Scanner

The Veeva Scanner connects to the Veeva Vault by using the username, password & server name from the configuration. The FTP/S connection is done internally by following the standard instructions computing the FTP username from the server and the user. Additionally, to use the proxy functionality, you have to provide a proxyUser, proxyPassword, proxyPort & proxyServer.

The scanner can export all the versions of a document, as a version tree, together with their rendition files and audit trails if they exist. The scanner will export the documents in batches of provided size using the condition specified in the configuration.

VQL enforces a limit of 5 queries per user. VQL will throttle additional query requests until existing queries have completed.

Please use multiple Vault users for concurrent scan runs of the same Vault.

Exporting Documents & Versions

The Veeva Scanner uses a VQL query to determine the documents to be scanned. By leaving the documentSelection parameter empty, the scanner will export all available documents from the entire Vault.

Exporting Crosslink Documents

A Crosslink is a document created in one Vault that uses the viewable rendition of another document in another Vault as its source document.

For scanning the crosslink it is necessary to set the exportCrossLinkDocuments checkbox from the scanner configuration. The scanned crosslink object will have isCrosslink attribute set to true and this is how you can separate the documents and crosslinks from each other.

Exporting Renditions

The scanner configuration view contains the exportRenditions parameter, which let you export the renditions. Moreover, you can specify exactly, which rendition files to be exported by specifying the desired types in the renditionTypes parameter.

If the exportRenditions parameter is checked and the ‘renditionTypes parameter contains annotated_version__c, rendition_two__c values, then just these two renditions will be exported.

Exporting Audit Trails

We are strongly recommending you to use a separate user for the migration project, because the Veeva Vault will generate a new audit trail record for every document for every action (extracting metadata, downloading the document content, etc.) made during the scanning process.

Veeva Scanner allows scanning audit trails for every scanned document as a distinct MC source object. The audit trails will be scanned as ‘Veeva(audittrail)’ objects if the checkbox scanAuditTrails is set. The audited_obj_id attribute contains the source system id of the document that has this audit trail.

The audit trails can be detected by migration-center during the delta scan even if the audited document was not changed if the checkbox enableDeltaForAuditTrails is set.

Exporting Binders

The Veeva Scanner allows you to extract the binders in such a way that they can be imported in a OpenText Documentum repository as Virtual Documents. The feature is provided by three parameters: scanBinders, scanBinderVersions, maintainBinderIntegrity. On Veeva Vault side, the binders are managed just as the documents, but having the binder__v attribute set to true.

To fully migrate binders to virtual documents, you have to scan them by maintaining their integrity to be able to rebuild them on Documentum side.

The scanner will create a relationship between the binder and its children. Moreover, the scanner supports scanning of nested binders. The relationship will contain all the information required by the Documentum importer.

Last updated 14 days ago

Was this helpful?