Veeva Scanner
Introduction
The Veeva Scanner allows extracting documents, audit trails and their related information from any supported Veeva Vault.
Veeva Scanners can be created, configured, started and monitored through migration-center Client, while the corresponding processes are executed by a migration-center Job Server and the migration-center Veeva Scanner respectively.
A scanner is the term used in migration-center for an input adapter. Using a scanner module to read the data that needs processing into migration-center is the first step in a migration project, thus scanning also refers to the process used to input data to migration-center.
Scanners and importers work as jobs that can be run at any time and can even be executed repeatedly. For every run, a detailed history and log file are created. Multiple jobs can be created or run at a time, each being defined by a unique name, a set of configuration parameters and a description (optional).
Prerequisites
Network
The scanner uses FTP/S to download the content of the documents and their versions from the Veeva Vault environment to the selected export location. This means the content will be exported first from the Veeva Vault to the Veeva FTP server and the Veeva scanner will then download the content files via FTP/S from the Veeva FTP server. So, the necessary outbound ports (i.e. TCP 21 and TCP 56000 – 56100) should be opened in your firewalls as described here:
http://vaulthelp.vod309.com/wordpress/admin-user-help/admin-basics/accessing-your-vaults-ftp-server/
Veeva Scanner Properties
To create a new Veeva Scanner job click on the New Scanner button and select “Veeva” from the adapter type dropdown list. Once the adapter type has been selected, the parameters list will be populated with the Veeva Scanner parameters.
The Properties of an existing scanner can be accessed after creating the scanner by double-clicking the scanner in the list or by selecting the Properties button/menu item from the toolbar/context menu. A description is always displayed at the bottom of the window for the selected parameter.
Multiple scanners can be created for scanning different locations, provided each scanner has a unique name.
Common Scanner Parameters
Configuration parameters | Values |
Name | Enter a unique name for this scanner Mandatory |
Adapter type | Select the “Veeva” adapter from the list of available adapters Mandatory |
Location | Select the Job Server location where this job should run. Job Servers are defined in the Jobserver window. If no Job Server was selected, migration-center will prompt the user to define a Job Server Location when saving the scanner. Mandatory |
Description | Enter a description for this scanner (optional) |
Veeva Scanner Parameters
Configuration Parameters | Values |
username* | Veeva username. It must be a Vault Owner. Mandatory |
password* | The user’s password. Mandatory |
server* | Veeva Vault fully qualified domain name. Ex: fme-clinical.veevavault.com Mandatory |
proxyServer | The name or IP of the proxy server if there is any. |
proxyPort | The port for the proxy server. |
proxyUser | The username if required by the proxy server. |
proxyPassword | The password if required by the proxy server. |
documentSelection | The parameter contains the conditions, which will be used in the WHERE statement of the VQL. The Vault will validate the conditions. If the string comprises several conditions, you must ensure the intended order between the logical operators. If the parameter is empty, then the entire Veeva Vault will be scanned. Examples: type__v=’Your Type’ classification__v=’SOP’ product__v=’00P000000000201’ OR product__v=’00P000000000202’ (study__v=’0ST000000000501’ OR study__v=’0ST000000000503’) AND blinding__v=’Blinded’ |
batchSize* | The batch size representing how many documents to be loaded in a single bulk operation. Mandatory |
scanBinders | The flag indicates if the binders should be scanned or not. This parameter will scan just the latest version of each version tree. |
scanBinderVersions | Flag indicating if all the versions should be scanned or not. If it is checked, every version will be scanned. |
maintainBinderIntegrity | The flag indicates if the scanner must maintain the binder integrity or not. If it is scanned, all the children will be exported to keep consistency. |
exportRenditions | This checkbox parameter indicates if the document renditions will be scanned or not. |
renditionTypes | This parameter indicates which types of renditions will be scanned. If the value is left empty, all the types will be exported. This parameter is repeating and is used just if the exportRenditions is checked. Examples: viewable_rendition__v custom_rendition__c |
exportSignaturePages | The flag indicates if the Signature Pages should be scanned. If it is checked, the scanner will download each viewable rendition which will include the signature page. |
exportCrosslinkDocuments | This checkbox indicates if the cross-link documents should be scanned or not. |
scanAuditTrails | This checkbox parameter indicates if the audit trail records of the scanned documents will be scanned as a separate MC object. The audit trail object will have ‘Source_type’ attribute set to Veeva(audittrail) and the ‘audited_obj_id’ will contain the source id of the audited document. |
skipFtpContentDownload | This flag indicates if the content files should be transferred from the FTP staging server. If selected, the contents will remain on the staging server and the scanner will store their FTP paths. The FTP paths have the following OOTB pattern: ftp:/u<user-id>/<veeva-job-id>/<doc-id>/<version-identifier>/<doc-name> Examples: ftp:/u6236635/23625/423651/2_1/monthly-bill.pdf |
useHTTPSContentTransfer | This flag indicates if the content files should be transferred from the Veeva Vault to the export location via the REST endpoint. When this box is checked, the scanner will download every content file sequentially via HTTPS. Note: This feature is much slower, but it will not involve any FTP connection. |
exportLocation* | Folder path. The location where the exported object content should be temporarily saved. It can be a local folder on the same machine with the Job Server or a shared folder on the network. This folder must exist prior to launching the scanner and the user account running the Job Server must have write permission on the folder. migration-center will not create this folder automatically. If the folder does not exist, an appropriate error will be raised and logged. This path must be accessible by both scanner and importer so if they are running on different machines, it should be a shared folder. Mandatory |
loggingLevel* | Sets the verbosity of the log file. Values: 1 - logs only errors during scan 2 - is the default value reporting all warnings and errors 3 - logs all successfully performed operations in addition to any warnings or errors 4 - logs all events (for debugging only, use only if instructed by fme product support since it generates a very large amount of output. Do not use in production) Mandatory |
Additional Configuration Settings
There is a configuration file for additional settings for the Veeva Scanner located under the …/lib/mc-veeva-scanner/ folder in the Job Server install location. It has the following properties that can be set:
Configuration property | Values |
request_timeout | The time in milliseconds the scanner waits for a response after every API call. Default: 600000 ms |
threshold_burst_limit_remaining | This is set with the number of API calls remaining for the current 5-minutes burst limit before the scanner enters sleep mode and waits for the next burst interval. Default: 500 |
threshold_daily_limit_remaining | This is set with the number of API calls remaining for the current 24-hours daily window before the scanner stops exporting objects. An appropriate message is logged in the report log when the daily limit is reached. Note: Set this value to 0 to enable the automatic pause and resume feature of the scanner. When set to 0, the scanner will pause for "waiting_time" hours when there are less than 1,500 API calls left before it resumes processing. Default: 50000 |
client_id_prefix | This is the prefix of the Client ID that is passed to every Vault API call. The Client ID is composed from this prefix followed by the id of the job run. The Client ID is always logged in the report log. Default: fme-mc-migration-job |
waiting_time | Waiting time when reaching the daily limit. This parameter is used just when the threshold_daily_limit_remaining is set to 0. The value is represented in hours. Default: 4 (hours) |
no_of_request_attempts | Represents the number of request attempts when the first call fails due to a server error(5xx error code). The REST API call will be executed at least once independent of the parameter value. Default: 2 |
Working with Veeva Scanner
The Veeva Scanner connects to the Veeva Vault by using the username, password & server name from the configuration. The FTP/S connection is done internally by following the standard instructions computing the FTP username from the server and the user. Additionally, to use the proxy functionality, you have to provide a proxyUser, proxyPassword, proxyPort & proxyServer.
The scanner can export all the versions of a document, as a version tree, together with their rendition files and audit trails if they exist. The scanner will export the documents in batches of provided size using the condition specified in the configuration.
Exporting Documents & Versions
The Veeva Scanner uses a VQL query to determine the documents to be scanned. By leaving the documentSelection parameter empty, the scanner will export all available documents from the entire Vault.
Exporting Crosslink Documents
A Crosslink is a document created in one Vault that uses the viewable rendition of another document in another Vault as its source document.
For scanning the crosslink it is necessary to set the exportCrossLinkDocuments checkbox from the scanner configuration. The scanned crosslink object will have isCrosslink attribute set to true and this is how you can separate the documents and crosslinks from each other.
Exporting Renditions
The scanner configuration view contains the exportRenditions parameter, which let you export the renditions. Moreover, you can specify exactly, which rendition files to be exported by specifying the desired types in the renditionTypes parameter.
If the exportRenditions parameter is checked and the ‘renditionTypes parameter contains annotated_version__c, rendition_two__c values, then just these two renditions will be exported.
Exporting Audit Trails
We are strongly recommending you to use a separate user for the migration project, because the Veeva Vault will generate a new audit trail record for every document for every action (extracting metadata, downloading the document content, etc.) made during the scanning process.
Veeva Scanner allows scanning audit trails for every scanned document as a distinct MC source object. The audit trails will be scanned as ‘Veeva(audittrail)’ objects if the checkbox scanAuditTrails is set. The audited_obj_id attribute contains the source system id of the document that has this audit trail.
Exporting Binders
The Veeva Scanner allows you to extract the binders in such a way that they can be imported in a OpenText Documentum repository as Virtual Documents. The feature is provided by three parameters: scanBinders, scanBinderVersions, maintainBinderIntegrity. On Veeva Vault side, the binders are managed just as the documents, but having the binder__v attribute set to true.
To fully migrate binders to virtual documents, you have to scan them by maintaining their integrity to be able to rebuild them on Documentum side.
The scanner will create a relationship between the binder and its children. Moreover, the scanner supports scanning of nested binders. The relationship will contain all the information required by the Documentum importer.
Last updated