Veeva Scanner
Last updated
Last updated
The Veeva Scanner allows extracting documents, audit trails and their related information from any supported Veeva Vault.
Currently, the Veeva Connector uses version 21.2 of the Veeva REST API.
The scanner uses FTP/S to download the content of the documents and their versions from the Veeva Vault environment to the selected export location. This means the content will be exported first from the Veeva Vault to the Veeva FTP server and the Veeva scanner will then download the content files via FTP/S from the Veeva FTP server. So, the necessary outbound ports (i.e. TCP 21 and TCP 56000 – 56100) should be opened in your firewalls as described here:
http://vaulthelp.vod309.com/wordpress/admin-user-help/admin-basics/accessing-your-vaults-ftp-server/
To create a new Veeva Scanner job click on the New Scanner button and select “Veeva” from the adapter type dropdown list. Once the adapter type has been selected, the parameters list will be populated with the Veeva Scanner parameters.
The Properties of an existing scanner can be accessed after creating the scanner by double-clicking the scanner in the list or by selecting the Properties button/menu item from the toolbar/context menu. A description is always displayed at the bottom of the window for the selected parameter.
Multiple scanners can be created for scanning different locations, provided each scanner has a unique name.
There is a configuration file for additional settings for the Veeva Scanner located under the …/lib/mc-veeva-scanner/ folder in the Job Server install location. It has the following properties that can be set:
The Veeva Scanner connects to the Veeva Vault by using the username, password & server name from the configuration. The FTP/S connection is done internally by following the standard instructions computing the FTP username from the server and the user. Additionally, to use the proxy functionality, you have to provide a proxyUser, proxyPassword, proxyPort & proxyServer.
The scanner can export all the versions of a document, as a version tree, together with their rendition files and audit trails if they exist. The scanner will export the documents in batches of provided size using the condition specified in the configuration.
The Veeva Scanner uses a VQL query to determine the documents to be scanned. By leaving the documentSelection parameter empty, the scanner will export all available documents from the entire Vault.
A Crosslink is a document created in one Vault that uses the viewable rendition of another document in another Vault as its source document.
For scanning the crosslink it is necessary to set the exportCrossLinkDocuments checkbox from the scanner configuration. The scanned crosslink object will have isCrosslink attribute set to true and this is how you can separate the documents and crosslinks from each other.
The scanner configuration view contains the exportRenditions parameter, which let you export the renditions. Moreover, you can specify exactly, which rendition files to be exported by specifying the desired types in the renditionTypes parameter.
If the exportRenditions parameter is checked and the ‘renditionTypes parameter contains annotated_version__c, rendition_two__c values, then just these two renditions will be exported.
We are strongly recommending you to use a separate user for the migration project, because the Veeva Vault will generate a new audit trail record for every document for every action (extracting metadata, downloading the document content, etc.) made during the scanning process.
Veeva Scanner allows scanning audit trails for every scanned document as a distinct MC source object. The audit trails will be scanned as ‘Veeva(audittrail)’ objects if the checkbox scanAuditTrails is set. The audited_obj_id attribute contains the source system id of the document that has this audit trail.
The audit trails can be detected by migration-center during the delta scan even if the audited document was not changed if the checkbox enableDeltaForAuditTrails is set.
The Veeva Scanner allows you to extract the binders in such a way that they can be imported in a OpenText Documentum repository as Virtual Documents. The feature is provided by three parameters: scanBinders, scanBinderVersions, maintainBinderIntegrity. On Veeva Vault side, the binders are managed just as the documents, but having the binder__v attribute set to true.
To fully migrate binders to virtual documents, you have to scan them by maintaining their integrity to be able to rebuild them on Documentum side.
The scanner will create a relationship between the binder and its children. Moreover, the scanner supports scanning of nested binders. The relationship will contain all the information required by the Documentum importer.
Configuration parameters
Values
Name
Enter a unique name for this scanner
Mandatory
Adapter type
Select the “Veeva” adapter from the list of available adapters
Mandatory
Location
Select the Job Server location where this job should run. Job Servers are defined in the Jobserver window. If no Job Server was selected, migration-center will prompt the user to define a Job Server Location when saving the scanner.
Mandatory
Description
Enter a description for this scanner (optional)
Configuration Parameters
Values
username*
Veeva username. It must be a Vault Owner.
Mandatory
password*
The user’s password.
Mandatory
server*
Veeva Vault fully qualified domain name. Ex: fme-clinical.veevavault.com
Mandatory
proxyServer
The name or IP of the proxy server if there is any.
proxyPort
The port for the proxy server.
proxyUser
The username if required by the proxy server.
proxyPassword
The password if required by the proxy server.
documentSelection
The parameter contains the conditions, which will be used in the WHERE statement of the VQL. The Vault will validate the conditions. If the string comprises several conditions, you must ensure the intended order between the logical operators. If the parameter is empty, then the entire Veeva Vault will be scanned.
Examples:
type__v=’Your Type’
classification__v=’SOP’
product__v=’00P000000000201’ OR product__v=’00P000000000202’
(study__v=’0ST000000000501’ OR study__v=’0ST000000000503’) AND blinding__v=’Blinded’
batchSize*
The batch size representing how many documents to be loaded in a single bulk operation.
Mandatory
scanBinders
The flag indicates if the binders should be scanned or not. This parameter will scan just the latest version of each version tree.
scanBinderVersions
Flag indicating if all the versions should be scanned or not. If it is checked, every version will be scanned.
maintainBinderIntegrity
The flag indicates if the scanner must maintain the binder integrity or not. If it is scanned, all the children will be exported to keep consistency.
exportRenditions
This checkbox parameter indicates if the document renditions will be scanned or not.
renditionTypes
This parameter indicates which types of renditions will be scanned. If the value is left empty, all the types will be exported. This parameter is repeating and is used just if the exportRenditions is checked.
Examples:
viewable_rendition__v
custom_rendition__c
exportSignaturePages
The flag indicates if the Signature Pages should be scanned. If it is checked, the scanner will download each viewable rendition which will include the signature page.
exportCrosslinkDocuments
This checkbox indicates if the cross-link documents should be scanned or not.
scanAuditTrails
This checkbox parameter indicates if the audit trail records of the scanned documents will be scanned as a separate MC object. The audit trail object will have ‘Source_type’ attribute set to Veeva(audittrail) and the ‘audited_obj_id’ will contain the source id of the audited document.
enableDeltaForAuditTrails
Flag indicating if the new audit trails should be detected during delta scan even the document was not changed.
skipFtpContentDownload
This flag indicates if the content files should be transferred from the FTP staging server. If selected, the contents will remain on the staging server and the scanner will store their FTP paths.
The FTP paths have the following OOTB pattern:
ftp:/u<user-id>/<veeva-job-id>/<doc-id>/<version-identifier>/<doc-name>
Examples:
ftp:/u6236635/23625/423651/2_1/monthly-bill.pdf
useHTTPSContentTransfer
This flag indicates if the content files should be transferred from the Veeva Vault to the export location via the REST endpoint. When this box is checked, the scanner will download every content file sequentially via HTTPS.
Note: This feature is much slower, but it will not involve any FTP connection.
downloadFtpContentUsingRest
Flag indicating if the content from Staging Folder will be downloaded by using REST API.
This is the recommended way to download the content locally.
Note: Connection to FTPS server is not made so the FTPS ports don't need to be opened.
exportLocation*
Folder path. The location where the exported object content should be temporarily saved. It can be a local folder on the same machine with the Job Server or a shared folder on the network. This folder must exist prior to launching the scanner and the user account running the Job Server must have write permission on the folder. migration-center will not create this folder automatically. If the folder does not exist, an appropriate error will be raised and logged. This path must be accessible by both scanner and importer so if they are running on different machines, it should be a shared folder.
Mandatory
ftpMaxNrOfThreads*
Maximum number of concurrent threads that Maximum number of concurrent threads that will be used to transfer content from FTP server to local system. The max value allowed by the scanner is 20 but according with Veeva migration best practices it is strongly recommended to use maximum 5 threads.
loggingLevel*
Sets the verbosity of the log file.
Values:
1 - logs only errors during scan
2 - is the default value reporting all warnings and errors
3 - logs all successfully performed operations in addition to any warnings or errors
4 - logs all events (for debugging only, use only if instructed by fme product support since it generates a very large amount of output. Do not use in production)
Mandatory
Configuration property
Values
request_timeout
The time in milliseconds the scanner waits for a response after every API call.
Default: 600000 ms
client_id_prefix
This is the prefix of the Client ID that is passed to every Vault API call. The Client ID is composed from this prefix followed by the id of the job run. The Client ID is always logged in the report log.
Default: fme-mc-migration-job
no_of_request_attempts
Represents the number of request attempts when the first call fails due to a server error(5xx error code). The REST API call will be executed at least once independent of the parameter value.
Default: 2