SharePoint Online Scanner

Introduction

The SharePoint Online Scanner allows extracting documents, folders and their related information from Microsoft SharePoint Online libraries.

Starting with version 3.15 Update 2 the SharePoint Online Scanner only supports the app-only authentication method! If you cannot use app-only authentication for any reason, please do not upgrade to 3.15 Update 2 or a later version.

Known issues & limitations

  • SPO Scanner might receive timeout error from SharePoint Online when scanning libraries with more than 5000 documents (#52865). This can be solved by increasing the timeout values depending on your situation.

    • There are 3 situations when the importer can get a timeout exception:

      • The initialization phase: the java component waits for the C# component to retrieve all the objects that satisfy the conditions. This is solved by increasing the value for the initialization_timeout property which is present in the additionalConfig.properties file (...\lib\mc-sharepoint-online-scanner).

      • The communication between the Java component and the C# component: This is solved by increasing the value for the timeout property which is present in the serverConfig.properties file (...\lib\mc-sharepoint-online-scanner).

      • The communication between the C# component and Sharepoint: This is solved by increasing the value for the variable sharepointCommunicationTimeout which is present in the de.fme.mc.spo.scanner.winservice.exe.config file (...\lib\mc-sharepoint-online-scanner\CSOM_service)

Installation

To install the main product components, consult the migration-center Installation Guide document.

The migration-center SharePoint Online scanner requires installing an additional component besides the main product components.

This additional component needs the .NET Framework 4.7.2 installed and it’s designed to run as a Windows service and must be installed on all machines where the a Job Server is installed.

To install this additional component, it is necessary to run an installation file, which is located within the SharePoint folder of your Job Server install location, which is by default C:\Program Files (x86)\fme AG\migration-center Server Components <Version>\lib\mc-sharepointonline-scanner\CSOM_Service\install.

To install the service run the install.bat file using administrative privileges. You will need to start it manually for the first time, afterwards the service is configured to start automatically at system startup.

To uninstall the service run the uninstall.bat file using administrative privileges.

Before uninstalling the Jobserver component, the CSOM service must be uninstalled as described here.

The app-only principal authentication used by the scanner calls the following HTTPS endpoints. Please ensure that the job server machine has access to those endpoints:

  • <tenant name>.sharepoint.com:443

  • accounts.accesscontrol.windows.net:443

The CSOM service must be run with the same user as the Job Server service so that it has the same access to the export location.

When running the CSOM service with a domain account you might need to grant access to the account by running the following command: netsh http add urlacl url=http://+:57096/ user=<your user>

<your user> might be in the format domain\username or username@domain.com

Preparation for app-only principal authentication

The importer supports app-principal authentication for connecting to SharePoint Online. The app-principal authentication comes in two flavors: Azure AD app-only principal authentication and SharePoint app-only principal authentication.

Azure AD app-only authentication requires full control access for the migration-center application on your SharePoint Online tenant. This includes full control on ALL site collections of your tenant.

If you want to restrict the access of the migration-center application to certain site collections or sites, you can use SharePoint app-only authentication.

Azure AD app-only principal authentication

The migration-center SharePoint Online scanner supports Azure AD app-only authentication. This is the authentication method for background processes accessing SharePoint Online recommended by Microsoft. When using SharePoint Online you can define applications in Azure AD and these applications can be granted permissions to your SharePoint Online tenant.

Please follow these steps in order to setup your migration-center application in your Azure AD.

The information in this chapter is based on the following Microsoft guidelines: https://docs.microsoft.com/en-us/sharepoint/dev/solution-guidance/security-apponly-azuread

Step 1: Create a self-signed certificate for your migration-center Azure AD application

In Azure AD when doing App-Only you typically use a certificate to request access: anyone having the certificate and its private key can use the app and the permissions granted to the app. The below steps walk you through the setup of this model.

You are now ready to configure the Azure AD Application for invoking SharePoint Online with an App-Only access token. To do that, you must create and configure a self-signed X.509 certificate, which will be used to authenticate your migration-center Application against Azure AD, while requesting the App-Only access token. First you must create the self-signed X.509 Certificate, which can be created using the makecert.exe tool that is available in the Windows SDK or through a provided PowerShell script which does not have a dependency to makecert. Using the PowerShell script is the preferred method and is explained in this chapter.

It's important that you run the below scripts with Administrator privileges.

To create a self-signed certificate with this script, which you can find in the <job server folder>\lib\mc-spo-batch-importer\scripts folder:

.\Create-SelfSignedCertificate.ps1 -CommonName "MyCompanyName" -StartDate 2020-07-01 -EndDate 2022-06-30

The dates are provided in ISO date format: YYYY-MM-dd

You will be asked to give a password to encrypt your private key, and both the .PFX file and .CER file will be exported to the current folder.

Save the password of the private key as you’ll need it later.

Step 2: Register the migration-center Azure AD application

Next step is registering an Azure AD application in the Azure Active Directory tenant that is linked to your Office 365 tenant. To do that, open the Office 365 Admin Center (https://admin.microsoft.com) using the account of a user member of the Tenant Global Admins group. Click on the "Azure Active Directory" link that is available under the "Admin centers" group in the left-side tree view of the Office 365 Admin Center. In the new browser's tab that will be opened you will find the Microsoft Azure portal (https://portal.azure.com/). If it is the first time that you access the Azure portal with your account, you will have to register a new Azure subscription, providing some information and a credit card for any payment need. But don't worry, in order to play with Azure AD and to register an Office 365 Application you will not pay anything. In fact, those are free capabilities. Once having access to the Azure portal, select the "Azure Active Directory" section and choose the option "App registrations". See the next figure for further details.

In the "App registrations" tab you will find the list of Azure AD applications registered in your tenant. Click the "New registration" button in the upper left part of the blade. Next, provide a name for your application, e.g. “migration-center” and click on "Register" at the bottom of the blade.

Once the application has been created copy the "Application (client) ID" as you’ll need it later.

Step 3: Configure necessary permissions for the migration-center application

Now click on "API permissions" in the left menu bar and click on the "Add a permission" button. A new blade will appear. Here you choose the permissions that are required by migration-center. Choose i.e.:

  • Microsoft APIs

    • SharePoint

      • Application permissions

        • Sites

          • Sites.FullControl.All

        • TermStore

          • TermStore.Read.All

        • User

          • User.Read.All

    • Graph

      • Application permissions

        • Sites

          • Sites.FullControl.All

Click on the blue "Add permissions" button at the bottom to add the permissions to your application. The "Application permissions" are those granted to the migration-center application when running as App Only.

Step 4: Uploading the self-signed certificate

Next step is “connecting” the certificate you created earlier to the application. Click on "Certificates & secrets" in the left menu bar. Click on the "Upload certificate" button, select the .CER file you generated earlier and click on "Add" to upload it.

Step 5: Grand admin consent

The “Sites.FullControl.All” application permission requires admin consent in a tenant before it can be used. In order to do this, click on "API permissions" in the left menu again. At the bottom you will see a section "Grand consent". Click on the "Grand admin consent for" button and confirm the action by clicking on the "Yes" button that appears at the top.

Final Step: Setting the necessary parameters in the importer

In order to use Azure AD app-only principal authentication with the SharePoint Online scanner you need to fill in the following scanner parameters with the information you gathered in the steps above:

SharePoint app-only principal authentication

SharePoint app-only authentication allows you to grant fine granular access permissions on your SharePoint Online tenant for the migration-center application.

The information in this chapter is based on the following guidelines from Microsoft:

https://docs.microsoft.com/en-us/sharepoint/dev/solution-guidance/security-apponly-azuread https://docs.microsoft.com/en-us/sharepoint/dev/solution-guidance/security-apponly-azureacs https://docs.microsoft.com/en-us/sharepoint/dev/sp-add-ins/add-in-permissions-in-sharepoint

Step 1: Create a self-signed certificate for your migration-center Azure AD application

In Azure AD when doing App-Only you typically use a certificate to request access: anyone having the certificate and its private key can use the app and the permissions granted to the app. The below steps walk you through the setup of this model.

You are now ready to configure the Azure AD Application for invoking SharePoint Online with an App-Only access token. To do that, you must create and configure a self-signed X.509 certificate, which will be used to authenticate your migration-center Application against Azure AD, while requesting the App-Only access token. First you must create the self-signed X.509 Certificate, which can be created by using the makecert.exe tool that is available in the Windows SDK or through a provided PowerShell script which does not have a dependency to makecert. Using the PowerShell script is the preferred method and is explained in this chapter.

It's important that you run the below scripts with Administrator privileges.

To create a self-signed certificate with this script, which you can find in the <job server folder>\lib\mc-spo-batch-importer\scripts folder:

.\Create-SelfSignedCertificate.ps1 -CommonName "MyCompanyName" -StartDate 2020-07-01 -EndDate 2022-06-30

The dates are provided in ISO date format: YYYY-MM-dd

You will be asked to give a password to encrypt your private key, and both the .PFX file and .CER file will be exported to the current folder.

Save the password of the private key as you’ll need it later.

Step 2: Register the migration-center Azure AD application

Next step is registering an Azure AD application in the Azure Active Directory tenant that is linked to your Office 365 tenant. To do that, open the Office 365 Admin Center (https://admin.microsoft.com) using the account of a user member of the Tenant Global Admins group. Click on the "Azure Active Directory" link that is available under the "Admin centers" group in the left-side tree view of the Office 365 Admin Center. In the new browser's tab that will be opened you will find the Microsoft Azure portal (https://portal.azure.com/). If it is the first time that you access the Azure portal with your account, you will have to register a new Azure subscription, providing some information and a credit card for any payment need. But don't worry, in order to play with Azure AD and to register an Office 365 Application you will not pay anything. In fact, those are free capabilities. Once having access to the Azure portal, select the "Azure Active Directory" section and choose the option "App registrations". See the next figure for further details.

In the "App registrations" tab you will find the list of Azure AD applications registered in your tenant. Click the "New registration" button in the upper left part of the blade. Next, provide a name for your application, e.g. “migration-center” and click on "Register" at the bottom of the blade.

Once the application has been created copy the "Application (client) ID" as you’ll need it later.

Step 3: Uploading the self-signed certificate and generate secret key

Next step is “connecting” the certificate you created earlier to the application. Click on "Certificates & secrets" in the left menu bar. Click on the "Upload certificate" button, select the .CER file you generated earlier and click on "Add" to upload it.

After that, you need to create a secret key. Click on “New client secret” to generate a new secret key. Give it an appropriate description, e.g. “migration-center” and choose an expiration period that matches your migration project time frame. Click on “Add” to create the key.

Store the retrieved information (client id and client secret) since you'll need this later! Please safeguard the created client id/secret combination as would it be your administrator account. Using this client id/secret one can read/update all data in your SharePoint Online environment!

Step 4: Granting permissions to the app-only principal

Next step is granting permissions to the newly created principal in SharePoint Online.

If you want to grant tenant scoped permissions this granting can only be done via the “appinv.aspx” page on the tenant administration site. If your tenant URL is https://contoso-admin.sharepoint.com, you can reach this site via https://contoso-admin.sharepoint.com/_layouts/15/appinv.aspx.

If you want to grant site collection scoped permissions, open the “appinv.aspx” on the specific site collection, e.g. https://contoso.sharepoint.com/sites/mysite/_layouts/15/appinv.aspx.

Once the page is loaded add your client id and look up the created principal by pressing the "Lookup" button:

Please enter “www.migration-center.com” in field “App Domain” and “https://www.migration-center.com” in field “Redirect URL”.

To grant permissions, you'll need to provide the permission XML that describes the needed permissions. The migration-center application will always need the “FullControl” permission. Use the following permission XML for granting tenant scoped permissions:

<AppPermissionRequests AllowAppOnlyPolicy="true"> <AppPermissionRequest Scope="http://sharepoint/content/tenant" Right="FullControl" /> <AppPermissionRequest Scope="http://sharepoint/taxonomy" Right="Read" /> </AppPermissionRequests>

Use this permission XML for granting site collection scoped permissions:

<AppPermissionRequests AllowAppOnlyPolicy="true"> <AppPermissionRequest Scope="http://sharepoint/content/sitecollection" Right="FullControl" /> <AppPermissionRequest Scope="http://sharepoint/taxonomy" Right="Read" /> </AppPermissionRequests>

When you click on “Create” you'll be presented with a permission consent dialog. Press “Trust It” to grant the permissions:

Please safeguard the created client id/secret combination as would it be your administrator account. Using this client id/secret one can read/update all data in your SharePoint Online environment!

Final Step: Setting the necessary parameters in the scanner

In order to use SharePoint app-only principal authentication with the SharePoint Online scanner you need to fill in the following scanner parameters with the information you gathered in the steps above:

SharePoint Online Scanner Properties

To create a new SharePoint Online Scanner, create a new scanner and select SharePoint Online from the Adapter Type drop-down. Once the adapter type has been selected, the Parameters list will be populated with the parameters specific to the selected adapter type. Mandatory parameters are marked with an *.

The Properties of an existing scanner can be accessed after creating the scanner by double-clicking the scanner in the list or by selecting the Properties button/menu item from the toolbar/context menu. A description is always displayed at the bottom of the window for the selected parameter.

Multiple scanners can be created for scanning different locations, provided each scanner has a unique name.

Common scanner parameters

SharePoint Scanner parameters

The configuration parameters available for the SharePoint Scanner are described below:

Scanning OneDrive using the SharePoint Online Scanner

You can scan documents from OneDrive using the same parameters as for scanning SharePoint Online libraries but with a different format for some of them.

Scanning from OneDrive requires the Azure AD app-only principal authentication with the self-signed certificate and password. It does not work with SharePoint app-only credentials with the appClientSecret.

Parameters not mentioned here are either not used when scanning from OneDrive or do not have any specific requirement.

Scanning using the CAML query

The SharePoint Online scanner can use SharePoint CAML queries for filtering which objects are to be scanned. Based on the entered query, the scanner scans documents and folders in the lists/libraries.

The queries used must only contain the content that would be placed inside the <Where> block. The scope is already set to recursive.

The following example shows a simple CAML query for scanning the contents of the Docs folder. In this example "mc" is the source site, "Versioning" a subsite, "VersionNumber" a library and "Docs" a folder.

<BeginsWith> 
   <FieldRef Name='FileDirRef'/> 
   <Value Type='Text'>mc/Versioning/VersionNumber/Docs</Value>
</BeginsWith>

More complex queries can also be used. This next example scans only documents created before a chosen date from 2 different subsites.

<And>
   <Or>
        <BeginsWith> 
  	     <FieldRef Name='FileDirRef'/> 
  	     <Value Type='Text'>mc/Versioning/VersionNumber/Docs</Value>
        </BeginsWith>
	<BeginsWith> 
  	     <FieldRef Name='FileDirRef'/> 
  	     <Value Type='Text'>mc/docLib/SubFolder</Value>
	</BeginsWith>
   </Or>
   <Leq>
    	<FieldRef Name='Created' />
    	<Value IncludeTimeValue='TRUE' Type='DateTime'>2020-12-31T00:00:00Z</Value>
   </Leq>
</And>

For details on how to form CAML queries for each version of SharePoint please consult the official Microsoft MSDN documentation.

When using the CAML query parameter “query”, the parameters "excludeListAndLibraries", "includeListAndLibraries", "scanSubsites", "excludeSubsites", "excludeFolders", "includeFolders" must not be set. Otherwise the scanner will fail to start with an error message.

Scanning permissions

The SharePoint Online scanner can extract permission information for documents and folders. Note that only unique permissions are extracted. Permissions inherited from parent objects are not extracted by the scanner.

Additional configuration settings

There is a configuration file for additional settings regarding the SharePoint Online Scanner. Located under the …/lib/mc-sharepointonline-scanner/ folder in the Job Server install location it has the following properties that can be set:

Additional logs

An additional log file is generated by the SharePoint Online Scanner.

The location of this log file is in the same folder as the regular SharePoint Online scanner log files with the name: mc-sharepointonline-scanner.log.

Last updated