GitHub Icon Image
GitHub

Sample on how to use ForEach-Object -Parallel to iterate SharePoint site collections in parallel, something some of us do a LOT

Summary

Often we will have to iterate a lot of site collections in order to query if this or that property has been changed, or to update something.

There are a number of ways to speed up the process (divide&conquer using multiple scripts or fan in / fan out in Azure Functions) but I have had the ForEach-Object -parallel on my to do list since I saw it here: https://devblogs.microsoft.com/powershell/powershell-foreach-object-parallel-feature/

Implementation

  • Open Windows PowerShell ISE
  • Create a new file
  • Write a script as below,
  • Change the variables to target to your environment, site, document library, document path, max count
  • Run the script.

Screenshot of Output

Example Screenshot

  • PnP PowerShell


if(-not $cred)
{
    $cred = Get-Credential
}
$SPAdminSiteUlr = "https://[YourTenant]-admin.sharepoint.com"
$conn = Connect-PnPOnline -Url $SPAdminSiteUlr -Credentials $cred -ErrorAction Stop

#substitute this section with your selection of site collections
#in this case I just get the first 40 sitecollections from the tenant
try
{
    $SiteCollections = Get-PnPTenantSite -Connection $conn | select-object -first 40
}
catch
{
    Write-Host $_.Exception
    throw $_.Exception
}
$urls = @()
foreach($sitecollection in $SiteCollections)
{
    $urls+=$sitecollection.Url    
}

function DoSomethingInASiteCollection ($sitecollectionUrl, $cred )
{
    $success = $false
    $reruncount = 0
    $filecounter = 0
    $errors = @()
    while($success -eq $false -and $reruncount -lt 9)
    {
        try
        {
            $localconn = Connect-PnPOnline -Url $sitecollectionUrl -Credentials $cred -ReturnConnection -ErrorAction Stop
            $lists= Get-PnPList -Connection $localconn -ErrorAction Stop| Where-Object {$_.BaseTemplate -eq 101 -and $_.Hidden -eq $false} 
            foreach($list in $lists)
            {
                if( $list.Title -eq "Form Templates" -or $list.Title -eq "Style Library" -or $list.Title -eq "Site Assets")
                {
                    #not sure if I need to count those
                }
                else
                {
                    $listItems = Get-PnPListItem -Connection $localconn -List $list  -ErrorAction Stop
                    $filecounter+= $listItems.Count
                }
                
            }
            $outputobj = new-object PSObject -property @{"SiteUrl" = $sitecollectionUrl; "FileCount" = $filecounter; "Reruncount" = $reruncount ; "errors"  = $errors}
            $success = $true
            return $outputobj
            
        }
        catch
        {
            #this error handling is pretty rudimentary, please replace it with your own :-)
            if($_.Exception.Message -like "*429*")
            {
                Write-Warning -Message ("Received throttling error ")
                [int]$waittime = $_.Exception.Response.Headers.GetValues("Retry-after")[0]
                Start-Sleep -Seconds $waittime
            }
            #write-host -f Red "`tError:" $_.Exception.Message $_.Exception
            $reruncount++
            $errors+= $_.Exception.Message
        }
    }
    if($reruncount -gt 9)
    {
        $outputobj = new-object PSObject -property @{"SiteUrl" = $sitecollectionUrl; "FileCount" = $filecounter; "Reruncount" = $reruncount ; "errors" = $errors}
        return $outputobj
    
    }
}


#just included in order to show the diff between running it in sequence and parallel
#it also serves as a test bed for the logic in the function as you can debug using this
if($true)
{
    Write-Host "Starting sequential run" -ForegroundColor Green
    $start = (Get-Date)
    foreach($url in $urls)
    {
        $test = DoSomethingInASiteCollection -sitecollectionUrl $url -cred $cred
        write-host "URL $($test.SiteUrl), FileCount   $($test.FileCount)  , Reruns $($test.Reruncount), errros  $($test.errors)" -ForegroundColor Yellow
    }
    $end = (Get-Date)
    $sequentialtimespan = $end - $start
    Write-Host "Running sequentially total time:$sequentialtimespan " -ForegroundColor Green
}


Write-Host "Starting parallel run" -ForegroundColor Blue
$throttleLimit = 10
$funcDef = $function:DoSomethingInASiteCollection.ToString()
$threadSafeDictionary = [System.Collections.Concurrent.ConcurrentDictionary[string,object]]::new()

$start = Get-Date
$urls| ForEach-Object -Parallel -ThrottleLimit $throttleLimit   {
    $function:DoSomethingInASiteCollection = $using:funcDef
    $res = DoSomethingInASiteCollection -sitecollectionUrl $_  -cred $using:cred 
    $dict = $using:threadSafeDictionary
    $outObject = new-object PSObject -property @{"FileCount" = $res.filecount; "Reruncount" = $res.reruncount; errors = $res.errors }
    $dict.TryAdd($res.SiteUrl, $outObject) | Out-Null

} 
$end = Get-Date
$timespan = $end - $start

$threadSafeDictionary.Count
foreach($key in $threadSafeDictionary.Keys)
{
    $returnObject = $threadSafeDictionary[$key]
    if($returnObject.errors -and $returnObject.errors.Count -gt 0)
    {
        Write-Host " Url : $key failed with codes $($returnObject.errors)" -ForegroundColor Red
    }
    else
    {
        Write-Host " Url : $key contains $($returnObject.FileCount) items, reruns = $($returnObject.Reruncount)"
    }
}
Write-Host "Running sequentially total time:$sequentialtimespan " -ForegroundColor Green
Write-Host "Running parallel total time:$timespan " -ForegroundColor Blue


Check out the PnP PowerShell to learn more at: https://aka.ms/pnp/powershell

The way you login into PnP PowerShell has changed please read PnP Management Shell EntraID app is deleted : what should I do ?

Contributors

Author(s)
Kasper Larsen, Fellowmind

Disclaimer

THESE SAMPLES ARE PROVIDED AS IS WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESS OR IMPLIED, INCLUDING ANY IMPLIED WARRANTIES OF FITNESS FOR A PARTICULAR PURPOSE, MERCHANTABILITY, OR NON-INFRINGEMENT.

Back to top Script Samples
Generated by DocFX with Material UI