SP: Purging document libraries with Powershell

I wanted to purge the contents of a huge document library. Because of the 5000 item limit, using the GUI or explorer view wasn’t posssible any more. There are a couple of scripts out there which can help, based on the ProcessBatchData method to process batches of operations at once. This is often the best scenario, because deleting items one by one takes a lot of time.

Now upon deletion, you can get all kinds of non descriptive errors (operation failed, 0xblabla…). When your script fails after some items were deleted, there is a good chance check-outs are in the way of deletions. Checked-out files have to be checked-in again before they can be deleted, and folders to be deleted cannot contain any checked-out items either. So I create a purge script which takes care of all of these kinds of things, before deleting the entire contents of a library. I’m saving it here mainly for my own reference, but feel free to use it when you want to!

Add-PSSnapIn Microsoft.SharePoint.Powershell -ErrorAction SilentlyContinue

# Appends a line to the deletion builder and outputs a line every 
# 100 items to track progress
function AppendLine($line)
{
    $deleteBldr.Append($line) > $null  # global variable declared in PurgeDocumentLibrary
    Set-Variable -Name lines -Value ($lines + 1) -Scope Global
    
    # Write a line every 100 items to track the progress on this library
    if ($lines % 100 -eq 0)
    {
        Write-Host "Appended line $lines of $itemcount" 
    }
}

# Helper function which processes an item. This way it doesn't matter 
# what type you put in, the correct function is always executed, or 
# an error is thrown when the type is unusable
function ProcessItem($item)
{
    if ($item -eq $null)
    {
        break
    }

    if ($item -is [Microsoft.SharePoint.SPFolder])
    {
        ProcessFolder($item)
    }
    elseif ($item.FileSystemObjectType -eq "Folder")
    {
        ProcessFolder($item.Folder)
    }
    elseif ($item -is [Microsoft.SharePoint.SPFile])
    {
        ProcessFile($item)
    }
    elseif ($item.FileSystemObjectType -eq "File")
    {
        ProcessFile($item.File)
    }
    else
    {
        throw "Woops, the $item of type $($item.GetType()) cannot be processed" 
    }
}

# Processes a file; check-in when required and then append to delete
function ProcessFile($file)
{
    # When there is no item linked to this file, its a system file; break and ignore
    if ($file.Item -eq $null)
    {
        break
    }
    
    # When the file is checked out in some way; forcibly check it in
    if ($file.CheckOutType -ne "None")
    {
        $file.CheckIn("Automatic checkin")
        $file.Update()
    }
        
    $line = [System.String]::Format($command, $file.Item.ID.ToString(), $file.ServerRelativeUrl)
    AppendLine $line
}

# Processes a folder; first all subfolders and files in the folder and then
# the folder itself. Appends a line for bulk deletion
function ProcessFolder($folder)
{
    foreach ($item in $folder.SubFolders)
    {
        ProcessItem($item)
    }
    
    foreach ($item in $folder.Files)
    {
        ProcessItem($item)
    }
    
    # only delete a folder when it has an item linked to it... folders without an item are system folders
    if ($folder.item -ne $null)
    {
        $line = [System.String]::Format($command, $folder.Item.ID.ToString(), $folder.ServerRelativeUrl)
        AppendLine $line
    }
}

# Processes a document library and purges all of its content. This includes check-in of files
# without a checked-in version, or plain checked-out files (which cannot be deleted)
function PurgeDocumentLibrary( [Microsoft.SharePoint.SPWeb] $spweb, [Microsoft.SharePoint.SPList] $splist)
{ 
    Write-Host "Processing library $($splist.Title) with $($splist.ItemCount) items"
    
    # Create the batch string builder containing the command XML
    [System.Text.StringBuilder] $deleteBldr = New-Object "System.Text.StringBuilder"; 
    $deleteBldr.Append("");   
    $command = "" + $splist.ID + "{0}{1}Delete";
    
    # variable used in output to screen, to track progress
    $itemcount = $splist.FolderCount + $splist.ItemCount

    # forcibly check in any files which do not have a checked-in version... 
    # otherwise these will cause errors    
    foreach ($checkedOutFile in $splist.CheckedOutFiles)
    {
        $checkedOutFile.TakeOverCheckOut();
        $listItem = $splist.GetItemById($checkedOutFile.ListItemId);
        $listItem.File.CheckIn("Automated checkin");
    }

    # Process the root folder which will then recursively process all sub items,    
    ProcessItem $list.RootFolder
    
    # append closing tag for proper XML structure
    $deleteBldr.Append("")
    
    # Process the batch command and update the web (not strictly necessary)
    $command = $deleteBldr.ToString()
    $spweb.ProcessBatchData($command)
    $spweb.Update()
}    

# start assignment for disposal
Start-SPAssignment –Global
Set-Variable -Name lines -Value 0 -Scope "global"

# Get the web object, retrieve the list and call the Purge method
$web = Get-SPWeb "http://www.contoso.com/subsite"
$list = $web.Lists["Shared Documents"]
PurgeDocumentLibrary $web $list

# stop assignment for disposal
Stop-SPAssignment -Global

,

Related posts

Long Term Support… or not?

I wanted to purge the contents of a huge document library. Because of the 5000 item limit, using the GUI or explorer view wasn't posssible any more. There are a couple of scripts out there which can help, based on the ProcessBatchData method to process batches of operations at once. This is often the best scenario, because deleting items one by one takes a lot of time.

Now upon deletion, you can get all kinds of non descriptive errors (operation failed, 0xblabla...). When your script fails after some items were deleted, there is a good chance check-outs are in the way of deletions. Checked-out files have to be checked-in again before they can be deleted, and folders to be deleted cannot contain any checked-out items either. So I create a purge script which takes care of all of these kinds of things, before deleting the entire contents of a library. I'm saving it here mainly for my own reference, but feel free to use it when you want to!

[DevOps] Should you migrate onto YAML release pipelines?

I wanted to purge the contents of a huge document library. Because of the 5000 item limit, using the GUI or explorer view wasn't posssible any more. There are a couple of scripts out there which can help, based on the ProcessBatchData method to process batches of operations at once. This is often the best scenario, because deleting items one by one takes a lot of time.

Now upon deletion, you can get all kinds of non descriptive errors (operation failed, 0xblabla...). When your script fails after some items were deleted, there is a good chance check-outs are in the way of deletions. Checked-out files have to be checked-in again before they can be deleted, and folders to be deleted cannot contain any checked-out items either. So I create a purge script which takes care of all of these kinds of things, before deleting the entire contents of a library. I'm saving it here mainly for my own reference, but feel free to use it when you want to!

Latest posts

Long Term Support… or not?

I wanted to purge the contents of a huge document library. Because of the 5000 item limit, using the GUI or explorer view wasn't posssible any more. There are a couple of scripts out there which can help, based on the ProcessBatchData method to process batches of operations at once. This is often the best scenario, because deleting items one by one takes a lot of time.

Now upon deletion, you can get all kinds of non descriptive errors (operation failed, 0xblabla...). When your script fails after some items were deleted, there is a good chance check-outs are in the way of deletions. Checked-out files have to be checked-in again before they can be deleted, and folders to be deleted cannot contain any checked-out items either. So I create a purge script which takes care of all of these kinds of things, before deleting the entire contents of a library. I'm saving it here mainly for my own reference, but feel free to use it when you want to!

[DevOps] Should you migrate onto YAML release pipelines?

I wanted to purge the contents of a huge document library. Because of the 5000 item limit, using the GUI or explorer view wasn't posssible any more. There are a couple of scripts out there which can help, based on the ProcessBatchData method to process batches of operations at once. This is often the best scenario, because deleting items one by one takes a lot of time.

Now upon deletion, you can get all kinds of non descriptive errors (operation failed, 0xblabla...). When your script fails after some items were deleted, there is a good chance check-outs are in the way of deletions. Checked-out files have to be checked-in again before they can be deleted, and folders to be deleted cannot contain any checked-out items either. So I create a purge script which takes care of all of these kinds of things, before deleting the entire contents of a library. I'm saving it here mainly for my own reference, but feel free to use it when you want to!

2 comments

Leave a Comment

Leave a Reply to Praveen Nair Cancel reply

Your email address will not be published. Required fields are marked *