Sometimes you need to break into scripting to get mundane tasks completed accurately and quickly. We’ve started on upgrading a new SharePoint (SP) 2010 system that we currently have and it’s going to include SP Search with Longitude. Part of the Search project is to identify and configure the Content Source(s). I created this little Power Shell (PS) script that does the job for me while moving environments.
Intro
PowerShell (PS) is a powerful tool that we can use to automate a bunch of different tasks. I used PS to create a script that would
- Create new Content Sources
- Create new Content Source SP Site Addresses
- Schedule the Full Crawl
- Schedule the Intermediate Crawl
**My script is designed to specifically create rules for the SharePoint Site Content Source Type, but it can be modified to create other types with minimal work if needed.
This script accompanies an XML file that identifies the Site and Rules that are associated with each Source. I used this script the help me bash out multiple of crawl rules and scopes at the same time.
Give this script a try and give me feedback if you have any problems.
Walking it Down
The first section of the script configures the connection to the to the XML document. Then it runs through each <source> element from the XML document. If it notices that there is an existing Content Source, it just updates the rules. If the script doesn’t find any matches, then it creates the new Content Source.
$xmlData.Sources.Source | ForEach-Object {
$newSourceName=$_.Name
$sourceType = $_.Type
$source = Get-SPEnterpriseSearchCrawlContentSource -SearchApplication $ssa -Identity $newSourceName
if($source -eq $null)
{
write-host -f Magenta Creating Content Source $_.Name
#Modify the following line if you’d like to use other Source Types.
New-SPEnterpriseSearchCrawlContentSource -SearchApplication $ssa -Type $sourceType -name $newSourceName -SharePointCrawlBehavior CrawlSites
write-host -f Green $_.Name is created successfully
CreateNewContentSourcesRules
}
else
{
Write-Host -f Yellow $_.Name is already created. Updating rules.
CreateNewContentSourcesRules
}
}
Then I search through all the existing Content Sources to make sure my new Start Addresses do not exist in. I look through until I find a match and delete the Address if I find it.
Get-SPEnterpriseSearchCrawlContentSource -SearchApplication $ssa | ForEach-Object {
$existName = $_.Name
$existSource = Get-SPEnterpriseSearchCrawlContentSource -SearchApplication $ssa -Identity $existName
$i = 0
while ($addressArray.length-1 -ge $i)
{
if ($existSource.StartAddresses.Exists($addressArray[$i]))
{
write-host
write-host Found Source Address in $existName named $addressArray[$i] Removing…
$existSource.StartAddresses.Remove($addressArray[$i])
$existSource.Update()
}
else
{
Write-Host –NoNewLine ..
}
$i++
}
}
Then I add the Start Addresses along with the schedule that I need the Full and Incremental search to run.
$source = Get-SPEnterpriseSearchCrawlContentSource -SearchApplication $ssa -Identity $newSourceName
write-host
write-host -f Magenta Creating rules for $newSourceName
Set-SPEnterpriseSearchCrawlContentSource -Identity $newSourceName -StartAddresses $newSourceAddress -SearchApplication $ssa
Set-SPEnterpriseSearchCrawlContentSource -Identity $newSourceName -SearchApplication $ssa -ScheduleType Full -WeeklyCrawlSchedule -CrawlScheduleRunEveryInterval 1 -CrawlScheduleDaysOfWeek $fullCrawlDay -CrawlScheduleStartDateTime $fullCrawlTime -Confirm:$false
Set-SPEnterpriseSearchCrawlContentSource -Identity $newSourceName -SearchApplication $ssa -ScheduleType Incremental -WeeklyCrawlSchedule -CrawlScheduleRunEveryInterval 1 -CrawlScheduleDaysOfWeek $incCrawlDay -CrawlScheduleStartDateTime $incCrawlTime -Confirm:$false
Write-Host -f Green Completed.
How to Use
Save the PS script along with the XML configuration document in the same file folder. The PS script looks in its current folder for the configuration file.
Once you have the XML document populated with your Scope, Rules, and Schedule, you can go ahead and run the PS script.
The Done
XML Document
<!--?xml version="1.0" encoding="utf-8" ?>
<Sources>
<SSAName>Search Service Application</SSAName>
sharepoint”>
<FullCrawl Days=”Friday” Time=”12:00 AM” />
<IncCrawl Days=”Monday,Wednesday” Time=”12:00 PM”/>
<StartAddresses URL=”http://site1/,http://site2/site2-2″ />
</Source>
</Sources>
PowerShell
#--------
#Author: Marion Owen#Description: This creates Search Content Sources and schedules the crawls.
#Ref: http://technet.microsoft.com/en-us/library/ff607867.aspx
#Ref: http://technet.microsoft.com/en-us/library/ee906563.aspx
#——–
#—————-Get the xml file—————————————————————
if((Get-PSSnapin | Where {$_.Name -eq “Microsoft.SharePoint.PowerShell”}) -eq $null) {
Add-PSSnapin Microsoft.SharePoint.PowerShell;
}
$xmlConfig = Resolve-Path “ContentCrawlSource.xml” #This is the configuration XML located in the same folder.
[xml]$xmlData=Get-Content $xmlConfig
#—————-Create New Scope Function ———————————————
Function CreateNewContentSources()
{
$ssa=Get-SPEnterPriseSearchServiceApplication -Identity $xmlData.SSAName
$xmlData.Sources.Source | ForEach-Object {
$newSourceName=$_.Name
$sourceType = $_.Type
$source = Get-SPEnterpriseSearchCrawlContentSource -SearchApplication $ssa -Identity $newSourceName
if($source -eq $null)
{
write-host -f Magenta Creating Content Source $_.Name
#Modify the following line if you’d like to use other Source Types.
New-SPEnterpriseSearchCrawlContentSource -SearchApplication $ssa -Type $sourceType -name $newSourceName -SharePointCrawlBehavior CrawlSites
write-host -f Green $_.Name is created successfully
CreateNewContentSourcesRules
}
else
{
Write-Host -f Yellow $_.Name is already created. Updating rules.
CreateNewContentSourcesRules
}
}
}
#—————-Create New Scope Rules Function ———————————————
Function CreateNewContentSourcesRules()
{
$xmlDoc = $xmlData.Sources.Source
$newSourceAddress=$xmlDoc.StartAddresses.URL
$fullCrawlDay = $xmlDoc.FullCrawl.Days
$incCrawlDay = $xmlDoc.IncCrawl.Days
$fullCrawlTime = $xmlDoc.FullCrawl.Time
$incCrawlTime = $xmlDoc.IncCrawl.Time
$addressArray = @($newSourceAddress.Split(“,”))
#————– Remove the Source Address if exists in default Source ——————–
Write-Host -ForegroundColor DarkBlue Making sure the Source Address does not exist anywhere else.
Get-SPEnterpriseSearchCrawlContentSource -SearchApplication $ssa | ForEach-Object {
$existName = $_.Name
$existSource = Get-SPEnterpriseSearchCrawlContentSource -SearchApplication $ssa -Identity $existName
$i = 0
while ($addressArray.length-1 -ge $i)
{
if ($existSource.StartAddresses.Exists($addressArray[$i]))
{
write-host
write-host Found Source Address in $existName named $addressArray[$i] Removing…
$existSource.StartAddresses.Remove($addressArray[$i])
$existSource.Update()
}
else
{
Write-Host –NoNewLine ..
}
$i++
}
}
#—————————————————————————————-
$source = Get-SPEnterpriseSearchCrawlContentSource -SearchApplication $ssa -Identity $newSourceName
write-host
write-host -f Magenta Creating rules for $newSourceName
Set-SPEnterpriseSearchCrawlContentSource -Identity $newSourceName -StartAddresses $newSourceAddress -SearchApplication $ssa
Set-SPEnterpriseSearchCrawlContentSource -Identity $newSourceName -SearchApplication $ssa -ScheduleType Full -WeeklyCrawlSchedule -CrawlScheduleRunEveryInterval 1 -CrawlScheduleDaysOfWeek $fullCrawlDay -CrawlScheduleStartDateTime $fullCrawlTime -Confirm:$false
Set-SPEnterpriseSearchCrawlContentSource -Identity $newSourceName -SearchApplication $ssa -ScheduleType Incremental -WeeklyCrawlSchedule -CrawlScheduleRunEveryInterval 1 -CrawlScheduleDaysOfWeek $incCrawlDay -CrawlScheduleStartDateTime $incCrawlTime -Confirm:$false
Write-Host -f Green Completed.
}
CreateNewContentSources
Write-Host “Press any key to continue …”
$x = $host.UI.RawUI.ReadKey(“NoEcho,IncludeKeyDown”)
References