PowerShell Search Content Source

Sometimes you need to break into scripting to get mundane tasks completed accurately and quickly. We’ve started on upgrading a new SharePoint (SP) 2010 system that we currently have and it’s going to include SP Search with Longitude. Part of the Search project is to identify and configure the Content Source(s). I created this little Power Shell (PS) script that does the job for me while moving environments.

Intro

PowerShell (PS) is a powerful tool that we can use to automate a bunch of different tasks. I used PS to create a script that would

  1. Create new Content Sources
  2. Create new Content Source SP Site Addresses
  3. Schedule the Full Crawl
  4. Schedule the Intermediate Crawl

**My script is designed to specifically create rules for the SharePoint Site Content Source Type, but it can be modified to create other types with minimal work if needed.

This script accompanies an XML file that identifies the Site and Rules that are associated with each Source. I used this script the help me bash out multiple of crawl rules and scopes at the same time.

Give this script a try and give me feedback if you have any problems.

Walking it Down

The first section of the script configures the connection to the to the XML document. Then it runs through each <source> element from the XML document. If it notices that there is an existing Content Source, it just updates the rules. If the script doesn’t find any matches, then it creates the new Content Source.

$xmlData.Sources.Source | ForEach-Object {

$newSourceName=$_.Name

$sourceType = $_.Type

$source = Get-SPEnterpriseSearchCrawlContentSource -SearchApplication $ssa -Identity $newSourceName

if($source -eq $null)

{

write-host -f Magenta Creating Content Source $_.Name

#Modify the following line if you’d like to use other Source Types.

New-SPEnterpriseSearchCrawlContentSource -SearchApplication $ssa -Type $sourceType -name $newSourceName -SharePointCrawlBehavior CrawlSites

write-host -f Green $_.Name is created successfully

CreateNewContentSourcesRules

}

else

{

Write-Host -f Yellow $_.Name is already created. Updating rules.

CreateNewContentSourcesRules

}

}

Then I search through all the existing Content Sources to make sure my new Start Addresses do not exist in. I look through until I find a match and delete the Address if I find it.

Get-SPEnterpriseSearchCrawlContentSource -SearchApplication $ssa | ForEach-Object {

$existName = $_.Name

$existSource = Get-SPEnterpriseSearchCrawlContentSource -SearchApplication $ssa -Identity $existName

$i = 0

while ($addressArray.length-1 -ge $i)

{

if ($existSource.StartAddresses.Exists($addressArray[$i]))

{

write-host

write-host Found Source Address in $existName named $addressArray[$i] Removing…

$existSource.StartAddresses.Remove($addressArray[$i])

$existSource.Update()

}

else

{

Write-Host –NoNewLine ..

}

$i++

}

}

Then I add the Start Addresses along with the schedule that I need the Full and Incremental search to run.


$source = Get-SPEnterpriseSearchCrawlContentSource -SearchApplication $ssa -Identity $newSourceName

write-host

write-host -f Magenta Creating rules for $newSourceName

Set-SPEnterpriseSearchCrawlContentSource -Identity $newSourceName -StartAddresses $newSourceAddress -SearchApplication $ssa

Set-SPEnterpriseSearchCrawlContentSource -Identity $newSourceName -SearchApplication $ssa -ScheduleType Full -WeeklyCrawlSchedule -CrawlScheduleRunEveryInterval 1 -CrawlScheduleDaysOfWeek $fullCrawlDay -CrawlScheduleStartDateTime $fullCrawlTime -Confirm:$false

Set-SPEnterpriseSearchCrawlContentSource -Identity $newSourceName -SearchApplication $ssa -ScheduleType Incremental -WeeklyCrawlSchedule -CrawlScheduleRunEveryInterval 1 -CrawlScheduleDaysOfWeek $incCrawlDay -CrawlScheduleStartDateTime $incCrawlTime -Confirm:$false

Write-Host -f Green Completed.

How to Use

Save the PS script along with the XML configuration document in the same file folder. The PS script looks in its current folder for the configuration file.

Once you have the XML document populated with your Scope, Rules, and Schedule, you can go ahead and run the PS script.

The Done

XML Document


<!--?xml version="1.0" encoding="utf-8" ?>

<Sources>

<SSAName>Search Service Application</SSAName>

sharepoint”>

<FullCrawl Days=”Friday” Time=”12:00 AM” />

<IncCrawl Days=”Monday,Wednesday” Time=”12:00 PM”/>

<StartAddresses URL=”http://site1/,http://site2/site2-2″ />

</Source>

</Sources>

PowerShell


#--------
#Author: Marion Owen#Description: This creates Search Content Sources and schedules the crawls.

#Ref: http://technet.microsoft.com/en-us/library/ff607867.aspx

#Ref: http://technet.microsoft.com/en-us/library/ee906563.aspx

#——–

#—————-Get the xml file—————————————————————

if((Get-PSSnapin | Where {$_.Name -eq “Microsoft.SharePoint.PowerShell”}) -eq $null) {

Add-PSSnapin Microsoft.SharePoint.PowerShell;

}

$xmlConfig = Resolve-Path “ContentCrawlSource.xml” #This is the configuration XML located in the same folder.

[xml]$xmlData=Get-Content $xmlConfig

#—————-Create New Scope Function ———————————————

Function CreateNewContentSources()

{

$ssa=Get-SPEnterPriseSearchServiceApplication -Identity $xmlData.SSAName

$xmlData.Sources.Source | ForEach-Object {

$newSourceName=$_.Name

$sourceType = $_.Type

$source = Get-SPEnterpriseSearchCrawlContentSource -SearchApplication $ssa -Identity $newSourceName

if($source -eq $null)

{

write-host -f Magenta Creating Content Source $_.Name

#Modify the following line if you’d like to use other Source Types.

New-SPEnterpriseSearchCrawlContentSource -SearchApplication $ssa -Type $sourceType -name $newSourceName -SharePointCrawlBehavior CrawlSites

write-host -f Green $_.Name is created successfully

CreateNewContentSourcesRules

}

else

{

Write-Host -f Yellow $_.Name is already created. Updating rules.

CreateNewContentSourcesRules

}

}

}

#—————-Create New Scope Rules Function ———————————————

Function CreateNewContentSourcesRules()

{

$xmlDoc = $xmlData.Sources.Source

$newSourceAddress=$xmlDoc.StartAddresses.URL

$fullCrawlDay = $xmlDoc.FullCrawl.Days

$incCrawlDay = $xmlDoc.IncCrawl.Days

$fullCrawlTime = $xmlDoc.FullCrawl.Time

$incCrawlTime = $xmlDoc.IncCrawl.Time

$addressArray = @($newSourceAddress.Split(“,”))

#————– Remove the Source Address if exists in default Source ——————–

Write-Host -ForegroundColor DarkBlue Making sure the Source Address does not exist anywhere else.

Get-SPEnterpriseSearchCrawlContentSource -SearchApplication $ssa | ForEach-Object {

$existName = $_.Name

$existSource = Get-SPEnterpriseSearchCrawlContentSource -SearchApplication $ssa -Identity $existName

$i = 0

while ($addressArray.length-1 -ge $i)

{

if ($existSource.StartAddresses.Exists($addressArray[$i]))

{

write-host

write-host Found Source Address in $existName named $addressArray[$i] Removing…

$existSource.StartAddresses.Remove($addressArray[$i])

$existSource.Update()

}

else

{

Write-Host –NoNewLine ..

}

$i++

}

}

#—————————————————————————————-

$source = Get-SPEnterpriseSearchCrawlContentSource -SearchApplication $ssa -Identity $newSourceName

write-host

write-host -f Magenta Creating rules for $newSourceName

Set-SPEnterpriseSearchCrawlContentSource -Identity $newSourceName -StartAddresses $newSourceAddress -SearchApplication $ssa

Set-SPEnterpriseSearchCrawlContentSource -Identity $newSourceName -SearchApplication $ssa -ScheduleType Full -WeeklyCrawlSchedule -CrawlScheduleRunEveryInterval 1 -CrawlScheduleDaysOfWeek $fullCrawlDay -CrawlScheduleStartDateTime $fullCrawlTime -Confirm:$false

Set-SPEnterpriseSearchCrawlContentSource -Identity $newSourceName -SearchApplication $ssa -ScheduleType Incremental -WeeklyCrawlSchedule -CrawlScheduleRunEveryInterval 1 -CrawlScheduleDaysOfWeek $incCrawlDay -CrawlScheduleStartDateTime $incCrawlTime -Confirm:$false

Write-Host -f Green Completed.

}

CreateNewContentSources

Write-Host “Press any key to continue …”

$x = $host.UI.RawUI.ReadKey(“NoEcho,IncludeKeyDown”)

References

 

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.