<?xml version="1.0" encoding="utf-8"?>
<rss xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:pingback="http://madskills.com/public/xml/rss/module/pingback/" xmlns:trackback="http://madskills.com/public/xml/rss/module/trackback/" xmlns:wfw="http://wellformedweb.org/CommentAPI/" xmlns:slash="http://purl.org/rss/1.0/modules/slash/" xmlns:dc="http://purl.org/dc/elements/1.1/" version="2.0">
  <channel>
    <title>Adventures in SPWonderland. - search</title>
    <link>http://blogs.flexnetconsult.co.uk/colinbyrne/</link>
    <description>Taking apart and putting back together</description>
    <language>en-us</language>
    <copyright>Colin Byrne</copyright>
    <lastBuildDate>Mon, 23 Jun 2008 10:56:49 GMT</lastBuildDate>
    <generator>newtelligence dasBlog 2.0.7226.0</generator>
    <managingEditor>webparts@flexnetconsult.co.uk</managingEditor>
    <webMaster>webparts@flexnetconsult.co.uk</webMaster>
    <item>
      <trackback:ping>http://blogs.flexnetconsult.co.uk/colinbyrne/Trackback.aspx?guid=869a1436-3b78-404c-a538-5eb20fe971aa</trackback:ping>
      <pingback:server>http://blogs.flexnetconsult.co.uk/colinbyrne/pingback.aspx</pingback:server>
      <pingback:target>http://blogs.flexnetconsult.co.uk/colinbyrne/PermaLink,guid,869a1436-3b78-404c-a538-5eb20fe971aa.aspx</pingback:target>
      <dc:creator>Colin Byrne</dc:creator>
      <wfw:comment>http://blogs.flexnetconsult.co.uk/colinbyrne/CommentView,guid,869a1436-3b78-404c-a538-5eb20fe971aa.aspx</wfw:comment>
      <wfw:commentRss>http://blogs.flexnetconsult.co.uk/colinbyrne/SyndicationService.asmx/GetEntryCommentsRss?guid=869a1436-3b78-404c-a538-5eb20fe971aa</wfw:commentRss>
      <body xmlns="http://www.w3.org/1999/xhtml">
        <p>
 
</p>
        <p>
          <a href="http://blogs.flexnetconsult.co.uk/colinbyrne/content/binary/WindowsLiveWriter/IndexingyourCSandPowerShellcodewithShare_117D5/image_22.png">
            <img style="border-right: 0px; border-top: 0px; border-left: 0px; border-bottom: 0px" height="287" alt="image" src="http://blogs.flexnetconsult.co.uk/colinbyrne/content/binary/WindowsLiveWriter/IndexingyourCSandPowerShellcodewithShare_117D5/image_thumb.png" width="644" border="0" />
          </a>
        </p>
        <p>
Recently I went through the process of indexing a subversion source code repository
with SharePoint. I thought I'd share those steps as OOTB SharePoint won't index ps1,
cs or vb files.
</p>
        <p>
Setting up search to index these files works either if the files themselves live in
a document library or are external to SharePoint. The process to index files from
other source control systems will vary depending on how you can get access to the
source files. If you need to index SourceSafe you can set up what's called a mirror
directory that automatically save the files from your repositories on disk and I suspect
you can index Team Foundation Server via its Web Access URL's although I've not tried
that.
</p>
        <p>
The subversion side of things is pretty easy, pick the repository you want and export
the latest version using the svn client i.e. svn export svn://devhosting/svn/webparts
d:\SVNExport\webparts. Script the export of each repository and then schedule it. 
</p>
        <p>
On the SharePoint side you set up a new content source to crawl the directories. 
</p>
        <p>
In this case the Indexing is on a separate machine so we enter the UNC path. Make
sure the content access account has read rights to the share. If needed you can setup
separate credentials for this source.
</p>
        <p>
In the SSP on the Search Setting page, click <strong>New Content Source</strong> under <strong>Content
source and crawl schedules</strong></p>
        <p>
          <a href="http://blogs.flexnetconsult.co.uk/colinbyrne/content/binary/WindowsLiveWriter/IndexingyourCSandPowerShellcodewithShare_117D5/image12.png">
            <img style="border-top-width: 0px; border-left-width: 0px; border-bottom-width: 0px; border-right-width: 0px" height="484" alt="image" src="http://blogs.flexnetconsult.co.uk/colinbyrne/content/binary/WindowsLiveWriter/IndexingyourCSandPowerShellcodewithShare_117D5/image12_thumb.png" width="644" border="0" />
          </a>
        </p>
        <p>
          <a href="http://blogs.flexnetconsult.co.uk/colinbyrne/content/binary/WindowsLiveWriter/IndexingyourCSandPowerShellcodewithShare_117D5/image15.png">
            <img style="border-top-width: 0px; border-left-width: 0px; border-bottom-width: 0px; border-right-width: 0px" height="230" alt="image" src="http://blogs.flexnetconsult.co.uk/colinbyrne/content/binary/WindowsLiveWriter/IndexingyourCSandPowerShellcodewithShare_117D5/image15_thumb.png" width="644" border="0" />
          </a>
        </p>
        <p>
The problem now is if you start a full crawl typically only the .txt files are indexed
as the SharePoint indexers have no idea what to do with file extensions it doesn't
recognise. 
</p>
        <p>
There are a couple of steps to getting new file extensions indexed. This assumes you
are a Search Service administrator.
</p>
        <p>
          <strong>First add the extension to File Types</strong>
        </p>
        <p>
1. On the Search Administration page, click <strong>File Types</strong> under <strong>Crawling</strong>. 
</p>
        <p>
          <a href="http://blogs.flexnetconsult.co.uk/colinbyrne/content/binary/WindowsLiveWriter/IndexingyourCSandPowerShellcodewithShare_117D5/image_14.png">
            <img style="border-top-width: 0px; border-left-width: 0px; border-bottom-width: 0px; border-right-width: 0px" height="436" alt="image" src="http://blogs.flexnetconsult.co.uk/colinbyrne/content/binary/WindowsLiveWriter/IndexingyourCSandPowerShellcodewithShare_117D5/image_thumb_6.png" width="356" border="0" />
          </a>
        </p>
        <p>
2. On the Manage File Types page, click <strong>New File Type</strong>. 
</p>
        <p>
          <a href="http://blogs.flexnetconsult.co.uk/colinbyrne/content/binary/WindowsLiveWriter/IndexingyourCSandPowerShellcodewithShare_117D5/image_18.png">
            <img style="border-top-width: 0px; border-left-width: 0px; border-bottom-width: 0px; border-right-width: 0px" height="284" alt="image" src="http://blogs.flexnetconsult.co.uk/colinbyrne/content/binary/WindowsLiveWriter/IndexingyourCSandPowerShellcodewithShare_117D5/image_thumb_8.png" width="598" border="0" />
          </a>
        </p>
        <p>
3. On the Add File Type page, type the file name extension in the <strong>File extension</strong> box
for the file type that you want to add.<br />
To search for PowerShell files, type ps1 
<br />
Do not include the period (.) character in front of the file name extension. 
</p>
        <p>
4.Click <strong>OK</strong>. 
</p>
        <p>
          <a href="http://blogs.flexnetconsult.co.uk/colinbyrne/content/binary/WindowsLiveWriter/IndexingyourCSandPowerShellcodewithShare_117D5/image_16.png">
            <img style="border-top-width: 0px; border-left-width: 0px; border-bottom-width: 0px; border-right-width: 0px" height="248" alt="image" src="http://blogs.flexnetconsult.co.uk/colinbyrne/content/binary/WindowsLiveWriter/IndexingyourCSandPowerShellcodewithShare_117D5/image_thumb_7.png" width="520" border="0" />
          </a>
        </p>
        <p>
5. Rinse and repeat for each file type that you want to add. 
</p>
        <p>
The second step in getting the file extensions recognised is to add it to the registry
entries the SharePoint Server Search service reads when it starts up. This key is
located at 
</p>
        <p>
HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Office Server\12.0\Search\Setup\ContentIndexCommon\Filters\Extension 
</p>
        <p>
Add a new key, enter the extension including the dot i.e. .ps1.
</p>
        <p>
Save and set its default value to be {4A3DD7AB-0A6B-43B0-8A90-0D8B0CC36AAB}. This
means use the text parser Ifilter tquery.dll for this extension.
</p>
        <p>
          <a href="http://blogs.flexnetconsult.co.uk/colinbyrne/content/binary/WindowsLiveWriter/IndexingyourCSandPowerShellcodewithShare_117D5/image_12.png">
            <img style="border-top-width: 0px; border-left-width: 0px; border-bottom-width: 0px; border-right-width: 0px" height="221" alt="image" src="http://blogs.flexnetconsult.co.uk/colinbyrne/content/binary/WindowsLiveWriter/IndexingyourCSandPowerShellcodewithShare_117D5/image_thumb_5.png" width="644" border="0" />
          </a>
        </p>
        <p>
And a new key for each file extension you want indexed in this case cs,ps1 and aspx
but you can add vb vbs or whatever other text files you need indexed.
</p>
        <p>
Stop and start the search service with these commands
</p>
        <p>
net stop osearch
</p>
        <p>
net start osearch
</p>
        <p>
Now do a full crawl of your content type and your files should have been full text
indexed. The crawl log is useful in seeing if the filtering barfed on your files.
</p>
        <p>
 
</p>
        <p>
Now you can go to the Search Center enter your keyword and get a list of code files
back.
</p>
        <p>
          <a href="http://blogs.flexnetconsult.co.uk/colinbyrne/content/binary/WindowsLiveWriter/IndexingyourCSandPowerShellcodewithShare_117D5/image_6.png">
            <img style="border-top-width: 0px; border-left-width: 0px; border-bottom-width: 0px; border-right-width: 0px" height="464" alt="image" src="http://blogs.flexnetconsult.co.uk/colinbyrne/content/binary/WindowsLiveWriter/IndexingyourCSandPowerShellcodewithShare_117D5/image_thumb_2.png" width="718" border="0" />
          </a>
        </p>
        <p>
Here I've set up a custom scope, search page, and added a custom search tab so separate
the code results on its own. I won't go into it here but there is a <a href="http://www.zimmergren.net/archive/tags/Search%20Scope/default.aspx" target="_blank">good
post here</a> that shows how you do this.
</p>
        <p>
Even better with SharePoint Search if you know you want PowerShell files only you
can enter the fileextension keyword and search will filter out everything but PowerShell
files.
</p>
        <p>
          <a href="http://blogs.flexnetconsult.co.uk/colinbyrne/content/binary/WindowsLiveWriter/IndexingyourCSandPowerShellcodewithShare_117D5/image_20.png">
            <img style="border-top-width: 0px; border-left-width: 0px; border-bottom-width: 0px; border-right-width: 0px" height="321" alt="image" src="http://blogs.flexnetconsult.co.uk/colinbyrne/content/binary/WindowsLiveWriter/IndexingyourCSandPowerShellcodewithShare_117D5/image_thumb_9.png" width="799" border="0" />
          </a>
        </p>
        <p>
 
</p>
        <p>
Searching your entire code repository with subsecond query times is now pretty easy.
</p>
        <img width="0" height="0" src="http://blogs.flexnetconsult.co.uk/colinbyrne/aggbug.ashx?id=869a1436-3b78-404c-a538-5eb20fe971aa" />
      </body>
      <title>Full Text Searching your CS and PowerShell code with SharePoint Search</title>
      <guid isPermaLink="false">http://blogs.flexnetconsult.co.uk/colinbyrne/PermaLink,guid,869a1436-3b78-404c-a538-5eb20fe971aa.aspx</guid>
      <link>http://blogs.flexnetconsult.co.uk/colinbyrne/2008/06/23/FullTextSearchingYourCSAndPowerShellCodeWithSharePointSearch.aspx</link>
      <pubDate>Mon, 23 Jun 2008 10:56:49 GMT</pubDate>
      <description>&lt;p&gt;
&amp;nbsp;
&lt;/p&gt;
&lt;p&gt;
&lt;a href="http://blogs.flexnetconsult.co.uk/colinbyrne/content/binary/WindowsLiveWriter/IndexingyourCSandPowerShellcodewithShare_117D5/image_22.png"&gt;&lt;img style="border-right: 0px; border-top: 0px; border-left: 0px; border-bottom: 0px" height="287" alt="image" src="http://blogs.flexnetconsult.co.uk/colinbyrne/content/binary/WindowsLiveWriter/IndexingyourCSandPowerShellcodewithShare_117D5/image_thumb.png" width="644" border="0"&gt;&lt;/a&gt; 
&lt;/p&gt;
&lt;p&gt;
Recently I went through the process of indexing a subversion source code repository
with SharePoint. I thought I'd share those steps as OOTB SharePoint won't index ps1,
cs or vb files.
&lt;/p&gt;
&lt;p&gt;
Setting up search to index these files works either if the files themselves live in
a document library or are external to SharePoint. The process to index files from
other source control systems will vary depending on how you can get access to the
source files. If you need to index SourceSafe you can set up what's called a mirror
directory that automatically save the files from your repositories on disk and I suspect
you can index Team Foundation Server via its Web Access URL's although I've not tried
that.
&lt;/p&gt;
&lt;p&gt;
The subversion side of things is pretty easy, pick the repository you want and export
the latest version using the svn client i.e. svn export svn://devhosting/svn/webparts
d:\SVNExport\webparts. Script the export of each repository and then schedule it. 
&lt;/p&gt;
&lt;p&gt;
On the SharePoint side you set up a new content source to crawl the directories. 
&lt;/p&gt;
&lt;p&gt;
In this case the Indexing is on a separate machine so we enter the UNC path. Make
sure the content access account has read rights to the share. If needed you can setup
separate credentials for this source.
&lt;/p&gt;
&lt;p&gt;
In the SSP on the Search Setting page, click &lt;strong&gt;New Content Source&lt;/strong&gt; under &lt;strong&gt;Content
source and crawl schedules&lt;/strong&gt;
&lt;/p&gt;
&lt;p&gt;
&lt;a href="http://blogs.flexnetconsult.co.uk/colinbyrne/content/binary/WindowsLiveWriter/IndexingyourCSandPowerShellcodewithShare_117D5/image12.png"&gt;&lt;img style="border-top-width: 0px; border-left-width: 0px; border-bottom-width: 0px; border-right-width: 0px" height="484" alt="image" src="http://blogs.flexnetconsult.co.uk/colinbyrne/content/binary/WindowsLiveWriter/IndexingyourCSandPowerShellcodewithShare_117D5/image12_thumb.png" width="644" border="0"&gt;&lt;/a&gt; 
&lt;/p&gt;
&lt;p&gt;
&lt;a href="http://blogs.flexnetconsult.co.uk/colinbyrne/content/binary/WindowsLiveWriter/IndexingyourCSandPowerShellcodewithShare_117D5/image15.png"&gt;&lt;img style="border-top-width: 0px; border-left-width: 0px; border-bottom-width: 0px; border-right-width: 0px" height="230" alt="image" src="http://blogs.flexnetconsult.co.uk/colinbyrne/content/binary/WindowsLiveWriter/IndexingyourCSandPowerShellcodewithShare_117D5/image15_thumb.png" width="644" border="0"&gt;&lt;/a&gt; 
&lt;/p&gt;
&lt;p&gt;
The problem now is if you start a full crawl typically only the .txt files are indexed
as the SharePoint indexers have no idea what to do with file extensions it doesn't
recognise. 
&lt;/p&gt;
&lt;p&gt;
There are a couple of steps to getting new file extensions indexed. This assumes you
are a Search Service administrator.
&lt;/p&gt;
&lt;p&gt;
&lt;strong&gt;First add the extension to File Types&lt;/strong&gt;
&lt;/p&gt;
&lt;p&gt;
1. On the Search Administration page, click &lt;strong&gt;File Types&lt;/strong&gt; under &lt;strong&gt;Crawling&lt;/strong&gt;. 
&lt;p&gt;
&lt;a href="http://blogs.flexnetconsult.co.uk/colinbyrne/content/binary/WindowsLiveWriter/IndexingyourCSandPowerShellcodewithShare_117D5/image_14.png"&gt;&lt;img style="border-top-width: 0px; border-left-width: 0px; border-bottom-width: 0px; border-right-width: 0px" height="436" alt="image" src="http://blogs.flexnetconsult.co.uk/colinbyrne/content/binary/WindowsLiveWriter/IndexingyourCSandPowerShellcodewithShare_117D5/image_thumb_6.png" width="356" border="0"&gt;&lt;/a&gt; 
&lt;p&gt;
2. On the Manage File Types page, click &lt;strong&gt;New File Type&lt;/strong&gt;. 
&lt;p&gt;
&lt;a href="http://blogs.flexnetconsult.co.uk/colinbyrne/content/binary/WindowsLiveWriter/IndexingyourCSandPowerShellcodewithShare_117D5/image_18.png"&gt;&lt;img style="border-top-width: 0px; border-left-width: 0px; border-bottom-width: 0px; border-right-width: 0px" height="284" alt="image" src="http://blogs.flexnetconsult.co.uk/colinbyrne/content/binary/WindowsLiveWriter/IndexingyourCSandPowerShellcodewithShare_117D5/image_thumb_8.png" width="598" border="0"&gt;&lt;/a&gt; 
&lt;p&gt;
3. On the Add File Type page, type the file name extension in the &lt;strong&gt;File extension&lt;/strong&gt; box
for the file type that you want to add.&lt;br&gt;
To search for PowerShell files, type ps1 
&lt;br&gt;
Do not include the period (.) character in front of the file name extension. 
&lt;p&gt;
4.Click &lt;strong&gt;OK&lt;/strong&gt;. 
&lt;p&gt;
&lt;a href="http://blogs.flexnetconsult.co.uk/colinbyrne/content/binary/WindowsLiveWriter/IndexingyourCSandPowerShellcodewithShare_117D5/image_16.png"&gt;&lt;img style="border-top-width: 0px; border-left-width: 0px; border-bottom-width: 0px; border-right-width: 0px" height="248" alt="image" src="http://blogs.flexnetconsult.co.uk/colinbyrne/content/binary/WindowsLiveWriter/IndexingyourCSandPowerShellcodewithShare_117D5/image_thumb_7.png" width="520" border="0"&gt;&lt;/a&gt; 
&lt;p&gt;
5. Rinse and repeat for each file type that you want to add. 
&lt;p&gt;
The second step in getting the file extensions recognised is to add it to the registry
entries the SharePoint Server Search service reads when it starts up. This key is
located at 
&lt;/p&gt;
&lt;p&gt;
HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Office Server\12.0\Search\Setup\ContentIndexCommon\Filters\Extension 
&lt;/p&gt;
&lt;p&gt;
Add a new key, enter the extension including the dot i.e. .ps1.
&lt;/p&gt;
&lt;p&gt;
Save and set its default value to be {4A3DD7AB-0A6B-43B0-8A90-0D8B0CC36AAB}. This
means use the text parser Ifilter tquery.dll for this extension.
&lt;/p&gt;
&lt;p&gt;
&lt;a href="http://blogs.flexnetconsult.co.uk/colinbyrne/content/binary/WindowsLiveWriter/IndexingyourCSandPowerShellcodewithShare_117D5/image_12.png"&gt;&lt;img style="border-top-width: 0px; border-left-width: 0px; border-bottom-width: 0px; border-right-width: 0px" height="221" alt="image" src="http://blogs.flexnetconsult.co.uk/colinbyrne/content/binary/WindowsLiveWriter/IndexingyourCSandPowerShellcodewithShare_117D5/image_thumb_5.png" width="644" border="0"&gt;&lt;/a&gt; 
&lt;/p&gt;
&lt;p&gt;
And a new key for each file extension you want indexed in this case cs,ps1 and aspx
but you can add vb vbs or whatever other text files you need indexed.
&lt;/p&gt;
&lt;p&gt;
Stop and start the search service with these commands
&lt;/p&gt;
&lt;p&gt;
net stop osearch
&lt;/p&gt;
&lt;p&gt;
net start osearch
&lt;/p&gt;
&lt;p&gt;
Now do a full crawl of your content type and your files should have been full text
indexed. The crawl log is useful in seeing if the filtering barfed on your files.
&lt;/p&gt;
&lt;p&gt;
&amp;nbsp;
&lt;/p&gt;
&lt;p&gt;
Now you can go to the Search Center enter your keyword and get a list of code files
back.
&lt;/p&gt;
&lt;p&gt;
&lt;a href="http://blogs.flexnetconsult.co.uk/colinbyrne/content/binary/WindowsLiveWriter/IndexingyourCSandPowerShellcodewithShare_117D5/image_6.png"&gt;&lt;img style="border-top-width: 0px; border-left-width: 0px; border-bottom-width: 0px; border-right-width: 0px" height="464" alt="image" src="http://blogs.flexnetconsult.co.uk/colinbyrne/content/binary/WindowsLiveWriter/IndexingyourCSandPowerShellcodewithShare_117D5/image_thumb_2.png" width="718" border="0"&gt;&lt;/a&gt;
&lt;/p&gt;
&lt;p&gt;
Here I've set up a custom scope, search page, and added a custom search tab so separate
the code results on its own. I won't go into it here but there is a &lt;a href="http://www.zimmergren.net/archive/tags/Search%20Scope/default.aspx" target="_blank"&gt;good
post here&lt;/a&gt; that shows how you do this.
&lt;/p&gt;
&lt;p&gt;
Even better with SharePoint Search if you know you want PowerShell files only you
can enter the fileextension keyword and search will filter out everything but PowerShell
files.
&lt;/p&gt;
&lt;p&gt;
&lt;a href="http://blogs.flexnetconsult.co.uk/colinbyrne/content/binary/WindowsLiveWriter/IndexingyourCSandPowerShellcodewithShare_117D5/image_20.png"&gt;&lt;img style="border-top-width: 0px; border-left-width: 0px; border-bottom-width: 0px; border-right-width: 0px" height="321" alt="image" src="http://blogs.flexnetconsult.co.uk/colinbyrne/content/binary/WindowsLiveWriter/IndexingyourCSandPowerShellcodewithShare_117D5/image_thumb_9.png" width="799" border="0"&gt;&lt;/a&gt;
&lt;/p&gt;
&lt;p&gt;
&amp;nbsp;
&lt;/p&gt;
&lt;p&gt;
Searching your entire code repository with subsecond query times is now pretty easy.
&lt;/p&gt;
&lt;img width="0" height="0" src="http://blogs.flexnetconsult.co.uk/colinbyrne/aggbug.ashx?id=869a1436-3b78-404c-a538-5eb20fe971aa" /&gt;</description>
      <comments>http://blogs.flexnetconsult.co.uk/colinbyrne/CommentView,guid,869a1436-3b78-404c-a538-5eb20fe971aa.aspx</comments>
      <category>search</category>
      <category>Sharepoint 2007</category>
    </item>
  </channel>
</rss>