Skip Ribbon Commands
Skip to main content

SharePoint Strategist Blog


SharePoint Strategist > SharePoint Strategist Blog > Posts > Indexing PDF Files - Must Do Steps for SSP
September 28
Indexing PDF Files - Must Do Steps for SSP
While at the Los Angeles SharePoint User Group the other night I was reminded by Doug Niles of QuickStart who gave the presentation that out of the box the search function in 2007 does not index PDF files.  Now I don't know about you but I think that's kinda funny.  Well, maybe just for me because I am constantly preaching that people should only upload to our publishing portals PDF files which can not be edited.  Since PDF's are such a common file type I just think its funny that Microsoft chose to not include that type by default. 
At any rate no matter how amused I may be it is important to remember to do these simple steps to make sure they are indexed as content when your search crawl executes.  The notes below discuss an installation with only one SSP.  They would need to be repeated on each SSP in your enterprise. The process is done in two phases as broken down below:
Phase One:  Edit the List of Indexable File Types
  1. Browse to Central Administration
  2. Click on SharedServices1 (or the name of the SSP you wish to edit located directly under the Shared Services Administration link)
  3. Click on Search Settings
  4. Click on File Types
  5. Click on New File Type, enter the extention, in this case PDF and click OK
Then you're done - well almost!  This entry will ensure the PDF files will be seen by the crawl at your next run, however, you won't by default have an icon to go with those files.
Phase Two: Edit DocIcon.xml to include PDF Icon
The file that controls the icons visible in the system is call docicon.xml and it is located in the 12 hive under the XML directory.  For those of you not familiar with the 12 hive let me issue a caution here. Editing some of these files can cause serious issues or even completely disable your installation and should be done with great care.  This one simply maps the extension to the proper image file but no matter how simple the edit you should have a nearby back up of the default file just in case. An extra hard return or comma can really cause you a problem.  Proceed with caution! 
You will also have to have access to the C:\ drive of the server where SharePoint is installed.  If you don't then you can pass these instructions on to the server administration for assistance.
1.  Download the icon.  The current link for this file is:
This file should be saved in a default installation to:
c:\Program Files\Common Files\Microsoft Shared\Web Server Extensions\12\Template\Images
2. Next you must edit the docicon.xml.  The path on the server in a default installation to this file is:
c:\Program Files\Common Files\Microsoft Shared\Web Server Extensions\12\Template\XML
Open the file in Notepad or other text editor. Doug noted that the file is in alphabetical order and that respecting that is a good best practice. 
3.  Add in the appropriate mapping. The text is:
<Mapping Key="pdf" value="filename" />
In my case the filename was pdficon.gif
4.  Save the file
5.  Perform an IISReset
Now to check that it worked go back to your list of file types and you should see the icon present next to the PDF file type.
Thanx again to Doug Niles and @SoCalSPUG for this great reminder!


There are no comments for this post.