Index Server FAQ
The Indexing Services MMC
The Indexing Services MMC is the central point of administration for Indexing Services. You can access it through typing ciadv.msc at the command prompt or through the Computer Management MMC (accessible by typing compmgmt.msc at the command prompt), expand the Services and Application Node, and click on Indexing Services.
Q. I get different results when I use the Query the Catalog section in the Indexing Services MMC than what I get when I search using my web application. Why is this?
A. By default the Query the Catalog section in the Indexing Services MMC will only query physical roots, and not virtual roots. Your web application by default will only query virtual roots. The Query the Catalog section is illustrated in Figure 1.
Q. I like the Query the Catalog web page. Can I use this as a web page?
A. You could until this security patch (click here to see it). You can find this web page in
%WinDir%\help. It is called ciquery.htm. You would query it like this from your web server:
if you had copied the file to the root of your web server's default web site.
Q. What is a virtual root?
A. A virtual root corresponds to the root of your web site, or a virtual directory in your web site. Virtual roots will show up in the directories folder of your catalog in the Indexing Services MMC as folders with blue globes on them. Virtual roots are illustrated in Figure 2. If a directory or a document is a virtual root, or is within a virtual root, the vpath property will have a value in your results set, instead of being NULL.
Q. What is a physical root?
A. Physical roots correspond to directories within your file system. They will show up in the Indexing Services MMC as plain folders - to contrast them with the virtual roots which have blue globes on them. Physical roots are illustrated in Figure 2 (c:\temp is an example of one). If a directory or document is a physical root, or in a physical root, the path property will be populated. The path property is always populated for all documents or roots (physical and virtual).
Q. I created some custom MetaTags; however they don't show up in the MMC Properties Window? How can I get them to show up?
A. Use FiltDump (download the Platform SDK, it can be found in the bin folder), to see how Indexing Services interprets your custom MetaTags. If they show up in the output of FiltDump, Indexing Services should publish them to the properties folder in the Indexing Services MMC. Some iFilters take a while to publish these properties to Indexing Services, be patient. It may take up to 10 minutes to publish some properties (XML is very slow).
Q. How do I cache properties without using MMC ?
A. Use the registry for this. Consult this kb article for more information. http://support.microsoft.com/default.aspx?scid=kb;en-us;198586
Q. My Virtual Root is not shown in MMC, how do I fix this?
A. Open up the IIS MMC (not the Indexing Services MMC, but the IIS MMC, accessible by typing %windir%\system32\inetsrv\iis.msc at the command prompt), expand your web site node; locate the folder that does not have a blue globe on it in the Indexing Services MMC. This folder may have a blue globe on it in the IIS MMC (which means it is a virtual root, in other words this directory is not a physical subfolder of the web site root, i.e., it is not a subdirectory of c:\inetpub\wwwroot or the root of your web site), or it may have what appears to be a box with a marble in it (which means it is virtual directory in IIS - a directory which may or may not exist as a subdirectory of your web site roots, but has process isolation from the rest of your web site). Indexing Services considers both virtual roots and virtual directories to be virtual roots. Right click on these virtual roots/directories in the IIS MMC, select properties, and virtual directory. Ensure that the Index this resource check box is checked (Figure 4).
If the virtual root, or the directory you wish to index as a virtual root, does not appear within the scope of your web site, you must locate this folder using file explorer (type explorer at the command prompt), right click on the folder, select sharing, and then click the web sharing tab. Select the Share this folder option, and park the virtual directory in the web site of your choice.
Q. How to define brand-new property sets?
A. You can't do this. Property sets are generated by an iFilter. You could write your own iFilter to accomplish this. Consult this link for more information on how to write an iFilter. http://msdn.microsoft.com/library/default.asp?url=/library/en-us/odc_SP2003_ta/html/ODC_HowToWriteaFilter.asp
Q. How do I create a new catalog?
A. In the Indexing Services MMC, select Action, point to New, and click Create Catalog. Assign your catalog and name, and give it a location. With Win2k and above, the catalog files will not be indexed by default. If you were not careful of your catalog location in NT 4.0 you could end up indexing your catalog files and consuming considerable CPU in the process.
Q. Do I have to use the Indexing Services MMC to create a new catalog?
A. No, you can do it through a reg file. Here is an example of such a reg file. Save this file with the extension reg.
Windows Registry Editor Version 5.00 [HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\ContentIndex\Catalogs\CatalogName] "Location"="c:\\CatalogLocation"
"MaxCharacterization"=dword:00000140 [HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\ContentIndex\Catalogs\CatalogName\Scopes] "c:\\inetpub\\wwwroot\\"=",,3"
Note that you have to change the catalog name from CatalogName to whatever you want your catalog called, and you have to change the location from c:\CatalogLocation, to its actual catalog location (you will have to pre-create this directory). You will also have to add the scopes manually. After importing the reg file (by clicking on it) stop and start your Indexing Services server (go to the command prompt and type net stop cisvc and then net start cisvc).
Q. How do I get Indexing Services to index a new folder?
A. In the Indexing Services MMC, expand your catalog, and then expand your directories folder. Right click on your directories folder and select New Directory. Enter the name of the path, and accept the default to add this directory. You also have the option of indexing a directory, and then adding subdirectories you do not wish to be included in this index. To exclude these directories, ensure you select No for the Include in Index option in the Add Directory dialog box, as illustrated in Figure 3.
Q. How do I get Indexing Services to index my web site?
A. Any web site created by default will be indexed by the web catalog. You can create new catalogs and have them tracked by a different catalog by right clicking on your catalog in the Indexing Services MMC, and selecting Properties and then Tracking. Select your web site in the drop down box labeled WWW Server. This is illustrated in Figure 4.
Q. How do I get Indexing Services to index my virtual directory?
A. You can't add virtual directories through the Indexing Services MMC. You must use the IIS MMC. Open up your IIS MMC (from the command prompt type c:\winnt\system32\inetsrv\iis.msc), and expand your web site. Locate your Virtual Directory, right click on it, select properties, and then click the Home Directory tab. Ensure the check box Index this resource is checked. This is illustrated in Figure 5.
Q. What are the blue globes that show up in my Indexing Services MMC in the directories folder? (These are illustrated in Figure 2).
A. These are virtual roots; they correspond to virtual roots or virtual directories within your web site. In your search results, hits coming from virtual roots will have a vpath value. If they come from a physical root, the value of the vpath property will be null. Hits coming from virtual roots or physical roots will always have a value for the path property which corresponds to the physical directory the document resides in.
Q. Why are some of my folders which show up in directories folder of the Indexing Services MMC tinged a light green?
A. These are folders, or physical roots come from network shares.
Q. How do I index a NNTP site?
A. Open up your IIS MMC (server versions of Win2k and above only), expand your NNTP server, Right click on your NNTP virtual directory and select properties. Ensure that the Index News Content check box is checked (Figure 6).
Then right click on your catalog you wish to index your nntp server, and select properties. Click on the tracking tab. In the drop down box for NNTP server, select your NNTP server (Figure 7) (Server versions of Win2k and above only).
Q. How do I index a network share?
A. Select an account to crawl the network share. Expand your catalog, right click on the directories folder, and select add directory. You will see the dialog illustrated in Figure 3. Enter the path as an unc this will enable the account option as illustrated in Figure 8.
By default the network share will be crawled every 2 hours. This interval is defined in the registry key HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\ContentIndex with the registry value ForcedNetPathScanInterval. Win2k and above supports file-modification notifications where file activity on the network share will notify the change journal on the local machine to index the file.
Q. How do I index a Novell share?
A. Indexing a Novell share, or any other OS or appliance is the same as indexing a network share. As long as the account you are using to do the crawl has rights to read this share.
Q. How do I index an ftp site?
A. Expand your catalog, right click on the directories folder, select New and then Directory. For the path, enter the ftp server and root. For user name enter the NT account, or anonymous. Figure 9 illustrates this.
Q. How do I index a DFS share?
A. You can't. Please consult this knowledgebase article for more information. http://support.microsoft.com/default.aspx?scid=kb;en-us;260207
Q. Can I put my index on a network drive?
A. No, Indexing Services requires catalog placement on a local drive.
Q. How do I index the contents of a CD ROM?
A. First off your drive must be formatted as a FAT or FAT32 drive. Create your catalog on this drive and place your documents on this drive. The documents can be in subdirectories. After Indexing Services has completed indexing the drive, do a merge, stop Indexing Services (net stop cisvc at the command prompt) and burn the contents of this drive (or at least the contents of the catalog.wci folder and all documents indexed maintaining the directory structure) to your CD ROM. When you place the CD in a drive it will be mounted as Removable_Q , where Q is the drive letter normally associated with your CD ROM. Figure 10 illustrates was a catalog mounted from a CD will look like.
To access this catalog in your code use the name Removable_Q where Q is the drive letter normally associated with your CD ROM.
Q. When I open the property folder I see many columns such as Property, Property Sets, Friendly Names, etc. (there columns are illustrated in Figure 11). What do these columns refer to?
A. Indexing Services not only allows you to search content, but it also allows you to search properties; properties like filename, file size, the Office Summary information (to see these click here), NTFS properties (to see these click here), and HTML MetaTags.
Indexing Services uses iFilters to extract the properties and content from documents. Some iFilters are multipurpose-they extract content from different document types. For instance the Office iFilter extracts content and properties from Word, Excel, and PowerPoint documents; the HTML iFilter also indexes ASP pages.
Some document types have the same properties. All documents on Win2k and above have NFTS properties which are indexable and queryable. Office documents and some HTML MetaTags overlap. You can see these by right clicking on a document in the file system, and selecting properties, summary.
Properties are grouped by function, or sometimes a vendor chooses to group them according to their own idiosyncratic fashion. Microsoft chose to select a random GUID to identify a set of properties. Here is a list of the properties I know about.
You will see these properties in your properties folder is 1) you have the iFilter that corresponds to this property set on your machine, and 2) you have documents in your file system that the iFilter indexes, and 3) the properties the iFilter extracts can be found in these documents.
You will not see the last 4 property sets in the Indexing Services MMC.
Part of the reason Microsoft has chosen to adopt this strategy is to prevent property name collisions. For instance you might have one document which has a property of Size. This property might also exist in a different document type, and also exists in the NTFS property set. Grouping properties according to property sets prevents such naming collisions.
The Property column refers to the name of your property, this is the name you will query by, or the name of the column in the results set returned from a query. Not all properties are queryable, indexable, or returnable in a query. You will notice that some of the property names are hex values, or some properties might have duplicate names, in this case you will want to query the property by its friendly name. For instance the characterization property has a property name of 0x2. You will want to query it by its friendly name - characterization.
If your property does not have a friendly name you will want to give it one using the default column file registry setting. Click here from more information on how to use this.
Data type refers to the data type of your property. Most properties will be string, but they can also be integer, or date (among others). This presents some complications in querying and indexing. Please click here for more information on this.
Cached Size is how much space is allocated for property storage in the property store. Accept the defaults in all cases.
Storage level refers to where the property is stored when indexed. Frequently queried properties should be stored in the primary cache as it is preferentially not paged to disk. Properties which are not stored frequently should be stored on the secondary cache, which is paged to disk.
Q. I want to be able to search or retrieve a property? How do I do this?
A. Not all properties are queryable, indexable, or retrievable. By retrievable I mean you will not be able to display their contents in your results set. Use FiltDump to see if your iFilter supports indexing this property, observe the name that the iFilter gives to this property. You may have to seed the property with a value to spot it in FiltDump's output. Then use the default column file to name it if it doesn't have a friendly name. Click here from more information on how to use this. Please return to the troubleshooting section for more information on these classes of problems.
Q. When I search I don't get correct results and I notice that in the Indexing Services MMC, the status for my catalog says Indexing paused (user active), started. What does this mean, and how do I fix this?
A. Please consult this kb article for more information.
Q. How can I find out what documents can't be indexed?
A. Expand your catalog, select the Query the Catalog Page. Click on the Unfiltered Docs link as illustrated in Figure 12.
Q. How can I find out what documents are waiting to be indexed?
A. There is an applet that does this. PLEASE PROVIDE A LINK TO IT.
Q. How can I stop certain folders from being indexed?
A. Right click on these folders, select properties and advanced. Uncheck the For fast searching allow Indexing Service to index this folder.
Q. How can I stop certain document types from being indexed?
A. You have two options. The first is to remove the persistent handler registry key for the extension for this document type. For instance to prevent Word documents from being indexed remove this key:
or change it to something like this
You can also add the following registry entries to the key
Value Name c:\inetpub\wwwroot\*.txt Value Data ,,4
Here we are preventing any text files from being indexed in the root of our default web site, and any of its subdirectories (this tip courtesy or Matthew Folletti (matthewfoletti at hotmail.com)).
designed by :: smilla group