Page 1 of 7
Search engines have revolutionised the way that we use the Internet and recently the desktop. No longer do you have to be careful where you store files because you can find anything by searching on content. Users are becoming increasingly reliant on being able to find whatever they have lost, but if you program then the task of implementing a search engine is a tall order.
The good news is that you don't have to because Microsoft has made the inner workings of its Desktop Search 3.0/4.0 available for you to use. You can easily build this into an application that finds and organises documents of all kinds – photos, music, web pages and so on. With just a little extra effort you can use it to create a search facility for your ASP .NET website.
Which Desktop Search?
Desktop Search is developing rapidly but the one to use at the time of writing is version 4.0. This is built into Windows 7 and renamed Windows Search but if you are using Windows Vista, XP or Windows Server 2008/2003 then you will have to download and install it.
Put simply Windows Search = Windows Desktop Search 4 (WDS 4)
A good place to start finding out about Desktop Search and the relevant downloads is:
There is also an SDK including some examples and a .NET wrapper for most of the COM objects involved in the API.
The big problem is that most of the interfaces have no documentation and this makes the wrapper difficult to use. There is no separate SDK for Windows 7 as it is included in the Window 7 SDK.
If you simply want to use the Desktop Search to find files then you probably don't need the wrapper as this portion of the interface is based on a simulated database API. What this means is that we can use a SQL query to search for documents based on a wide range of properties and content.
Before we get started it is worth understanding how the search actually works.
After you have installed the Desktop Search it starts to build an index of files. It doesn't index the entire disk however, the default settings are to index your document directory and the directory Outlook (2007 on) stores emails in.
To build the index the search engine has to examine every file. It does this with the help of IFilters which return properties such as file name, owner, creation date and in some cases the text that the file contains.
The index takes time to create and until it has been completed you can't trust the results of a search. In addition once the index has been created the search engine keeps it up-to-date by examining any new files or changed files. This runs as a low priority task and indexing is suspended while you are using the machine – this can mean that the very latest files you have created or downloaded aren't immediately available in the index.
Another problem is that if you store what you consider "searchable" documents in a directory outside of your /documents directory, on another disk say, these will not be included in the index. The only solution is to manually or programmatically configure the search engine to include these directories. It is important to realise that just because the Desktop Search cannot find a file this doesn't imply that it isn't stored on the disk, just that it isn't stored in the folders currently included in the index.
The index created by Desktop Search can be queried as if it was an OLE DB database.
The only problem with this is that OLE DB has been more or less completely superseded by ADO .NET. This means that many of the examples supplied are difficult to follow and based on last year's technology. As ADO .NET is perfectly happy working with OLE DB, or almost any of the older database technologies, it seems sensible to use the most up-to-date approach – after all you spent time learning about it.
If you don't know how ADO .NET works using it with the Desktop Search is also a good introduction to its basic principles.