Page 1 of 2
Search is an important part of desktop and web based applications but it can be made to seem more difficult than needs be. We take a look at how to implement search via an object-oriented API using dtSearch and C#.
You may have noticed that I Programmer's search facility is, to put it bluntly, not very good. It has been on our fix list for a long time but so far no one has had the courage to decide on the technology to use to replace it.
I also have a great interest in desktop search - or rather how how it generally doesn't work under Windows. Since Vista, Window's desktop search has been difficult to use, difficult to configure and difficult to manage. I've tried alternatives such as Windows Search 4.0 and Solr but there are problems with both. They tend to be over complex and simply not worth the effort. Now I'm investigating dtSearch and I can tell you now, it's a refreshing return to simplicity.
But see if you agree as I explain how easy it is to get started with it as a system component and as an API.
What is search all about?
This is a difficult question because there are many answers depending on your exact circumstances. Put simply, search is about finding documents based on their content. The dumb way to do this is search the entire collection of documents each time. The most intelligent way of doing the job is to scan all of the documents and build an index of the words that they contain. The index is typically much smaller than the collection of documents and much faster to search.
In most cases you only have to scan the entire collection of files once and then add any new files to the index. The big problem is that most tools make creating an index a difficult task. Not so with dtSearch. It makes it seem ease and direct and you can see exactly what is going on.
You could simply use the dtSearch console to find documents but in most cases it is preferable to build it into an app of your own - and this is where the API comes in. But before looking at the basic structure of the API and building our first "hello world" applications let's take a quick look at getting started with dtSearch.
Installation and first use
All you have to do is download the 30-day evaluation of dtSearch With Spider and start the installer. Once is has finished run the dtSearchDesktop. This is your command centre for everything from creating an index to searching an index.
Your first task is to create an index. There are a number of different options you can select that make the index more useful, but for the moment you can opt for the defaults. All you have to do is give the index a name test in the following examples and specify where the index is stored. You can accept the default location for the moment but the fact that dtSearch doesn't try to hide where the index is stored by using a fixed internal path is welcome. When you come to use it for real you can specify a document path for the index that is say included in your regular backup. Notice that you can use a network share for the index location but this will run slower than a local file.
Once you click OK the index is created and your next step is to specify the files that will be indexed. A single index can be used to index multiple locations and multiple location types. You can add additional locations to an index at any time simply by using the Update Index command. Simply add the data you want to index to the "What to Index" list. You can index a folder, file, a website, or an Outlook store. You can customize the files that are included in the index using filename extensions - but for this example just accept the defaults. For the example I simply indexed my local documents directory - not a big collection of files but enough to show the principals of operations.
After a few minutes the index should be complete - you can stop it or pause it any time you want to. The next task is to query it. It is a good idea to become familiar with the query mechanisms before you move on to programming because what you can do using the API is very similar. Simply select the search option and type in your search target. You can type a single word, a phrase or a conditional For example "Hello or World" finds documents with "Hello" or "World".
The search results give you a list of document locations and various search statistics - hits, score etc. In the lower pane you can also see extracts from the documents where the hits occurred.
There are a great many search options that you can set and the best way to find out about them is to look up what they do and then construct searches using them.
That's about all you need to know about the basic use of dtSearch - set up and index, search it, view documents. It couldn't be simpler. From the programmers point of view it is made all the more simple because the details of operation aren't hidden from view in some complex admin structure. In this case you see the paths involved and can see the search parameters and results.
There are more details of the way the index is created and searched that you can control but this is enough to move on and start looking at the API.
So before moving on make sure you have an index called test in a known location and that you have tried it out with a search term or two.