Hit Highlighting with dtSearch
Written by Ian Elliot   
Article Index
Hit Highlighting with dtSearch
Decorating hits


The FileConverter

The central object in making hit highlighting and other similar tasks very easy indeed is the FileConverter. This takes a file in any of the supported formats, and there are a lot of supported formats. and performs a transformation on it to HTML, RTF, XML or plain text. Just this feat alone is worth its weight in code but it will also "decorate" the conversion with markers that can be used to highlight the hits.

The first thing to say is that FileConverter is general and will process a document even if you acquired it by some complicated route. All you have to do is set its properties correctly and call its Execute method and the job is done.

The properties that you have to set are also fairly simple: the name of the file to process, a hits array giving the offsets from the start of the file of each of the hits, a specification of what characters constitute a word break, the index that the file was retrieved from and the document id. You also have to specify what format you want the results in and what strings you want to use to mark up the hits.

If you want to process a general file then you need to specify the file name as Inputfile or it the data is in a  memory buffer then use InputBytes.

You can do this job one property at a time and it isn't difficult but if the file has been returned as the result of a search then it is even easier. The SetInputItem method can be used to set all of the necessary properties from a SearchResults object. For example:

FileConverter fc=new FileConverter();

creates a FileConverter and initializes it so as to be ready to process the first document in the results. After this the only things we need to set are the output format required, e.g. HTML, RTF, XML or plain text, and the strings to be inserted before and after each hit. For example to create an HTML document and surround every hit with an <h1> tag pair (an odd choice but still possible) you would use:

fc.OutputFormat = OutputFormats.itHTML;
fc.BeforeHit = "<h1>";
fc.AfterHit = "</h1>";

Almost ready to highlight, or headline in this case, but first we need to set where the output is going. The FileConverter can either create a new file or it can return a string. In this case we will opt to display the HTML in a webBrowser control and so a string is most appropriate. Place a webBrowser control on the form and store the result of the highlighting in it using

fc.OutputToString = true;
webBrowser1.DocumentText = fc.OutputString;

If you now put all this together you will find that every occurrence of the target search item is set as an <h1> headline.




In most cases it would be better to use a color code to highlight the hits but you really are free to do whatever you want with the text that constitutes the hits.

A really nice touch is the provision of the special tags - %%ThisHit%%, %%NextHit%% and %%PreviousHit%% which causes the FileConverter to place numbers into the file corresponding to the index of the hit in the array. With a little work this can be used to create hyperlinks that the user can use to navigate the document.

Also notice that while the output of the conversion is HTML (or RTF, or XML or plain text) the input file can be in any of the supported formats, making FileConverter a one-stop solution to highlighting hits in almost any type of file in the index.

This is almost too easy.

To try dtSearch for yourself download the 30-day evaluation from dtsearch.com.

See also:

Getting Started with dtSearch

Threading and dtSearch


If you would like to be informed about new articles on I Programmer you can either follow us on Twitter or Facebook or you can subscribe to our weekly newsletter.



What's The Matter With Pointers?

Back in the days when C was the language of choice, pointers meant programming and vice versa. Now in the more sophisticated and abstract days of C#, and even C++, raw pointers are a facility that is  [ ... ]

C# Bit Bashing - The BitConverter

Is C# a high-level or a low-level language? It doesn't really matter - all languages are low-level when you are thinking in terms of bits, and sometimes you just can't avoid thinking in bits.

Other Articles

Last Updated ( Thursday, 06 October 2011 )

RSS feed of all content
I Programmer - full contents
Copyright © 2014 i-programmer.info. All Rights Reserved.
Joomla! is Free Software released under the GNU/GPL License.