Web Weaver Help


Help Topics

Introduction

Creating an Index with Searcher Pro NDX

Creating an Index On-Line

Creating an Index from the Command Line

Setting up your search form

Setting up your results form


Introduction

Help Topics

Thank you for purchasing Web Weaver for Windows!

Web Weaver is the easy to use full text indexed search engine for Windows web servers. It works with MS Internet Information Server, Netscape Web servers, and and Website, to provide fast access to your web site pages. Web Weaver provides features not found in other search engines, including:

1. Multi-field entry for syntax free full Boolean combinations.

2. Easy creation of and reference to multiple indexes (including optional site visitor selection of index).

3. Select files for index by any parameter; easy exclusion of individual files.

4. On-line indexer for remote administration.

5. Log file of indexes created.

6. The ranking and sorting of "hits" according to the number of keywords the files contain (including the Boolean terms) or date of the document.

7. Optional site visitor selection of file aging in days.

8. Match partial word option.

9. Displays header, date and size of HTML files found.

10. User selection of number of matches displayed.

Help Topics

How Web Weaver Works

Web Weaver consists of two parts (three if you count the on-line indexer): the indexer and the search engine. When you choose to create an index, Searcher Pro NDX or MakeNDX create two files: files.bin and text.bin (default names). Files.bin contains a list of the files indexed and text.bin contains a list of words found in the files. The indexer stores the file locations in the Weaver.ini file under the section specified by the index name.

Weaver.exe is a CGI program which is launched by the Web Server. The Web server then passes the text terms to Weaver. Weaver then looks up the words in the index files specified by the search form index name. It then reads the results.htm page, and writes the page information along with the search results, back to the server which displays it in a Web browser.

The combination command line/On-Line indexer: MakeNDX.exe, is also a CGI program which uses the information passed to it, to construct new index files.


Creating an Index with Searcher Pro NDX

Help Topics

Launch Searcher Pro NDX from the server by double clicking on the icon in the Web Weaver group in Program Manager or the Web Weaver folder in the Start Menu. See the Searcher Pro Windows help file for general information on operating Searcher Pro.

To create an index for your web site using Searcher Pro NDX, first configure Searcher Pro to find the files you want to index. Click on the disk icon for the correct drive and enter the path for the root of your web pages. If you have a tree of folders beneath the root for your web site, be sure to click the "All Subs" button. Searcher Pro can search for files by virtually any attribute.

Launch a Search (Alt-S) for the files you want to index and select them with the "Select All" button. You can select or un-select files one at a time by holding the Ctrl key down while clicking on files in the list. Then choose Index from the File menu or click on the "Index" button.

The "Index Selected Files" dialog box will appear with entries for the various files and folders to be used in the index creation. The "Index Name" entry specifies which index to create. You may choose a previous index from the drop down list for updating, enter a new index name, or use the default index name: "Index Files". The "File Index File" and "Text Index File" entries specify the files to which the index should be written. If the files already exist, you will be warned before they are overwritten.

The "Results HTML File" entry specifies the HTML file to which the search results should be written. You may use different results files for each index created, but they must all end with the same 14 characters as the example provided. See Setting up your results form .

The "Log File" entry specifies the HTML file to which the index creation results are written. Searcher Pro NDX writes the name, the total number or words and the number of unique words for each file indexed.

The "Sort Text Index" check box controls whether the text index is sorted. If it is not sorted the text "hits" will be presented in the order found. If the text index is sorted, the results will be presented in the order of files with the most matching words first.

Creating an index can take more than an hour, but Searcher Pro NDX is multi-threading so you can continue to perform other tasks, or even index other files while the index is being created.

Searcher Pro writes the index file information into the Weaver.ini file, which is then used by Web Weaver to find the index files. The index information is found according the text value specified by the entry:

SELECT NAME="section"

OPTION VALUE="IndexFiles">All Pages

OPTION VALUE="IndexFiles1">Site 1

OPTION VALUE="IndexFiles2">Site 2

/SELECT

in the Search form : Search.htm, where the value assigned to "section" determines the index to be used for the search.

Help Topics


Creating an Index On-Line

Help Topics

You can create an index on-line with somewhat less flexibility than Searcher Pro NDX, by using the MakeNDX.htm form and the MakeNDX program. MakeNDX will only create indexes for a single directory or a root and all sub-directories. It does not contain provisions for including or excluding individual files from the index.

To create an index on-line, open the MakeNDX.htm HTML form in a web browser. This form should be customized apriori, with the default values for the index to be created, including the correct path to the MakeNDX.exe CGI program for the server.

The MakeNDX HTML page contains entries for the same values as Searcher Pro NDX, plus one:

File Specification of HTML Pages (ie. path\*.htm)

Enter the path to the root of the web site page files, followed by the wildcard file specification. Separate multiple paths with semi-colons.

Index Root Directory Only

If this box is checked only the folder specified above will be indexed. Otherwise the files in all sub-directories will be indexed.

Do Not Sort Text

If this box is checked only the the files will not be sorted according to the number of matching words.

Index Name

The "Index Name" entry specifies which index to create. You may enter a new index name, or use the default index name: "Index Files".

MakeNDX writes the index file information into the Weaver.ini file, which is then used by Web Weaver to find the index files. The index information is found according the text value specified by the entry:

SELECT NAME="section"

OPTION VALUE="IndexFiles">All Pages

OPTION VALUE="IndexFiles1">Site 1

OPTION VALUE="IndexFiles2">Site 2

/SELECT

in the Search form : Search.htm, where the value assigned to "section" determines the index to be used for the search.

File Index File Name

This entry specifies the file to which the file index should be written. If the file already exists, you will NOT be warned before it is overwritten.

Text Index File Name

This entry specifies the file to which the text index should be written. If the file already exists, you will NOT be warned before it is overwritten.

Results HTML File

The "Results HTML File" entry specifies the HTML file to which the search results should be written. You may use different results files for each index created, but they must all end with the same 14 characters as the example provided. See Setting up your results form .

Log File Name

The "Log File" entry specifies the HTML file to which the index creation results are written. MakeNDX writes the name, the total number or words and the number of unique words for each file indexed.

Help Topics


Creating an Index from the Command Line

Help Topics

You can create an index from a console command line with the same program: MakeNDX.exe, as from an on-line HTML form. The capabilities are exactly the same as described in Creating an Index On-Line.

You can view the command line parameters for MakeNDX by running the program with no parameters. The usage is:

MakeNDX [/ns] [/nw] [/na] [/Ppath\file spec] [/Sindex name] [/Ffile index file] [/Ttext index file] [/Llog file] where:

[/ns] represents the optional "not sorted". Files are not sorted so that files with the most matching words appear first in the search results.

[/nw] represents "no warning" on file overwrites. If the specified index files exist they are overwritten without warning when this flag is used.

[/na] represents "not all" sub-directories are included. Only files in the specified sub-directory are indexed when this flag is used. Otherwise, all sub-directories are included in the index.

The [path\file spec] (ie. "c:\webroot\*.htm") specifies the directory and file specification of the files to be included in the index.

The [index name] parameter specifies which index to create. You may enter a new index name, or use the default index name: "Index Files".

The [file index file] specifies the file to which the file index should be written.

The [text index file] parameter specifies the file to which the text index should be written. If the file already exists, you will NOT be warned before it is overwritten.

The [log file] parameter specifies the HTML file to which the index creation results are written. MakeNDX writes the name, the total number or words and the number of unique words for each file indexed.


Setting up your search form

Help Topics

The search form contains an entry which the server uses to launch Web Weaver such as :

FORM METHOD="POST" ACTION="http://localhost/cgi-shl/weaver.exe"

for Website or:

FORM METHOD="POST" ACTION="http://localhost/scripts/weaver.exe"

for Internet Information Server.

If the URL for your CGI scripts is different than the Search.htm file specifies, you may need to change the entry in the Search.htm file, copy Weaver.exe to the new location, or both.

The search form provides entries which specify the key words for which to search. A page is found and listed if it contains the specified combination of words. Full Boolean combinations are supported.

You can simplify the page by deleting the form entries. Weaver currently supports 3 AND terms of 3 OR terms. If you require additional AND or OR terms, please contact Cognitronix at 858-549-8955.

Searches can be limited by the date of the documents, specified in number of days. If a document is older than the number of days specified, it is excluded from the listing. If "all" is entered for the number of days, all documents which contain the keywords are listed.

If the "Match Partial Word" option is checked, words which start with the specified letters are matched. For example if "window" is specifed, "window", "windows" and "windowing" are treated as a match. The entry: INPUT TYPE="checkbox" NAME="partial" VALUE="on", controls this option.

The number of pages which are displayed may be specifed by the user. If "all" is specified, all documents which contain the keywords will be displayed. Otherwise, the number of matches specified will be displayed, and the option to display the rest of the documents will be offered at the end of the page list.

Documents will be sorted by either the number of matches they contain, or the date of the document, at the option of the user. If the Number of Matches option is chosen, Web Weaver counts the number of times the keywords are in the documents, and displays pages with the largest count first.


Setting up your results form

Help Topics

The default search results page contains a URL to the default search page:

Search Again

If your search page URL is different (this URL does not work), you will need to edit the Results.htm file to correct the URL for the "Search Again" pointer. Be sure that your Web server is running before trying this test.

You can customize the results.htm file to suit the needs of your web site, but any results page that you use will need to end with the same 14 characters which the example results form contains. To be sure that this condition is met, it is recommended that you copy the results.htm file to a new name and edit it without touching the last line.