
Creating an Index with Searcher Pro NDX
Creating an Index from the Command Line
Thank you for purchasing Web Weaver for Windows!
Web Weaver is the easy to use full text indexed search engine for Windows web servers. It works with MS Internet Information Server, Netscape Web servers, and and Website, to provide fast access to your web site pages. Web Weaver provides features not found in other search engines, including:
1. Multi-field entry for syntax free full Boolean combinations.
2. Easy creation of and reference to multiple indexes (including optional site visitor selection of index).
3. Select files for index by any parameter; easy exclusion of individual files.
4. On-line indexer for remote administration.
5. Log file of indexes created.
6. The ranking and sorting of "hits" according to the number of keywords the files contain (including the Boolean terms) or date of the document.
7. Optional site visitor selection of file aging in days.
8. Match partial word option.
9. Displays header, date and size of HTML files found.
10. User selection of number of matches displayed.
Web Weaver consists of two parts (three if you count the on-line indexer): the indexer and the search engine. When you choose to create an index, Searcher Pro NDX or MakeNDX create two files: files.bin and text.bin (default names). Files.bin contains a list of the files indexed and text.bin contains a list of words found in the files. The indexer stores the file locations in the Weaver.ini file under the section specified by the index name.
Weaver.exe is a CGI program which is launched by the Web Server. The Web server then passes the text terms to Weaver. Weaver then looks up the words in the index files specified by the search form index name. It then reads the results.htm page, and writes the page information along with the search results, back to the server which displays it in a Web browser.
The combination command line/On-Line indexer: MakeNDX.exe, is also a CGI program which uses the information passed to it, to construct new index files.
Launch Searcher Pro NDX from the server by double clicking on the icon
in the Web Weaver group in Program Manager or the Web Weaver folder
in the Start Menu. See the Searcher Pro Windows help file for general
information on operating Searcher Pro.
To create an index for your web site using Searcher Pro NDX, first
configure Searcher Pro to find the files you want to index. Click on the
disk icon for the correct drive and enter the path for the root of your
web pages. If you have a tree of folders beneath the root for your
web site, be sure to click the "All Subs" button. Searcher Pro can
search for files by virtually any attribute.
Launch a Search (Alt-S) for the files you want to index and select them
with the "Select All" button. You can select or un-select files one at a
time by holding the Ctrl key down while clicking on files in the list.
Then choose Index from the File menu or click on the "Index" button.
The "Index Selected Files" dialog box will appear with entries for the
various files and folders to be used in the index creation. The
"Index Name" entry specifies which index to create. You may choose a
previous index from the drop down list for updating, enter a new index
name, or use the default index name: "Index Files". The "File Index File"
and "Text Index File" entries specify the files to which the index should
be written. If the files already exist, you will be warned before they
are overwritten.
The "Results HTML File" entry specifies the HTML file to which the search
results should be written. You may use different results files for each
index created, but they must all end with the same 14 characters as the
example provided. See Setting up your results form .
The "Log File" entry specifies the HTML file to which the index creation
results are written. Searcher Pro NDX writes the name, the total number
or words and the number of unique words for each file indexed.
The "Sort Text Index" check box controls whether the text index is sorted.
If it is not sorted the text "hits" will be presented in the order found.
If the text index is sorted, the results will be presented in the order
of files with the most matching words first.
Creating an index can take more than an hour, but Searcher Pro NDX is
multi-threading so you can continue to perform other tasks, or even
index other files while the index is being created.
Searcher Pro writes the index file information into the Weaver.ini file,
which is then used by Web Weaver to find the index files. The index information
is found according the text value specified by the entry:
SELECT NAME="section"
OPTION VALUE="IndexFiles">All Pages
OPTION VALUE="IndexFiles1">Site 1
OPTION VALUE="IndexFiles2">Site 2
/SELECT
in the Search form : Search.htm, where the
value assigned to "section" determines the index to be used for the search.
You can create an index on-line with somewhat less flexibility than
Searcher Pro NDX, by using the MakeNDX.htm form and the MakeNDX program.
MakeNDX will only create indexes for a single directory or a root and
all sub-directories. It does not contain provisions for including or
excluding individual files from the index.
To create an index on-line, open the MakeNDX.htm HTML form in a web browser.
This form should be customized apriori, with the default values for the index to
be created, including the correct path to the MakeNDX.exe CGI program
for the server.
The MakeNDX HTML page contains entries for the same values as Searcher
Pro NDX, plus one:
Enter the path to the root of the web site page files, followed by
the wildcard file specification. Separate multiple paths with semi-colons.
If this box is checked only the folder specified above will be indexed.
Otherwise the files in all sub-directories will be indexed.
If this box is checked only the the files will not be sorted according
to the number of matching words.
The "Index Name" entry specifies which index to create. You may enter a
new index name, or use the default index name: "Index Files".
MakeNDX writes the index file information into the Weaver.ini file,
which is then used by Web Weaver to find the index files. The index information
is found according the text value specified by the entry:
SELECT NAME="section"
OPTION VALUE="IndexFiles">All Pages
OPTION VALUE="IndexFiles1">Site 1
OPTION VALUE="IndexFiles2">Site 2
/SELECT
in the Search form : Search.htm, where the
value assigned to "section" determines the index to be used for the search.
This entry specifies the file to which the file index should
be written. If the file already exists, you will NOT be warned before it
is overwritten.
This entry specifies the file to which the text index should
be written. If the file already exists, you will NOT be warned
before it is overwritten.
The "Results HTML File" entry specifies the HTML file to which the search
results should be written. You may use different results files for each
index created, but they must all end with the same 14 characters as the
example provided. See Setting up your results form .
The "Log File" entry specifies the HTML file to which the index creation
results are written. MakeNDX writes the name, the total number
or words and the number of unique words for each file indexed.
You can create an index from a console command line with the same program:
MakeNDX.exe, as from an on-line HTML form. The capabilities are exactly the
same as described in Creating an Index On-Line.
You can view the command line parameters for MakeNDX by running the program
with no parameters. The usage is:
MakeNDX [/ns] [/nw] [/na] [/Ppath\file spec] [/Sindex name] [/Ffile index file] [/Ttext index file]
[/Llog file] where:
[/ns] represents the optional "not sorted". Files are not sorted so that files with the
most matching words appear first in the search results.
[/nw] represents "no warning" on file overwrites. If the specified index files exist
they are overwritten without warning when this flag is used.
[/na] represents "not all" sub-directories are included. Only files in the specified
sub-directory are indexed when this flag is used. Otherwise, all sub-directories are
included in the index.
The [path\file spec] (ie. "c:\webroot\*.htm") specifies the directory and file specification
of the files to be included in the index.
The [index name] parameter specifies which index to create. You may enter a
new index name, or use the default index name: "Index Files".
The [file index file] specifies the file to which the file index should
be written.
The [text index file] parameter specifies the file to which the text index should
be written. If the file already exists, you will NOT be warned
before it is overwritten.
The [log file] parameter specifies the HTML file to which the index creation
results are written. MakeNDX writes the name, the total number
or words and the number of unique words for each file indexed.
The search form contains an entry which the server uses to launch
Web Weaver such as :
FORM METHOD="POST" ACTION="http://localhost/cgi-shl/weaver.exe"
for Website or:
FORM METHOD="POST" ACTION="http://localhost/scripts/weaver.exe"
for Internet Information Server.
If the URL for your CGI scripts is different than the Search.htm
file specifies, you may need to change the entry in the Search.htm
file, copy Weaver.exe to the new location, or both.
The search form provides entries which specify the key words for which
to search. A page is found and listed if it contains the specified combination
of words. Full Boolean combinations are supported.
You can simplify the page by deleting the form entries. Weaver currently
supports 3 AND terms of 3 OR terms. If you require additional AND or
OR terms, please contact Cognitronix at 858-549-8955.
Searches can be limited by the date of the documents, specified in number
of days. If a document is older than the number of days specified, it is
excluded from the listing. If "all" is entered for the number of days, all
documents which contain the keywords are listed.
If the "Match Partial Word" option is checked, words which start with the
specified letters are matched. For example if "window" is specifed,
"window", "windows" and "windowing" are treated as a match. The entry:
INPUT TYPE="checkbox" NAME="partial" VALUE="on", controls this option.
The number of pages which are displayed may be specifed by the user.
If "all" is specified, all documents which contain the keywords will be displayed.
Otherwise, the number of matches specified will be displayed, and the
option to display the rest of the documents will be offered at the end of
the page list.
Documents will be sorted by either the number of matches they contain, or
the date of the document, at the option of the user. If the Number of Matches
option is chosen, Web Weaver counts the number of times the keywords are
in the documents, and displays pages with the largest count first.
The default search results page contains a URL to the default search
page:
If your search page URL is different (this URL does not work),
you will need to edit the Results.htm file to correct the URL for
the "Search Again" pointer. Be sure that your Web server is running
before trying this test.
You can customize the results.htm file to suit the needs of your web
site, but any results page that you use will need to end with the same
14 characters which the example results form contains. To be sure that
this condition is met, it is recommended that you copy the results.htm
file to a new name and edit it without touching the last line.
Creating an Index with Searcher Pro NDX
Creating an Index On-Line
File Specification of HTML Pages (ie. path\*.htm)
Index Root Directory Only
Do Not Sort Text
Index Name
File Index File Name
Text Index File Name
Results HTML File
Log File Name
Creating an Index from the Command Line
Setting up your search form
Setting up your results form