1、 英文材料 Information management system WiliamK.Thomson U.S.A Abstract: An information storage, searching and retrieval system for large (gigabytes) domains of archived textual dam.The system includes multiple query generation processes, a search process, and a presentation of search results that is sor
2、ted by category or type and that may be customized based on the professional discipline(or analogous personal characteristic of the user), thereby reducing the amount of time and cost required to retrieve relevant results. Keyword:Information management Retrieval system Object-Oriented 1.INTRUDUCTIO
3、N This invention relates to an information storage, searching and retrieval system that incorporates a novel organization for presentation of search results from large (gigabytes) domains of archived textual data. 2.BACKGROUDN OF THE INVENTION On-line information retrieval systems are utilized for s
4、earching and retrieving many kinds of information.Most systems used today work in essentially the same manner;that is, users log on (through a computer terminal or personal microcomputer, and typically from a remote location), select a source of information (i.e., a particular database) which is usu
5、ally something less than the complete domain, formulate a query, launch the search, and then review the search results displayed on the terminal or microcomputer, typically with documents (or summaries of documents) displayed in reverse chronological order. This process must be repeated each time an
6、other source (database) or group of sources is selected (which is frequently necessary in order to insure all relevant documents have been found).Additionally, this process places on the user the burden of organizing and assimilating the multiple results generated from the launch of the same query i
7、n each of the multiple sources (databases) that the user needs (or wants) to search. Present systems that allow searching of large domains require persons seeking information in these domains to attempt to modify their queries to reduce the search results to a size that the user can assimilate by br
8、owsing through them (thus, potentially eliminating relevant results). In many cases end users have been forced to use an intermediary (i.e., a professional searcher) because the current collections of sources are both complex and extensive,and effective search strategies often vary significantly fro
9、m one source to another. Even with such guidance, potential relevant answers are missed because all potentially relevant databases or information sources are not searched on every query. Much effort has been expended on refining and improving source selection by grouping sources or database files to
10、gether. Significant effort has also been expended on query formulation through the use of knowledge bases and natural language processing. However, as the groupings of sources become larger, and the responses to more comprehensive search queries become more complete, the person seeking information i
11、s often faced with the daunting task of sifting through large unorganized answer sets in an attempt to find the most relevant documents or information. 3.SUMMARY OF THE INVENTION The invention provides an information storage,searching and retrieval system for a large domain of archived data of vario
12、us types, in which the results of a search are organized into discrete types of documents and groups of document types so that users may easily identify relevant information more efficiently and more conveniently than systems currently in use.The system of the invention includes means for storing a
13、large domain of data contained in multiple source records, at least some of the source records being comprised of individual documents of multiple document types; means for searching substantially all of the domain with a single search query to identify documents responsive to the query; and means f
14、or categorizing documents responsive to the query based on document type, including means for generating a summary of the number of documents responsive to the query which fall within various predetermined categories of document types. The query generation process may contain a knowledge base includ
15、ing a thesaurus that has predetermined and embedded complex search queries, or use natural language processing, or fuzzy logic, or tree structures, or hierarchical relationship or a set of commands that allow persons seeking information to formulate their queries. The search process can utilize any
16、index and search engine techniques including Boolean, vector, and probabilistic as long as a substantial portion of the entire domain of archived textual data is searched for each query and all documents found are returned to the organizing process. The sorting/categorization process prepares the se
17、arch results for presentation by assembling the various document types retrieved by the search engine and then arranging these basic document types into sometimes broader categories that are readily understood by and relevant to the user.The search results are then presented to the user and arranged
18、 by category along with an indication as to the number of relevant documents found in each category. The user may then examine search results in multiple formats, allowing the user to view as much of the document as the user deems necessary. 4.BRIEF DESCRIPTION OF THE DRAWINGS FIG. 1 is a block diagram illustrating an information retrieval system of the invention; FIG. 2 is a diagram illustrating a query formulation and search process utilized in the