版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领
文档简介
1、外文资料:information management systemwiliam k.thomson u.s.aabstract:an information storage, searching and retrieval system for large (gigabytes) domains of archived textual dam. the system includes multiple query generation processes, a search process, and a presentation of search results that is sorte
2、d by category or type and that may be customized based on the professional discipline(or analogous personal characteristic of the user), thereby reducing the amount of time and cost required to retrieve relevant results.keyword:information management retrieval system object-orientedl.intruductionthi
3、s invention relates to an information storage, searching and retrieval system that incorporates a novel organization for presentation of search results from large (gigabytes) domains of archived textual data.2.backgroudn of the inventiononline information retrieval systems are utilized for searching
4、 and retrieving many kinds of information. most systems used today work in essentially the same manner; that is, users log on (through a computer terminal or personal microcomputer, and typically from a remote location), select a source of information (ie, a particular database) which is usually som
5、ething less than the complete domain, formulate a query, launch the search, and then review the search results displayed on the terminal or microcomputer, typically with documents (or summaries of documents) displayed in reverse chronological order. this process must be repeated each time another so
6、urce (database) or group of sources is selected (which is frequently necessary in order to insure all relevant documents have been found).additionally, this process places on the user the burden of organizing and assimilating the multiple results generated from the launch of the same query in each o
7、f the multiple sources (databases) that the user needs (or wants) to search. present systems that allow searching of large domains require persons seeking information in these domains to attempt to modify their queries to reduce the search results to a size that the user can assimilate by browsing t
8、hrough them (thus, potentially eliminating relevant results).in many cases end users have been forced to use an intermediary (i.e., a professional searcher) because the current collections of sources are both complex and extensive, and effective search strategies often vary significantly from one so
9、urce to another. even with such guidance, potential relevant answers are missed because all potentially relevant databases or information sources are not searched on every query. much effort has been expended on refining and improving source selection by grouping sources or database files together.
10、significant effort has also been expended on query formulation through the use of knowledge bases and natural language processing. however, as the groupings of sources become larger, and the responses to more comprehensive search queries become more complete, the person seeking information is often
11、faced with the daunting task of sifting through large unorganized answer sets in an attempt to find the most relevant documents or information.3-summary of the inventionthe invention provides an information storage, searching and retrieval system for a large domain of archived data of various types,
12、 in which the results of a search are organized into discrete types of documents and groups of document types so that users may easily identify relevant information more efficiently and more conveniently than systems currently in use. the system of the invention includes means for storing a large do
13、main of data contained in multiple source records, at least some of the source records being comprised of individual documents of multiple document types; means for searching substantially all of the domain with a single search query to identify documents responsive to the query; and means for categ
14、orizing documents responsive to the query based on document type, including means for generating a summary of the number of documents responsive to the query which fall within various predetermined categories of document types.the query generation process may contain a knowledge base including a the
15、saurus that has predetermined and embedded complex search queries, or use natural language processing, or fuzzy logic, or tree structures, or hierarchical relationship or a set of commands that allow persons seeking information to formulate their queries.the search process can utilize any index and
16、search engine techniques including boolean, vector, and probabilistic as long as a substantial portion of the entire domain of archived textual data is searched for each query and all documents found are returned to the organizing process.the sorting/categorization process prepares the search result
17、s for presentation by assembling the various document types retrieved by the search engine and then arranging these basic document types into sometimes broader categories that are readily understood by and relevant to the user.the search results are then presented to the user and arranged by categor
18、y along with an indication as to the number of relevant documents found in each category. the user may then examine search results in multiple formats, allowing the user to view as much of the document as the user deems necessary.4.brief description of the drawingsfig. 1 is a block diagram illustrat
19、ing an information retrieval system of the invention;fig. 2 is a diagram illustrating a query formulation and search process utilized in the invention;figue 3user.user.efnursaidsaidsu0 selectsur< selects concept to seirctcse 1conceptto seirctcse 1fxtcuttd by $mf(h proc«$ngfxtcuttd by $mf(h p
20、roc«$ngrauitt as presented 坏 c*tegory:ust hxmat ts pretecneduser selects c*w»v vi ilsvsekxu addtmnai ftxmau (otmaiiadbtfc shokl rmxamwgeda cdmuftf) r.2m 1”4akubw"2 jj0f mi*fig. 3 is a diagram illustrating a sorting process for organizing and presenting search results. pt 9 j】1iiibbl.沖
21、jif1jj11cmi4hiiii.eeujoasmdu 一25.best mode for carrying out the inventionas is illustrated in the block diagram of fig. 1 、 the information retrieval system of the invention includes an input/output process ,a query generation process, a search process that involves a large domain of textual data (t
22、ypically in the multiple gigabyte range), an organizing process, presentation of the information to the user, and a process to identify and characterize the types of documents contained in the large domain of data.turning now to fig. 2, the query generation process preferably includes a knowledge ba
23、se containing a thesaurus and a note pad, and preferably utilizes embedded predefined complex boolean strategies. such a system allows the user to enter their description of the information needed using simple words/phrases made up of nnaturalh language and to rely on the system to assist in generat
24、ing the full search query, which would include, e.g.,. synonyms and alternate phraseology. the user can then request, by a command such as "vi co r to view the complete document selected from the list, giving, in this case, complete information about the identity and credentials of the expert.f
25、ig. 3 illustrates how five typical sources of information (i.e., source records) can be sorted into many document types and then subsequently into categories. for example, a typical trade magazine may contain several types of information such as editorials, regular columns, feature articles, news, p
26、roduct announcements, and a calendar of events. th叫 the trade magazine (ie, the source record) may be sorted into these various document types, and these document types in turn may be categorized or grouped into categories contained in one or more sets of categories; each document type typically wil
27、l be sorted into one category within a set of categories, but the individual categories within each set will vary from one set to another. for example, one set of categories may be established for a first characteristic type of user, and a different set of categories may be established for a second
28、characteristic type of user. when a user corresponding to type #1 executes a search, the system automatically utilizes the categories of set #1, corresponding to that particular type of user, in organizing the results of the search for review by the user. when a user from type #2 executes a search,
29、however, the system automatically utilizes the categories of set #2 in presenting the search results to the user.the information storage, searching and retrieval system of the invention resolves the common difficulties in typical on-line information retrieval systems that operate on large (e.g., 2 g
30、igabytes or more) domains of textual data, query generation, source selection, and organizing search results. the information base with the thesaurus and embedded search strategies allows users to generate expert search queries in their own unaturaln language. source (i.e., database) selection is no
31、t an issue because the search engines are capable of searching substantially the entire domain on every query. moreover, the unique presentation of search results by category set substantially reduces the time and cost of performing repetitive searches in multiple databases and therefore of efficien
32、tly retrieving relevant search results.while a preferred embodiment of the present invention has been described, it should be understood that various changes, adaptations and modifications may be made therein without departing from the spirit of the invention and the scope of the appended claims.屮文译
33、文:信息管理系统wiliam k.thomson u.s.a摘要:一个信息存储,查询和检索系统主要应用于大(千兆字节)的需要存档的文字领域。该 系统包括多个查询产生过程和一个搜索过程。而查询的结果i般是按类别和类型进行排序 的,检索字段是由个人决定的,在查询的过程中,可能基于这个搜索结果查看到多个相关 的信息(或类似的用户个人特点介绍),从而减少了搜索结果是所需的吋间和费用。 关键词:信息管理;检索系统;面向对象1. 简介信息的存储,查询和检索系统,主要应用原文档数据比较大的文档,利用搜索条件和 索引字段可以快速查询结果。2. 开发背景网上查询系统主要用于查询和检索在线的各种各样的信息。今天所
34、使用的多数系统实 际上采用的是同一方式。也就是说,用户登录(通过计算机终端或个人微机,或者是远程 登录),选择一个信息源(比如一个特定的数据库),通常是一些不完整的检索条件,开始 查询,启动搜索,然后查询结果将显示在计算机终端或个人微机上,且查询结果一般按照 时间的顺序显示。在查询过程中,会不断的重复查询每一个数据来源或一组数据源,为了 确保搜索出所有相关的文件,这个重复是非常必要的。另外,这个查询过程也给用户带来 一定的负担,他要根据从同一个数据源查询出的多个结果,进行归纳和总结。而目前的系 统可以搜寻大的数据,在这过程中要求人们寻求信息或试图修改他们的查询条件,以减少 不必要的搜索结果(消
35、灭潜在的相关结果),使用户查询到真正要查的数据。在许多情况 下,用户被迫使用中介(例如专业的搜索引擎),因为当前收藏的来源是复杂和广泛的, 并且有效的搜索策略经常从一个数据来源变化到另一个。即使你按照这样操作,也有可能 错过相关的答案,因为所有可能相关的数据库或信息来源并不在每一次搜索查询中。所以 就要付出很大的努力改善和提高数据源的选择,更大的努力在操作查询时所制定的数据库 语言。然而,当面对变得更大来源分组或需要更加全面的查询结果时,这个问题就更加明 显,人们寻找的信息经常面对大量未组织的结果集合,这样就需要增加过滤查询的重要任 务。3. 系统概要该系统主要应用于对大量数据进行信息存储,查
36、询和检索,查询的结果将被导出成文 件类型,比fi前的系统更方面,容易的找到用户想要查询的有关数据。该系统不仅包括存 储广泛数据领域的复合数据源记录,还包括多个文件类型的某些原始记录。该方式提供了搜索大数据领域所进行的一次唯一辨认文件的重要查询部分;还提供了文件重要部分的查 询,以及包括对文件数量的统计和属于各种各样的预先确定类别的文件查询。查询创建过程包含一个知识库,该知识库包括被预先确定和嵌入复杂查询的分类词典, 或者是自然语言的处理,或者模糊逻辑,或者树型结构,或者等级关系,或者是一套寻求 信息的公式化查询命令。搜索的过程可能利用到所有的索引和搜索引擎技术,包括布尔,传播媒介,机率查询。
37、只要每次查询到一个原文归档数据的固有部分,所有建立的文档就能返回到其组织过程。排序或分类的过程是通过调用搜索引擎检索查询的结果,从而为引入各种各样的基本 文件类里做准备,然后组织安排这些容易被理解且与用户密切相关的基本文件类型。然后 提供给相对于用户相关查询的结果与在该查询结果中的每个类别相关文档数量的统计。用 户可以以多种形式来检查查询的结果,并且用户可以根据自己的需要来查看相关的文件。4. 图例简要说明图1是信息查询系统总流程图;oxene xtcjrcr tm< wtnlumr wtecti(rc mor 7 x0eftpctsotqtion to umr a category s
38、etoiimy gtrtribon proctwsearchstarchprocessinqco<npl«xprocessinqco<npl«xresets search orgatd iftfo groups of documtm typeslaqe domain (exwmdita stored mtoctronic formpi ocass s kum wfy4oc4jmc<xemimcsxtru图2是系统制定查询和搜索过程图;屯§oe uvea fiisuwqm<00 iau13ww3 <qsw5u5 0h$15 -n8038
39、3总resold右吾ebps gsmuj 55-asn-asn一 一 一 一document typng processimcumo o)mdu|*wrj wmt屮1 ff 兀wj*nlf y叫- *nwrt i .*二*mvw«li 恤f>«<s.*-*w tm>»w*yknrg m i 4s.ztttdxiiumunxooxsqa proemnr咪壻二、昭昵砾一?f®h>s屈划殴"羽5. 该系统的最佳模式正如图1所说明的那样,信息检索系统的开发包括一个输入、输出过程,一个查询创 建过程,一个大量数据范围的查询过程(典
40、型地在多个千兆字节范围),一个用户信息的组 织过程,以及一个辨认和描绘在大数据领域中文件的类型。如图2,查询生成过程包括分类词词典和笔记的一个知识库和运用嵌入被定定义的复 杂战略。这样系统允许用户输入简单的词或词组,并且需要的他们的信息的描述由“自然” 语言组成和依靠系统协助引起充分的查询,将包括同义词和供选择文词。用户发出一个命 令然后请求,例如“vi co 1”,查验从名单挑选的完全文件,在这种情况下,给关于身 分专家的完全信息和证件。图3说明了五种一般的信息源(即原始记录)可以被写入多数类型的文档,随后被写入 类。例如,一本典型的商业杂志也许包含信息的儿个类型,例如社论、规则专栏、特写、
41、 新闻、产品公告和事件日历。因此,商业杂志(即原始记录)也许被排序入各种各样的文件 类型和这些文件类型也许反过来被分类或被编组入一个或更多套包含的类别,每个文件 类型在一套将典型地被排序入一个类别之内,但各自的类别在每个集合之内从一个集合将 变化到另一个。例如,一套类别为用户的第一个典型类型建立,并且不同的套类别也许为 用户的第二个典型类型建立。当对应类型#1的用户执行一次查询时,系统为回顾自动地运 用集合#1类别,对应于用户的那个特殊类型,在由用户组织查询的结果。当一名用户从类 型#2执行一次查询时,系统提出查询结果自动地运用集合#2类别对用户。信息存储、搜索和检索系统的开发解决了原文数据、
42、查询方案、资源选择和组织查询 结果等大容量数据范围(即二十亿字节或更多)的在线信息检索系统的基本难题。基于分 类词典和嵌入搜索策略的信息库,允许用户使用“自然”语言来进行专业的信息查询。数 据来源(如数据库)的选择已不再是个问题,因为搜索引擎能够在每次搜索时可以搜索到整 个数据域。查询结果的独特类设置介绍不但极大地减少了反复查询多个数据库所付出的时 间和费用,并且可以做到高效率检索相关的查询结果。当现有开发系统被具体化描述时,应该不能摒弃该开发系统的精髓和附加规范,便可 以了解到所开发的系统中各式各样的变化、适应和改动。五分钟搞定5000字毕业论文外文翻译,你想要的工具都在这里!在科研过程中阅
43、读翻译外文文献是一个非常重要的环节,许多领 域高水平的文献都是外文文献,借鉴一些外文文献翻译的经验是非常 必要的。由于特殊原因我翻译外文文献的机会比较多,慢慢地就发现 了外文文献翻译过程中的三大利器:google“翻译,濒道、金山词霸(完 整版本)和cnki“翻译助手”。具体操作过程如下:1 先打开金山词霸自动取词功能,然后阅读文献;2遇到无法理解的长句时,可以交给google处理,处理后的结 果猛一看,不堪入目,可是经过大脑的再处理后句子的意思基本就明tt;3如果通过google仍然无法理解,感觉就是不同,那肯定是对 其中某个“常用单词”理解有误,因为某些单词看似很简单,但是在文 献中有特殊的意思,这时就可以通过cnki的“翻译助手”來查询相关 单词的意思,由于cnki的单词意思都是来源与大量的文献,所以它 的吻合率很高。另外,在翻译过程中最好以“段落,'或者“长句”作为翻译的基本单 位,这样才不会造成“只见树木,不见森林叩勺误导。四大工具:1、google 翻译: toolsgoogle,众所周知,谷歌里面的英文文献和资料还算是比较详实 的。我利用它是这样的。一方面可以用它查询英文论文,当然这方面 的帖子很多,大家可以搜索,在此不赘述。回到我自己说的翻译
温馨提示
- 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
- 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
- 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
- 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
- 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
- 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
- 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。
最新文档
- 2024物理教研组工作计划范文
- 羽毛球课程教学工作计划教研学习计划
- 乡村幼儿园教师送培项目工作总结-“国培计划”
- 音乐室管理工作计划音乐功能室工作计划
- 学生会宣传部年度的工作总结
- 新公司财务工作计划
- 2024年幼儿园小班个人计划参考范文
- 托班班务计划范文
- 教研的年度工作计划范文
- 三年级信息技术教学计划
- 会议服务的合同范本(8篇)
- 高级中学音乐教师资格考试面试试题及解答参考(2025年)
- 2024供应链合作伙伴采购基本协议
- 电力行业锅炉维护保养方案
- 2024年医院满意度调查工作制度(三篇)
- 腰穿术护理常规
- 农业智能装备市场研究
- 2026届高三政治一轮复习实操策略研讨
- 2024年邻居公用围墙协议书模板
- 9 古诗三首《题西林壁》(教学设计)2024-2025学年统编版语文四年级上册
- 2024年二级建造师继续教育考核题及答案
评论
0/150
提交评论