数据挖掘和数据发布外文文献及翻译

资源ID：132683 资源大小：31.24KB 全文页数：14页
资源格式： DOCX 下载积分：100金币

快捷下载

账号登录下载

三方登录下载：

下载资源需要100金币

邮箱/手机：
温馨提示：	快捷下载时，用户名和密码都是您填写的邮箱或者手机号，方便查询和重复下载（系统自动生成）。如填写123，账号就是123，密码也是123。
支付方式：
验证码：	换一换

账号：
密码：
验证码：	换一换
当日自动登录忘记密码？

友情提示

1、下载资料失败解决办法

2、PDF文件下载后，可能会被浏览器默认打开，此种情况可以点击浏览器菜单，保存网页到桌面，就可以正常下载了。

3、本站不支持迅雷下载，请使用电脑自带的IE浏览器，或者360浏览器、谷歌浏览器下载即可。

4、本站资源下载后的文档和图纸-无水印,预览文档经过压缩，下载后原文更清晰。

数据挖掘和数据发布外文文献及翻译

1、What is Data Mining? Many people treat data mining as a synonym for another popularly used term, “Knowledge Discovery in Databases”, or KDD. Alternatively, others view data mining as simply an essential step in the process of knowledge discovery in databases. Knowledge discovery consists of an itera

2、tive sequence of the following steps: data cleaning: to remove noise or irrelevant data, data integration: where multiple data sources may be combined, data selection : where data relevant to the analysis task are retrieved from the database, data transformation : where data are transformed or conso

3、lidated into forms appropriate for mining by performing summary or aggregation operations, for instance, data mining: an essential process where intelligent methods are applied in order to extract data patterns, pattern evaluation: to identify the truly interesting patterns representing knowledge ba

4、sed on some interestingness measures, and knowledge presentation: where visualization and knowledge representation techniques are used to present the mined knowledge to the user . The data mining step may interact with the user or a knowledge base. The interesting patterns are presented to the user,

5、 and may be stored as new knowledge in the knowledge base. Note that according to this view, data mining is only one step in the entire process, albeit an essential one since it uncovers hidden patterns for evaluation. We agree that data mining is a knowledge discovery process. However, in industry,

6、 in media, and in the database research milieu, the term “data mining” is becoming more popular than the longer term of “knowledge discovery in databases”. Therefore, in this book, we choose to use the term “data mining”. We adopt a broad view of data mining functionality: data mining is the process

7、 of discovering interesting knowledge from large amounts of data stored either in databases, data warehouses, or other information repositories. Based on this view, the architecture of a typical data mining system may have the following major components: 1. Database, data warehouse, or other informa

8、tion repository. This is one or a set of databases, data warehouses, spread sheets, or other kinds of information repositories. Data cleaning and data integration techniques may be performed on the data. 2. Database or data warehouse server. The database or data warehouse server is responsible for f

9、etching the relevant data, based on the users data mining request. 3. Knowledge base. This is the domain knowledge that is used to guide the search, or evaluate the interestingness of resulting patterns. Such knowledge can include concept hierarchies, used to organize attributes or attribute values

10、into different levels of abstraction. Knowledge such as user beliefs, which can be used to assess a patterns interestingness based on its unexpectedness, may also be included. Other examples of domain knowledge are additional interestingness constraints or thresholds, and metadata (e.g., describing

11、data from multiple heterogeneous sources). 4. Data mining engine. This is essential to the data mining system and ideally consists of a set of functional modules for tasks such as characterization, association analysis, classification, evolution and deviation analysis. 5. Pattern evaluation module.

12、This component typically employs interestingness measures and interacts with the data mining modules so as to focus the search towards interesting patterns. It may access interestingness thresholds stored in the knowledge base. Alternatively, the pattern evaluation module may be integrated with the

13、mining module, depending on the implementation of the data mining method used. For efficient data mining, it is highly recommended to push the evaluation of pattern interestingness as deep as possible into the mining process so as to confine the search to only the interesting patterns. 6. Graphical

14、user interface. This module communicates between users and the data mining system, allowing the user to interact with the system by specifying a data mining query or task, providing information to help focus the search, and performing exploratory data mining based on the intermediate data mining res

15、ults. In addition, this component allows the user to browse database and data warehouse schemas or data structures, evaluate mined patterns, and visualize the patterns in different forms. From a data warehouse perspective, data mining can be viewed as an advanced stage of on-1ine analytical processi

16、ng (OLAP). However, data mining goes far beyond the narrow scope of summarization-style analytical processing of data warehouse systems by incorporating more advanced techniques for data understanding. While there may be many “data mining systems” on the market, not all of them can perform true data

17、 mining. A data analysis system that does not handle large amounts of data can at most be categorized as a machine learning system, a statistical data analysis tool, or an experimental system prototype. A system that can only perform data or information retrieval, including finding aggregate values,

18、 or that performs deductive query answering in large databases should be more appropriately categorized as either a database system, an information retrieval system, or a deductive database system. Data mining involves an integration of techniques from mult1ple disciplines such as database technolog

19、y, statistics, machine learning, high performance computing, pattern recognition, neural networks, data visualization, information retrieval, image and signal processing, and spatial data analysis. We adopt a database perspective in our presentation of data mining in this book. That is, emphasis is

20、placed on efficient and scalable data mining techniques for large databases. By performing data mining, interesting knowledge, regularities, or high-level information can be extracted from databases and viewed or browsed from different angles. The discovered knowledge can be applied to decision maki

21、ng, process control, information management, query processing, and so on. Therefore, data mining is considered as one of the most important frontiers in database systems and one of the most promising, new database applications in the information industry. A classification of data mining systems Data

22、 mining is an interdisciplinary field, the confluence of a set of disciplines, including database systems, statistics, machine learning, visualization, and information science. Moreover, depending on the data mining approach used, techniques from other disciplines may be applied, such as neural netw

23、orks, fuzzy and or rough set theory, knowledge representation, inductive logic programming, or high performance computing. Depending on the kinds of data to be mined or on the given data mining application, the data mining system may also integrate techniques from spatial data analysis, Information

24、retrieval, pattern recognition, image analysis, signal processing, computer graphics, Web technology, economics, or psychology. Because of the diversity of disciplines contributing to data mining, data mining research is expected to generate a large variety of data mining systems. Therefore, it is n

25、ecessary to provide a clear classification of data mining systems. Such a classification may help potential users distinguish data mining systems and identify those that best match their needs. Data mining systems can be categorized according to various criteria, as follows. 1) Classification according to the kinds of databases mined.

注意事项: 本文（数据挖掘和数据发布外文文献及翻译）为本站会员（泛舟）主动上传，毕设资料网仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若此文所含内容侵犯了您的版权或隐私，请联系网站客服QQ：540560583，我们立即给予删除！