1、 江 汉 大 学 毕 业 论 文(设 计) 外 文 翻 译 原文来源 The Hadoop Distributed File System: Architecture and Design 中文译文 Hadoop 分布式文件系统:架构和设计 姓 名 XXXX 学 号 200708202137 2013 年 4 月 8 日 英 文原文 The Hadoop Distributed File System: Architecture and Design Source: http:/hadoop.apache.org/docs/r0.18.3/hdfs_design.html Introducti
2、on The Hadoop Distributed File System (HDFS) is a distributed file system designed to run on commodity hardware. It has many similarities with existing distributed file systems. However, the differences from other distributed file systems are significant. HDFS is highly fault-tolerant and is designe
3、d to be deployed on low-cost hardware. HDFS provides high throughput access to application data and is suitable for applications that have large data sets. HDFS relaxes a few POSIX requirements to enable streaming access to file system data. HDFS was originally built as infrastructure for the Apache
4、 Nutch web search engine project. HDFS is part of the Apache Hadoop Core project. The project URL is http:/hadoop.apache.org/core/. Assumptions and Goals Hardware Failure Hardware failure is the norm rather than the exception. An HDFS instance may consist of hundreds or thousands of server machines,
5、 each storing part of the file systems data. The fact that there are a huge number of components and that each component has a non-trivial probability of failure means that some component of HDFS is always non-functional. Therefore, detection of faults and quick, automatic recovery from them is a co
6、re architectural goal of HDFS. Streaming Data Access Applications that run on HDFS need streaming access to their data sets. They are not general purpose applications that typically run on general purpose file systems. HDFS is designed more for batch processing rather than interactive use by users. The emphasis is on high throughput of data access rather than low latency of data access. POSIX imposes many hard requirements that are not