Hadoop Application Architectures-轻识

With Early Release ebooks, you get books in their earliest form — the author's raw and unedited content as he or she writes — so you can take advantage of these technologies long before the official release of these titles. You'll also receive updates when significant changes are made, new chapters as they're written, and the final ebook bundle.

Get expert guidance on architect...

Get expert guidance on architecting end-to-end data management solutions with Apache Hadoop. While many sources explain how to use various components in the Hadoop ecosystem, this practical book takes you through architectural considerations necessary to tie those components together into a complete tailored application, based on your particular use case.

To reinforce those lessons, the book’s second section provides detailed examples of architecture used in some of the most commonly found Hadoop applications. Whether you’re designing and implementing a new Hadoop application, or planning to integrate Hadoop into your existing data infrastructure, Hadoop Application Architectures will skillfully guide you through the process.

The Early Release edition begins with chapters that concentrate on design considerations for Data Modeling and Data Movement in Hadoop:

Explore whether your application should store data on Hadoop Distributed File System (HDFS) or HBase

Get best practices for designing an HDFS or HBase schema

Learn how to design schemas for SQL-on-Hadoop (e.g. Hive, Impala, HCatalog) tables

Mark Grover

Apache Sentry项目管理委员会成员，《Hive编程指南》作者之一，曾参与Apache Hadoop、Apache Hive、Apache Sqoop以及Apache Flume等项目，并为Apache Bigtop项目和Apache Sentry（项目孵化中）项目贡献代码。

Ted Malaska

Cloudera公司的资深解决方案架构师，致力于帮助客户更好地掌握Hadoop及其生态系统。曾任美国金融业监管局（FINRA，Financial Industry Regulatory Authority）首席架构师，指导建设了包括网络应用、服务型架构以及大数据应用在内的大量解决方案。曾为Apache Flume、Apache Avro、YARN以及Apache Pig等项目贡献代码。

Jonathan Seidman

C...

Mark Grover

Ted Malaska

Jonathan Seidman

Cloudera公司的解决方案架构师，协助合作伙伴将的解决方案集成到Cloudera的软件栈中。芝加哥Hadoop用户组（Chicago Hadoop User Group）及芝加哥大数据（Chicago Big Data）的联合创始人、《Hadoop实战》技术编辑。曾任Orbiz Worldwide公司大数据团队技术主管，为最为繁忙的站点管理了承载海量数据的Hadoop集群。也曾多次在Hadoop及大数据专业会议上发言。

Gwen Shapira

Cloudera公司的解决方案架构师，知名博主，拥有15年从业经验，协助客户设计高扩展性的数据架构。曾任Pythian高级顾问、Oracle ACE主管以及NoCOUG董事会成员，活跃于诸多业内会议