The Apache Modules Book
by Nick Kew
“Do you learn best by example and experimentation? This book is ideal. Have your favorite editor and compiler ready–you’ll encounter example code you’ll want to try right away. You’ve picked the right book–this is sure to become the de facto standard guide to writing Apache modules.”
–Rich Bowen, coauthor, Apache Administrators Handbook, Apache Cookbook, and The Definitive Guide to Apache mod_rewrite
“A first-rate guide to getting the most out of Apache as a modular application platform–sure to become a must-read for any Apache programmer, from beginner to experienced professional. It builds up carefully and meticulously from the absolute basics, while including chapters on everything from the popular Apache DBD Framework to best practices, security, and debugging.”
–Noirin Plunkett, documentation committer to the Apache httpd project, and member of the ASF conference committee
The Only Comprehensive Guide to Developing Apache 2.x Modules and Applications
Apache is more than the world’s most popular Web server–it’s also an extraordinarily powerful and extensible development platform. Now, ApacheTutor.org’s Nick Kew has written The Apache Modules Book, the first start-to-finish, example-rich guide for every developer who wants to make the most of Apache.
Kew begins with detailed, accessible introductions to Apache’s architecture and API, then illuminates all the techniques you’ll need, from request processing through code security. He brings together the best of both worlds: powerful C-based techniques for accomplishing tasks Perl or PHP can’t handle, implemented with tools that deliver all the productivity you’d expect from higher-level languages. Utilizing realistic code samples, Kew introduces techniques documented in no other book-and, often, nowhere else at all.
- Using Apache Portable Runtime (APR) to streamline C development and avoid its pitfalls
- Leveraging Apache DBD to build applications far more scalable than classic LAMP software
- Working with the latest Apache 2.x features: filter modules, XML support, and smart proxies
- Mastering best practices, from thread safety to multi-platform development
- Utilizing the Apache Authentication Framework
- Tracing and debugging problems in both Apache and your custom modules
About the Author
Chapter 1 Applications Development with Apache
Chapter 2 The Apache Platform and Architecture
Chapter 3 The Apache Portable Runtime
Chapter 4 Programming Techniques and Caveats
Chapter 5 Writing a Content Generator
Chapter 6 Request Processing Cycle and Metadata Handlers
Chapter 7 AAA: Access, Authentication, and Authorization
Chapter 8 Filter Modules
Chapter 9 Configuration for Modules
Chapter 10 Extending the API
Chapter 11 The Apache Database Framework
Chapter 12 Module Debugging
Appendix A Apache License
Appendix B Contributor License Agreements
Appendix C Hypertext Transfer Protocol: HTTP/1.1
About the Web Site
ApacheTutor.org contains code examples from the book, all designed for easy use and integration into existing applications.
Hadoop: The Definitive Guide
by Tom White
Ready to unlock the power of your data? With this comprehensive guide, you’ll learn how to build and maintain reliable, scalable, distributed systems with Apache Hadoop. This book is ideal for programmers looking to analyze datasets of any size, and for administrators who want to set up and run Hadoop clusters.
You’ll find illuminating case studies that demonstrate how Hadoop is used to solve specific problems. This third edition covers recent changes to Hadoop, including material on the new MapReduce API, as well as MapReduce 2 and its more flexible execution model (YARN).
- Store large datasets with the Hadoop Distributed File System (HDFS)
- Run distributed computations with MapReduce
- Use Hadoop’s data and I/O building blocks for compression, data integrity, serialization (including Avro), and persistence
- Discover common pitfalls and advanced features for writing real-world MapReduce programs
- Design, build, and administer a dedicated Hadoop cluster—or run Hadoop in the cloud
- Load data from relational databases into HDFS, using Sqoop
- Perform large-scale data processing with the Pig query language
- Analyze datasets with Hive, Hadoop’s data warehousing system
- Take advantage of HBase for structured and semi-structured data, and ZooKeeper for building distributed systems
Apache Hadoop YARN
by Arun Murthy, Vinod Vavilapalli, Douglas Eadline, Joseph Niemiec, Jeff Markham
“This book is a critically needed resource for the newly released Apache Hadoop 2.0, highlighting YARN as the significant breakthrough that broadens Hadoop beyond the MapReduce paradigm.”
—From the Foreword by Raymie Stata, CEO of Altiscale
The Insider’s Guide to Building Distributed, Big Data Applications with Apache Hadoop™ YARN
Apache Hadoop is helping drive the Big Data revolution. Now, its data processing has been completely overhauled: Apache Hadoop YARN provides resource management at data center scale and easier ways to create distributed applications that process petabytes of data. And now in Apache Hadoop™ YARN, two Hadoop technical leaders show you how to develop new applications and adapt existing code to fully leverage these revolutionary advances.
YARN project founder Arun Murthy and project lead Vinod Kumar Vavilapalli demonstrate how YARN increases scalability and cluster utilization, enables new programming models and services, and opens new options beyond Java and batch processing. They walk you through the entire YARN project lifecycle, from installation through deployment.
You’ll find many examples drawn from the authors’ cutting-edge experience—first as Hadoop’s earliest developers and implementers at Yahoo! and now as Hortonworks developers moving the platform forward and helping customers succeed with it.
- YARN’s goals, design, architecture, and components—how it expands the Apache Hadoop ecosystem
- Exploring YARN on a single node
- Administering YARN clusters and Capacity Scheduler
- Running existing MapReduce applications
- Developing a large-scale clustered YARN application
- Discovering new open source frameworks that run under YARN
Kafka: The Definitive Guide
by Neha Narkhede, Gwen Shapira, Todd Palino
Every enterprise application creates data, whether it’s log messages, metrics, user activity, outgoing messages, or something else. And how to move all of this data becomes nearly as important as the data itself. If you’re an application architect, developer, or production engineer new to Apache Kafka, this practical guide shows you how to use this open source streaming platform to handle real-time data feeds.
Engineers from Confluent and LinkedIn who are responsible for developing Kafka explain how to deploy production Kafka clusters, write reliable event-driven microservices, and build scalable stream-processing applications with this platform. Through detailed examples, you’ll learn Kafka’s design principles, reliability guarantees, key APIs, and architecture details, including the replication protocol, the controller, and the storage layer.
- Understand publish-subscribe messaging and how it fits in the big data ecosystem.
- Explore Kafka producers and consumers for writing and reading messages
- Understand Kafka patterns and use-case requirements to ensure reliable data delivery
- Get best practices for building data pipelines and applications with Kafka
- Manage Kafka in production, and learn to perform monitoring, tuning, and maintenance tasks
- Learn the most critical metrics among Kafka’s operational measurements
- Explore how Kafka’s stream delivery capabilities make it a perfect source for stream processing systems
Hadoop For Dummies
by Dirk deRoos, Paul Zikopoulos, Roman B. Melnyk
Let Hadoop For Dummies help harness the power of yourdata and rein in the information overload
Big data has become big business, and companies and organizationsof all sizes are struggling to find ways to retrieve valuableinformation from their massive data sets with becoming overwhelmed.Enter Hadoop and this easy-to-understand For Dummiesguide. Hadoop For Dummies helps readers understand thevalue of big data, make a business case for using Hadoop, navigatethe Hadoop ecosystem, and build and manage Hadoop applications andclusters.
- Explains the origins of Hadoop, its economic benefits, and itsfunctionality and practical applications
- Helps you find your way around the Hadoop ecosystem, programMapReduce, utilize design patterns, and get your Hadoop cluster upand running quickly and easily
- Details how to use Hadoop applications for data mining, webanalytics and personalization, large-scale text processing, datascience, and problem-solving
- Shows you how to improve the value of your Hadoop cluster,maximize your investment in Hadoop, and avoid common pitfalls whenbuilding your Hadoop cluster
From programmers challenged with building and maintainingaffordable, scaleable data systems to administrators who must dealwith huge volumes of information effectively and efficiently, thishow-to has something to help you with Hadoop.