Marz, Nathan

Big data : principles and best practices of scalable real-time data systems / Nathan Marz and James Warren - New York : Manning Publications, c2015 - xx, 308 pages : illustrations ; 23 cm.

1. A new paradigm for big data -- Part 1: Batch layer -- 2. Data model for big data -- 3. Data model for big data : illustration -- 4. Data storage on the batch layer -- 5. Data storage on the batch layer : illustration -- 6. Batch layer -- 7. Batch layer : illustration -- 8. An example batch layer : architecture and algorithms -- 9. An example batch layer : implementation -- Part 2 : Serving layer -- 10. Serving layer -- 11. Serving layer : illustration -- Part 3 : Speed layer -- 12. Realtime views -- 13. Realtime views : illustration -- 14. Queuing and stream processing -- 15. Queuing and stream processing : illustration -- 16. Micro-batch stream processing -- 17. Micro-batch stream processing : illustration -- 18. Lambda architecture in depth.

Web-scale applications like social networks, real-time analytics, or e-commerce sites deal with a lot of data, whose volume and velocity exceed the limits of traditional database systems. These applications require architectures built around clusters of machines to store and process data of any size, or speed. Fortunately, scale and simplicity are not mutually exclusive. Big Data teaches you to build big data systems using an architecture designed specifically to capture and analyze web-scale data. This book presents the Lambda Architecture, a scalable, easy-to-understand approach that can be built and run by a small team. You'll explore the theory of big data systems and how to implement them in practice. In addition to discovering a general framework for processing big data, you'll learn specific technologies like Hadoop, Storm, and NoSQL databases. This book requires no previous exposure to large-scale data analysis or NoSQL tools. Familiarity with traditional databases is helpful.

9781617290343


BIG DATA

QA 76.9 .M37 2015