Big Data Processing with Apache Spark (Srini Penchikala)

 
0.0 (0)
Big Data Processing with Apache Spark (Srini Penchikala)

An open-source big-data processing platform called Apache Spark was created with speed, usability, and powerful analytics in mind.

When compared to other big-data and MapReduce systems like Hadoop and Storm, Spark has a number of advantages. It offers a thorough, unified framework for managing big-data processing needs for datasets with a variety of characteristics (text data, graph data, etc.) and from different sources (batch versus real-time streaming data).

HDFS clusters can run applications up to a hundred times quicker in memory and ten times faster even when they are running on disk thanks to Spark.

The reader of this mini-book will become familiar with the Apache Spark framework and create Spark programs for use in big-data research. The Spark ecosystem, which consists of Spark Core, Spark SQL, Spark Streaming, Spark MLlib, and Spark GraphX, is covered in detail in this book.

Ebook Details

About the Authors
In Austin, Texas, Srini Penchikala is employed as a software architect at a company that provides financial services. In terms of software architecture, design, and development, he has more than 20 years of expertise.
Publisher
Published
Published Date / Year
(2018)
eBook Format
PDF (104 pages)
Language
English

Similar Programming & Computer Books

Strategic Foundations of General Equilibrium: Dynamic Matching and Bargaining Games (Douglas Gale)
Since Adam Smith's day, the theory of competition has played a significant role in economic study. This book, published by one of the most eminent modern economic theorists, details...
The Pure Logic Of Choice (Richard D. Fuerle)
A broad theory of economics based on free will is presented in this free programming book. The assumption that humans have free will and the ability to alter physical...
Portfolio Theory and Financial Analyses (Robert Alan Hill)
Whether they involve calculating the return on a portfolio, analyzing portfolio risk, or assessing the effectiveness of the portfolio management process, this free programming book links each of the...
Price Theory: An Intermediate Text (David D. Friedman)
In order to help the reader grasp the economic way of thinking, the author first gives verbal, intuitive explanations of the topics before using graphs and/or calculus to illustrate...
Mathematical Models in Portfolio Analysis (Farida Kachapova)
This free programming book presents the mathematical theory of portfolio modeling in financial mathematics as a coherent whole, with justifications for each step. ...
Data Mining in Medical and Biological Research (Eugenia G. Giannopoulou)
The goal of this free programming book is to compile the most recent developments and uses of data mining research from around the globe in the exciting fields of...
The Biostar Handbook (Dr. Istvan Albert)
The scientific field of bioinformatics, which combines biology, computer science, and statistical data analytics, is explained to readers in this useful book. Bioinformatics is concerned with the digital processing...
Planning for Big Data: A CIO's Handbook to the Changing Data Landscape (Edd Dumbill)
This free programming book offers a useful, approachable "brief" on the state of Big Data analytics today and how you may profitably use this technology to boost your company's...
Big Data Now: Current Perspectives from O'Reilly Radar (O'Reilly Radar Team)
This free programming book summarizes the report's findings on trends, techniques, applications, and predictions.  
Designing Event-Driven Systems: Concepts and Patterns for Streaming Services with Apache Kafka (Ben Stopford)
In Concepts and Patterns for Streaming Services with Apache Kafka, the author discusses how you may create mission-critical systems using service-based architectures and stream processing tools like Apache Kafka....

Others Programming Books by InfoQ Inc.

The Angular Mini-Book 2.0 (Matt Raible)
This version (v2.0) makes use of Spring Boot 2.6 and Angular 13. Web or Java developers that want a quick introduction to Angular, Bootstrap, and Spring Boot...
High-Performance Teams: The Foundations (Richard Kasperowski)
This book is a road map for everyone who wants to lead or actively take part in the best team experience of their lives. It builds on The Core...
Dynamic Proxies in Java (Heinz M. Kabutz)
Even now, learning Java is still quite simple if we concentrate on the most important tools. Object orientation, flow control, collections, and Java 8 streams should all be introduced...
Practical Guide to Building an API Back End with Spring Boot (Wim Deblauwe)
In this fast-paced environment, it's critical to be able to prototype quickly while also making sure that no work is being done in vain.
Distributed Agile (John Okoro, et al)
In a product portfolio, various teams may be collaborating on a single product or working on various products separately. The necessity for strong governance is greater the more dispersed...
Conversation Patterns for Software Professionals (Michal Bartyzel)
The relationship between the company and IT is dominated by two false stereotypes: the business believes that IT lacks a business mindset and that IT believes that the business...
Pairing Apache Shiro and Java EE 7 (Nebrass Lamouchi)
Authentication and permission are crucial security components when protecting systems. Despite the fact that the two names have different meanings, they are occasionally used synonymously due to their respective...
Why Agile Works: The Values Behind the Results (Michael de la Maza)
Why do some businesses benefit greatly from being flexible while others barely change? The difference is in the knowledge that agile is a framework for profound cultural transformation rather...
The Cynefin Mini-Book (Greg Brougham)
We can all agree that the world is complex by nature, but what does this actually mean? According to theory, this denotes an open system in which agents and...
The Java Garbage Collection Mini-Book (Charles Humble)
For Java architects and senior engineers who wish to comprehend what garbage collection is, how it functions, and how it affects the execution of their applications, the Java Garbage...
Next Generation HTML5 and JavaScript (David Pitt)
JavaScript is disorganized. A new framework or significant library seems to appear every few weeks due to the faster-than-ever rate of change. As ECMAScript 6 is finished this year,...
Scrum and XP from the Trenches, 2nd Edition (Henrik Kniberg)
This second edition is an annotated version, a "director's cut," in which Henrik discusses the material and offers fresh perspectives gleaned since the book's initial publication. ...
Leading Self-Organising Teams (Siegfried Kaltenecker)
Self-organizing teams: what are they? What makes us require them? How can we exercise effective leadership in a self-organizing setting?
Do Better Scrum - An Unofficial Set of Tips and Insights into How to Implement Scrum Well (Peter Hundermark)
Jim York, a certified scrum trainer, and coach asserts: "Scrum is Simple." Scrum is challenging to implement. Many people I see in businesses say they struggle to understand how...
Confessions of a Scrum Master (Paul VII)
It has been a mix of my experiences and applying continuous improvement that has, by far, provided me with the most learning in my years as a software engineer,...
Dependency-Oriented Thinking: Volume 2 - Governance and Management (Ganesh Prasad)
Service-Oriented Architecture (SOA) is a rather underwhelming buzzword from the last ten years in technology. It is related to pricey, complex technology that may not deliver the ROI that...
Dependency-Oriented Thinking: Volume 1 - Analysis and Design (Ganesh Prasad)
Service-Oriented Architecture (SOA) is a rather underwhelming buzzword from the last ten years in technology.
Modern Web Essentials Using JavaScript and HTML5 (David Pitt)
A corporate pain point - how to reach people on numerous platforms without degrading user experience - is resolved by creating single-page applications with JavaScript and HTML5 technologies. ...
Agile with Guts - A Pragmatic Guide to Value-Driven Development (Nicolas Gouy)
"Valuable software" is the subject of the Agile Manifesto's first tenet. Value is the perceived advantage we receive from something, and it is arbitrary.
Getting Value out of Agile Retrospectives - A Toolbox of Retrospective Exercises (Luis Gonalves, et al)
The exercises in this pocketbook are accompanied by information on the "what" and "why" of retrospectives, the value and advantages they can have for your business, and suggestions for...

User reviews

There are no user reviews for this listing.
Ratings
Rate this Book
Comments