LoveReading

Becoming a member of the LoveReading community is free.

No catches, no fine print just unadulterated book loving, with your favourite books saved to your own digital bookshelf.

New members get entered into our monthly draw to win £100 to spend in your local bookshop Plus lots lots more…

Find out more

Sherif Sakr - Author

About the Author

Books by Sherif Sakr

Big Data 2.0 Processing Systems

Big Data 2.0 Processing Systems

Author: Sherif Sakr Format: Hardback Release Date: 10/07/2020

This book provides readers the big picture and a comprehensive survey of the domain of big data processing systems. For the past decade, the Hadoop framework has dominated the world of big data processing, yet recently academia and industry have started to recognize its limitations in several application domains and thus, it is now gradually being replaced by a collection of engines that are dedicated to specific verticals (e.g. structured data, graph data, and streaming data). The book explores this new wave of systems, which it refers to as Big Data 2.0 processing systems. After Chapter 1 presents the general background of the big data phenomena, Chapter 2 provides an overview of various general-purpose big data processing systems that allow their users to develop various big data processing jobs for different application domains. In turn, Chapter 3 examines various systems that have been introduced to support the SQL flavor on top of the Hadoop infrastructure and provide competing and scalable performance in the processing of large-scale structured data. Chapter 4 discusses several systems that have been designed to tackle the problem of large-scale graph processing, while the main focus of Chapter 5 is on several systems that have been designed to provide scalable solutions for processing big data streams, and on other sets of systems that have been introduced to support the development of data pipelines between various types of big data processing jobs and systems. Next, Chapter 6 focuses on covering the emerging frameworks and systems in the domain of scalable machine learning and deep learning processing. Lastly, Chapter 7 shares conclusions and an outlook on future research challenges. This new and considerably enlarged second edition not only contains the completely new chapter 6, but also offers a refreshed content for the state-of-the-art in all domains of big data processing over the last years. Overall, the book offers a valuable reference guide for professional, students, and researchers in the domain of big data processing systems. Further, its comprehensive content will hopefully encourage readers to pursue further research on the subject.

Encyclopedia of Big Data Technologies

Encyclopedia of Big Data Technologies

Author: Sherif Sakr Format: Hardback Release Date: 01/03/2019

The Encyclopedia of Big Data Technologies provides researchers, educators, students and industry professionals with a comprehensive authority over the most relevant Big Data Technology concepts. With over 300 articles written by worldwide subject matter experts from both industry and academia, the encyclopedia covers topics such as big data storage systems, NoSQL database, cloud computing, distributed systems, data processing, data management, machine learning and social technologies, data science. Each peer-reviewed, highly structured entry provides the reader with basic terminology, subject overviews, key research results, application examples, future directions, cross references and a bibliography. The entries are expository and tutorial, making this reference a practical resource for students, academics, or professionals. In addition, the distinguished, international editorial board of the encyclopedia consists of well-respected scholars, each developing topics based upon their expertise.

Linked Data

Linked Data

Author: Sherif Sakr, Marcin Wylot, Raghava Mutharaju, Danh Le Phuoc Format: Paperback / softback Release Date: 11/01/2019

This book describes efficient and effective techniques for harnessing the power of Linked Data by tackling the various aspects of managing its growing volume: storing, querying, reasoning, provenance management and benchmarking. To this end, Chapter 1 introduces the main concepts of the Semantic Web and Linked Data and provides a roadmap for the book. Next, Chapter 2 briefly presents the basic concepts underpinning Linked Data technologies that are discussed in the book. Chapter 3 then offers an overview of various techniques and systems for centrally querying RDF datasets, and Chapter 4 outlines various techniques and systems for efficiently querying large RDF datasets in distributed environments. Subsequently, Chapter 5 explores how streaming requirements are addressed in current, state-of-the-art RDF stream data processing. Chapter 6 covers performance and scaling issues of distributed RDF reasoning systems, while Chapter 7 details benchmarks for RDF query engines and instance matching systems. Chapter 8 addresses the provenance management for Linked Data and presents the different provenance models developed. Lastly, Chapter 9 offers a brief summary, highlighting and providing insights into some of the open challenges and research directions. Providing an updated overview of methods, technologies and systems related to Linked Data this book is mainly intended for students and researchers who are interested in the Linked Data domain. It enables students to gain an understanding of the foundations and underpinning technologies and standards for Linked Data, while researchers benefit from the in-depth coverage of the emerging and ongoing advances in Linked Data storing, querying, reasoning, and provenance management systems. Further, it serves as a starting point to tackle the next research challenges in the domain of Linked Data management.

Large-Scale Graph Processing Using Apache Giraph

Large-Scale Graph Processing Using Apache Giraph

Author: Sherif Sakr, Faisal Moeen Orakzai, Ibrahim Abdelaziz, Zuhair Khayyat Format: Paperback / softback Release Date: 07/07/2018

This book takes its reader on a journey through Apache Giraph, a popular distributed graph processing platform designed to bring the power of big data processing to graph data. Designed as a step-by-step self-study guide for everyone interested in large-scale graph processing, it describes the fundamental abstractions of the system, its programming models and various techniques for using the system to process graph data at scale, including the implementation of several popular and advanced graph analytics algorithms. The book is organized as follows: Chapter 1 starts by providing a general background of the big data phenomenon and a general introduction to the Apache Giraph system, its abstraction, programming model and design architecture. Next, chapter 2 focuses on Giraph as a platform and how to use it. Based on a sample job, even more advanced topics like monitoring the Giraph application lifecycle and different methods for monitoring Giraph jobs are explained. Chapter 3 then provides an introduction to Giraph programming, introduces the basic Giraph graph model and explains how to write Giraph programs. In turn, Chapter 4 discusses in detail the implementation of some popular graph algorithms including PageRank, connected components, shortest paths and triangle closing. Chapter 5 focuses on advanced Giraph programming, discussing common Giraph algorithmic optimizations, tunable Giraph configurations that determine the system's utilization of the underlying resources, and how to write a custom graph input and output format. Lastly, chapter 6 highlights two systems that have been introduced to tackle the challenge of large scale graph processing, GraphX and GraphLab, and explains the main commonalities and differences between these systems and Apache Giraph. This book serves as an essential reference guide for students, researchers and practitioners in the domain of large scale graph processing. It offers step-by-step guidance, with several code examples and the complete source code available in the related github repository. Students will find a comprehensive introduction to and hands-on practice with tackling large scale graph processing problems using the Apache Giraph system, while researchers will discover thorough coverage of the emerging and ongoing advancements in big graph processing systems.

Linked Data

Linked Data

Author: Sherif Sakr, Marcin Wylot, Raghava Mutharaju, Danh Le Phuoc Format: Hardback Release Date: 09/03/2018

This book describes efficient and effective techniques for harnessing the power of Linked Data by tackling the various aspects of managing its growing volume: storing, querying, reasoning, provenance management and benchmarking. To this end, Chapter 1 introduces the main concepts of the Semantic Web and Linked Data and provides a roadmap for the book. Next, Chapter 2 briefly presents the basic concepts underpinning Linked Data technologies that are discussed in the book. Chapter 3 then offers an overview of various techniques and systems for centrally querying RDF datasets, and Chapter 4 outlines various techniques and systems for efficiently querying large RDF datasets in distributed environments. Subsequently, Chapter 5 explores how streaming requirements are addressed in current, state-of-the-art RDF stream data processing. Chapter 6 covers performance and scaling issues of distributed RDF reasoning systems, while Chapter 7 details benchmarks for RDF query engines and instance matching systems. Chapter 8 addresses the provenance management for Linked Data and presents the different provenance models developed. Lastly, Chapter 9 offers a brief summary, highlighting and providing insights into some of the open challenges and research directions. Providing an updated overview of methods, technologies and systems related to Linked Data this book is mainly intended for students and researchers who are interested in the Linked Data domain. It enables students to gain an understanding of the foundations and underpinning technologies and standards for Linked Data, while researchers benefit from the in-depth coverage of the emerging and ongoing advances in Linked Data storing, querying, reasoning, and provenance management systems. Further, it serves as a starting point to tackle the next research challenges in the domain of Linked Data management.

Large-Scale Graph Processing Using Apache Giraph

Large-Scale Graph Processing Using Apache Giraph

Author: Sherif Sakr, Faisal Moeen Orakzai, Ibrahim Abdelaziz, Zuhair Khayyat Format: Hardback Release Date: 12/01/2017

This book takes its reader on a journey through Apache Giraph, a popular distributed graph processing platform designed to bring the power of big data processing to graph data. Designed as a step-by-step self-study guide for everyone interested in large-scale graph processing, it describes the fundamental abstractions of the system, its programming models and various techniques for using the system to process graph data at scale, including the implementation of several popular and advanced graph analytics algorithms. The book is organized as follows: Chapter 1 starts by providing a general background of the big data phenomenon and a general introduction to the Apache Giraph system, its abstraction, programming model and design architecture. Next, chapter 2 focuses on Giraph as a platform and how to use it. Based on a sample job, even more advanced topics like monitoring the Giraph application lifecycle and different methods for monitoring Giraph jobs are explained. Chapter 3 then provides an introduction to Giraph programming, introduces the basic Giraph graph model and explains how to write Giraph programs. In turn, Chapter 4 discusses in detail the implementation of some popular graph algorithms including PageRank, connected components, shortest paths and triangle closing. Chapter 5 focuses on advanced Giraph programming, discussing common Giraph algorithmic optimizations, tunable Giraph configurations that determine the system's utilization of the underlying resources, and how to write a custom graph input and output format. Lastly, chapter 6 highlights two systems that have been introduced to tackle the challenge of large scale graph processing, GraphX and GraphLab, and explains the main commonalities and differences between these systems and Apache Giraph. This book serves as an essential reference guide for students, researchers and practitioners in the domain of large scale graph processing. It offers step-by-step guidance, with several code examples and the complete source code available in the related github repository. Students will find a comprehensive introduction to and hands-on practice with tackling large scale graph processing problems using the Apache Giraph system, while researchers will discover thorough coverage of the emerging and ongoing advancements in big graph processing systems.

Large Scale and Big Data

Large Scale and Big Data

Author: Sherif Sakr Format: Paperback / softback Release Date: 16/11/2016

Large Scale and Big Data: Processing and Management provides readers with a central source of reference on the data management techniques currently available for large-scale data processing. Presenting chapters written by leading researchers, academics, and practitioners, it addresses the fundamental challenges associated with Big Data processing tools and techniques across a range of computing environments. The book begins by discussing the basic concepts and tools of large-scale Big Data processing and cloud computing. It also provides an overview of different programming models and cloud-based deployment models. The book's second section examines the usage of advanced Big Data processing techniques in different domains, including semantic web, graph processing, and stream processing. The third section discusses advanced topics of Big Data processing such as consistency management, privacy, and security. Supplying a comprehensive summary from both the research and applied perspectives, the book covers recent research discoveries and applications, making it an ideal reference for a wide range of audiences, including researchers and academics working on databases, data mining, and web scale data processing. After reading this book, you will gain a fundamental understanding of how to use Big Data-processing tools and techniques effectively across application domains. Coverage includes cloud data management architectures, big data analytics visualization, data management, analytics for vast amounts of unstructured data, clustering, classification, link analysis of big data, scalable data mining, and machine learning techniques.

Big Data 2.0 Processing Systems

Big Data 2.0 Processing Systems

Author: Sherif Sakr Format: Paperback / softback Release Date: 24/08/2016

This book provides readers the big picture and a comprehensive survey of the domain of big data processing systems. For the past decade, the Hadoop framework has dominated the world of big data processing, yet recently academia and industry have started to recognize its limitations in several application domains and big data processing scenarios such as the large-scale processing of structured data, graph data and streaming data. Thus, it is now gradually being replaced by a collection of engines that are dedicated to specific verticals (e.g. structured data, graph data, and streaming data). The book explores this new wave of systems, which it refers to as Big Data 2.0 processing systems. After Chapter 1 presents the general background of the big data phenomena, Chapter 2 provides an overview of various general-purpose big data processing systems that allow their users to develop various big data processing jobs for different application domains. In turn, Chapter 3 examines various systems that have been introduced to support the SQL flavor on top of the Hadoop infrastructure and provide competing and scalable performance in the processing of large-scale structured data. Chapter 4 discusses several systems that have been designed to tackle the problem of large-scale graph processing, while the main focus of Chapter 5 is on several systems that have been designed to provide scalable solutions for processing big data streams, and on other sets of systems that have been introduced to support the development of data pipelines between various types of big data processing jobs and systems. Lastly, Chapter 6 shares conclusions and an outlook on future research challenges. Overall, the book offers a valuable reference guide for students, researchers and professionals in the domain of big data processing systems. Further, its comprehensive content will hopefully encourage readers to pursue further research on the subject.

Large Scale and Big Data

Large Scale and Big Data

Author: Sherif Sakr Format: Hardback Release Date: 12/06/2014

Large Scale and Big Data: Processing and Management provides readers with a central source of reference on the data management techniques currently available for large-scale data processing. Presenting chapters written by leading researchers, academics, and practitioners, it addresses the fundamental challenges associated with Big Data processing tools and techniques across a range of computing environments. The book begins by discussing the basic concepts and tools of large-scale Big Data processing and cloud computing. It also provides an overview of different programming models and cloud-based deployment models. The book's second section examines the usage of advanced Big Data processing techniques in different domains, including semantic web, graph processing, and stream processing. The third section discusses advanced topics of Big Data processing such as consistency management, privacy, and security. Supplying a comprehensive summary from both the research and applied perspectives, the book covers recent research discoveries and applications, making it an ideal reference for a wide range of audiences, including researchers and academics working on databases, data mining, and web scale data processing. After reading this book, you will gain a fundamental understanding of how to use Big Data-processing tools and techniques effectively across application domains. Coverage includes cloud data management architectures, big data analytics visualization, data management, analytics for vast amounts of unstructured data, clustering, classification, link analysis of big data, scalable data mining, and machine learning techniques.

Graph Data Management

Graph Data Management

Author: Sherif Sakr Format: Hardback Release Date: 15/03/2012

Graph Data Management: Techniques and Applications is a central reference source for different data management techniques for graph data structures and their application. This book discusses graphs for modeling complex structured and schemaless data from the Semantic Web, social networks, protein networks, chemical compounds, and multimedia databases and offers essential research for academics working in the interdisciplinary domains of databases, data mining, and multimedia technology.