Apache Spark

Apache Spark is een van de meest populaire frameworks voor big data-verwerking, met krachtige, snelle mogelijkheden voor gegevensanalyse en machine learning. Het is ontworpen om op schaal te werken, van een enkele server tot duizenden machines, met een nadruk op snelheid en ontwikkelaarsvriendelijkheid.

Belangrijkste Kenmerken van Apache Spark

Snelheid: Spark biedt een geoptimaliseerde engine die tot 100 keer sneller kan zijn dan Hadoop MapReduce in het geheugen, en 10 keer sneller bij het verwerken van data op schijf.
Ondersteuning voor meerdere talen: Programmeurs kunnen Spark-toepassingen schrijven in Scala, Python, Java, en R, waardoor Spark toegankelijk is voor een breed scala aan gebruikers.
Geavanceerde analytics: Naast MapReduce-bewerkingen ondersteunt Spark SQL voor data querying, MLlib voor machine learning, GraphX voor grafiekverwerking en Spark Streaming.

Componenten van Apache Spark

Spark Core: De fundamentele uitvoeringsengine voor de Spark-platform, waarop alle andere functionaliteiten zijn gebouwd.
Spark SQL: Maakt het mogelijk om data te verwerken via SQL- en HiveQL-queries, waardoor het eenvoudig is om te integreren met bestaande databases en datawarehouses.
Spark Streaming: Stelt ontwikkelaars in staat om real-time data streaming te verwerken en te analyseren, vergelijkbaar met de mogelijkheden van Apache Storm en Apache Flink.
MLlib: Een bibliotheek voor machine learning die het eenvoudig maakt om voorspellende analytics te implementeren en te schalen op een groot aantal datasets.
GraphX: Voor het verwerken van grafieken en grafiek-gebaseerde berekeningen, uitstekend geschikt voor taken zoals sociale netwerkanalyse.

Toepassingen van Apache Spark

Data Processing: Geschikt voor taken variërend van ETL tot interactieve queries en data mining.
Machine Learning: Ideaal voor het draaien van complexe algoritmen, zoals clustering en classificatie, op grote datasets.
Real-Time Processing: Wordt gebruikt voor streaming data, waardoor bedrijven in real-time inzichten kunnen verkrijgen uit hun datastromen.

Beginnen met Apache Spark

Om te starten met Spark, kun je de software downloaden van de officiële Apache website. Veel ontwikkelaars kiezen ervoor om Spark te draaien op een cluster met Hadoop YARN of Apache Mesos, maar het kan ook lokaal draaien op een enkele computer voor testdoeleinden.

Leermiddelen

Voor diegenen die meer willen leren over Apache Spark, zijn er tal van tutorials, documentatie, en online cursussen beschikbaar. Populaire bronnen omvatten de officiële Spark-documentatie, gespecialiseerde blogs, en hands-on lab-sessies die online toegankelijk zijn.

Overzicht cursussen Apache Spark

Taming Big Data with Apache Spark and Python - Hands On!

Doelgroep: Alle niveaus..

Bedrijf: Udemy Duur: 46 colleges - 5 uur Richtprijs: € 99,99

“Big data" analysis is a hot and highly valuable skill – and this course will teach you the hottest technology in big data: Apache Spark. Employers including Amazon, EBay, NASA JPL, and Yahoo all use Spark to quickly extract meaning from massive data sets across a fault-tolerant Hadoop cluster. You'll learn those ..

Meer info

Apache Spark with Scala - Hands On with Big Data!

Doelgroep: Software engineers who want to expand their skills into the world of big data processing on a cluster,If you have no previous programming or scripting experience, you'll want to take an introductory p..

Bedrijf: Udemy Duur: 9 uur in totaal Richtprijs: €149.99

New! Completely updated and re-recorded for Spark 3, IntelliJ, Structured Streaming, and a stronger focus on the DataSet API.“Big data" analysis is a hot and highly valuable skill – and this course will teach you the hottest technology in big data: Apache Spark. Employers including Amazon, EBay, ..

Meer info

Scala and Spark for Big Data and Machine Learning

Doelgroep: Someone who already knows how to program and is interested in learning Big Data Technologies,Interested in using Spark with Scala for Machine Learning with Large Data Sets..

Bedrijf: Udemy Duur: 10 uur in totaal Richtprijs: €194.99

Learn how to utilize some of the most valuable tech skills on the market today, Scala and Spark! In this course we will show you how to use Scala and Spark to analyze Big Data. Scala and Spark are two of the most in demand skills right now, and with this course you can learn them quickly and easily! This cour..

Meer info

Streaming Big Data with Spark Streaming and Scala - Hands On

Doelgroep: Students with some prior programming or scripting ability SHOULD take this course.,If you're working for a company with \big data\ that is being generated continuously, or hope to work for one, this c..

Bedrijf: Udemy Duur: 6,5 uur in totaal Richtprijs: €149.99

New! Updated for Spark 3.0.0!"Big Data" analysis is a hot and highly valuable skill. Thing is, "big data" never stops flowing! Spark Streaming is a new and quickly developing technology for processing massive data sets as they are created - why wait for some nightly analysis to run when you can constantly upd..

Meer info

Apache Spark 2.0 with Java -Learn Spark from a Big Data Guru

Doelgroep: Anyone who want to fully understand how Apache Spark technology works and learn how Apache Spark is being used in the field.,Software engineers who want to develop Apache Spark 2.0 applications using..

Bedrijf: Udemy Duur: 3,5 uur in totaal Richtprijs: €199.99

What is this course about: This course covers all the fundamentals about Apache Spark with Java and teaches you everything you need to know about developing Spark applications with Java. At the end of this course, you will gain in-depth knowledge about Apache Spark and general big data analysis and manip..

Meer info

CCA 175 - Spark and Hadoop Developer Certification - Scala

Doelgroep: Any IT aspirant/professional willing to learn Big Data and give CCA 175 certification..

Bedrijf: Udemy Duur: 29 uur in totaal Richtprijs: €179.99

CCA 175 Spark and Hadoop Developer is one of the well recognized Big Data certification. This scenario based certification exam demands basic programming using Python or Scala along with Spark and other Big Data technologies.This comprehensive course covers all aspects of the certification using Scala as programming la..

Meer info

Apache Spark for Java Developers

Doelgroep: Anyone who already knows Java and would like to explore Apache Spark,Anyone new to Data Science who want a fast way to get started, without learning Python, Scala or R!..

Bedrijf: Udemy Duur: 21,5 uur in totaal Richtprijs: €34.99

Get started with the amazing Apache Spark parallel computing framework - this course is designed especially for Java Developers.If you're new to Data Science and want to find out about how massive datasets are processed in parallel, then the Java API for spark is a great way to get started, fast.All of the fundame..

Meer info

Apache Spark 3 - Spark Programming in Python for Beginners

Doelgroep: Software Engineers and Architects who are willing to design and develop a Bigdata Engineering Projects using Apache Spark,Programmers and developers who are aspiring to grow and learn Data Engineering..

Bedrijf: Udemy Duur: 6,5 uur in totaal Richtprijs: €19.99

This course does not require any prior knowledge of Apache Spark or Hadoop. We have taken enough care to explain Spark Architecture and fundamental concepts to help you come up to speed and grasp the content of this course.About the CourseI am creating Apache Spark 3 - Spark Programming in Python for Beginners course t..

Meer info

Spark 3.0 & Big Data Essentials with Scala | Rock the JVM

Doelgroep: Future data scientists,Programmers getting into the field of Big Data,Engineers wanting to learn Spark in Scala, its native language..

Bedrijf: Udemy Duur: 7,5 uur in totaal Richtprijs: €49.99

UPDATED FOR SPARK 3.0In this course, we will learn how to write big data applications with Apache Spark 3 and Scala. You'll write 2000+ lines of Spark code yourself, with guidance, and you will become a rockstar.This course is for Scala programmers who are getting started with Ap..

Meer info

Apache Spark 3 - Spark Programming in Scala for Beginners

Bedrijf: Udemy Duur: 7 uur in totaal Richtprijs: €19.99

Meer info

Master Apache Spark - Hands On!

Doelgroep: Anyone who is a Java developer and want's to add this seriously marketable technology on their resume,Anyone who wants to get into the data science field,Anyone who is interested in into the world of..

Bedrijf: Udemy Duur: 7 uur in totaal Richtprijs: €99.99

LAST UPDATED: November 2020Apache Spark is the next generation batch and stream processing engine. It's been proven to be almost 100 times faster than Hadoop and much much easier to develop distributed big data applications with. It's demand has sky rocketed in recent years and having this technolog..

Meer info

Apache Spark Hands on Specialization for Big Data Analytics

Doelgroep: Anyone who has the passion to develop expertise in Big Data and specifically Apache Spark,Software Engineers or Developers,Data Warehousing or Business Intelligence Professionals,Data Scientist and Ma..

Bedrijf: Udemy Duur: 12 uur in totaal Richtprijs: €149.99

What if you could catapult your career in one of the most lucrative domains i.e. Big Data by learning the state of the art Hadoop technology (Apache Spark) which is considered mandatory in all of the current jobs in this industry? What if you could develop your skill-set in one of the most hottest Big Data technol..

Meer info

Apache Spark Streaming with Python and PySpark

Doelgroep: Python Developers looking to get better at Data Streaming,Managers or Senior Engineers in Data Engineering Teams,Spark Developers eager to expand their skills...

Bedrijf: Udemy Duur: 4 uur in totaal Richtprijs: €139.99

What is this course about? This course covers all the fundamentals about Apache Spark streaming with Python and teaches you everything you need to know about developing Spark streaming applications using PySpark, the Python API for Spark. At the end of this course, you will gain in-depth knowledge ..

Meer info

Apache Spark 2.0 + Java : DO Big Data Analytics & ML

Doelgroep: Software Professionals,Big Data Architects,Data Engineers..

Bedrijf: Udemy Duur: 8 uur in totaal Richtprijs: €199.99

Welcome to our course. Looking to learn Apache Spark 2.0, practice end-to-end projects and take it to a job interview? You have come to the RIGHT course! This course teaches you Apache Spark 2.0 with Java, trains you in building Spark Analytics and machine learning programs and helps you&n..

Meer info

From 0 to 1 : Spark for Data Science with Python

Doelgroep: Yep! Analysts who want to leverage Spark for analyzing interesting datasets,Yep! Data Scientists who want a single engine for analyzing and modelling data as well as productionizing it. ,Yep! Engineer..

Bedrijf: Udemy Duur: 8,5 uur in totaal Richtprijs: €99.99

Taught by a 4 person team including 2 Stanford-educated, ex-Googlers and 2 ex-Flipkart Lead Analysts. This team has decades of practical experience in working with Java and with billions of rows of data. Get your data to fly using Spark for analytics, machine learning and data science ..

Meer info

HDPCD:Spark using Scala

Doelgroep: Any one who want to prepare for HDPCD Spark Certification using Scala..

Bedrijf: Udemy Duur: 18 uur in totaal Richtprijs: €49.99

Course cover the overall syllabus of HDPCD:Spark Certification.Scala Fundamentals - Basic Scala programming required using REPLGetting Started with Spark - Different setup options, setup processCore Spark - Transformations and Actions to process the dataData Frames and Spark SQL - Leverage SQL skills on top of Dat..

Meer info

Vind een opleiding

Opleidingen - Apache Spark

Apache Spark

Belangrijkste Kenmerken van Apache Spark

Componenten van Apache Spark

Toepassingen van Apache Spark

Beginnen met Apache Spark

Leermiddelen

Overzicht cursussen Apache Spark

Taming Big Data with Apache Spark and Python - Hands On!

Apache Spark with Scala - Hands On with Big Data!

Scala and Spark for Big Data and Machine Learning

Streaming Big Data with Spark Streaming and Scala - Hands On

Apache Spark 2.0 with Java -Learn Spark from a Big Data Guru

CCA 175 - Spark and Hadoop Developer Certification - Scala

Apache Spark for Java Developers

Apache Spark 3 - Spark Programming in Python for Beginners

Spark 3.0 & Big Data Essentials with Scala | Rock the JVM

Apache Spark 3 - Spark Programming in Scala for Beginners

Master Apache Spark - Hands On!

Apache Spark Hands on Specialization for Big Data Analytics

Apache Spark Streaming with Python and PySpark

Apache Spark 2.0 + Java : DO Big Data Analytics & ML

From 0 to 1 : Spark for Data Science with Python

HDPCD:Spark using Scala