Practical Machine Learning With Apache Spark
This intensive hands-on training introduces the audience to the core aspects of scalable data processing using Python on the Apache Spark platform. The students will learn the essentials of Python with the primary focus being on the capabilities of the Apache Spark platform and its Machine Learning module. The students will be introduced to the terminology, concepts, and algorithms used in Machine Learning.
Audience
This course is suitable for: Data Scientists, Business Analysts, Software Developers, IT Architects.
Prerequisites
Participants should have a working knowledge of Python or have strong programming experience with another language. Familiarity with core statistical concepts such as variance, correlation, etc. is helpful.
- What is Data Science?
- Machine Learning Life-Cycle
- Overview of Python for Data Science
- Introduction to Apache Spark
- The Spark Shell
- Introduction to Jupyter Notebooks
- Data Visualization with matplotlib
- Data Science and ML Algorithms with PySpark
Is there a discount available for current students?
UMBC students and alumni, as well as students who have previously taken a public training course with UMBC Training Centers are eligible for a 10% discount, capped at $250. Please provide a copy of your UMBC student ID or an unofficial transcript or the name of the UMBC Training Centers course you have completed. Asynchronous courses are excluded from this offer.
What is the cancellation and refund policy?
Student will receive a refund of paid registration fees only if UMBC Training Centers receives a notice of cancellation at least 10 business days prior to the class start date for classes or the exam date for exams.