Loading...

Course Description

Machine learning can classically be summarized with two methodologies: supervised and unsupervised learning. In supervised learning, the “correct answers” are annotated ahead of time and the algorithm tries to fit a decision space based on those answers. In unsupervised learning, algorithms try to group like examples together, inferring similarities via distance or similarity metrics. These learning types allow us to explore data and categorize them in a meaningful way, predicting where new data will fit into our models.

Scikit-Learn is a powerful machine learning library implemented in Python with numeric and scientific computing powerhouses Numpy, Scipy, and matplotlib for extremely fast analysis of small to medium sized data sets. It is open source, commercially usable and contains many modern machine learning algorithms for classification, regression, clustering, feature extraction, and optimization. For this reason Scikit-Learn is often the first tool in a Data Scientist’s toolkit for machine learning of incoming data sets.

The purpose of this course is to serve as an introduction to Machine Learning with Scikit-Learn. We will explore several clustering, classification, and regression algorithms for a variety of machine learning tasks and learn how to implement these tasks with our data using Scikit-Learn and Python. In particular, we will structure our machine learning models as though we were producing a data product, an actionable model that can be used in larger programs or algorithms; rather than as simply a research or investigation methodology. For more on Scikit-Learn see: Six Reasons why I recommend Scikit-Learn (O’Reilly Radar).

Course Objectives

After this course you should understand the basics of machine learning and how to implement machine learning algorithms on your data sets using Python and Scikit-Learn. In particularly you should understand basic regressions, classifiers, and clustering algorithms and how to fit a model and use it to predict future outcomes.

After completion of this course students should:

  • Understand the basic mechanics of machine learning, and how machine learning differs from data mining, pattern recognition, or statistical hypothesis testing.
  • Understand the differences and data requirements for regressions, classification, and clustering machine learning methodologies.
  • Understand how to prepare and load datasets into Scikit-Learn, including normalization, standardization, and imputation techniques as well as pre-investigations of data with feature-extraction, dimension analysis, and distance metrics
  • Have reviewed the many different types of models available in Scikit-Learn and the basic API for building models and saving models to disks. 
  • Be able to evaluate models using cross-validation, mean squared error, accuracy, precision, recall, and F1 scores as well as understand confusion matrices. 
  • Be able to deploy models into applications or data products to receive feedback from them, retraining and reinforcing existing models. 

Notes

Enrollment in this course is restricted. Students must submit an application and be accepted into the Certificate in Data Science in order to register for this course.

Current Georgetown students must create an application using their Georgetown NetID and password. New students will be prompted to create an account.

Course Prerequisites

This course relies on understanding and completing prior courses in the Data Analytics Certificate program, as well as familiarity with Python and the command line. Please ensure that all software is installed and ready for your particular operating system. Other course requirements include:

  • A bachelor’s degree or equivalent
  • Completion of at least two college-level math courses (e.g. statistics, calculus, etc.)
  • Successful completion of Data Wrangling (XBUS-503)
  • Successful completion of Data Analysis I: Statistics (XBUS-504)
  • Familiarity with Python or Anacondas and the Command Line
  • A laptop for class meetings and coursework with Python and Scikit-Learn installed

Applies Towards the Following Certificates

Loading...

Enroll Now - Select a section to enroll in

Type
Class
Days
F
Time
6:30PM to 9:30PM
Dates
May 07, 2021 to May 14, 2021
Type
Class
Days
Sa
Time
9:00AM to 4:00PM
Dates
May 08, 2021 to May 15, 2021
Schedule and Location
Contact Hours
18.0
Course Tuition
Tuition non-credit $1,249.00 Click here to get more information
Section Notes

Welcome to the Flex Learning Experience - Real-time learning using live Zoom video conferencing— mirroring a more traditional classroom with regular interaction, - engaging activities, and the dynamic exploration of topics and concepts.

  • Dynamic exploration of topics, ideas and concepts with the instructor and students in the class
  • Interact regularly and frequently with your instructors and other students
  • Comparable level of accountability and engagement as classroom attendance
  • Lectures, discussions, and presentations occur at a specific hour
  • Face-to-face discussion, individual guidance, speed and immediacy to synchronous online learning
  • Immediate feedback - encouraging quick feedback on ideas, and support consensus and decision making
  • Pacing - encouraging students to keep up-to-date and provide a discipline to learning
  • Spontaneity - making it easy to add new ideas to the conversation, brainstorming or decision making
  • Familiarity - simulating a more traditional face-to-face environment

 

Computing Requirements

Students will be expected to use a personal laptop to complete analytics and programming workshops and a Capstone project. Students should have administrative access and be able to install required course software and libraries. We recommend the following minimum computing requirements:

  • A laptop with at least a dual-core 1.8 GHz processor, 4GB of RAM, and 20 GB free hard disk space (e.g. a laptop purchased in the past two years).
  • A modern operating system: Windows 10 or newer (updated to the latest semi-annual channel version), OS X 10.15 Catalina or newer, or Ubuntu 20.04 or newer (or an equivalent Linux distribution). OS X and Linux are strongly encouraged.
  • Administrator access on your system to install new software.
  • Python 3.8 (or later) or Anaconda 2020.07 (or later) installed on your system.
  • A command prompt available (Powershell on Windows, Terminal on OS X or Linux).

Please note that computing requirements and software dependencies may change. 

Type
Class
Days
F
Time
6:30PM to 9:30PM
Dates
Aug 06, 2021 to Aug 13, 2021
Type
Class
Days
Sa
Time
9:00AM to 4:00PM
Dates
Aug 07, 2021 to Aug 14, 2021
Schedule and Location
Contact Hours
18.0
Course Tuition
Tuition non-credit $1,249.00 Click here to get more information
Instructors
Section Notes

Welcome to the Flex Learning Experience - Real-time learning using live Zoom video conferencing— mirroring a more traditional classroom with regular interaction, - engaging activities, and the dynamic exploration of topics and concepts.

  • Dynamic exploration of topics, ideas and concepts with the instructor and students in the class
  • Interact regularly and frequently with your instructors and other students
  • Comparable level of accountability and engagement as classroom attendance
  • Lectures, discussions, and presentations occur at a specific hour
  • Face-to-face discussion, individual guidance, speed and immediacy to synchronous online learning
  • Immediate feedback - encouraging quick feedback on ideas, and support consensus and decision making
  • Pacing - encouraging students to keep up-to-date and provide a discipline to learning
  • Spontaneity - making it easy to add new ideas to the conversation, brainstorming or decision making
  • Familiarity - simulating a more traditional face-to-face environment

 

Computing Requirements

Students will be expected to use a personal laptop to complete analytics and programming workshops and a Capstone project. Students should have administrative access and be able to install required course software and libraries. We recommend the following minimum computing requirements:

  • A laptop with at least a dual-core 1.8 GHz processor, 4GB of RAM, and 20 GB free hard disk space (e.g. a laptop purchased in the past two years).
  • A modern operating system: Windows 10 or newer (updated to the latest semi-annual channel version), OS X 10.15 Catalina or newer, or Ubuntu 20.04 or newer (or an equivalent Linux distribution). OS X and Linux are strongly encouraged.
  • Administrator access on your system to install new software.
  • Python 3.8 (or later) or Anaconda 2020.07 (or later) installed on your system.
  • A command prompt available (Powershell on Windows, Terminal on OS X or Linux).

Please note that computing requirements and software dependencies may change. 

Required fields are indicated by .