Loading...

Course Description

Cleaning and formatting data, also known as “data wrangling,” are the most under appreciated yet time-consuming steps in the data science pipeline. In real world analyses, data wrangling can consume up to 80% of project time.  During this course, students will learn and apply the Extract/ Transform/ Load (ETL) process used by professional data scientists to clean and prep data sets for analysis.

Course Objectives

Upon successful completion of the course, students will:

  • Understand the time commitment needed for data wrangling 
  • Identify data sets that may be time-intensive to clean
  • Efficiently clean data sets of both structured and unstructured data to prepare for analysis
  • Apply the Extract/ Transform/ Load (ETL) process to a data set
  • Better estimate the time required for data wrangling tasks

Notes

Enrollment in this course is restricted. Students must submit an application and be accepted into the Certificate in Data Science in order to register for this course.

Current Georgetown students must create an application using their Georgetown NetID and password. New students will be prompted to create an account.

Course Prerequisites

Course prerequisites include:

  • A bachelor's degree or equivalent
  • Completion of at least two college-level math courses (e.g. statistics, calculus, etc.)
  • Successful completion of Data Sources (XBUS-502)
  • Basic familiarity with programming or a programming language
  • A laptop for class meetings and coursework

Applies Towards the Following Certificates

Loading...

Enroll Now - Select a section to enroll in

Type
Class
Days
Sa
Time
9:00AM to 4:00PM
Dates
Apr 10, 2021 to Apr 17, 2021
Schedule and Location
Contact Hours
12.0
Course Tuition
Tuition non-credit $833.00 Click here to get more information
Instructors
Section Notes

Welcome to the Flex Learning Experience - Real-time learning using live Zoom video conferencing— mirroring a more traditional classroom with regular interaction, - engaging activities, and the dynamic exploration of topics and concepts.

  • Dynamic exploration of topics, ideas and concepts with the instructor and students in the class
  • Interact regularly and frequently with your instructors and other students
  • Comparable level of accountability and engagement as classroom attendance
  • Lectures, discussions, and presentations occur at a specific hour
  • Face-to-face discussion, individual guidance, speed and immediacy to synchronous online learning
  • Immediate feedback - encouraging quick feedback on ideas, and support consensus and decision making
  • Pacing - encouraging students to keep up-to-date and provide a discipline to learning
  • Spontaneity - making it easy to add new ideas to the conversation, brainstorming or decision making
  • Familiarity - simulating a more traditional face-to-face environment

 

Computing Requirements

Students will be expected to use a personal laptop to complete analytics and programming workshops and a Capstone project. Students should have administrative access and be able to install required course software and libraries. We recommend the following minimum computing requirements:

  • A laptop with at least a dual-core 1.8 GHz processor, 4GB of RAM, and 20 GB free hard disk space (e.g. a laptop purchased in the past two years).
  • A modern operating system: Windows 10 or newer (updated to the latest semi-annual channel version), OS X 10.15 Catalina or newer, or Ubuntu 20.04 or newer (or an equivalent Linux distribution). OS X and Linux are strongly encouraged.
  • Administrator access on your system to install new software.
  • Python 3.8 (or later) or Anaconda 2020.07 (or later) installed on your system.
  • A command prompt available (Powershell on Windows, Terminal on OS X or Linux).

Please note that computing requirements and software dependencies may change. 

Type
Class
Days
Sa
Time
9:00AM to 4:00PM
Dates
Jul 10, 2021 to Jul 17, 2021
Schedule and Location
Contact Hours
12.0
Course Tuition
Tuition non-credit $833.00 Click here to get more information
Instructors
Section Notes

Welcome to the Flex Learning Experience - Real-time learning using live Zoom video conferencing— mirroring a more traditional classroom with regular interaction, - engaging activities, and the dynamic exploration of topics and concepts.

  • Dynamic exploration of topics, ideas and concepts with the instructor and students in the class
  • Interact regularly and frequently with your instructors and other students
  • Comparable level of accountability and engagement as classroom attendance
  • Lectures, discussions, and presentations occur at a specific hour
  • Face-to-face discussion, individual guidance, speed and immediacy to synchronous online learning
  • Immediate feedback - encouraging quick feedback on ideas, and support consensus and decision making
  • Pacing - encouraging students to keep up-to-date and provide a discipline to learning
  • Spontaneity - making it easy to add new ideas to the conversation, brainstorming or decision making
  • Familiarity - simulating a more traditional face-to-face environment

 

Computing Requirements

Students will be expected to use a personal laptop to complete analytics and programming workshops and a Capstone project. Students should have administrative access and be able to install required course software and libraries. We recommend the following minimum computing requirements:

  • A laptop with at least a dual-core 1.8 GHz processor, 4GB of RAM, and 20 GB free hard disk space (e.g. a laptop purchased in the past two years).
  • A modern operating system: Windows 10 or newer (updated to the latest semi-annual channel version), OS X 10.15 Catalina or newer, or Ubuntu 20.04 or newer (or an equivalent Linux distribution). OS X and Linux are strongly encouraged.
  • Administrator access on your system to install new software.
  • Python 3.8 (or later) or Anaconda 2020.07 (or later) installed on your system.
  • A command prompt available (Powershell on Windows, Terminal on OS X or Linux).

Please note that computing requirements and software dependencies may change. 

Required fields are indicated by .