Data teams can often generate proofs of concept for their work relatively quickly. But an impressive proof of concept is rarely the finished result; the coordination necessary to metamorphosize a Jupyter notebook into a scalable data product can be complex, and there is no single template for success. Effective data teams must coordinate with product management and engineering functions and align project outcomes with larger business goals in a way that preserves the unique culture of experimentation and curiosity that makes data science projects compelling in the first place.
This course illuminates the “how” of data science within organizations: product management, processes, and infrastructure as well as the tooling that sets the conditions for successful data projects. We will explore some common stumbling blocks for data teams as they work to transition small scale experiments to robust production systems. We will also investigate some of the data engineering and infrastructure that comprises a productive developer experience for data scientists.
This course is part of the Leadership track of the Advanced Data Science Certificate. In particular, it’s part of a three-course leadership sequence that explores different aspects of leadership and management in a data science context: How, focusing on product management and infrastructure; Who, focusing on building effective teams, and Why, What, & Where, focusing on organizational strategy.
Upon successful completion of the course, students will:
Understand the benefits of agile data product development and the special considerations necessary when applying product management processes to data teams that operate with an inherently high degree of uncertainty.
Explore best practices for data product development including hypothesis-driven development, test-driven development, version control, unit testing, code review, pair programming, SCRUM, and Kanban
Scope large, complex, and unpredictable tasks into tractable steps with concrete measures of success.
Evaluate software tools that can help to discourage silos, encourage prototyping and iteration, and keep costs low.
Compare open source data storage, access, and processing tools to proprietary ones and understand the tradeoffs and advantages.
Applies Towards the Following Certificates
- Certificate in Advanced Data Science : Elective Course