Summer School: Data Science - An Overview

Target group

Students who want to develop skills in managing scientific data using the statistical software R and learn a basic understanding of data science.



Introduction: What is Data Science?

  • Big Data and Data Science hype and getting past the hype
  • Why now? - Datafication
  • Current landscape of perspectives
  • Skill sets needed / Data Science Teams
  • Data Catalog
  • Plan a Data Science Project & measure success of a DS project
  • Digital Transformation strategies

Statistical Inference

  • Populations and samples
  • Statistical modeling, probability distributions, fitting a model

Exploratory Data Analysis and the Data Science Process

  • Basic tools (plots, graphs and summary statistics) of EDA
  • Philosophy of EDA
  • The Data Science Process

Three Basic Machine Learning Algorithms

  • Linear Regression
  • k-Nearest Neighbors (k-NN)
  • k-means

One More Machine Learning Algorithm and Usage in Applications

  • Motivating application: Filtering Spam
  • Why Linear Regression and k-NN are poor choices for Filtering Spam
  • Naive Bayes and why it works for Filtering Spam
  • Data Wrangling: APIs and other tools for scrapping the Web

Feature Generation and Feature Selection (Extracting Meaning From Data)

  • Motivating application: user (customer) retention
  • Feature Generation (brainstorming, role of domain expertise, and place for imagination)
  • Feature Selection algorithms
  • Filters; Wrappers; Decision Trees; Random Forests

Recommendation Systems: Building a User-Facing Data Product

  • Algorithmic ingredients of a Recommendation Engine
  • Dimensionality Reduction
  • Singular Value Decomposition
  • Principal Component Analysis

Mining Social-Network Graphs

  • Social networks as graphs
  • Clustering of graphs
  • Direct discovery of communities in graphs
  • Partitioning of graphs
  • Neighborhood properties in graphs

Data Science and Ethical Issues

  • Discussions on privacy, security, ethics
  • A look back at Data Science
  • Next-generation data scientists 



Data Science is the study of the generalizable extraction of knowledge from data. Being a data scientist requires an integrated skill set spanning mathematics, statistics, machine learning, databases and other branches of computer science along with a good understanding of the craft of problem formulation to engineer effective solutions.

This summer school will introduce students to this rapidly growing field and equip them with some of its basic principles and tools as well as its general mindset. Students will learn concepts, techniques and tools they need to deal with various facets of data science practice, including data collection and integration, exploratory data analysis, predictive modeling, descriptive modeling, data product creation, evaluation, and effective communication.

Short Facts
  • Date: 02.05. - 14.07.2023, self paced learning
    30.05. + 27.06.2023, 05.30 - 07.00 pm, Q&A session
  • Format: Online Course, 40 TU, 3,5 ECTS
  • Costs: EUR 150,-
  • Lecturer: Peter Schwazer, Mario Tuta, Walter Boyajian
  • Group size: Max. 40 participants
  • Methodology: Self-paced learning
  • Language: English
  • Badge: Data Science | Basics
  • Venue: Online, Sakai

Registration for this career event starts on February 1, 2023!

Online Registration

Online Registration Badges

Students who participate in a Badge program are kindly asked to register for this career event only via:

We use cookies on our website. Some of them are essential for the operation of the site, while others help us to improve this site and the user experience (tracking cookies). You can decide for yourself whether you want to allow cookies or not. Please note that if you reject them, you may not be able to use all the functionalities of the site.