Summer School: Data Science - An Overview

Target group

Students who want to develop skills in managing scientific data using the statistical software R and learn a basic understanding of data science.

 

Content

Introduction: What is Data Science?

  • Big Data and Data Science hype and getting past the hype
  • Why now? - Datafication
  • Current landscape of perspectives
  • Skill sets needed / Data Science Teams
  • Data Catalog
  • Plan a Data Science Project & measure success of a DS project
  • Digital Transformation strategies


Statistical Inference

  • Populations and samples
  • Statistical modeling, probability distributions, fitting a model


Exploratory Data Analysis and the Data Science Process

  • Basic tools (plots, graphs and summary statistics) of EDA
  • Philosophy of EDA
  • The Data Science Process


Three Basic Machine Learning Algorithms

  • Linear Regression
  • k-Nearest Neighbors (k-NN)
  • k-means


One More Machine Learning Algorithm and Usage in Applications

  • Motivating application: Filtering Spam
  • Why Linear Regression and k-NN are poor choices for Filtering Spam
  • Naive Bayes and why it works for Filtering Spam
  • Data Wrangling: APIs and other tools for scrapping the Web


Feature Generation and Feature Selection (Extracting Meaning From Data)

  • Motivating application: user (customer) retention
  • Feature Generation (brainstorming, role of domain expertise, and place for imagination)
  • Feature Selection algorithms
  • Filters; Wrappers; Decision Trees; Random Forests


Recommendation Systems: Building a User-Facing Data Product

  • Algorithmic ingredients of a Recommendation Engine
  • Dimensionality Reduction
  • Singular Value Decomposition
  • Principal Component Analysis


Mining Social-Network Graphs

  • Social networks as graphs
  • Clustering of graphs
  • Direct discovery of communities in graphs
  • Partitioning of graphs
  • Neighborhood properties in graphs


Data Science and Ethical Issues

  • Discussions on privacy, security, ethics
  • A look back at Data Science
  • Next-generation data scientists 

 

Description

Data Science is the study of the generalizable extraction of knowledge from data. Being a data scientist requires an integrated skill set spanning mathematics, statistics, machine learning, databases and other branches of computer science along with a good understanding of the craft of problem formulation to engineer effective solutions.

This summer school will introduce students to this rapidly growing field and equip them with some of its basic principles and tools as well as its general mindset. Students will learn concepts, techniques and tools they need to deal with various facets of data science practice, including data collection and integration, exploratory data analysis, predictive modeling, descriptive modeling, data product creation, evaluation, and effective communication.

Short Facts
  • Date: 02.05. - 30.06.2024, self paced learning
  • Format: Online Course, 40 TU, 3,5 ECTS
  • Costs: EUR 150,-
  • Lecturer: Peter Schwazer, Mario Tuta, Walter Boyajian
  • Group size: Max. 40 participants
  • Methodology: Self-paced learning
  • Language: English
  • Badge: Data Science | Basics
  • Venue: Online, Sakai


Registration for this career event starts on February 1, 2024!


Online Registration

Online Registration Badges

Students who participate in a Badge program are kindly asked to register for this career event only via:

Wir nutzen Cookies auf unserer Website. Einige von ihnen sind essenziell für den Betrieb der Seite, während andere uns helfen, diese Website und die Nutzererfahrung zu verbessern (Tracking Cookies). Sie können selbst entscheiden, ob Sie die Cookies zulassen möchten. Bitte beachten Sie, dass bei einer Ablehnung womöglich nicht mehr alle Funktionalitäten der Seite zur Verfügung stehen.