Lecture

Introduction to Data Science

Related lectures (32)

Covers decision tree classification using KNIME Analytics Platform for data preprocessing and model creation.

Covers data manipulation and exploration using Python with a focus on visualization techniques.

Offers a comprehensive introduction to Data Science, covering Python, Numpy, Pandas, Matplotlib, and Scikit-learn, with a focus on practical exercises and collaborative work.

Decision Trees: Classification

Explores decision trees for classification, entropy, information gain, one-hot encoding, hyperparameter optimization, and random forests.

General Introduction to Big Data

Covers data science tools, Hadoop, Spark, data lake ecosystems, CAP theorem, batch vs. stream processing, HDFS, Hive, Parquet, ORC, and MapReduce architecture.

Logistic Regression: Fundamentals and Applications

Explores logistic regression fundamentals, including cost functions, regularization, and classification boundaries, with practical examples using scikit-learn.

Data Wrangling with Hadoop: Storage Formats and Hive

Explores data wrangling with Hadoop, emphasizing storage formats and Hive for big data processing.

Data Wrangling with Hadoop

Covers data wrangling techniques using Hadoop, focusing on row versus column-oriented databases, popular storage formats, and HBase-Hive integration.

Spectral Estimation Methods

Explores parametric spectrum estimation methods, including line and smooth spectra, and delves into heart rate variability analysis.

Python Lists: Manipulation and Comprehension

Covers Python list manipulation and comprehension, emphasizing memory representation and mutability.

PyTorch and Convolutional Networks

Covers PyTorch tensor data structure and training a CNN to classify images.

Introduction to Machine Learning

Covers the basics of machine learning, including supervised and unsupervised learning, linear regression, and classification.

Data Wrangling with Hive: Managing Big Data Efficiently

Covers data wrangling techniques using Apache Hive for efficient big data management.

Structures and Mechanisms: Opening a Box

Explores the analysis of structures and mechanisms through a sample problem of opening a box with a string-held lid.

Statistical Signal Processing

Covers Gaussian Mixture Models, Denoising, Data Classification, and Spike Sorting using Principal Component Analysis.

3D Stone Scanning Session

Introduces a 'professional' 3D measurement system for stone analysis and feature extraction using stereo photogrammetry and structured light technologies.

Gitlab Agent for Kubernetes (`agentk`)

Covers the setup of a Gitlab agent for Kubernetes, focusing on installation, version control, and troubleshooting.

Total scattering and PDF analysis

Explores total scattering and PDF analysis in materials science, covering in-situ synthesis, data analysis techniques, and applications in host-guest systems.

Big Data Best Practices and Guidelines

Covers best practices and guidelines for big data, including data lakes, architecture, challenges, and technologies like Hadoop and Hive.

Data Science Essentials: Python, Numpy, Pandas, and Scikit-learn

Covers the essentials of Data Science using Python, Numpy, Pandas, and Scikit-learn, including DNA sequence analysis and classification.