MAST32001: Computational Statistics, Autumn 2021

If you are enrolled in the Computational Statistics course for autumn 2021 (MAST32001) at the University of Helsinki, please read the instructions below carefully.

These instructions are the same as those available in Moodle.

Due to a flaw in the new studies system (Sisu), most enrolled students won't be able to access Moodle before the first lecture, so all preliminary course instructions have been copied in this page. As soon as you are able to access Moodle, please use Moodle for up-to-date information, instructions and all the course materials.

Course enrollment

Please double-check that you are correctly enrolled to the course in Sisu for MAST32001.
If you log in in Sisu and you see something along the lines of "Requirements not fulfilled" or "Ei täytä vaatimuksia", then you are not correctly enrolled.

Instructions for registering to courses

General instructions: https://studies.helsinki.fi/instructions/article/sisu-instructions-registering-…

PLEASE NOTE: Always register via your study plan, (do not use the search in Sisu), but instead add the course to your study plan and register via the plan. This is how you insure you are registering to the correct version of the course.

If you register for a course and receive the notification ‘The course is not included in your primary study plan’, Sisu will advise you to add the course to your plan.

However, if the course is already included in your primary plan, but you still receive the above message, please check from the course brochure that you have selected the most recent version of the course available, i.e., the one for the academic year 2021–2022.

Instructions for registration errors:

https://studies.helsinki.fi/instructions/article/sisu-instructions-registering-…

If you encounter a registration error, please unenroll from the (wrong) course and try enrolling again, following the instructions above (e.g., choose the correct course). Only if you still have trouble after having tried all the points above, please email the course instructor.

Introduction

Welcome to the Computational Statistics course (MAST32001) for autumn 2021!

This course is taught by Assistant Professor Luigi Acerbi, who leads the Machine and Human Intelligence research group at the Department of Computer Science, University of Helsinki.

All teaching events of the course will be on Zoom. There will be two kinds of sessions:

  • Thursday 14:15-16:00 and Friday 14:15-16:00 are "lectures" with Q&A and discussion on the pre-recorded lecture videos available on Moodle and an opportunity to work on example problems in small groups.
  • Monday 16:15-18:00 and Tuesday 16:15-18:00 are workshops for working on and getting help for the weekly exercise problems.

Videos and supporting material will be made available on Moodle before each lecture. Solutions for the computer tasks done in class (during the Thursday/Friday lectures) will be made available after the lecture.

Practicalities and code of conduct

Course attendance

Attendance is not required throughout the course. However, the Q&A sessions, group exercises and workshop classes have proven to be very useful in the past.

The Zoom sessions are not recorded.

Pre-recorded lectures and readings

Short pre-recorded lectures from the 2020 Computational Statistics I course (by Antti Honkela) will be made available each week.

You are expected to watch the short pre-recorded lectures before each Q&A session (each video is ~5 minutes long).
In preparation for the Q&A, you should also read the relevant chapter in the course textbook (link available in Moodle).

Weekly exercises

The course is completed based on programming tasks and exercises that are returned via Moodle.

To return your weekly exercises you must:

  1. Return the numerical answers to the questions to the Moodle quiz for automated scoring.
  2. Return the complete runnable source code for all the problems.

The source code must be a Jupyter/IPython notebook or a set of clearly labelled source files for each problem that prints the numerical solution required. Any notebooks returned must be runnable from scratch.

No points will be awarded for solutions without corresponding working source code for all problems you have answered, even if the Moodle scoring indicates otherwise!

Please note that this automatic scoring is about trust: in case you do not return matching source code for some of the problems you have marked as solved, you risk losing all points for that week.

Collaboration

Collaboration with other students is a good way to enhance learning: explaining things to others helps both understand them better. Collaboration on exercise problems is OK on the level of ideas, but each student must write their own code.

Direct copying of code from other students or from the Internet is considered plagiarism!

Home exam

Instructions for the home exam will appear on Moodle before the exam.

Getting started with Python

Software installations before the first lecture

All course sessions will include plenty of computer-based work.

Please make sure you have the necessary software installed (and working) on your laptop before joining the first Zoom session. We will not spend time to troubleshoot problems with the Python installation (you can find a lot of support online should you have any trouble).

The default programming language for the course is Python. The reasons for using Python are that:

  • it is the most popular language for computational statistics, data science, and machine learning;
  • we are going to use several Python features (mainly, automatic differentiation) that are not available in other languages for data analysis (such as R).

Required Python software:

  • Python, version >=3.7
  • IPython / Jupyter notebook
  • SciPy
  • NumPy
  • Matplotlib
  • (optional) PyTorch >= 1.9 (to run later examples that require automated differentiation)

If you are not very familiar with software installation, the easiest option is to use the Anaconda distribution of Python, which is available for any operating system: https://www.anaconda.com/products/individual and includes all the packages we need (except for PyTorch).

Otherwise, you can install Python (if it's not already installed) using your OS package manager (for Linux).
Please see the installation instructions of specific packages for more help:

Python tutorials

If you do not know Python (or you are not particularly fluent in it), these are useful resources.
If you are already familiar with any other programming language, learning Python will be quite easy, especially for the relatively simple usage level required for this course.

First lecture (7 September)

The first lecture will be on Zoom on Tuesday, September 7 2021 at 16:15-18:00.

The Zoom link will be shared beforehand in a message on Sisu.

Before the lecture, please:

  • Ensure that you have Jupyter notebook for Python installed and working on your system (see above, "Getting started with Python").
  • Watch the following short technical video on "statistical computation" by Antti Honkela: /en/unitube/video/50cba159-9bf9-4849-8a0b-189d313fc1b9
  • Read the first chapter from the course textbook: https://www.cs.helsinki.fi/u/ahonkela/teaching/compstats1/book/ (lecture notes by Antti Honkela).