- Bayesian Kernel Methods [Morning tutorial]
- Inside WEKA -- and Beyond the Book [Afternoon tutorial]
- Introduction to Minimum Length Encoding Inference [Afternoon tutorial]

Monday, 8 July, 10am

Dr. Alexander Johannes Smola, The Australian National University, Canberra, Alex.Smola@anu.edu.au

The tutorial will introduce Gaussian Processes both for Classifcation
and Regression. This includes a brief presentation of covariance functions,
their connection

to Support Vector Kernels, and an overview over recent optimization
methods for Gaussian Processes.

Target Audience: Novices and researchers more advanced in the knowledge of Gaussian Processes will benefit from the presentation. While being self contained, i.e., without requiring much further knowledge than basic calculus and linear algebra, the presentation will advance to state of the art results in optimization and adaptive inference. This means that the course will cater for Graduate Students and senior researchers alike. In particular, I will not assume knowledge beyond undergraduate mathematics (see Prerequisites for further detail).

Expected Knowledge Gain: a working knowledge in Gaussian Processes which will enable the audience to apply Bayesian inference methods in their research without much further training.

Prerequisites: Nothing beyond undergraduate knowledge in mathematics is expected. More specifically, I assume:

- Basic linear algebra (matrix inverse, eigenvector, eigenvalue, etc.)
- Some numerical mathematics (beenficial but not required), such as matrix factorization, conditioning, etc.
- Basic statistics and probability theory (Normal distribution, conditional distributions).
- (OPTIONAL:) Some knowledge in Bayesian methods
- (OPTIONAL:) Some knowledge in kernel methods

Monday, 8 July, 2pm

**Inside WEKA -- and
Beyond the Book**

Ian H. Witten, Computer Science, University of Waikato, NZ,
ihw@cs.waikato.ac.nz

Eibe Frank, Computer Science, University of Waikato, NZ, eibe@cs.waikato.ac.nz

Bernhard Pfahringer, Computer Science, University of Waikato, NZ,
bernhard@cs.waikato.ac.nz

Mark Hall, Computer Science, University of Waikato, NZ,
mhall@cs.waikato.ac.nz

Weka is an open-source Weka machine learning workbench, implemented
in Java, that incorporates many popular algorithms and is widely used for
practical work in machine learning. This tutorial describes and demonstrates
the many recent developments that have been made in the Weka system.
It also looks inside Weka and sketches its inner workings for people who
want to extend it with their own machine learning implementations and make
them available to the community by contributing to this open-source project.The
goal is to empower attendees to increase the productivity of their machine
learning research and application development by making best use of the
Weka

workbench, and to share their efforts with others by contributing them
to the ML community. The tutorial is aimed at people who want to
know about advanced features of Weka, and also at those who want to work
within the system at a programming level.

This tutorial is *not* intended as a comprehensive introduction to Weka: attendees are presumed to have some familiarity with it already. Neither does it reveal any secrets that are not in the current version of Weka: if you have fully explored the features in the latest distribution you do not need to attend the tutorial.

Attendees are expected to have:

1. Basic knowledge of ML algorithms
and methodology

2. Some familiarity with Weka

3. Some programming experience
in Java.

(The book "Data mining" by Witten and Frank covers all three at an appropriate
level).

There will be a 2-hour lab session after the tutorial for those who
want to follow up with some practical work. Computers (Linux) will
be available, or you can bring along your laptop (Windows or Linux) and
we will help you install the latest version of Weka from a CD-ROM.
We will provide exercises for you to work on; alternatively you are encouraged
to bring their own data files and use Weka on them instead. Tutorial
help will be available throughout the lab session

Tuesday, 9 July, 2pm

**Introduction
to Minimum Length Encoding Inference**

Dr David Dowe,Monash University, Australia, dld@cs.monash.edu.au

The tutorial will be on Minimum Length Encoding, encompassing both Minimum Message Length (MML) and Minimum Description Length (MDL) inductive inference, topics central to the 1999 special issue of the Computer Journal on Kolmogorov complexity (vol. 42, no. 4, 1999). This information-theoretic approach bridges many fields, and is yielding state-of-the-art solutions to at least many problems in machine learning, statistics, econometrics and ``data mining''. It has applications right across the sciences.

This work is information-theoretic in nature, with a broad range of
applications in machine learning, statistics, knowledge discovery and data
mining. We discuss statistical parameter estimation and mixture modelling
(or clustering) of continuous, discrete and circular data. We also discuss
learning decision trees and decision graphs, both with standard multinomial
leaf distributions and with more complicated models. We further discuss
MML solutions of either cut-point problems or polynomial regression; and,
if time permits, possibly Support Vector Machines (SVMs), causal networks
and finite state machines (Hidden Markov Models, HMMs) or other problems.

The target audience is academics, machine learning and data mining
practitioners and consultants, and/or others with at least first year university
education in at least one of

mathematics, statistics, econometrics or electrical engineering. The
audience will learn introductory fundamentals of MML inference, and the
many state-of-the art success of MML in statistics, machine learning and
hybrid problems. The audience will also see some applications of MML to
real-world data.