Appendix
Mathematical Notations
| Notation | Definition |
|---|---|
| $\mathbb{R}$ | Real space, i.e., more or less any numerical value. |
| $\mathbb{N}$ | Natural numbers, i.e., any integer greater than 0. |
| $O$ | Object space, i.e., a set of real-world objects. |
| $\phi$ | Feature map, i.e., a map that defines the values of the features for objects. |
| $\mathcal{F}$ | Feature space, i.e., the values of all features. Often the $\mathbb{R}^d$, i.e., the $d$-dimensional real space. In this case there are $d \in \mathbb{N}$ features. |
| $X$ (clustering, classification, regression) | Used for the instances of objects in the feature space. Depending on the context, $X$ is either a set of instances have $X = \{x_1, ..., x_n\} \subseteq \mathcal{F}$. There are also some cases where $X$ is used as a random variable instead, the set would then be $n$ realizations of this random variable. |
| $Y$ (clustering, classification, regression) | Used for the value of interest, e.g., the classes for classification or the dependent variable in regression. Defined either as a set $Y= \{y_1, ..., y_n\}$ or a random variable (see $X$). |
| $I$ | Finite set of items {i_1, ..., i_m}. |
| $T$ | Finite set of transactions $T=\{t_1, ..., t_n\}$ where $t_i \subseteq I$ for $i=1, ..., n$. |
| $X$ (association rules) | Antecedent of an association rule. |
| $Y$ (association rules) | Consequent of an association rule. |
| $X \Rightarrow Y$ | Association rule where $Y$ is a consequent of $X$. |
| ${n \choose k}$ | The binomial coefficient ${n \choose k} = \frac{n!}{(n-k)!k!}$. |
| $\mathcal{P}(I)$ | The power set of a finite set $I$. |
| $\vert \cdot \vert$ | Cardinality of a set, e.g., $\vert X \vert$ for the number of elements of $X$ |
| $d(x,y)$ | Distance between two vectors $x$ and $y$, e.g., Euclidean distance, Manhattan distance, or Chebyshev distance. |
| $argmin_{i=1,...,k} f(i)$ | The value of $i$ for which the function $f$ is minimized. |
| $argmin_{i \in \{1, ..., k\}} f(i)$ | Same as $argmin_{i=1,...,k} f(i)$. |
| $\min_{i=1,...,k} f(i)$ | The minimal value of the function $f$ for any value of $i$. |
| $\min_{i \in \{1, ..., k\}} f(i)$ | Same as $\min_{i=1, ..., k}$. |
| $argmax$ | See $argmin$. |
| $\max$ | See $\min$. |
| $\sim$ | Used to define the distribution of a random variable, e.g., $X \sim (\mu, \sigma)$ to specify that $X$ is normally distributed with mean value $\mu$ and standard deviation $\sigma$. |
| $C$ (classification) | Set of classes. |
| $C$ (clustering) | Description of a cluster. |
| $h$ | Hypothesis, concept, classifier, classification model. |
| $h^*$ | Target concept. |
| $h'_c$ | Score based hypothesis that computes the scores for the class $c$ |
| $P(X=x)$ | Probability that the random variable $X$ is realized by the value $x$. |
| $p(x)$ | $p(x) = P(X=x)$ for a random variable $x$. |
| $P(X \vert Y)$ | Conditional probability of the random variable $X$ given the random variable $Y$. |
| $H(X)$ | Entropy of the random variable $X$. |
| $H(X \vert Y)$ | Conditional entropy of the random variable $X$ given the random variable $Y$. |
| $I(X; Y)$ | Information gain for $X$/$Y$ if the other variable is known. Also known as mutual information. |
| $e_x$ | Residual of a regression. |
| $x_t$ | Values of a time series $\{x_1, ..., x_T\} = \{x_t\}_{t=1}^T$. |
| $T_t$ | Trend term of a time series. |
| $S_t$ | Seasonal term of a time series. |
| $R_t$ | Autoregressive term of a time series. |