Search
Appendix

Mathematical Notations

Notation Definition
$\mathbb{R}$ Real space, i.e., more or less any numerical value.
$\mathbb{N}$ Natural numbers, i.e., any integer greater than 0.
$O$ Object space, i.e., a set of real-world objects.
$\phi$ Feature map, i.e., a map that defines the values of the features for objects.
$\mathcal{F}$ Feature space, i.e., the values of all features. Often the $\mathbb{R}^d$, i.e., the $d$-dimensional real space. In this case there are $d \in \mathbb{N}$ features.
$X$ (clustering, classification, regression) Used for the instances of objects in the feature space. Depending on the context, $X$ is either a set of instances have $X = \{x_1, ..., x_n\} \subseteq \mathcal{F}$. There are also some cases where $X$ is used as a random variable instead, the set would then be $n$ realizations of this random variable.
$Y$ (clustering, classification, regression) Used for the value of interest, e.g., the classes for classification or the dependent variable in regression. Defined either as a set $Y= \{y_1, ..., y_n\}$ or a random variable (see $X$).
$I$ Finite set of items {i_1, ..., i_m}.
$T$ Finite set of transactions $T=\{t_1, ..., t_n\}$ where $t_i \subseteq I$ for $i=1, ..., n$.
$X$ (association rules) Antecedent of an association rule.
$Y$ (association rules) Consequent of an association rule.
$X \Rightarrow Y$ Association rule where $Y$ is a consequent of $X$.
${n \choose k}$ The binomial coefficient ${n \choose k} = \frac{n!}{(n-k)!k!}$.
$\mathcal{P}(I)$ The power set of a finite set $I$.
$\vert \cdot \vert$ Cardinality of a set, e.g., $\vert X \vert$ for the number of elements of $X$
$d(x,y)$ Distance between two vectors $x$ and $y$, e.g., Euclidean distance, Manhattan distance, or Chebyshev distance.
$argmin_{i=1,...,k} f(i)$ The value of $i$ for which the function $f$ is minimized.
$argmin_{i \in \{1, ..., k\}} f(i)$ Same as $argmin_{i=1,...,k} f(i)$.
$\min_{i=1,...,k} f(i)$ The minimal value of the function $f$ for any value of $i$.
$\min_{i \in \{1, ..., k\}} f(i)$ Same as $\min_{i=1, ..., k}$.
$argmax$ See $argmin$.
$\max$ See $\min$.
$\sim$ Used to define the distribution of a random variable, e.g., $X \sim (\mu, \sigma)$ to specify that $X$ is normally distributed with mean value $\mu$ and standard deviation $\sigma$.
$C$ (classification) Set of classes.
$C$ (clustering) Description of a cluster.
$h$ Hypothesis, concept, classifier, classification model.
$h^*$ Target concept.
$h'_c$ Score based hypothesis that computes the scores for the class $c$
$P(X=x)$ Probability that the random variable $X$ is realized by the value $x$.
$p(x)$ $p(x) = P(X=x)$ for a random variable $x$.
$P(X \vert Y)$ Conditional probability of the random variable $X$ given the random variable $Y$.
$H(X)$ Entropy of the random variable $X$.
$H(X \vert Y)$ Conditional entropy of the random variable $X$ given the random variable $Y$.
$I(X; Y)$ Information gain for $X$/$Y$ if the other variable is known. Also known as mutual information.
$e_x$ Residual of a regression.
$x_t$ Values of a time series $\{x_1, ..., x_T\} = \{x_t\}_{t=1}^T$.
$T_t$ Trend term of a time series.
$S_t$ Seasonal term of a time series.
$R_t$ Autoregressive term of a time series.