# Big Data for Volatility vs.Trend

So different aspects of Big Data — in this case dense vs. tall — are of different value for different things.  Dense data promote accurate volatility estimation, and tall data promote accurate trend estimation.

More (No Hesitations blog)

# Very brief notes on measures: From σ-fields to Carathéodory’s Theorem

Definition 1. A ${\sigma}$-field ${\mathcal{F}}$ is a non-empty collection of subsets of the sample space ${\Omega}$ closed under the formation of complements and countable unions (or equivalently of countable intesections – note ${\bigcap_{i} A_i = (\bigcup_i A_i^c)^c}$). Hence ${\mathcal{F}}$ is a ${\sigma}$-field if

$1. {A^c \in \mathcal{F}}$ whenever ${A \in \mathcal{F}}$
$2. {\bigcup_{i=1}^{\infty} A_i \in \mathcal{F}}$ whenever ${A_i \in \mathcal{F}, n \geq 1}$

Definition 2. Set functions and measures. Let ${S}$ be a set and ${\Sigma_0}$ be an algebra on ${S}$, and let ${\mu_0}$ be a non-negative set function

$\displaystyle \mu_0: \Sigma_0 \rightarrow [0, \infty]$

• ${\mu_0}$ is additive if ${\mu_0 (\varnothing) =0}$ and, for ${F,G \in \Sigma_0}$,

$\displaystyle F \cap G = \varnothing \qquad \Rightarrow \qquad \mu_0(F \cup G ) = \mu_0(F) + \mu_0(G)$

• The map ${\mu_0}$ is called countably additive (or ${\sigma}$-additive) if ${\mu (\varnothing)=0}$ and whenever ${(F_n: n \in \mathbb{N})}$ is a sequence of disjoint sets in ${\Sigma_0}$ with union ${F = \cup F_n}$ in ${\Sigma_0}$, then

$\displaystyle \mu_0 (F) = \sum_{n}\mu_0 (F_n)$

• Let ${(S, \Sigma)}$ be a measurable space, so that ${\Sigma}$ is a ${\sigma}$-algebra on ${S}$.
• A map $\displaystyle \mu: \Sigma \rightarrow [0,\infty].$ is called a measure on ${(S, \Sigma)}$ if ${\mu}$ is countable additive. The triple ${(S, \Sigma, \mu)}$ is called a measure space.
• The measure ${\mu}$ is called finite if

$\displaystyle \mu(S) < \infty,$

and ${\sigma}$finite if

${\exists \{S_n\} \in \Sigma}$, (${n \in \mathbb{N}}$) s.th.$\displaystyle \mu(S_n)< \infty, \forall n \in \mathbb{N} \text{ and } \cup S_n = S.$

• Measure ${\mu}$ is called a probability measure if $\displaystyle \mu(S) = 1,$ and ${(S, \Sigma, \mu)}$ is then called a probability triple.
• An element ${F}$ of ${\Sigma}$ is called ${\mu}$-null if ${\mu(F)=0}$.
• A statement ${\mathcal{S}}$ about points ${s}$ of ${\mathcal{S}}$ is said to hold almost everywhere (a.s.) if

$\displaystyle F \equiv \{ s: \mathcal{S}(s) \text{ is false} \} \in \Sigma \text{ and } \mu(F)=0.$

Continue reading “Very brief notes on measures: From σ-fields to Carathéodory’s Theorem”

# David Aldous’ review of the Black Swan

The phrase “Black Swan” (arising earlier in the different context of Popperian falsification) is here defined as an event characterized [p. xviii] by rarity, extreme impact, and retrospective (though not prospective) predictability, and Taleb’s thesis is that such events have much greater effect, in financial markets and the broader world of human affairs, than we usually suppose. The book is challenging to review because it requires considerable effort to separate the content from the style. The style is rambling and pugnacious—well described by one reviewer as “with few exceptions, the writers and professionals Taleb describes are knaves or fools, mostly fools. His writing is full of irrelevances, asides and colloquialisms, reading like the conversation of a raconteur rather than a tightly argued thesis”. And clearly this is perfectly deliberate. Such a book invites a review that reflects the reviewer’s opinions more than is customary in the Notices. My own overall reaction is that Taleb is sensible (going on prescient) in his discussion of financial markets and in some of his general philosophical thought but tends toward irrelevance or ridiculous exaggeration otherwise.

# Rio’s Inequality

Let ${X}$ and ${Y}$ be two integrable real-valued random variables and let ${ Q_x(u) = inf\{t: P(|X|>t) \leq u \}}$ be the quantile function of ${|X|}$. Then if ${Q_X Q_Y}$ is integrable over ${ (0,1)}$ we have

$\displaystyle \boxed{|Cov(X,Y)| \leq 2 \int\limits_{0}^{2a} Q_x(u) Q_Y(u) du}$

where ${ a= a(\sigma(X), \sigma(Y)) = \sup\limits_{\substack{B \in \mathcal{B} \\ C \in \mathcal{C}}} |Cov(\mathbb{I}_{\sigma(X)},\mathbb{I}_{\sigma(Y)})|}$ is the a-mixing coefficient.

Proof: Set ${X^{+} = sup(0,X)}$ and ${X^{-} = sup(0,-X)}$ then

$\displaystyle Cov(X,Y) = Cov(X^{+},Y^{+}) + Cov(X^{-},Y^{-}) - Cov(X^{+},Y^{-}) - Cov(X^{-},Y^{+})$

since ${ X = (X^{+} - X^{-})}$ and ${ Y = (Y^{+} - Y^{-})}$

note also that

$\displaystyle Cov(X^+,Y^+) = \int \int_{\mathbb{R}^{2}_{+}} [P(X>u, Y> \upsilon) - P(X>u)P(Y> \upsilon)]du d\upsilon$

which implies that

$\displaystyle |Cov(X^+,Y^+)| \leq \int \int_{\mathbb{R}^{2}_{+}} \inf (a, P(X>u),P(Y> \upsilon))du d\upsilon$

# Kolmogorov’s Maximal Inequality

1. Let ${X_1, X_2,...,X_n}$ be independent random variables with ${ \mathbb{E}[X_i]=0, \mathbb{E}[X_i^2]< \infty }$. Set ${S_n = \sum_{i=1}^{n} X_n}$. Then ${\forall \varepsilon > 0}$

$\displaystyle \boxed{ P \left( \max_{1 \leq k \leq n} |S_K| \geq \varepsilon \right) \leq \frac{\mathbb{E}[S_n^2]}{\varepsilon^2} }$

Proof: Let

$\displaystyle \begin{array}{rcl} A &\equiv& \{ \max_{1 \leq k \leq n} |S_k| \geq \varepsilon \} ,\\ A_k &\equiv& \{ |S_i| < \varepsilon, i=1,...,k-1,|S_k| \geq \varepsilon \}, \qquad 1 \leq k \leq n \end{array}$

Notice that ${\cup_{i=1}^{n}A_k = A}$ and