Model Performance Metrics

Model	Task	Model Output	KPI	Target Value for Seed	Current Status
iGN	AI early disease warning	Alert	Sensitivity/False negative rate	0.51
iGN	Explicit case definitions	Alert	Specificity	0.60
iGN	Individual disease prediction	Outbreak Probability	Average precision for historical anthrax outbreaks in Victoria	0.65	0.69
iGN	Trader regression	Data feed	RMSE between predicted and actual futures prices	0.65	0.66

Horizon 1

Data

During our Seed Stage, iGN (ingenum Graph Network) will be primarily based on data from WorkMate. Using iLM (ingenum Language Model), clinically relevant data points will be extracted from clinical notes and invoices. iGN will probabilistically map these to diagnoses.

Our primary ForeSight customer during the Seed Stage will be the financial services industry for the purpose of futures trading. To provide useful outputs to those customers, iGN does require a minimal sample of data to train a useful model. However, the absolute volume of data is less useful than the coverage of that data. The penetration of the WorkMate product among vets is a strong proxy for the quality of the sample.

For simplicity, assume that we model the impact of animal health on commodities futures as a single variable. Two important metrics for the purposes of futures trading are:

The maximum likelihood, which represents the average burden of disease
The tail risk, representing the probability of a “black swan” epidemic outbreak

Futures traders will combine these metrics with other data in order to make their trades. The absolute quantity of data collected will improve the quality of the first metric up to a point of diminishing returns, but the more impactful consideration will be whether the sample is statistically representative of the population. This has both geographic and animal density dimensions.

Because it is more sensitive to sample size, the quality of the tail risk metric will continue to improve with WorkMate penetration.

Technical Explanation

We measure coverage - and by extension the quality of metric 1 - by means of the Geographically-Weighted Gini Coefficient (GWGC).

$$ \text{GWGC} = 1 - \sum_{i=1}^{n} \left[(P_i - P_{i-1}) \cdot (D_i + D_{i-1})\right] $$

$$ \begin{aligned}i &= 1, 2, \ldots, n \text{ (where $n$ is the total number of regions)} \\P_i &= \text{Cumulative Proportion of Population up to region $i$} \\D_i &= \text{Cumulative Proportion of Data up to region $i$} \\P_0 &= D_0 = 0\end{aligned} $$

MLE

To quantify the precise impact of increasing the sample size (n) and decreasing the Geographically Weighted Gini Coefficient (GWGC) on the quality of the maximum likelihood estimate (MLE) of the Weibull distribution parameters, we need to consider the asymptotic properties of the MLE and the relationship between the GWGC and the sample's representativeness.

Let's denote the true Weibull distribution parameters as $θ = (α, β)$, where $α$ is the shape parameter and $β$ is the scale parameter. The MLE of $θ$ based on a sample of size n is denoted as $θ̂ = (α̂, β̂)$.

Impact of increasing sample size ($n$):
- The asymptotic properties of the MLE state that as the sample size n increases, the MLE $θ̂$ converges in probability to the true parameter θ.
- The rate of convergence is given by the square root of the sample size, i.e., $√n(θ̂ - θ)$ converges in distribution to a multivariate normal distribution with mean zero and covariance matrix equal to the inverse of the Fisher information matrix $I(θ)^{-1}$.
- The Fisher information matrix $I(θ)$ measures the amount of information that the sample provides about the parameters $θ$. It is defined as the negative expected value of the second partial derivatives of the log-likelihood function.
- As the sample size increases, the Fisher information matrix increases, leading to a smaller covariance matrix and more precise estimates of the parameters.
Impact of decreasing GWGC:
- The GWGC measures the unevenness of the sample distribution across geographic regions. A lower GWGC indicates a more balanced and representative sample.
- Let's denote the GWGC as $G$, where $0 ≤ G ≤ 1$. A value of $G = 0$ indicates perfect equality (i.e., a completely balanced sample), while $G = 1$ indicates perfect inequality (i.e., a completely uneven sample).
- We can introduce a representativeness factor $R$ that is inversely related to the GWGC. For simplicity, let's define $R = 1 - G$, where $0 ≤ R ≤ 1$. A higher value of $R$ indicates a more representative sample.
- The representativeness factor $R$ affects the quality of the MLE by modifying the effective sample size. We can define the effective sample size as $n_{\text(eff)} = nR$.
- As the GWGC decreases, $R$ increases, leading to a larger effective sample size and improved quality of the MLE.

Combining the effects of sample size and GWGC, we can express the impact on the quality of the MLE using the following formula:

$√(n(1 - G))(θ̂ - θ) → N(0, I(θ)^{-1}$)

This formula indicates that as the sample size n increases and the GWGC (G) decreases, the quality of the MLE improves. The term $√(n × (1 - G))$ represents the effective sample size, considering both the actual sample size and the representativeness factor $(1 - G)$. As this term increases, the MLE $θ̂$ converges more quickly to the true parameter $θ$, and the covariance matrix $I(θ)^{-1}$ becomes smaller, indicating more precise estimates.

Tail Risk

To analyze the tail risk in the context of the Weibull distribution and its relationship with the sample size ($n$) and the Geographically Weighted Gini Coefficient (GWGC), we need to define a measure of tail risk. One common measure of tail risk is the Value-at-Risk (VaR) at a given confidence level.

For the Weibull distribution with shape parameter α and scale parameter $β$, the VaR $V$ at a confidence level of $(1 - p)$ is given by:

$V(p) = β(-ln(1 - p))^{1/α}$

where $0 < p < 1$ is the probability of exceeding the VaR threshold.

Now, let's consider the impact of increasing the sample size ($n$) and decreasing the GWGC on the estimation of the VaR.

Impact of increasing sample size ($n$):
- As the sample size n increases, the maximum likelihood estimates (MLEs) of the Weibull parameters $(α̂, β̂)$ become more precise and accurate.
- The improved estimates of α and β lead to a more reliable estimation of the VaR.
- With a larger sample size, the estimation of the tail risk becomes more robust, as the increased data points provide more information about the tail behaviour of the distribution.
- The standard errors of the estimated VaR decrease as the sample size increases, indicating higher confidence in the tail risk estimation.
Impact of decreasing GWGC:
- As the GWGC decreases, the sample becomes more balanced and representative of the underlying population.
- A more representative sample captures the true variability and characteristics of the population, including the tail behaviour.
- With a lower GWGC, the estimation of the Weibull parameters $(α̂, β̂)$ becomes less biased and more accurate.
- The improved parameter estimates lead to a more reliable estimation of the VaR and the tail risk.
- A lower GWGC indicates that the sample is less biased and more representative of the overall population, reducing the uncertainty in the tail risk estimation.

To quantify the impact of increasing n and decreasing GWGC on the tail risk estimation, we can combine the effects on the parameter estimates and the VaR calculation:

$V(p, n, G) = β̂(n, G) ×(-ln(1 - p))^{1/α̂(n, G)}$

where $β̂(n, G)$ and $α̂(n, G)$ are the MLEs of the Weibull parameters based on a sample of size $n$ and GWGC $G$.

As the sample size n increases and the GWGC G decreases, the estimates $β̂(n, G)$ and $α̂(n, G)$ become more accurate and precise. Consequently, the estimated $V(p, n, G)$ becomes a more reliable measure of the tail risk.

The rate of improvement in the tail risk estimation can be assessed by comparing the estimated VaR values for different combinations of $n$ and $G$. For example, you can calculate $V(p, n_1, G_1)$ and $V(p, n_2, G_2)$ for two different scenarios with sample sizes n1 and n2 and GWGC values G1 and G2, respectively. The relative difference between the two VaR estimates indicates the impact of changing $n$ and $G$ on the tail risk estimation.

It's important to note that the choice of the confidence level $(1 - p)$ also affects the tail risk estimation. A higher confidence level focuses on more extreme events in the tail of the distribution, while a lower confidence level considers a broader range of tail events. The appropriate choice of the confidence level depends on the specific risk management objectives and the nature of the underlying risk being assessed.