Nearest neighbour averaging

Table of Contents

This is one of the smoothing methods. It uses data in the neighborhood (\(\mathcal{N}\)) to estimate regression function. Works when the dimensionality of the problem is small (\(\leq 4\)), and with dense training data (large \(N\)). In higher dimensions the neighbourhing data points will be very far, this phenomenon is called the Curse of Dimentionality.

\begin{equation} \hat{f}(X) = Ave(Y|X \in \mathcal{N}(x)) \end{equation}

1. Fails

  • This technique does not work well at the boundaries (extrapolation).
  • Typicall, it is good practice to estimate value of unknown point using 10% training data, that is to say that we need 10% of points in the neighbourhood.

Date: 2025-03-02 Sun 18:52

Author: vj

Created: 2026-03-05 Thu 07:53

Validate