Machine learning using k-Nearest Neighbors (kNN)

Machine learning using k-Nearest Neighbors (kNN)

The nearest neighbors approach to classification is utilized by the kNN algorithm. Let us take a look at the strengths and weaknesses of this algorithm:

•     Simple and effective
•     Makes no assumptions about the underlying data distribution
•     Fast training phase

•     Does not produce a model, which limits the ability to find novel insights in relationships among features
•     Slow classification phase
•     Requires a large amount of memory
•     Nominal features and missing data require additional processing

The kNN algorithm begins with a training dataset made up of examples that are classified into several categories, as labeled by a nominal variable. Assume that we have a test dataset containing unlabeled examples that otherwise have the same features as the training data. For each record in the test dataset, kNN identifies k records in the training data that are the "nearest" in similarity, where k is an integer specified in advance. The unlabeled test instance is assigned the class of the majority of the k nearest neighbors.

Get In Touch 


Mob- +91-94175-45651