Last active
August 8, 2019 13:21
-
-
Save sjednac/984b1663f7587f7a82da976c1ceac13d to your computer and use it in GitHub Desktop.
K-nearest neighbours in C# (k-NN classification)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
5.1,3.5,1.4,0.2,Iris-setosa | |
4.9,3.0,1.4,0.2,Iris-setosa | |
4.7,3.2,1.3,0.2,Iris-setosa | |
4.6,3.1,1.5,0.2,Iris-setosa | |
5.0,3.6,1.4,0.2,Iris-setosa | |
5.4,3.9,1.7,0.4,Iris-setosa | |
4.6,3.4,1.4,0.3,Iris-setosa | |
5.0,3.4,1.5,0.2,Iris-setosa | |
4.4,2.9,1.4,0.2,Iris-setosa | |
4.9,3.1,1.5,0.1,Iris-setosa | |
5.4,3.7,1.5,0.2,Iris-setosa | |
4.8,3.4,1.6,0.2,Iris-setosa | |
4.8,3.0,1.4,0.1,Iris-setosa | |
4.3,3.0,1.1,0.1,Iris-setosa | |
5.8,4.0,1.2,0.2,Iris-setosa | |
5.7,4.4,1.5,0.4,Iris-setosa | |
5.4,3.9,1.3,0.4,Iris-setosa | |
5.1,3.5,1.4,0.3,Iris-setosa | |
5.7,3.8,1.7,0.3,Iris-setosa | |
5.1,3.8,1.5,0.3,Iris-setosa | |
5.4,3.4,1.7,0.2,Iris-setosa | |
5.1,3.7,1.5,0.4,Iris-setosa | |
4.6,3.6,1.0,0.2,Iris-setosa | |
5.1,3.3,1.7,0.5,Iris-setosa | |
4.8,3.4,1.9,0.2,Iris-setosa | |
5.0,3.0,1.6,0.2,Iris-setosa | |
5.0,3.4,1.6,0.4,Iris-setosa | |
5.2,3.5,1.5,0.2,Iris-setosa | |
5.2,3.4,1.4,0.2,Iris-setosa | |
4.7,3.2,1.6,0.2,Iris-setosa | |
4.8,3.1,1.6,0.2,Iris-setosa | |
5.4,3.4,1.5,0.4,Iris-setosa | |
5.2,4.1,1.5,0.1,Iris-setosa | |
5.5,4.2,1.4,0.2,Iris-setosa | |
4.9,3.1,1.5,0.1,Iris-setosa | |
5.0,3.2,1.2,0.2,Iris-setosa | |
5.5,3.5,1.3,0.2,Iris-setosa | |
4.9,3.1,1.5,0.1,Iris-setosa | |
4.4,3.0,1.3,0.2,Iris-setosa | |
5.1,3.4,1.5,0.2,Iris-setosa | |
5.0,3.5,1.3,0.3,Iris-setosa | |
4.5,2.3,1.3,0.3,Iris-setosa | |
4.4,3.2,1.3,0.2,Iris-setosa | |
5.0,3.5,1.6,0.6,Iris-setosa | |
5.1,3.8,1.9,0.4,Iris-setosa | |
4.8,3.0,1.4,0.3,Iris-setosa | |
5.1,3.8,1.6,0.2,Iris-setosa | |
4.6,3.2,1.4,0.2,Iris-setosa | |
5.3,3.7,1.5,0.2,Iris-setosa | |
5.0,3.3,1.4,0.2,Iris-setosa | |
7.0,3.2,4.7,1.4,Iris-versicolor | |
6.4,3.2,4.5,1.5,Iris-versicolor | |
6.9,3.1,4.9,1.5,Iris-versicolor | |
5.5,2.3,4.0,1.3,Iris-versicolor | |
6.5,2.8,4.6,1.5,Iris-versicolor | |
5.7,2.8,4.5,1.3,Iris-versicolor | |
6.3,3.3,4.7,1.6,Iris-versicolor | |
4.9,2.4,3.3,1.0,Iris-versicolor | |
6.6,2.9,4.6,1.3,Iris-versicolor | |
5.2,2.7,3.9,1.4,Iris-versicolor | |
5.0,2.0,3.5,1.0,Iris-versicolor | |
5.9,3.0,4.2,1.5,Iris-versicolor | |
6.0,2.2,4.0,1.0,Iris-versicolor | |
6.1,2.9,4.7,1.4,Iris-versicolor | |
5.6,2.9,3.6,1.3,Iris-versicolor | |
6.7,3.1,4.4,1.4,Iris-versicolor | |
5.6,3.0,4.5,1.5,Iris-versicolor | |
5.8,2.7,4.1,1.0,Iris-versicolor | |
6.2,2.2,4.5,1.5,Iris-versicolor | |
5.6,2.5,3.9,1.1,Iris-versicolor | |
5.9,3.2,4.8,1.8,Iris-versicolor | |
6.1,2.8,4.0,1.3,Iris-versicolor | |
6.3,2.5,4.9,1.5,Iris-versicolor | |
6.1,2.8,4.7,1.2,Iris-versicolor | |
6.4,2.9,4.3,1.3,Iris-versicolor | |
6.6,3.0,4.4,1.4,Iris-versicolor | |
6.8,2.8,4.8,1.4,Iris-versicolor | |
6.7,3.0,5.0,1.7,Iris-versicolor | |
6.0,2.9,4.5,1.5,Iris-versicolor | |
5.7,2.6,3.5,1.0,Iris-versicolor | |
5.5,2.4,3.8,1.1,Iris-versicolor | |
5.5,2.4,3.7,1.0,Iris-versicolor | |
5.8,2.7,3.9,1.2,Iris-versicolor | |
6.0,2.7,5.1,1.6,Iris-versicolor | |
5.4,3.0,4.5,1.5,Iris-versicolor | |
6.0,3.4,4.5,1.6,Iris-versicolor | |
6.7,3.1,4.7,1.5,Iris-versicolor | |
6.3,2.3,4.4,1.3,Iris-versicolor | |
5.6,3.0,4.1,1.3,Iris-versicolor | |
5.5,2.5,4.0,1.3,Iris-versicolor | |
5.5,2.6,4.4,1.2,Iris-versicolor | |
6.1,3.0,4.6,1.4,Iris-versicolor | |
5.8,2.6,4.0,1.2,Iris-versicolor | |
5.0,2.3,3.3,1.0,Iris-versicolor | |
5.6,2.7,4.2,1.3,Iris-versicolor | |
5.7,3.0,4.2,1.2,Iris-versicolor | |
5.7,2.9,4.2,1.3,Iris-versicolor | |
6.2,2.9,4.3,1.3,Iris-versicolor | |
5.1,2.5,3.0,1.1,Iris-versicolor | |
5.7,2.8,4.1,1.3,Iris-versicolor | |
6.3,3.3,6.0,2.5,Iris-virginica | |
5.8,2.7,5.1,1.9,Iris-virginica | |
7.1,3.0,5.9,2.1,Iris-virginica | |
6.3,2.9,5.6,1.8,Iris-virginica | |
6.5,3.0,5.8,2.2,Iris-virginica | |
7.6,3.0,6.6,2.1,Iris-virginica | |
4.9,2.5,4.5,1.7,Iris-virginica | |
7.3,2.9,6.3,1.8,Iris-virginica | |
6.7,2.5,5.8,1.8,Iris-virginica | |
7.2,3.6,6.1,2.5,Iris-virginica | |
6.5,3.2,5.1,2.0,Iris-virginica | |
6.4,2.7,5.3,1.9,Iris-virginica | |
6.8,3.0,5.5,2.1,Iris-virginica | |
5.7,2.5,5.0,2.0,Iris-virginica | |
5.8,2.8,5.1,2.4,Iris-virginica | |
6.4,3.2,5.3,2.3,Iris-virginica | |
6.5,3.0,5.5,1.8,Iris-virginica | |
7.7,3.8,6.7,2.2,Iris-virginica | |
7.7,2.6,6.9,2.3,Iris-virginica | |
6.0,2.2,5.0,1.5,Iris-virginica | |
6.9,3.2,5.7,2.3,Iris-virginica | |
5.6,2.8,4.9,2.0,Iris-virginica | |
7.7,2.8,6.7,2.0,Iris-virginica | |
6.3,2.7,4.9,1.8,Iris-virginica | |
6.7,3.3,5.7,2.1,Iris-virginica | |
7.2,3.2,6.0,1.8,Iris-virginica | |
6.2,2.8,4.8,1.8,Iris-virginica | |
6.1,3.0,4.9,1.8,Iris-virginica | |
6.4,2.8,5.6,2.1,Iris-virginica | |
7.2,3.0,5.8,1.6,Iris-virginica | |
7.4,2.8,6.1,1.9,Iris-virginica | |
7.9,3.8,6.4,2.0,Iris-virginica | |
6.4,2.8,5.6,2.2,Iris-virginica | |
6.3,2.8,5.1,1.5,Iris-virginica | |
6.1,2.6,5.6,1.4,Iris-virginica | |
7.7,3.0,6.1,2.3,Iris-virginica | |
6.3,3.4,5.6,2.4,Iris-virginica | |
6.4,3.1,5.5,1.8,Iris-virginica | |
6.0,3.0,4.8,1.8,Iris-virginica | |
6.9,3.1,5.4,2.1,Iris-virginica | |
6.7,3.1,5.6,2.4,Iris-virginica | |
6.9,3.1,5.1,2.3,Iris-virginica | |
5.8,2.7,5.1,1.9,Iris-virginica | |
6.8,3.2,5.9,2.3,Iris-virginica | |
6.7,3.3,5.7,2.5,Iris-virginica | |
6.7,3.0,5.2,2.3,Iris-virginica | |
6.3,2.5,5.0,1.9,Iris-virginica | |
6.5,3.0,5.2,2.0,Iris-virginica | |
6.2,3.4,5.4,2.3,Iris-virginica | |
5.9,3.0,5.1,1.8,Iris-virginica |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
using System; | |
using System.Collections.Generic; | |
using System.Linq; | |
using System.IO; | |
namespace knn | |
{ | |
class MainClass | |
{ | |
public struct Record | |
{ | |
public double[] attributes; | |
public string category; | |
public Record(string[] fields) | |
{ | |
attributes = fields.Take(fields.Length - 1) | |
.Select(field => Double.Parse(field)) | |
.ToArray(); | |
category = fields.Last(); | |
} | |
} | |
public static void NormalizeAttributes(Record[] records) | |
{ | |
int numberOfAttributes = records.First ().attributes.Length; | |
for (int i = 0; i < numberOfAttributes; i++) | |
{ | |
double attrMin = records.Min (record => record.attributes [i]); | |
double attrMax = records.Max (record => record.attributes [i]); | |
foreach(Record record in records) | |
{ | |
record.attributes[i] = (record.attributes[i] - attrMin) / (attrMax - attrMin); | |
} | |
} | |
} | |
public static void PrintRecords(Record[] records) | |
{ | |
Console.WriteLine ("Records: "); | |
foreach (Record r in records) | |
{ | |
Console.WriteLine ("Record: category=" + r.category + " attributes=[" + string.Join(",", r.attributes)+"]"); | |
} | |
} | |
public static double TestKNN(int k, Func<double[], double[], double> distanceFunc, Record[] records) | |
{ | |
int hits = 0; | |
foreach(Record record in records) | |
{ | |
double[] inputAttrs = record.attributes; | |
Record[] trainingSet = records.Where(r => !r.Equals(record)).ToArray(); | |
string outputCategory = KNN (k, distanceFunc, trainingSet, inputAttrs); | |
string expectedCategory = record.category; | |
if (outputCategory.Equals (expectedCategory)) { | |
hits++; | |
} | |
} | |
double percentHits = (double) hits / (double) records.Count() * 100.0; | |
return percentHits; | |
} | |
public static string KNN(int k, Func<double[], double[], double> distanceFunc, Record[] trainingSet, double[] input) { | |
Record[] nearestNeighbours = | |
trainingSet.Select (sample => new Tuple<Record, double>(sample, distanceFunc (sample.attributes, input))) | |
.OrderBy(tuple => tuple.Item2) | |
.Take(k) | |
.Select(tuple => tuple.Item1) | |
.ToArray(); | |
string[] neighbourCategories = nearestNeighbours.Select (neighbour => neighbour.category).ToArray(); | |
string mostCommonCategory = neighbourCategories.GroupBy(v => v) | |
.OrderByDescending(g => g.Count()) | |
.First() | |
.Key; | |
return mostCommonCategory; | |
} | |
public static double DistanceEuclid(double[] v1, double[] v2) | |
{ | |
double sum = 0.0; | |
for (int i = 0; i < v1.Length; i++) | |
{ | |
sum += Math.Pow (v1 [i] - v2 [i], 2); | |
} | |
return Math.Sqrt(sum); | |
} | |
public static double DistanceChebyshev(double[] v1, double[] v2) | |
{ | |
List<double> differences = new List<double>(); | |
for (int i = 0; i < v1.Length; i++) | |
{ | |
differences.Add(Math.Abs (v1 [i] - v2 [i])); | |
} | |
return differences.Max(); | |
} | |
public static void Main (string[] args) | |
{ | |
string[] rows = File.ReadAllLines ("iris.txt"); | |
Record[] records = rows.Select(row => row.Split(',')) | |
.Select(fields => new Record(fields)) | |
.ToArray(); | |
NormalizeAttributes (records); | |
Console.WriteLine ("Euclid distance:"); | |
for (int k = 1; k <= records.Count(); k++) { | |
double hitsPercent = TestKNN (k, DistanceEuclid, records); | |
Console.WriteLine ("Percent of hits for k=" + k + ": " + hitsPercent + "%"); | |
} | |
Console.WriteLine ("Chebyshev distance:"); | |
for (int k = 1; k <= records.Count(); k++) { | |
double hitsPercent = TestKNN (k, DistanceChebyshev, records); | |
Console.WriteLine ("Percent of hits for k=" + k + ": " + hitsPercent + "%"); | |
} | |
} | |
} | |
} |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment