K-Nearest neighbor(KNN) ์•Œ๊ณ ๋ฆฌ์ฆ˜

K ์ตœ๊ทผ์ ‘ ์ด์›ƒ ์•Œ๊ณ ๋ฆฌ์ฆ˜

KNN classification

  • ์˜ˆ์ธกํ•˜๋ ค๋Š” ์ƒ˜ํ”Œ์— ๊ฐ€์žฅ ๊ฐ€๊นŒ์šด ์ƒ˜ํ”Œ k๊ฐœ๋ฅผ ์„ ํƒ
  • ์ด์›ƒํ•œ ์ƒ˜ํ”Œ์˜ ํƒ€๊นƒ = ์–ด๋–ค ํด๋ž˜์Šค
  • ์ƒ˜ํ”Œ๋“ค์˜ ํด๋ž˜์Šค๋ฅผ ํ™•์ธํ•ด ๋‹ค์ˆ˜ ํด๋ž˜์Šค๋ฅผ ์ƒˆ๋กœ์šด ์ƒ˜ํ”Œ์˜ ํด๋ž˜์Šค๋กœ ์˜ˆ์ธก
  • ํ…Œ์ŠคํŠธ ์„ธํŠธ์— ์žˆ๋Š” ์ƒ˜ํ”Œ์„ ์ •ํ™•ํ•˜๊ฒŒ ๋ถ„๋ฅ˜ํ•œ ๊ฐœ์ˆ˜์˜ ๋น„์œจ(=์ •ํ™•๋„)๋กœ ์„ฑ๋Šฅ ํŒ๋‹จ

knn



KNN regression

  • ์˜ˆ์ธกํ•˜๋ ค๋Š” ์ƒ˜ํ”Œ์— ๊ฐ€์žฅ ๊ฐ€๊นŒ์šด ์ƒ˜ํ”Œ k๊ฐœ๋ฅผ ์„ ํƒ
  • ์ด์›ƒํ•œ ์ƒ˜ํ”Œ์˜ ํƒ€๊นƒ = ์ž„์˜์˜ ์ˆ˜์น˜
  • ์ด์›ƒ ์ƒ˜ํ”Œ์˜ ์ˆ˜์น˜๋ฅผ ์‚ฌ์šฉํ•ด ์ƒˆ๋กœ์šด ์ƒ˜ํ”Œ์˜ ํƒ€๊นƒ์„ ์˜ˆ์ธก = ์ˆ˜์น˜๋“ค์˜ ํ‰๊ท 
from sklearn.neighbors import KNeighborsRegressor
knr = KNeighborsRegressor()
knr.fit(train_input, train_target)

# ํ…Œ์ŠคํŠธ ์„ธํŠธ ์ ์ˆ˜. ํƒ€๊นƒ์ด ์˜ˆ์ธก์— ๊ฐ€๊นŒ์šธ์ˆ˜๋ก 1์— ๊ฐ€๊นŒ์›€
print(knr.score(test_input, test_target))


๊ฒฐ์ •๊ณ„์ˆ˜(coefficient of determination, R2)

$ R^{2} = 1 - \frac{(target - predict)^2}{(target - average)^2} $

  • ํƒ€๊นƒ์ด ์˜ˆ์ธก์— ์•„์ฃผ ๊ฐ€๊นŒ์›Œ์ง€๋ฉด 1์— ๊ฐ€๊นŒ์šด ๊ฐ’
  • R2์ด ์–ผ๋งˆ๋‚˜ ์ข‹์€์ง€ ์ง๊ฐ์ ์œผ๋กœ ์ดํ•ด๊ฐ€ ์–ด๋ ค์›€ โ†’ MSE(mean absolute error) ๊ณ„์‚ฐ. ํƒ€๊นƒ๊ณผ ์˜ˆ์ธก์˜ ์ ˆ๋Œ“๊ฐ’ ์˜ค์ฐจ๋ฅผ ํ‰๊ท ํ•ด ๋ฐ˜ํ™˜
from sklearn.metrics import mean_absolute_error
test_prediction = knr.predict(test_input)
mae = mean_absolute_error(test_target, test_prediction)
print(mae)


โ˜… ๋” ๋งŽ์€ ๋‚ด์šฉ ๋ณด๋Ÿฌ๊ฐ€๊ธฐ click click! โ˜