• Mindscape ๐Ÿ”ฅ
    • Playlist ๐ŸŽง
  • Algorithm

    • 1018๋ฒˆ: ์ฒด์ŠคํŒ ๋‹ค์‹œ ์น ํ•˜๊ธฐ
    • 1966๋ฒˆ: ํ”„๋ฆฐํ„ฐ ํ
    • Python ์‹œ๊ฐ„ ์ดˆ๊ณผ ๋ฐฉ์ง€๋ฅผ ์œ„ํ•œ ํŒ
    • C++ std::vector ์‚ฌ์šฉ๋ฒ• ์ •๋ฆฌ
    • Vim ์‚ฌ์šฉ ๋งค๋‰ด์–ผ
  • Ubuntu

    • ๋ฆฌ๋ˆ…์Šค ์šฐ๋ถ„ํˆฌ GRUB ํฐํŠธ ๋ณ€๊ฒฝ
    • ์šฐ๋ถ„ํˆฌ ์ด๋ฏธ์ง€ ๋น„๋””์˜ค ์ธ๋„ค์ผ(๋ฏธ๋ฆฌ๋ณด๊ธฐ) ์•ˆ ๋ณด์ž„ ๋ฌธ์ œ ํ•ด๊ฒฐ
    • Wine ํ™˜๊ฒฝ์—์„œ ์นด์นด์˜คํ†ก ์‹คํ–‰ ์‹œ explorer.exe ๋œจ์ง€ ์•Š๊ฒŒ ํ•˜๋Š” ๋ฒ•
    • ์šฐ๋ถ„ํˆฌ Wine ์นด์นด์˜คํ†ก ์‚ฌ์ง„ ์ด๋ฏธ์ง€ ์Šคํฌ๋ฆฐ์ƒท ๋ถ™์—ฌ๋„ฃ๊ธฐ
    • Wine ์นด์นด์˜คํ†ก ์ด๋ชจ์ง€ ๊นจ์ง ๋ฌธ์ œ ํ•ด๊ฒฐ
    • Ubuntu ์œˆ๋„์šฐ ์• ๋‹ˆ๋ฉ”์ด์…˜ ๋„๊ธฐ
  • Wellness

    • ์ฐจ์ „์žํ”ผ (Psyllium Husk)
    • ์—‘์ŠคํŠธ๋ผ ๋ฒ„์ง„ ์˜ฌ๋ฆฌ๋ธŒ์œ  (Extra Virgin Olive Oil)
    • ์ž๊ฐ€๋น„๊ฐ•์„ธ์ฒ™ (Nasal Irrigation)
    • QCY HT08 (MeloBuds Pro Plus)
    • ์ฝ˜์„œํƒ€ (Concerta)
    • ์ธ๋ฐ๋†€ (Inderal)
    • ์„คํŠธ๋ž„๋ฆฐ (Sertraline)
    • ๋ฉœ๋ผํ† ๋‹Œ (Melatonin)
    • ์น˜๊ฒฝ๋ถ€ ๋งˆ๋ชจ์ฆ
    • ๋ฐ”๋ฒจ ์Šค์ฟผํŠธ (Barbell Squat)
  • Humanities

    • Nordvik, Russia
    • North Sentinel Island
    • ๋กฑ๊ณ ๋กฑ๊ณ (Rongorongo)
    • ๋ฐ”๋กœํฌ ์Œ์•… (Baroque Music)
  • Design

    • ๊ตฌ๊ธ€์˜ ์•„์ด์ฝ˜ ๋Œ€๊ฐœํŽธ โ€” 6๋…„ ๋งŒ์˜ ์‹ค์ˆ˜ ์ธ์ •
    • ์ œ๋Ÿด๋“œ ์  ํƒ€ โ€” ๋Ÿญ์…”๋ฆฌ ์Šคํฌ์ธ  ์›Œ์น˜์˜ ์ฐฝ์‹œ์ž
    • ๋ฐ”์šฐํ•˜์šฐ์Šค โ€” ํ˜„๋Œ€ ๋””์ž์ธ์˜ ์›์ 
  • Brands

    • NOMOS Glashรผtte
    • Frรฉdรฉrique Constant
    • KZ (Knowledge Zenith)
    • ์—์ŠคํŠธ๋ผ (AESTURA)
    • JINHAO (้‡‘่ฑช)
    • Herman Miller
    • ๋ฐ์Šค์ปค (DESKER)
    • ๋ฌด์‹ ์‚ฌ ์Šคํƒ ๋‹ค๋“œ (Musinsa Standard)
  • Finance

    • ํ˜„๋Œ€์นด๋“œ ZERO โ€” Edition2 vs Edition3 ๋น„๊ต
    • ์‹ ํ•œ์นด๋“œ ์ฒ˜์Œ
    • S&P 500 ETF ํˆฌ์ž ๊ฐ€์ด๋“œ
    • ํŒŒํ‚นํ†ต์žฅ vs CMA ํ†ต์žฅ
    • ๋ฒ„ํฌ์…” ํ•ด์„œ์›จ์ด (Berkshire Hathaway)
    • ๋น„ํŠธ์ฝ”์ธ(Bitcoin)
  • Products

    • ์˜ค๋””์˜ค ์ธํ„ฐํŽ˜์ด์Šค (Audio Interface)
    • ์ฟ ๋ฃจํ† ๊ฐ€ (KURUTOGA)
    • CX31993 DAC ๋™๊ธ€
    • ํด๋ Œ์ง• ๋ฐ€ํฌ (Cleansing Milk)
    • ํ”ผ์ ฏ ํ† ์ด (Fidget Toy)
    • ThinkPad
  • Programming Languages

    • 8.0. Statement Level Control Structures
    • 8. Subprogram
    • 9. Implementing Subprogram
    • 10.1. Abstract Data Types and Encapsulation Constructs
    • 10.2. Support for Object Oriented Programming
    • 11. Concurrency
    • 12. FPL (1)
    • 13. FPL (2)
    • 14. Exception Handling and Event Handling
    • Final Exam

17. Other Classic ML Models (2)

์ž‘์„ฑ 2026. 6. 12.ยท์ˆ˜์ • 2026. 6. 12.

Support Vector Machines

Support Vector Machines (SVMs)

  • 2000๋…„๋Œ€ ์ดˆ๋ฐ˜, "off-the-shelf" supervised learning (๋„๋ฉ”์ธ์— ๋Œ€ํ•œ ์‚ฌ์ „ ์ง€์‹ ์—†์ด ์‚ฌ์šฉํ•˜๋Š”)์— ๊ฐ€์žฅ ์ธ๊ธฐ ์žˆ์—ˆ๋˜ model class
  • ํ˜„์žฌ ์ด ์œ„์น˜๋Š” deep learning network์™€ random forest๊ฐ€ ์ฐจ์ง€ํ–ˆ์ง€๋งŒ, SVM์€ ์—ฌ์ „ํžˆ 3๊ฐ€์ง€ ๋งค๋ ฅ์ ์ธ ์†์„ฑ์„ ๋ณด์œ 
    1. SVM์€ maximum margin separator (๊ฒฐ์ • ๊ฒฝ๊ณ„)๋ฅผ ๊ตฌ์ถ• (example point๊นŒ์ง€ ๊ฐ€๋Šฅํ•œ ๊ฐ€์žฅ ํฐ ๊ฑฐ๋ฆฌ๋ฅผ ๊ฐ€์ง). ์ด๋Š” ์ผ๋ฐ˜ํ™”(generalize)์— ๋„์›€
    2. SVM์€ linear separating hyperplane์„ ์ƒ์„ฑํ•˜์ง€๋งŒ, kernel trick์„ ์‚ฌ์šฉํ•ด ๋ฐ์ดํ„ฐ๋ฅผ ๋” ๋†’์€ ์ฐจ์›์˜ ๊ณต๊ฐ„์œผ๋กœ embedding ํ•  ์ˆ˜ ์žˆ์Œ. ์›๋ณธ input space์—์„œ linearly separable ํ•˜์ง€ ์•Š์€ ๋ฐ์ดํ„ฐ๊ฐ€ ๊ณ ์ฐจ์› ๊ณต๊ฐ„์—์„œ๋Š” ์‰ฝ๊ฒŒ ๋ถ„๋ฆฌ๋˜๋Š” ๊ฒฝ์šฐ๊ฐ€ ๋งŽ์Œ
    3. SVM์€ nonparametric์ž„. Separating hyperplane์€ parameter ๊ฐ’์˜ ์ง‘ํ•ฉ์ด ์•„๋‹Œ example point ์ง‘ํ•ฉ์— ์˜ํ•ด ์ •์˜๋จ. Nearest-neighbor model์€ ๋ชจ๋“  example์„ ์œ ์ง€ํ•ด์•ผ ํ•˜์ง€๋งŒ, SVM model์€ separating plane์— ๊ฐ€์žฅ ๊ฐ€๊นŒ์šด example (์ผ๋ฐ˜์ ์œผ๋กœ ์ฐจ์› ์ˆ˜์˜ ๋ช‡ ๋ฐฐ์— ๋ถˆ๊ณผํ•œ)๋งŒ ์œ ์ง€. ๋”ฐ๋ผ์„œ SVM์€ nonparametric model๊ณผ parametric model์˜ ์žฅ์ ์„ ๊ฒฐํ•ฉ: ๋ณต์žกํ•œ ํ•จ์ˆ˜๋ฅผ ํ‘œํ˜„ํ•˜๋Š” ์œ ์—ฐ์„ฑ์„ ๊ฐ€์ง€๋ฉด์„œ overfitting์— ๊ฐ•ํ•จ

Properties of SVMs

  • SVM์€ maximum margin separator๋ฅผ ๊ตฌ์ถ•
  • Margin์€ ๊ทธ๋ฆผ์˜ ์ ์„ ์œผ๋กœ ๋‘˜๋Ÿฌ์‹ธ์ธ ์˜์—ญ์˜ ๋„ˆ๋น„ (separator์—์„œ ๊ฐ€์žฅ ๊ฐ€๊นŒ์šด example point๊นŒ์ง€ ๊ฑฐ๋ฆฌ์˜ 2๋ฐฐ) alt text
  • SVM์€ ๋ฐ์ดํ„ฐ๋ฅผ ๋” ๋†’์€ ์ฐจ์›์˜ ๊ณต๊ฐ„์œผ๋กœ embedding ํ•˜๋Š” ๋Šฅ๋ ฅ(kernel trick)์„ ๊ฐ€์ง alt text
  • SVM์€ nonparametric์ž„. Separating hyperplane์€ parameter ๊ฐ’์˜ ์ง‘ํ•ฉ์ด ์•„๋‹Œ example point ์ง‘ํ•ฉ์— ์˜ํ•ด ์ •์˜๋จ
  • Logistic regression์€ ์–ด๋–ค separating line์„ ์ฐพ์œผ๋ฉฐ, ์ด line์˜ ์ •ํ™•ํ•œ ์œ„์น˜๋Š” ๋ชจ๋“  example point์— ๋”ฐ๋ผ ๋‹ฌ๋ผ์ง
  • SVM์˜ ํ•ต์‹ฌ ํ†ต์ฐฐ์€ ์ผ๋ถ€ example (์ฆ‰, support vector)์ด ๋‹ค๋ฅธ ๊ฒƒ๋ณด๋‹ค ๋” ์ค‘์š”ํ•˜๋ฉฐ, ์ด์— ์ง‘์ค‘ํ•˜๋Š” ๊ฒƒ์ด ๋” ๋‚˜์€ generalization์œผ๋กœ ์ด์–ด์งˆ ์ˆ˜ ์žˆ๋‹ค๋Š” ๊ฒƒ alt text

Implementing ML Models in Python

Scikit-learn

  • https://scikit-learn.org/

K-NNs in Scikit-learn

  • https://scikit-learn.org/stable/auto_examples/neighbors/plot_classification.html#sphx-glr-autoexamples-neighbors-plot-classification-py

Clustering in Scikit-learn

alt text

(Gaussian) Naรฏve Bayes in Scikit-learn

  • https://scikit-learn.org/stable/modules/naive_bayes.html

GaussianNB๋Š” classification์„ ์œ„ํ•œ Gaussian Naive Bayes algorithm์„ ๊ตฌํ˜„ํ•จ. Feature์˜ likelihood๋Š” Gaussian์ด๋ผ๊ณ  ๊ฐ€์ •:

P(xiโˆฃy)=12ฯ€ฯƒy2expโก(โˆ’(xiโˆ’ฮผy)22ฯƒy2)P(x_i | y) = \frac{1}{\sqrt{2\pi\sigma^2_y}}\exp\left(-\frac{(x_i - \mu_y)^2}{2\sigma^2_y}\right) P(xiโ€‹โˆฃy)=2ฯ€ฯƒy2โ€‹โ€‹1โ€‹exp(โˆ’2ฯƒy2โ€‹(xiโ€‹โˆ’ฮผyโ€‹)2โ€‹)

Parameter ฯƒy\sigma_yฯƒyโ€‹์™€ ฮผy\mu_yฮผyโ€‹๋Š” maximum likelihood๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ์ถ”์ •

>>> from sklearn.datasets import load_iris
>>> from sklearn.model_selection import train_test_split
>>> from sklearn.naive_bayes import GaussianNB
>>> X, y = load_iris(return_X_y=True)
>>> X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.5, random_state=0)
>>> gnb = GaussianNB()
>>> y_pred = gnb.fit(X_train, y_train).predict(X_test)
>>> print("Number of mislabeled points out of a total %d points : %d"
...       % (X_test.shape[0], (y_test != y_pred).sum()))
Number of mislabeled points out of a total 75 points : 4

Decision Trees in Scikit-learn

  • https://scikit-learn.org/stable/auto_examples/tree/plot_iris_dtc.htmlalt text

SVMs in Scikit-learn

  • https://scikit-learn.org/stable/auto_examples/svm/plot_separating_hyperplane.html#sphx-glrauto-examples-svm-plot-separating-hyperplane-py

Neural Networks and Deep Learning (1)

Introduction to Neural Networks and Deep Learning

AI, ML, and DL (The Slide From the First Lecture)

  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”Œโ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” Machine  โ”‚
โ”‚ โ”‚  AI      โ”‚ Learning โ”‚
โ”‚ โ”‚ โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”      โ”‚
โ”‚ โ”‚ โ”‚        โ”‚   โ”‚      โ”‚
โ”‚ โ”‚ โ”‚  Deep  โ”‚   โ”‚      โ”‚
โ”‚ โ”‚ โ”‚  Learning  โ”‚      โ”‚
โ”‚ โ”‚ โ”‚        โ”‚   โ”‚      โ”‚
โ”‚ โ”‚ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”˜      โ”‚
โ”‚ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜           
  • Artificial Intelligence (AI), Machine Learning (ML), Deep Learning (DL)
  • ์ด ์šฉ์–ด๋“ค์€ ๋ฏธ๋””์–ด์™€ ๋Œ€์ค‘์ด ๋ณ„๋‹ค๋ฅธ ์ฃผ์˜ ์—†์ด ์ƒํ˜ธ ๊ตํ™˜์ ์œผ๋กœ ํ™œ์šฉํ•˜๋Š” ๊ฒฝ์šฐ๊ฐ€ ๋งŽ์Œ
  • 2010๋…„๋Œ€ statistical machine learning์˜ ์ „๋ก€ ์—†๋Š” ์„ฑ๊ณต์€ AI์˜ ๋‹ค๋ฅธ ๋ชจ๋“  ์ ‘๊ทผ ๋ฐฉ์‹์„ ์••๋„ํ–ˆ์œผ๋ฉฐ, ์ผ๋ถ€ (ํŠนํžˆ ๋น„์ฆˆ๋‹ˆ์Šค๊ณ„)์—์„œ๋Š” "artificial intelligence"๋ผ๋Š” ์šฉ์–ด๋ฅผ "neural network๋ฅผ ์‚ฌ์šฉํ•œ machine learning"์„ ์˜๋ฏธํ•˜๋Š” ๋ฐ ์‚ฌ์šฉ
  • AI, ML, DL์˜ ๊ณ„์ธต ๊ตฌ์กฐ
  • ML์€ ๋ณดํ†ต AI์˜ ๋ถ€๋ถ„์ง‘ํ•ฉ์œผ๋กœ ๊ฐ„์ฃผ๋˜๋ฉฐ, DL์€ ML์˜ ํŠน์ • ๋ถ€๋ถ„
  • ๊ทธ๋Ÿฌ๋‚˜ ML์ด ์ „์ ์œผ๋กœ AI์— ํฌํ•จ๋˜๋Š”์ง€์— ๋Œ€ํ•ด์„œ๋Š” ์—ฌ์ „ํžˆ ๋…ผ๋ž€์ด ์žˆ์Œ

Deep Learning

  • Deep learning์€ machine learning์„ ์œ„ํ•œ ๊ด‘๋ฒ”์œ„ํ•œ technique family์ด๋ฉฐ, hypothesis๋Š” ์กฐ์ • ๊ฐ€๋Šฅํ•œ ์—ฐ๊ฒฐ ๊ฐ•๋„๋ฅผ ๊ฐ€์ง„ ๋ณต์žกํ•œ ๋Œ€์ˆ˜์  circuit ํ˜•ํƒœ๋ฅผ ๋ฐ
  • "Deep"์ด๋ผ๋Š” ๋‹จ์–ด๋Š” circuit์ด ์ผ๋ฐ˜์ ์œผ๋กœ ๋งŽ์€ layer๋กœ ๊ตฌ์„ฑ๋˜์–ด input์—์„œ output๊นŒ์ง€์˜ ๊ณ„์‚ฐ ๊ฒฝ๋กœ๊ฐ€ ์—ฌ๋Ÿฌ ๋‹จ๊ณ„๋ฅผ ๊ฑฐ์นœ๋‹ค๋Š” ์‚ฌ์‹ค์„ ์˜๋ฏธ
  • Deep learning์€ ํ˜„์žฌ visual object recognition, machine translation, speech recognition, speech synthesis, image synthesis์™€ ๊ฐ™์€ application์— ๊ฐ€์žฅ ๋„๋ฆฌ ์‚ฌ์šฉ๋˜๋Š” ์ ‘๊ทผ ๋ฐฉ์‹
  • ์ปดํ“จํ„ฐ ๋น„์ „
  • ์ž์—ฐ์–ด ์ฒ˜๋ฆฌ
  • Speech & Music Processing
  • ๊ฐ•ํ™” ํ•™์Šต
  • ์˜๋ฃŒ, ๋ฒ•๋ฅ , ...
  • Physical AI & Robotics

Neural Networks

  • Deep learning์€ ๋‡Œ์˜ neuron network๋ฅผ computational circuit์œผ๋กœ modeling ํ•˜๋ ค๋˜ ์ดˆ๊ธฐ ์—ฐ๊ตฌ์— ๊ธฐ์›์„ ๋‘ 
  • ์ด๋Ÿฌํ•œ ์ด์œ ๋กœ deep learning method๋กœ ํ›ˆ๋ จ๋œ network๋Š” ์ข…์ข… neural network๋ผ๊ณ  ๋ถˆ๋ฆผ (์‹ค์ œ neural cell ๋ฐ ๊ตฌ์กฐ์™€์˜ ์œ ์‚ฌ์„ฑ์€ ํ”ผ์ƒ์ ์ผ์ง€๋ผ๋„) alt text

Why is Deep Learning Successful?

  • Deep learning ์„ฑ๊ณต์˜ ์ง„์ •ํ•œ ์ด์œ ๋Š” ์•„์ง ์™„์ „ํžˆ ๋ฐํ˜€์ง€์ง€ ์•Š์•˜์ง€๋งŒ, ๋‹ค๋ฅธ method์— ๋น„ํ•ด ๋ช…๋ฐฑํ•œ ์ด์ ์„ ๊ฐ€์ง
  • Linear/logistic regression์€ ๋งŽ์€ input ๋ณ€์ˆ˜๋ฅผ ์ฒ˜๋ฆฌํ•  ์ˆ˜ ์žˆ์ง€๋งŒ, ๊ฐ input์—์„œ output๊นŒ์ง€์˜ ๊ณ„์‚ฐ ๊ฒฝ๋กœ๋Š” ๋งค์šฐ ์งง์Œ (๋‹จ์ผ weight์™€์˜ ๊ณฑ์…ˆ ํ›„ aggregate output์— ๋”ํ•ด์ง)
  • ๋”์šฑ์ด, ์„œ๋กœ ๋‹ค๋ฅธ input ๋ณ€์ˆ˜๋“ค์€ ์„œ๋กœ ์ƒํ˜ธ์ž‘์šฉ ์—†์ด ( (a) ) output์— ๋…๋ฆฝ์ ์œผ๋กœ ๊ธฐ์—ฌ. ์ด๋Š” ํ•ด๋‹น model์˜ expressive power๋ฅผ ํ˜„์ €ํžˆ ์ œํ•œ
  • ๋Œ€๋ถ€๋ถ„์˜ ์‹ค์ œ real-world concept์€ ํ›จ์”ฌ ๋” ๋ณต์žกํ•˜์ง€๋งŒ, ์ด model๋“ค์€ input space์—์„œ linear function๊ณผ ๊ฒฝ๊ณ„๋งŒ ํ‘œํ˜„ ๊ฐ€๋Šฅ
  • Decision tree๋Š” ๋งŽ์€ input ๋ณ€์ˆ˜์— ์˜์กดํ•  ์ˆ˜ ์žˆ๋Š” ๊ธด ๊ณ„์‚ฐ ๊ฒฝ๋กœ๋ฅผ ํ—ˆ์šฉ
  • ๊ทธ๋Ÿฌ๋‚˜ ์ด๋Š” ๊ฐ€๋Šฅํ•œ input vector ์ค‘ ์ƒ๋Œ€์ ์œผ๋กœ ์ž‘์€ ๋ถ€๋ถ„์— ๋Œ€ํ•ด์„œ๋งŒ ํ•ด๋‹น ( (b) ). Decision tree๊ฐ€ ๊ฐ€๋Šฅํ•œ input์˜ ์ƒ๋‹น ๋ถ€๋ถ„์— ๋Œ€ํ•ด ๊ธด ๊ณ„์‚ฐ ๊ฒฝ๋กœ๋ฅผ ๊ฐ€์ง„๋‹ค๋ฉด, input ๋ณ€์ˆ˜์˜ ์ˆ˜์— ๋Œ€ํ•ด exponentially large ํ•ด์•ผ ํ•จ
  • Deep learning์˜ ๊ธฐ๋ณธ ์•„์ด๋””์–ด๋Š” ๊ณ„์‚ฐ ๊ฒฝ๋กœ๊ฐ€ ๊ธธ๋„๋ก circuit์„ ํ›ˆ๋ จํ•˜๋Š” ๊ฒƒ
  • ๋ชจ๋“  input ๋ณ€์ˆ˜๊ฐ€ ๋ณต์žกํ•œ ๋ฐฉ์‹์œผ๋กœ ์ƒํ˜ธ์ž‘์šฉํ•˜๋„๋ก ํ—ˆ์šฉ ( (c) )
  • ์ด๋Ÿฌํ•œ circuit model์€ ๋งŽ์€ ์ค‘์š”ํ•œ learning problem ์œ ํ˜•์— ๋Œ€ํ•ด real-world data์˜ ๋ณต์žก์„ฑ์„ ํฌ์ฐฉํ•  ๋งŒํผ ์ถฉ๋ถ„ํžˆ expressive ํ•œ ๊ฒƒ์œผ๋กœ ๋ฐํ˜€์ง alt text

Simple Feedforward Networks

Feedforward Networks

  • (Deep) feedforward (neural) network (๋˜๋Š” multi-layer perceptrons (MLPs))๋Š” ํ•œ ๋ฐฉํ–ฅ์œผ๋กœ๋งŒ connection์„ ๊ฐ€์ง
  • ์ง€์ •๋œ input ๋ฐ output node๊ฐ€ ์žˆ๋Š” directed acyclic graph๋ฅผ ํ˜•์„ฑ
  • ๊ฐ node๋Š” input์˜ ํ•จ์ˆ˜๋ฅผ ๊ณ„์‚ฐํ•˜๊ณ  ๊ทธ ๊ฒฐ๊ณผ๋ฅผ network์˜ ํ›„์† node๋กœ ์ „๋‹ฌ
  • ์ •๋ณด๋Š” input node์—์„œ output node๋กœ network๋ฅผ ํ†ตํ•ด ํ๋ฅด๋ฉฐ, loop๊ฐ€ ์—†์Œ
  • Recurrent network๋Š” ์ค‘๊ฐ„ ๋˜๋Š” ์ตœ์ข… output์„ ์ž์ฒด input์œผ๋กœ ๋‹ค์‹œ feedback
  • ์ด๋Š” network ๋‚ด์˜ signal ๊ฐ’์ด internal state ๋˜๋Š” memory๋ฅผ ๊ฐ–๋Š” dynamical system์„ ํ˜•์„ฑํ•จ์„ ์˜๋ฏธ alt text
  • Neural network์—์„œ input ๊ฐ’์€ ์ผ๋ฐ˜์ ์œผ๋กœ continuousํ•˜๋ฉฐ, node๋Š” continuous input์„ ๋ฐ›์•„ continuous output์„ ์ƒ์„ฑ
  • Node์— ๋Œ€ํ•œ input ์ค‘ ์ผ๋ถ€๋Š” network์˜ parameter
  • Network๋Š” ์ด parameter ๊ฐ’์„ ์กฐ์ •ํ•˜์—ฌ network ์ „์ฒด๊ฐ€ training data์— ๋งž๋„๋ก ํ•™์Šต
  • Feedforward network์˜ ๋ชฉํ‘œ๋Š” ์–ด๋–ค ํ•จ์ˆ˜ fโˆ—f^*fโˆ—๋ฅผ ๊ทผ์‚ฌ(approximate)ํ•˜๋Š” ๊ฒƒ
  • ์˜ˆ: Classifier์˜ ๊ฒฝ์šฐ y=fโˆ—(x)y = f^*(\mathbf{x})y=fโˆ—(x)๋Š” input x\mathbf{x}x๋ฅผ category yyy๋กœ mapping
  • Feedforward network๋Š” y=f(x;ย ฮธ)y = f(\mathbf{x};~ \boldsymbol{\theta})y=f(x;ย ฮธ)๋ผ๋Š” mapping์„ ์ •์˜ํ•˜๊ณ , ์ตœ์ƒ์˜ ํ•จ์ˆ˜ ๊ทผ์‚ฌ(approximation)๋ฅผ ๊ฒฐ๊ณผ๋กœ ๋‚ด๋Š” parameter ฮธ\boldsymbol{\theta}ฮธ์˜ ๊ฐ’์„ ํ•™์Šต
  • Feedforward network๋Š” ์ผ๋ฐ˜์ ์œผ๋กœ ๋งŽ์€ ์„œ๋กœ ๋‹ค๋ฅธ ํ•จ์ˆ˜๋ฅผ ํ•จ๊ป˜ ๊ตฌ์„ฑ(composing)ํ•˜์—ฌ ํ‘œํ˜„๋˜๊ธฐ ๋•Œ๋ฌธ์— network๋ผ๊ณ  ๋ถˆ๋ฆผ
  • ์˜ˆ: f(1)f^{(1)}f(1), f(2)f^{(2)}f(2), f(3)f^{(3)}f(3) ์„ธ ํ•จ์ˆ˜๊ฐ€ chain์œผ๋กœ ์—ฐ๊ฒฐ๋˜์–ด f(x)=f(3)(f(2)(f(1)(x)))f(\mathbf{x}) = f^{(3)}(f^{(2)}(f^{(1)}(\mathbf{x})))f(x)=f(3)(f(2)(f(1)(x)))๋ฅผ ํ˜•์„ฑ
  • ์ด ๊ฒฝ์šฐ f(1)f^{(1)}f(1)์€ network์˜ ์ฒซ ๋ฒˆ์งธ layer, f(2)f^{(2)}f(2)๋Š” ๋‘ ๋ฒˆ์งธ layer๋ผ๊ณ  ๋ถˆ๋ฆผ
  • Chain์˜ ์ „์ฒด ๊ธธ์ด๊ฐ€ model์˜ depth๋ฅผ ๊ฒฐ์ •. "Deep learning"์ด๋ผ๋Š” ์ด๋ฆ„์ด ์—ฌ๊ธฐ์„œ ์œ ๋ž˜
  • Feedforward network์˜ ๋งˆ์ง€๋ง‰ layer๋Š” output layer๋ผ๊ณ  ๋ถˆ๋ฆผ
  • Input layer์™€ output layer ์‚ฌ์ด์˜ layer๋“ค์€ hidden layer๋ผ๊ณ  ๋ถˆ๋ฆผ
    • Network์˜ ๊ฐ hidden layer๋Š” ์ผ๋ฐ˜์ ์œผ๋กœ vector-valued
    • ์ด๋Ÿฌํ•œ hidden layer์˜ dimensionality๊ฐ€ model์˜ width๋ฅผ ๊ฒฐ์ • alt text
  • Vector์˜ ๊ฐ element๋Š” neuron๊ณผ ์œ ์‚ฌํ•œ ์—ญํ• ์„ ํ•˜๋Š” ๊ฒƒ์œผ๋กœ ํ•ด์„๋  ์ˆ˜ ์žˆ์Œ
  • ๋˜ํ•œ layer๊ฐ€ ๋ณ‘๋ ฌ๋กœ ์ž‘๋™ํ•˜๋Š” ๋งŽ์€ unit์œผ๋กœ ๊ตฌ์„ฑ๋˜์–ด ์žˆ์œผ๋ฉฐ, ๊ฐ unit์€ vector-to-scalar ํ•จ์ˆ˜๋ฅผ ๋‚˜ํƒ€๋‚ธ๋‹ค๊ณ  ์ƒ๊ฐํ•  ์ˆ˜ ์žˆ์Œ
  • ๊ฐ unit์€ ๋งŽ์€ ๋‹ค๋ฅธ unit์œผ๋กœ๋ถ€ํ„ฐ input์„ ๋ฐ›๊ณ  ์ž์‹ ์˜ activation value๋ฅผ ๊ณ„์‚ฐํ•œ๋‹ค๋Š” ์˜๋ฏธ์—์„œ neuron๊ณผ ์œ ์‚ฌ
  • ๊ทธ๋Ÿฌ๋‚˜ neural network์˜ ๋ชฉํ‘œ๋Š” ๋‡Œ๋ฅผ ์™„๋ฒฝํ•˜๊ฒŒ model ํ•˜๋Š” ๊ฒƒ์ด ์•„๋‹˜

Computation of Feedforward Networks

  • Network ๋‚ด์˜ ๊ฐ node๋ฅผ unit (๋˜๋Š” perceptron)์ด๋ผ๊ณ  ํ•จ
    • ์ „ํ†ต์ ์œผ๋กœ unit์€ (1) ์ด์ „ node๋“ค๋กœ๋ถ€ํ„ฐ์˜ input์˜ weighted sum์„ ๊ณ„์‚ฐํ•˜๊ณ  (2) nonlinear function์„ ์ ์šฉํ•˜์—ฌ output์„ ์ƒ์„ฑ
    • aja_jajโ€‹๋ฅผ output unit jjj๋ผ ํ•˜๊ณ  wi,jw_{i,j}wi,jโ€‹๋ฅผ unit iii์—์„œ unit jjj๋กœ์˜ link์— ์—ฐ๊ฒฐ๋œ weight๋ผ ํ•  ๋•Œ;

    aj=gj(โˆ‘iwi,jai)โ‰กgj(inj)a_j = g_j(\sum_i w_{i,j} a_i) \equiv g_j(in_j) ajโ€‹=gjโ€‹(iโˆ‘โ€‹wi,jโ€‹aiโ€‹)โ‰กgjโ€‹(injโ€‹)

    • ์—ฌ๊ธฐ์„œ gjg_jgjโ€‹๋Š” unit jjj์™€ ์—ฐ๊ด€๋œ nonlinear activation function์ด๊ณ , injin_jinjโ€‹๋Š” unit jjj๋กœ์˜ input์˜ weighted sum alt text
    • ์•ž์˜ ๋ฐฉ์ •์‹์„ vector ํ˜•ํƒœ๋กœ ์ž‘์„ฑ ๊ฐ€๋Šฅ

    a=g(Wx)\mathbf{a} = \mathbf{g}(\mathbf{W}\mathbf{x}) a=g(Wx)

  • Activation function์ด nonlinear๋ผ๋Š” ์‚ฌ์‹ค์ด ์ค‘์š”
    • ๋งŒ์•ฝ ๊ทธ๋ ‡์ง€ ์•Š๋‹ค๋ฉด, unit์˜ ๋ชจ๋“  ๊ตฌ์„ฑ(composition)์€ ์—ฌ์ „ํžˆ linear function์„ ๋‚˜ํƒ€๋‚ผ ๊ฒƒ (W=W1W2W = W_1W_2W=W1โ€‹W2โ€‹์ผ ๋•Œ Wx=W1W2x\mathbf{W}\mathbf{x} = \mathbf{W_1}\mathbf{W_2}\mathbf{x}Wx=W1โ€‹W2โ€‹x)
    • Nonlinearity๋Š” ์ถฉ๋ถ„ํžˆ ํฐ unit network๊ฐ€ ์ž„์˜์˜ ํ•จ์ˆ˜๋ฅผ ํ‘œํ˜„ํ•  ์ˆ˜ ์žˆ๊ฒŒ ํ•˜๋Š” ์š”์†Œ
  • Universal approximation theorem
    • Computational unit์˜ layer๊ฐ€ ๋‘ ๊ฐœ๋ฟ์ธ network (์ฒซ ๋ฒˆ์งธ๋Š” nonlinear, ๋‘ ๋ฒˆ์งธ๋Š” linear)๊ฐ€ ์ž„์˜์˜ continuous function์„ ์ž„์˜์˜ ์ •ํ™•๋„๋กœ ๊ทผ์‚ฌ(approximate)ํ•  ์ˆ˜ ์žˆ์Œ์„ ๋ช…์‹œ

Activation Funtions

  • ๋‹ค์–‘ํ•˜๊ณ  ์„œ๋กœ ๋‹ค๋ฅธ activation function์ด ์‚ฌ์šฉ๋จ
  • Logistic ๋˜๋Š” sigmoid function (logistic regression์—์„œ๋„ ์‚ฌ์šฉ):

ฯƒ(x)=1/(1+eโˆ’x)\sigma(x) = 1/(1 + e^{-x}) ฯƒ(x)=1/(1+eโˆ’x)

  • ReLU function (Rectified Linear Unit์˜ ์•ฝ์–ด):

ReLU(x)=maxโก(0,ย x)\text{ReLU}(x) = \max(0,~x) ReLU(x)=max(0,ย x)

  • Softplus function (ReLU function์˜ smooth version)
  • Tanh function:

tanhโก(x)=e2xโˆ’1e2x+1\tanh(x) = \frac{e^{2x} - 1}{e^{2x} + 1} tanh(x)=e2x+1e2xโˆ’1โ€‹

์ตœ๊ทผ ์ˆ˜์ •: 26. 6. 12. ์˜คํ›„ 3:28
Contributors: kmbzn, Claude Sonnet 4.6

BUILT WITH

CloudflareNode.jsGitHubGitVue.jsJavaScriptVSCodenpm

All trademarks and logos are property of their respective owners.
ยฉ 2026 kmbzn ยท MIT License