• Mindscape ๐Ÿ”ฅ
    • Playlist ๐ŸŽง
  • ๐Ÿค– Artifical Intelligence

    • 1. Basics; Linear Algebra
    • 2. Basics; Linear Algebra (2), Search (1)
    • 3. Search (2)
    • 4. Knowledge and Logic (1)
    • 5. Knowledge and Logic (2)
    • 6. Probability
    • 7. Information Theory
    • 8. Probabilitc Reasoning (2)
    • 9. Probabilitc Reasoning (3)
    • 10. Machine Learning (1)
    • 11. Machine Learning (2)
    • 12. Machine Learning (3)
    • 13. Linear Models
    • 14. Other Classic ML Models (1)
    • 15. Other Classic ML Models (2)
  • ๐Ÿ”’ Computer Security

    • 01. Overview
    • 02. ์ •๋ณด๋ณด์•ˆ์ •์ฑ… ๋ฐ ๋ฒ•๊ทœ
    • 03. Cryptographic Tools
    • 04. User Authentication
    • 05. Access Control
    • 06. Database Security
    • 07. Malicious Software
    • 08. Firmware Analysis
  • ๐Ÿ—„๏ธ Database System

    • 1. Introduction
    • 2. Relational Model
    • 3. SQL
    • 6. E-R Model
    • 7. Relational Database Design (1)
    • 7. Relational Database Design (2)
    • 13. Data Storage Structures
    • 14. Indexing
    • 15. Query Processing
  • ๐Ÿ“ Software Engineering

    • 2. Introduction to Software Engineering
    • 3. Process
    • 4. Process Models
    • 5. Agile
    • 6. Requirements
    • 7. Requirements Elicitation and Documentation
    • 8. Architecture
    • 9. Unified Modelling Language
    • 10. Object-Oriented Analysis
    • Object-Oriented Design
  • ๐Ÿง  Algorithm

    • Python ์‹œ๊ฐ„ ์ดˆ๊ณผ ๋ฐฉ์ง€๋ฅผ ์œ„ํ•œ ํŒ
    • C++ std::vector ์‚ฌ์šฉ๋ฒ• ์ •๋ฆฌ
    • Vim ์‚ฌ์šฉ ๋งค๋‰ด์–ผ
    • 1018๋ฒˆ: ์ฒด์ŠคํŒ ๋‹ค์‹œ ์น ํ•˜๊ธฐ
    • 1966๋ฒˆ: ํ”„๋ฆฐํ„ฐ ํ

9. Probabilitc Reasoning (3)

Approximate Inference

  • Monte Carlo algorithms๋Š” ์ƒ์„ฑ๋œ ์ƒ˜ํ”Œ ์ˆ˜์— ๋”ฐ๋ผ ์ •ํ™•๋„๊ฐ€ ๋‹ฌ๋ผ์ง€๋Š” ๊ทผ์‚ฌ์ ์ธ (approximate) ํ•ด๋‹ต์„ ์ œ๊ณต
    • Monte Carlo algorithms๋Š” ์ •ํ™•ํ•œ ๊ณ„์‚ฐ์ด ์–ด๋ ค์šด ์–‘์„ ์ถ”์ •ํ•˜๊ธฐ ์œ„ํ•ด ๋งŽ์€ ๊ณผํ•™ ๋ถ„์•ผ์—์„œ ์‚ฌ์šฉ
  • Bayes net์˜ ํ™•๋ฅ ์„ ๊ธฐ๋ฐ˜์œผ๋กœ ๋žœ๋ค ์ด๋ฒคํŠธ๋ฅผ ์ƒ์„ฑํ•˜๊ณ , ์ด ๋žœ๋ค ์ด๋ฒคํŠธ์—์„œ ๋ฐœ๊ฒฌ๋œ ๋‹ค์–‘ํ•œ ํ•ด๋‹ต์„ ๊ณ„์‚ฐํ•˜์—ฌ ์ž‘๋™
    • ์ถฉ๋ถ„ํ•œ ์ƒ˜ํ”Œ์„ ํ†ตํ•ด, ์ฐธ ํ™•๋ฅ  ๋ถ„ํฌ (true probability distribution)๋ฅผ ์ž„์˜๋กœ ๊ฐ€๊น๊ฒŒ ๋ณต๊ตฌํ•  ์ˆ˜ ์žˆ์Œ
    • P(Xโˆฃe)P(X|e)P(Xโˆฃe)์™€ ๊ฐ™์€ Bayes net์˜ ์‚ฌํ›„ ํ™•๋ฅ  (posterior probabilities) ๊ณ„์‚ฐ์— ์ ์šฉ๋˜๋Š” ์ƒ˜ํ”Œ๋ง์— ๊ด€์‹ฌ

Direct Sampling Methods

  • ๋ชจ๋“  ์ƒ˜ํ”Œ๋ง ์•Œ๊ณ ๋ฆฌ์ฆ˜์˜ ๊ธฐ๋ณธ ์š”์†Œ๋Š” ์•Œ๋ ค์ง„ ํ™•๋ฅ  ๋ถ„ํฌ์—์„œ ์ƒ˜ํ”Œ์„ ์ƒ์„ฑํ•˜๋Š” ๊ฒƒ
  • ๋ณ€์ˆ˜๋“ค์„ ์œ„์ƒ ์ •๋ ฌ (topological order) ์ˆœ์„œ๋กœ ์ฐจ๋ก€๋กœ ์ƒ˜ํ”Œ๋ง
    • ๊ฐ’์ด ์ƒ˜ํ”Œ๋ง๋˜๋Š” ํ™•๋ฅ  ๋ถ„ํฌ๋Š” ์ด๋ฏธ ํ•ด๋‹น ๋ณ€์ˆ˜์˜ ๋ถ€๋ชจ (parents)์—๊ฒŒ ํ• ๋‹น๋œ ๊ฐ’์— ์กฐ๊ฑดํ™”๋จ (conditioned on)
  • Pseudocode

Notations

  • SPS(x1,โ€ฆ,xn)S_{PS}(x_1, \ldots, x_n)SPSโ€‹(x1โ€‹,โ€ฆ,xnโ€‹)๋Š” PRIOR-SAMPLE ์•Œ๊ณ ๋ฆฌ์ฆ˜์— ์˜ํ•ด ํŠน์ • ์ด๋ฒคํŠธ๊ฐ€ ์ƒ์„ฑ๋  ํ™•๋ฅ 
    • SPS(x1,โ€ฆ,xn)=ฮ i=1nP(xiโˆฃparents(Xi))S_{PS}(x_1, \ldots, x_n) = \Pi_{i=1}^n P(x_i | \text{parents}(X_i))SPSโ€‹(x1โ€‹,โ€ฆ,xnโ€‹)=ฮ i=1nโ€‹P(xiโ€‹โˆฃparents(Xiโ€‹))
    • ๊ฐ ์ƒ˜ํ”Œ๋ง ๋‹จ๊ณ„๋Š” ๋ถ€๋ชจ ๊ฐ’์—๋งŒ ์˜์กด
    • ์ด๋Š” Bayes net์— ๋”ฐ๋ฅธ ์ด๋ฒคํŠธ์˜ ํ™•๋ฅ , ์ฆ‰ SPS(x1,โ€ฆ,xn)=P(x1,โ€ฆ,xn)S_{PS}(x_1, \ldots, x_n) = P(x_1, \ldots, x_n)SPSโ€‹(x1โ€‹,โ€ฆ,xnโ€‹)=P(x1โ€‹,โ€ฆ,xnโ€‹)
  • ์ด ์ƒ˜ํ”Œ ์ˆ˜ NNN๊ฐœ
    • NPS(x1,โ€ฆ,xn)N_{PS}(x_1, \ldots, x_n)NPSโ€‹(x1โ€‹,โ€ฆ,xnโ€‹)์€ ์ƒ˜ํ”Œ ์ง‘ํ•ฉ์—์„œ ์ง€์ •๋œ ์ด๋ฒคํŠธ x1,โ€ฆ,xnx_1, \ldots, x_nx1โ€‹,โ€ฆ,xnโ€‹๊ฐ€ ๋ฐœ์ƒํ•œ ํšŸ์ˆ˜
  • ์ด ํšŸ์ˆ˜๋Š”, ์ „์ฒด ์ƒ˜ํ”Œ ์ˆ˜์— ๋Œ€ํ•œ ๋ถ„์œจ๋กœ์„œ, ์ƒ˜ํ”Œ๋ง ํ™•๋ฅ ์— ๋”ฐ๋ฅธ ๊ธฐ๋Œ€๊ฐ’์œผ๋กœ ์ˆ˜๋ ด
    • limโกNโ†’โˆžNPS(x1,โ€ฆ,xn)N=SPS(x1,โ€ฆ,xn)=P(x1,โ€ฆ,xn)\lim_{N \to \infty} \frac{N_{PS}(x_1, \ldots, x_n)}{N} = S_{PS}(x_1, \ldots, x_n) = P(x_1, \ldots, x_n)limNโ†’โˆžโ€‹NNPSโ€‹(x1โ€‹,โ€ฆ,xnโ€‹)โ€‹=SPSโ€‹(x1โ€‹,โ€ฆ,xnโ€‹)=P(x1โ€‹,โ€ฆ,xnโ€‹)
  • ๊ทผ์‚ฌ ๋“ฑ์‹ ("โ‰ˆ\approxโ‰ˆ")์€ ์ถ”์ •๋œ ํ™•๋ฅ ์ด ๋Œ€๊ทœ๋ชจ ์ƒ˜ํ”Œ ๊ทนํ•œ (large-sample limit)์—์„œ ์ •ํ™•ํ•ด์ง์„ ๋‚˜ํƒ€๋ƒ„
    • ์ด๋Ÿฌํ•œ ์ถ”์ •์น˜ (estimate)๋Š” ์ผ๊ด€์  (consistent)์ด๋ผ๊ณ  ํ•จ
    • ์˜ˆ: mโ‰คnm \le nmโ‰คn์ธ ๋ถ€๋ถ„์ ์œผ๋กœ ์ง€์ •๋œ ์ด๋ฒคํŠธ x1,โ€ฆ,xmx_1, \ldots, x_mx1โ€‹,โ€ฆ,xmโ€‹์˜ ํ™•๋ฅ ์— ๋Œ€ํ•œ ์ผ๊ด€์  ์ถ”์ •์น˜๋ฅผ ๋‹ค์Œ๊ณผ ๊ฐ™์ด ์ƒ์„ฑํ•  ์ˆ˜ ์žˆ์Œ
      • P(x1,โ€ฆ,xm)โ‰ˆNPS(x1,โ€ฆ,xm)/NP(x_1, \ldots, x_m) \approx N_{PS}(x_1, \ldots, x_m) / NP(x1โ€‹,โ€ฆ,xmโ€‹)โ‰ˆNPSโ€‹(x1โ€‹,โ€ฆ,xmโ€‹)/N
  • ์ถ”์ •๋œ ํ™•๋ฅ ์„ ๋‚˜ํƒ€๋‚ด๊ธฐ ์œ„ํ•ด P^\hat{P}P^ ("P-hat")๋ฅผ ์‚ฌ์šฉ
    • ์˜ˆ: 1,000๊ฐœ์˜ ์ƒ˜ํ”Œ์„ ์ƒ์„ฑํ–ˆ๊ณ , ๊ทธ ์ค‘ 511๊ฐœ๊ฐ€ Rain = truetruetrue๋ฅผ ๊ฐ€์ง€๋ฉด, P^(Rain=true)=0.511\hat{P}(\text{Rain} = true) = 0.511P^(Rain=true)=0.511

Rejection Sampling

  • Rejection sampling (๊ธฐ๊ฐ ์ƒ˜ํ”Œ๋ง)์€ ์‰ฝ๊ฒŒ ์ƒ˜ํ”Œ๋งํ•  ์ˆ˜ ์žˆ๋Š” ๋ถ„ํฌ๊ฐ€ ์ฃผ์–ด์กŒ์„ ๋•Œ ์ƒ˜ํ”Œ๋งํ•˜๊ธฐ ์–ด๋ ค์šด ๋ถ„ํฌ์—์„œ ์ƒ˜ํ”Œ์„ ์ƒ์„ฑํ•˜๋Š” ์ผ๋ฐ˜์ ์ธ ๋ฐฉ๋ฒ•.
    • ์กฐ๊ฑด๋ถ€ ํ™•๋ฅ  P(Xโˆฃe)P(X|e)P(Xโˆฃe)๋ฅผ ๊ณ„์‚ฐํ•˜๋Š” ๋ฐ ์‚ฌ์šฉ๋  ์ˆ˜ ์žˆ์Œ
  • ์ž‘๋™ ๋ฐฉ์‹
    1. ๋จผ์ €, ๋„คํŠธ์›Œํฌ์— ์˜ํ•ด ์ง€์ •๋œ ์‚ฌ์ „ ๋ถ„ํฌ (prior distribution)์—์„œ ์ƒ˜ํ”Œ์„ ์ƒ์„ฑ
    2. ๊ทธ๋Ÿฐ ๋‹ค์Œ, ์ฆ๊ฑฐ eee์™€ ์ผ์น˜ํ•˜์ง€ ์•Š๋Š” ๋ชจ๋“  ์ƒ˜ํ”Œ์„ ๊ธฐ๊ฐ (reject)
    3. ๋งˆ์ง€๋ง‰์œผ๋กœ, P^(X=xโˆฃe)\hat{P}(X=x|e)P^(X=xโˆฃe) ์ถ”์ •์น˜๋Š” ๋‚จ์•„์žˆ๋Š” ์ƒ˜ํ”Œ์—์„œ X=xX=xX=x๊ฐ€ ๋ฐœ์ƒํ•˜๋Š” ํšŸ์ˆ˜๋ฅผ ์„ธ์–ด ์–ป์Œ
  • ์ถ”์ • ๋ถ„ํฌ P^(Xโˆฃe)\hat{P}(X|e)P^(Xโˆฃe)๋Š” ์ฆ๊ฑฐ eee์™€ ์ผ์น˜ํ•˜๋Š” XXX์˜ ๊ฐ ๊ฐ’์— ๋Œ€ํ•œ ์ƒ˜ํ”Œ ํšŸ์ˆ˜ ๋ฒกํ„ฐ NPS(X,e)N_{PS}(X, e)NPSโ€‹(X,e)๋ฅผ ์ •๊ทœํ™”ํ•˜์—ฌ ๊ณ„์‚ฐ
    • P^(Xโˆฃe)=ฮฑNPS(X,e)=NPS(X,e)NPS(e)\hat{P}(X|e) = \alpha N_{PS}(X, e) = \frac{N_{PS}(X, e)}{N_{PS}(e)}P^(Xโˆฃe)=ฮฑNPSโ€‹(X,e)=NPSโ€‹(e)NPSโ€‹(X,e)โ€‹
    • P(x1,โ€ฆ,xm)โ‰ˆNPS(x1,โ€ฆ,xm)/NP(x_1, \ldots, x_m) \approx N_{PS}(x_1, \ldots, x_m) / NP(x1โ€‹,โ€ฆ,xmโ€‹)โ‰ˆNPSโ€‹(x1โ€‹,โ€ฆ,xmโ€‹)/N๋กœ๋ถ€ํ„ฐ, ๋‹ค์Œ์ด ๋จ
      • P^(Xโˆฃe)โ‰ˆP(X,e)P(e)=P(Xโˆฃe)\hat{P}(X|e) \approx \frac{P(X, e)}{P(e)} = P(X|e)P^(Xโˆฃe)โ‰ˆP(e)P(X,e)โ€‹=P(Xโˆฃe)
  • Rejection sampling ์•Œ๊ณ ๋ฆฌ์ฆ˜
function REJECTION-SAMPLING(X, e, bn, N) returns an estimate of P(X | e)
inputs: X, the query variable
        e, observed values for variables E
        bn, a Bayesian network
        N, the total number of samples to be generated
local variables: C, a vector of counts for each value of X, initially zero

for j = 1 to N do
    x โ† PRIOR-SAMPLE(bn)
    if x is consistent with e then
        C[j] โ† C[j]+1 where x_j is the value of X in x
return NORMALIZE(C)

Challenges in Rejection Sampling

  • Rejection sampling์˜ ๋ณต์žก์„ฑ์€ ์ฃผ๋กœ ์ˆ˜์šฉ๋˜๋Š” (accepted) ์ƒ˜ํ”Œ์˜ ๋น„์œจ์— ๋”ฐ๋ผ ๋‹ฌ๋ผ์ง€๋ฉฐ, ์ด ๋น„์œจ์€ ์ •ํ™•ํžˆ ์ฆ๊ฑฐ์˜ ์‚ฌ์ „ ํ™•๋ฅ  P(e)P(e)P(e)์™€ ๊ฐ™์Œ
  • ๋ณต์žกํ•œ ๋ฌธ์ œ์˜ ๊ฒฝ์šฐ, ๋งŽ์€ ์ฆ๊ฑฐ ๋ณ€์ˆ˜ (evidence variables)๊ฐ€ ์žˆ์œผ๋ฉด ์ด ๋น„์œจ์€ ๋งค์šฐ ์ž‘์•„์ง (vanishingly small)
    • ์ด๋Š” ๋Œ€๋ถ€๋ถ„์˜ ์ƒ˜ํ”Œ์ด ๊ธฐ๊ฐ๋˜์–ด ์ถ”์ •์„ ์œ„ํ•œ ์œ ํšจ ์ƒ˜ํ”Œ์ด ๊ฑฐ์˜ ๋‚จ์ง€ ์•Š๊ธฐ ๋•Œ๋ฌธ์— Rejection sampling์˜ ์•ฝ์ 

Importance Sampling

  • Importance sampling (์ค‘์š”๋„ ์ƒ˜ํ”Œ๋ง)์€ ๋˜ ๋‹ค๋ฅธ ๋ถ„ํฌ QQQ๋กœ๋ถ€ํ„ฐ์˜ ์ƒ˜ํ”Œ์„ ์‚ฌ์šฉํ•˜์—ฌ ๋ถ„ํฌ PPP๋กœ๋ถ€ํ„ฐ ์ƒ˜ํ”Œ๋งํ•˜๋Š” ํšจ๊ณผ๋ฅผ ๋ชจ๋ฐฉํ•˜๋Š” ๊ฒƒ์„ ๋ชฉํ‘œ๋กœ ํ•จ
    • ๊ฐ ์ƒ˜ํ”Œ xxx๋ฅผ ์…€ ๋•Œ P(x)Q(x)\frac{P(x)}{Q(x)}Q(x)P(x)โ€‹ (weight, ๊ฐ€์ค‘์น˜๋ผ๊ณ ๋„ ํ•จ)์ธ ๋ณด์ • ๊ณ„์ˆ˜ (correction factor)๋ฅผ ์ ์šฉํ•˜์—ฌ ํ•ด๋‹ต์ด ๊ทนํ•œ์—์„œ ์ •ํ™•ํ•จ์„ ๋ณด์žฅ
    • ์‹ค์ œ๋กœ๋Š” ๋„ˆ๋ฌด ์–ด๋ ค์›Œ์„œ ์ฆ๊ฑฐ์— ์กฐ๊ฑดํ™”๋œ ์ฐธ ์‚ฌํ›„ ๋ถ„ํฌ P(zโˆฃe)P(z|e)P(zโˆฃe)์—์„œ ์ง์ ‘ ์ƒ˜ํ”Œ๋งํ•˜๋Š” ๋Œ€์‹ , ์‰ฌ์šด ๋ถ„ํฌ์—์„œ ์ƒ˜ํ”Œ๋งํ•˜๊ณ  ํ•„์š”ํ•œ ๋ณด์ • (corrections)์„ ์ ์šฉ
  • Importance sampling ์ž‘๋™ ์›๋ฆฌ:
    • ๋น„์ฆ๊ฑฐ ๋ณ€์ˆ˜ (nonevidence variables)๋ฅผ ZZZ๋ผ๊ณ  ๊ฐ€์ • P(zโˆฃe)P(z|e)P(zโˆฃe)์—์„œ ์ง์ ‘ ์ƒ˜ํ”Œ๋งํ•  ์ˆ˜ ์žˆ๋‹ค๋ฉด ๋‹ค์Œ๊ณผ ๊ฐ™์ด ์ถ”์ •์น˜๋ฅผ ๊ตฌ์„ฑํ•  ์ˆ˜ ์žˆ์Œ
      • P^(zโˆฃe)=NP(z)Nโ‰ˆP(zโˆฃe)\hat{P}(z|e) = \frac{N_P(z)}{N} \approx P(z|e)P^(zโˆฃe)=NNPโ€‹(z)โ€‹โ‰ˆP(zโˆฃe), ์—ฌ๊ธฐ์„œ NP(z)N_P(z)NPโ€‹(z)๋Š” PPP๋กœ๋ถ€ํ„ฐ ์ƒ˜ํ”Œ๋งํ–ˆ์„ ๋•Œ Z=zZ=zZ=z์ธ ์ƒ˜ํ”Œ ์ˆ˜
    • ๋Œ€์‹  Q(z)Q(z)Q(z)์—์„œ ์ƒ˜ํ”Œ๋งํ•œ๋‹ค๊ณ  ๊ฐ€์ •ํ•˜๋ฉด, ์ถ”์ •์น˜๋Š” ๋ณด์ • ๊ณ„์ˆ˜๋ฅผ ํฌํ•จ
      • P^(zโˆฃe)=NQ(z)Nโ‹…P(zโˆฃe)Q(z)โ‰ˆQ(z)P(zโˆฃe)Q(z)=P(zโˆฃe)\hat{P}(z|e) = \frac{N_Q(z)}{N} \cdot \frac{P(z|e)}{Q(z)} \approx Q(z) \frac{P(z|e)}{Q(z)} = P(z|e)P^(zโˆฃe)=NNQโ€‹(z)โ€‹โ‹…Q(z)P(zโˆฃe)โ€‹โ‰ˆQ(z)Q(z)P(zโˆฃe)โ€‹=P(zโˆฃe)
    • ๋”ฐ๋ผ์„œ, ์‚ฌ์šฉ๋˜๋Š” ์ƒ˜ํ”Œ๋ง ๋ถ„ํฌ QQQ์™€ ๊ด€๊ณ„์—†์ด ์ถ”์ •์น˜๋Š” ์˜ฌ๋ฐ”๋ฅธ ๊ฐ’์œผ๋กœ ์ˆ˜๋ ด (๋‹จ, P(zโˆฃe)P(z|e)P(zโˆฃe)๊ฐ€ 0์ด ์•„๋‹Œ ๋ชจ๋“  zzz์— ๋Œ€ํ•ด Q(z)Q(z)Q(z)๋Š” 0์ด ์•„๋‹ˆ์–ด์•ผ ํ•จ)
  • ๋ณด์ • ๊ณ„์ˆ˜๋Š” ๊ณผ๋Œ€ ์ƒ˜ํ”Œ๋ง (oversampling) ๋˜๋Š” ๊ณผ์†Œ ์ƒ˜ํ”Œ๋ง (undersampling)์„ ๋ณด์ƒ
  • ์‚ฌ์šฉํ•˜๊ธฐ ์‰ฌ์šฐ๋ฉด์„œ๋„ ์ฐธ ์‚ฌํ›„ ๋ถ„ํฌ P(zโˆฃe)P(z|e)P(zโˆฃe)์— ์ตœ๋Œ€ํ•œ ๊ฐ€๊นŒ์šด QQQ๋ฅผ ์›ํ•จ ๊ฐ€์žฅ ์ผ๋ฐ˜์ ์ธ ์ ‘๊ทผ ๋ฐฉ์‹์€ Likelihood weighting

Likelihood Weighting

  • Likelihood weighting (์šฐ๋„ ๊ฐ€์ค‘) ์•Œ๊ณ ๋ฆฌ์ฆ˜์€ ์ฆ๊ฑฐ ๋ณ€์ˆ˜ EEE์˜ ๊ฐ’์„ ๊ณ ์ •ํ•˜๊ณ , ๋ชจ๋“  ๋น„์ฆ๊ฑฐ ๋ณ€์ˆ˜๋ฅผ ์œ„์ƒ ์ •๋ ฌ ์ˆœ์„œ๋กœ ์ƒ˜ํ”Œ๋งํ•˜๋ฉฐ, ๊ฐ ๋ณ€์ˆ˜๋Š” ๋ถ€๋ชจ์— ์กฐ๊ฑดํ™”๋จ
    • ์ด ์•Œ๊ณ ๋ฆฌ์ฆ˜์— ์˜ํ•ด ์ƒ์„ฑ๋œ ์ƒ˜ํ”Œ๋ง ๋ถ„ํฌ๋ฅผ QLWQ_{LW}QLWโ€‹๋ผ๊ณ  ํ•จ ๋น„์ฆ๊ฑฐ ๋ณ€์ˆ˜๊ฐ€ Z={Z1,โ€ฆ,Zk}Z = \{Z_1, \ldots, Z_k\}Z={Z1โ€‹,โ€ฆ,Zkโ€‹}์ธ ๊ฒฝ์šฐ, QLW(z)=ฮ i=1kP(ziโˆฃparents(Zi))Q_{LW}(z) = \Pi_{i=1}^k P(z_i | \text{parents}(Z_i))QLWโ€‹(z)=ฮ i=1kโ€‹P(ziโ€‹โˆฃparents(Ziโ€‹))
    • P^(zโˆฃe)=NLW(z)Nโ‹…P(zโˆฃe)QLW(z)=NLW(z)Nโ‹…w(z)\hat{P}(z|e) = \frac{N_{LW}(z)}{N} \cdot \frac{P(z|e)}{Q_{LW}(z)} = \frac{N_{LW}(z)}{N} \cdot w(z)P^(zโˆฃe)=NNLWโ€‹(z)โ€‹โ‹…QLWโ€‹(z)P(zโˆฃe)โ€‹=NNLWโ€‹(z)โ€‹โ‹…w(z)
  • ๊ฐ ์ƒ˜ํ”Œ์— ๋Œ€ํ•ด QLWQ_{LW}QLWโ€‹๋กœ๋ถ€ํ„ฐ ์ƒ์„ฑ๋œ ๊ฐ€์ค‘์น˜ w(z)w(z)w(z)๋ฅผ ๊ณ„์‚ฐํ•ด์•ผ ํ•จ
    • ๊ฐ€์ค‘์น˜๋Š” w(z)=P(zโˆฃe)QLW(z)=ฮฑP(z,e)QLW(z)w(z) = \frac{P(z|e)}{Q_{LW}(z)} = \alpha \frac{P(z, e)}{Q_{LW}(z)}w(z)=QLWโ€‹(z)P(zโˆฃe)โ€‹=ฮฑQLWโ€‹(z)P(z,e)โ€‹๋กœ ์ •์˜๋จ
    • P(z,e)P(z, e)P(z,e)๋Š” Bayes net์—์„œ zzz์™€ eee์˜ ๊ฒฐํ•ฉ ํ™•๋ฅ 
    • w(z)=ฮฑฮ i=1kP(ziโˆฃparents(Zi))ฮ j=1mP(ejโˆฃparents(Ej))ฮ i=1kP(ziโˆฃparents(Zi))=ฮฑฮ j=1mP(ejโˆฃparents(Ej))w(z) = \alpha \frac{\Pi_{i=1}^k P(z_i | \text{parents}(Z_i)) \Pi_{j=1}^m P(e_j | \text{parents}(E_j))}{\Pi_{i=1}^k P(z_i | \text{parents}(Z_i))} = \alpha \Pi_{j=1}^m P(e_j | \text{parents}(E_j))w(z)=ฮฑฮ i=1kโ€‹P(ziโ€‹โˆฃparents(Ziโ€‹))ฮ i=1kโ€‹P(ziโ€‹โˆฃparents(Ziโ€‹))ฮ j=1mโ€‹P(ejโ€‹โˆฃparents(Ejโ€‹))โ€‹=ฮฑฮ j=1mโ€‹P(ejโ€‹โˆฃparents(Ejโ€‹))
  • ๊ฐ€์ค‘์น˜๋Š” ๋ถ€๋ชจ๊ฐ€ ์ฃผ์–ด์ง„ ์ฆ๊ฑฐ ๋ณ€์ˆ˜๋“ค์˜ ์กฐ๊ฑด๋ถ€ ํ™•๋ฅ  (conditional probabilities)์˜ ๊ณฑ
    • ์ฆ๊ฑฐ์˜ ํ™•๋ฅ ์€ ์ผ๋ฐ˜์ ์œผ๋กœ ์šฐ๋„ (likelihoods)๋ผ๊ณ  ๋ถˆ๋ฆฌ๋ฏ€๋กœ, Likelihood weighting์ด๋ผ๋Š” ์ด๋ฆ„์ด ๋ถ™์Œ
  • Likelihood weighting ์•Œ๊ณ ๋ฆฌ์ฆ˜
function LIKELIHOOD-WEIGHTING(X, e, bn, N) returns an estimate of P(X | e)
inputs: X, the query variable
        e, observed values for variables E
        bn, a Bayesian network specifying joint distribution P(X_1,...,X_n)
        N, the total number of samples to be generated
local variables: W, a vector of weighted counts for each value of X, initially zero

for j = 1 to N do
    x, w โ† WEIGHTED-SAMPLE(bn, e)
    W[j] โ† W[j]+w where x_j is the value of X in x
return NORMALIZE(W)

function WEIGHTED-SAMPLE(bn, e) returns an event and a weight
    w โ† 1; x โ† an event with n elements, with values fixed from e
    for i = 1 to n do
        if X_i is an evidence variable with value x_i_j in e
            then w โ† w ร— P(x_i_j | parents(X_i))
        else x[i] โ† a random sample from P(X_i | parents(X_i))
    return x, w

Lawn Condition Example

  • Query: P(RainโˆฃCloudy=true,WetGrass=true)P(\text{Rain} | \text{Cloudy} = true, \text{WetGrass} = true)P(RainโˆฃCloudy=true,WetGrass=true)
  • Process ์˜ˆ์‹œ:
    1. ๊ฐ€์ค‘์น˜ www๋ฅผ 1.01.01.0์œผ๋กœ ์„ค์ •
    2. Cloudy๋Š” ์ฆ๊ฑฐ ๋ณ€์ˆ˜ (truetruetrue)์ด๋ฏ€๋กœ, wโ†wร—P(Cloudy=true)=0.5w \leftarrow w \times P(\text{Cloudy} = true) = 0.5wโ†wร—P(Cloudy=true)=0.5
    3. Sprinkler๋Š” ๋น„์ฆ๊ฑฐ ๋ณ€์ˆ˜์ด๋ฏ€๋กœ P(SprinklerโˆฃCloudy=true)P(\text{Sprinkler} | \text{Cloudy} = true)P(SprinklerโˆฃCloudy=true)์—์„œ ์ƒ˜ํ”Œ๋ง (์˜ˆ: falsefalsefalse)
    4. Rain์€ ๋น„์ฆ๊ฑฐ ๋ณ€์ˆ˜์ด๋ฏ€๋กœ P(RainโˆฃCloudy=true)P(\text{Rain} | \text{Cloudy} = true)P(RainโˆฃCloudy=true)์—์„œ ์ƒ˜ํ”Œ๋ง (์˜ˆ: truetruetrue)
    5. WetGrass๋Š” ์ฆ๊ฑฐ ๋ณ€์ˆ˜ (truetruetrue)์ด๋ฏ€๋กœ, wโ†wร—P(WetGrass=trueโˆฃSprinkler=false,Rain=true)=0.5ร—0.9=0.45w \leftarrow w \times P(\text{WetGrass} = true | \text{Sprinkler} = false, \text{Rain} = true) = 0.5 \times 0.9 = 0.45wโ†wร—P(WetGrass=trueโˆฃSprinkler=false,Rain=true)=0.5ร—0.9=0.45
  • ๊ฒฐ๊ณผ: ์ด๋ฒคํŠธ [true,false,true,true][true, false, true, true][true,false,true,true]๋Š” ๊ฐ€์ค‘์น˜ 0.450.450.45๋ฅผ ๊ฐ€์ง€๋ฉฐ, Rain = TrueTrueTrue๋กœ ์ง‘๊ณ„

Pros and Cons of Likelihood Weighting

  • Likelihood weighting์€ ์ƒ์„ฑ๋œ ๋ชจ๋“  ์ƒ˜ํ”Œ์„ ์‚ฌ์šฉํ•˜๊ธฐ ๋•Œ๋ฌธ์— Rejection sampling๋ณด๋‹ค ํ›จ์”ฌ ํšจ์œจ์ ์ผ ์ˆ˜ ์žˆ์Œ
  • ๊ทธ๋Ÿฌ๋‚˜, ์ฆ๊ฑฐ ๋ณ€์ˆ˜์˜ ์ˆ˜๊ฐ€ ์ฆ๊ฐ€ํ•จ์— ๋”ฐ๋ผ ์„ฑ๋Šฅ ์ €ํ•˜ (degradation in performance)๋ฅผ ๊ฒช์Œ
    • ๋Œ€๋ถ€๋ถ„์˜ ์ƒ˜ํ”Œ์ด ๋งค์šฐ ๋‚ฎ์€ ๊ฐ€์ค‘์น˜๋ฅผ ๊ฐ€์ง€๋ฏ€๋กœ, ๊ฐ€์ค‘์น˜๊ฐ€ ๋ถ€์—ฌ๋œ ์ถ”์ •์น˜๋Š” ์ฆ๊ฑฐ์— ๊ทนํžˆ ๋ฏธ๋ฏธํ•œ ์šฐ๋„๋ฅผ ๋ถ€์—ฌํ•˜๋Š” ์•„์ฃผ ์ž‘์€ ๋น„์œจ์˜ ์ƒ˜ํ”Œ์— ์˜ํ•ด ์ง€๋ฐฐ๋˜๊ธฐ ๋•Œ๋ฌธ
    • w(z)=ฮฑฮ i=1mP(eiโˆฃparents(Ei))w(z) = \alpha \Pi_{i=1}^m P(e_i | \text{parents}(E_i))w(z)=ฮฑฮ i=1mโ€‹P(eiโ€‹โˆฃparents(Eiโ€‹))

Visualized Illustrations for Likelihood Weighting

  • Likelihood Weighting์€ ์ฆ๊ฑฐ ๋ณ€์ˆ˜์˜ ๊ฐ’์„ ๊ณ ์ •ํ•˜๋Š” ๊ฒƒ์œผ๋กœ ์‹œ์ž‘
  • Bayesian Network์˜ ์กฐ๊ฑด๋ถ€ ํ™•๋ฅ ์„ ์‚ฌ์šฉํ•˜์—ฌ ๋น„์ฆ๊ฑฐ ๋ณ€์ˆ˜๋ฅผ ์ƒ˜ํ”Œ๋ง
  • ๊ฐ ์ƒ˜ํ”Œ์— ์šฐ๋„ (๋ชจ๋“  ์ฆ๊ฑฐ์˜ ํ™•๋ฅ )๋ฅผ ๊ฐ€์ค‘์น˜๋กœ ๋ถ€์—ฌ

Python Programming for Bayesian Networks

  • model.py, likelihood.py, inference.py, sample.py ํŒŒ์ผ์˜ ๋‚ด์šฉ ์ œ์‹œ

likelihood.py

  • Query ์˜ˆ์‹œ:

P(R=none,M=no,T=onย time,A=attend)P(R=none, M=no, T=on~time, A=attend) P(R=none,M=no,T=onย time,A=attend)

=P(R=none)P(M=noโˆฃR=none)= P(R=none) P(M=no | R=none) =P(R=none)P(M=noโˆฃR=none)

P(T=onย timeโˆฃR=none,M=no)P(A=attendโˆฃT=onย time)P(T=on~time | R=none, M=no) P(A=attend | T=on~time) P(T=onย timeโˆฃR=none,M=no)P(A=attendโˆฃT=onย time)

=0.7ร—0.6ร—0.9ร—0.9=0.3402= 0.7 \times 0.6 \times 0.9 \times 0.9 = 0.3402 =0.7ร—0.6ร—0.9ร—0.9=0.3402

inference.py

  • Verification by example
    • P(rain=noneโˆฃtrain=delayed)=ฮฑP(rain=none,train=delayed)=0.46โ€ฆP(\text{rain} = none | \text{train} = delayed) = \alpha P(\text{rain} = none, \text{train} = delayed) = 0.46 \ldotsP(rain=noneโˆฃtrain=delayed)=ฮฑP(rain=none,train=delayed)=0.46โ€ฆ
    • P(rain=lightโˆฃtrain=delayed)=ฮฑP(rain=light,train=delayed)=0.30โ€ฆP(\text{rain} = light | \text{train} = delayed) = \alpha P(\text{rain} = light, \text{train} = delayed) = 0.30 \ldotsP(rain=lightโˆฃtrain=delayed)=ฮฑP(rain=light,train=delayed)=0.30โ€ฆ
    • P(rain=heavyโˆฃtrain=delayed)=ฮฑP(rain=heavy,train=delayed)=0.23โ€ฆP(\text{rain} = heavy | \text{train} = delayed) = \alpha P(\text{rain} = heavy, \text{train} = delayed) = 0.23 \ldotsP(rain=heavyโˆฃtrain=delayed)=ฮฑP(rain=heavy,train=delayed)=0.23โ€ฆ
    • ์—ฌ๊ธฐ์„œ ฮฑ=4.6948โ€ฆ\alpha = 4.6948 \ldotsฮฑ=4.6948โ€ฆ
์ตœ๊ทผ ์ˆ˜์ •: 25. 11. 6. ์˜คํ›„ 12:07
Contributors: kmbzn
Prev
8. Probabilitc Reasoning (2)
Next
10. Machine Learning (1)

BUILT WITH

CloudflareNode.jsGitHubGitVue.jsJavaScriptVSCodenpm

All trademarks and logos are property of their respective owners.
ยฉ 2025 kmbzn ยท MIT License