• Mindscape ๐Ÿ”ฅ
    • Playlist ๐ŸŽง
  • ๐Ÿง  Algorithm

    • Python ์‹œ๊ฐ„ ์ดˆ๊ณผ ๋ฐฉ์ง€๋ฅผ ์œ„ํ•œ ํŒ
    • C++ std::vector ์‚ฌ์šฉ๋ฒ• ์ •๋ฆฌ
    • Vim ์‚ฌ์šฉ ๋งค๋‰ด์–ผ
    • 1018๋ฒˆ: ์ฒด์ŠคํŒ ๋‹ค์‹œ ์น ํ•˜๊ธฐ
    • 1966๋ฒˆ: ํ”„๋ฆฐํ„ฐ ํ
  • ๐Ÿ—‚๏ธ Database System

    • 1. Introduction
    • 2. Relational Model
    • 3. SQL
    • 6. E-R Model
    • 7. Relational Database Design (1)
    • 7. Relational Database Design (2)
    • 13. Data Storage Structures
    • 14. Indexing
    • 15. Query Processing
    • 16. Query Optimization
    • 17. Transactions
    • 18. Concurrency control
  • ๐Ÿง Operating System

    • 7. Deadlocks
    • 8. Memory Management (1)
    • 9. Memory Management (2)
    • 10. Virtual Memory(1)
    • 11. Virtual Memory (2)
    • 12. File System
    • 13. Mass Storage Management
    • 14. I/O Systems

16. Query Optimization

Introduction

  • ์ฃผ์–ด์ง„ query ํ‰๊ฐ€๋ฅผ ์œ„ํ•œ ๋Œ€์•ˆ์  ๋ฐฉ๋ฒ•๋“ค
  • ๋™๋“ฑํ•œ(Equivalent) ํ‘œํ˜„์‹
  • ๊ฐ ์—ฐ์‚ฐ์„ ์œ„ํ•œ ๋‹ค๋ฅธ algorithm
  • Query ์˜ˆ์‹œ: Music department ๊ฐ•์‚ฌ์˜ ์ด๋ฆ„๊ณผ ๊ทธ๋“ค์ด ๊ฐ€๋ฅด์น˜๋Š” ๊ณผ๋ชฉ์˜ title ๊ฒ€์ƒ‰
  • ฮ name,ย title(ฯƒdept_name=ย โ€™Musicโ€™(instructorโ‹ˆ(teachesโ‹ˆฮ course_id,ย title(course))))\Pi_{\text{name,~title}}(\sigma_{\text{dept\_name= 'Music'}}(\text{instructor} \bowtie (\text{teaches} \bowtie \Pi_{\text{course\_id,~title}} (\text{course}))))ฮ name,ย titleโ€‹(ฯƒdept_name=ย โ€™Musicโ€™โ€‹(instructorโ‹ˆ(teachesโ‹ˆฮ course_id,ย titleโ€‹(course))))
  • Evaluation plan: ๊ฐ ์—ฐ์‚ฐ์— ์‚ฌ์šฉ๋˜๋Š” algorithm ๋ฐ ์—ฐ์‚ฐ ์‹คํ–‰ ๋ฐฉ์‹ (materialized/pipelined) ์ •์˜
  • Query์— ๋Œ€ํ•œ evaluation plan ๊ฐ„์˜ ๋น„์šฉ ์ฐจ์ด๋Š” ๋ง‰๋Œ€ํ•  ์ˆ˜ ์žˆ์Œ (์˜ˆ: ์ดˆ vs. ์ผ)
  • Cost-based query optimization ๋‹จ๊ณ„:
    1. ๋…ผ๋ฆฌ์ ์œผ๋กœ ๋™๋“ฑํ•œ(equivalent) ํ‘œํ˜„์‹ ์ƒ์„ฑ (๋™๋“ฑ์„ฑ ๊ทœ์น™(equivalence rules) ์‚ฌ์šฉ)
    2. ๋Œ€์•ˆ์  query plan์„ ์–ป๊ธฐ ์œ„ํ•ด ๊ฒฐ๊ณผ ํ‘œํ˜„์‹์— ์ฃผ์„(annotate) ์ถ”๊ฐ€
    3. ์ถ”์ •๋œ ๋น„์šฉ(estimated cost)์— ๊ธฐ๋ฐ˜ํ•˜์—ฌ ๊ฐ€์žฅ ์ €๋ ดํ•œ plan ์„ ํƒ
  • Plan ๋น„์šฉ ์ถ”์ • ๊ธฐ๋ฐ˜:
    • Relation์— ๋Œ€ํ•œ ํ†ต๊ณ„ ์ •๋ณด (์˜ˆ: tuple ์ˆ˜, attribute์˜ distinct ๊ฐ’ ์ˆ˜)
    • ์ค‘๊ฐ„ ๊ฒฐ๊ณผ์— ๋Œ€ํ•œ ํ†ต๊ณ„ ์ถ”์ • (๋ณต์žกํ•œ ํ‘œํ˜„์‹ ๋น„์šฉ ๊ณ„์‚ฐ์šฉ)
    • Algorithm์— ๋Œ€ํ•œ ๋น„์šฉ ๊ณต์‹ (ํ†ต๊ณ„ ์‚ฌ์šฉ)

Transformation of Relational Expressions

  • ๋‘ ๊ด€๊ณ„๋Œ€์ˆ˜(relational algebra) ํ‘œํ˜„์‹์ด ๋ชจ๋“  legal database instance์—์„œ ๋™์ผํ•œ tuple ์ง‘ํ•ฉ์„ ์ƒ์„ฑํ•˜๋ฉด ๋™๋“ฑ(equivalent)ํ•˜๋‹ค๊ณ  ํ•จ (Tuple ์ˆœ์„œ ๋ฌด๊ด€)
  • SQL์—์„œ ์ž…์ถœ๋ ฅ์€ tuple์˜ multiset (์ค‘๋ณต ํ—ˆ์šฉ ์ง‘ํ•ฉ)
  • Multiset ๋ฒ„์ „์˜ ๊ด€๊ณ„๋Œ€์ˆ˜์—์„œ ๋‘ ํ‘œํ˜„์‹์ด ๋ชจ๋“  legal database instance์—์„œ ๋™์ผํ•œ multiset์„ ์ƒ์„ฑํ•˜๋ฉด ๋™๋“ฑํ•˜๋‹ค๊ณ  ํ•จ
  • Equivalence rule: ๋‘ ํ˜•ํƒœ์˜ ํ‘œํ˜„์‹์ด ๋™๋“ฑํ•จ์„ ์˜๋ฏธ (์ƒํ˜ธ ๊ต์ฒด ๊ฐ€๋Šฅ)

Equivalence Rules

  1. Conjunctive selection ์—ฐ์‚ฐ์€ ๊ฐœ๋ณ„ selection์˜ sequence๋กœ ๋ถ„ํ•ด ๊ฐ€๋Šฅ (cascade of ฯƒ\sigmaฯƒ)
  • ฯƒฮธ1โˆงฮธ2(E)โ‰กฯƒฮธ1(ฯƒฮธ2(E))\sigma_{\theta_1 \land \theta_2}(E) \equiv \sigma_{\theta_1}(\sigma_{\theta_2}(E))ฯƒฮธ1โ€‹โˆงฮธ2โ€‹โ€‹(E)โ‰กฯƒฮธ1โ€‹โ€‹(ฯƒฮธ2โ€‹โ€‹(E))
  1. Selection ์—ฐ์‚ฐ์€ ๊ตํ™˜ ๊ฐ€๋Šฅ(commutative)
  • ฯƒฮธ1(ฯƒฮธ2(E))โ‰กฯƒฮธ2(ฯƒฮธ1(E))\sigma_{\theta_1}(\sigma_{\theta_2}(E)) \equiv \sigma_{\theta_2}(\sigma_{\theta_1}(E))ฯƒฮธ1โ€‹โ€‹(ฯƒฮธ2โ€‹โ€‹(E))โ‰กฯƒฮธ2โ€‹โ€‹(ฯƒฮธ1โ€‹โ€‹(E))
  1. Projection ์—ฐ์‚ฐ sequence์—์„œ๋Š” ๋งˆ์ง€๋ง‰ ํ•˜๋‚˜๋งŒ ํ•„์š”
  • ฮ L1(ฮ L2(โ€ฆ(ฮ Ln(E))โ€ฆโ€‰))โ‰กฮ L1(E)\Pi_{L1}(\Pi_{L2}(\dots(\Pi_{Ln}(E))\dots)) \equiv \Pi_{L1}(E)ฮ L1โ€‹(ฮ L2โ€‹(โ€ฆ(ฮ Lnโ€‹(E))โ€ฆ))โ‰กฮ L1โ€‹(E), (๋‹จ, L1โІL2โ‹ฏโІLnL_1 \subseteq L_2 \dots \subseteq L_nL1โ€‹โІL2โ€‹โ‹ฏโІLnโ€‹)
  1. Selection์€ Cartesian product ๋ฐ theta join๊ณผ ๊ฒฐํ•ฉ ๊ฐ€๋Šฅ
  • a. ฯƒฮธ(E1ร—E2)โ‰กE1โ‹ˆฮธE2\sigma_{\theta}(E_1 \times E_2) \equiv E_1 \bowtie_{\theta} E_2ฯƒฮธโ€‹(E1โ€‹ร—E2โ€‹)โ‰กE1โ€‹โ‹ˆฮธโ€‹E2โ€‹
  • b. ฯƒฮธ1(E1โ‹ˆฮธ2E2)โ‰กE1โ‹ˆฮธ1โˆงฮธ2E2\sigma_{\theta_1}(E_1 \bowtie_{\theta_2} E_2) \equiv E_1 \bowtie_{\theta_1 \land \theta_2} E_2ฯƒฮธ1โ€‹โ€‹(E1โ€‹โ‹ˆฮธ2โ€‹โ€‹E2โ€‹)โ‰กE1โ€‹โ‹ˆฮธ1โ€‹โˆงฮธ2โ€‹โ€‹E2โ€‹
  1. Theta join (๋ฐ natural join)์€ ๊ตํ™˜ ๊ฐ€๋Šฅ
  • E1โ‹ˆฮธE2โ‰กE2โ‹ˆฮธE1E_1 \bowtie_{\theta} E_2 \equiv E_2 \bowtie_{\theta} E_1E1โ€‹โ‹ˆฮธโ€‹E2โ€‹โ‰กE2โ€‹โ‹ˆฮธโ€‹E1โ€‹
  1. (a) Natural join์€ ๊ฒฐํ•ฉ ๊ฐ€๋Šฅ(associative)
  • (E1โ‹ˆE2)โ‹ˆE3โ‰กE1โ‹ˆ(E2โ‹ˆE3)(E_1 \bowtie E_2) \bowtie E_3 \equiv E_1 \bowtie (E_2 \bowtie E_3)(E1โ€‹โ‹ˆE2โ€‹)โ‹ˆE3โ€‹โ‰กE1โ€‹โ‹ˆ(E2โ€‹โ‹ˆE3โ€‹)
  • (b) Theta join์˜ ๊ฒฐํ•ฉ ๋ฒ•์น™
    • (E1โ‹ˆฮธ1E2)โ‹ˆฮธ2โˆงฮธ3E3โ‰กE1โ‹ˆฮธ1โˆงฮธ3(E2โ‹ˆฮธ2E3)(E_1 \bowtie_{\theta_1} E_2) \bowtie_{\theta_2 \land \theta_3} E_3 \equiv E_1 \bowtie_{\theta_1 \land \theta_3} (E_2 \bowtie_{\theta_2} E_3)(E1โ€‹โ‹ˆฮธ1โ€‹โ€‹E2โ€‹)โ‹ˆฮธ2โ€‹โˆงฮธ3โ€‹โ€‹E3โ€‹โ‰กE1โ€‹โ‹ˆฮธ1โ€‹โˆงฮธ3โ€‹โ€‹(E2โ€‹โ‹ˆฮธ2โ€‹โ€‹E3โ€‹) (๋‹จ, ฮธ2\theta_2ฮธ2โ€‹๋Š” E2,E3E_2, E_3E2โ€‹,E3โ€‹์˜ attribute๋งŒ ํฌํ•จ)
  1. Selection ์—ฐ์‚ฐ์€ ๋‹ค์Œ ๋‘ ์กฐ๊ฑด ํ•˜์— theta join์— ๋ถ„๋ฐฐ ๊ฐ€๋Šฅ
  • (a) ฮธ0\theta_0ฮธ0โ€‹์˜ ๋ชจ๋“  attribute๊ฐ€ join ๋˜๋Š” ํ‘œํ˜„์‹ ์ค‘ ํ•˜๋‚˜(E1E_1E1โ€‹)์˜ attribute๋งŒ ํฌํ•จ
    • ฯƒฮธ0(E1โ‹ˆฮธE2)โ‰ก(ฯƒฮธ0(E1))โ‹ˆฮธE2\sigma_{\theta_0}(E_1 \bowtie_{\theta} E_2) \equiv (\sigma_{\theta_0}(E_1)) \bowtie_{\theta} E_2ฯƒฮธ0โ€‹โ€‹(E1โ€‹โ‹ˆฮธโ€‹E2โ€‹)โ‰ก(ฯƒฮธ0โ€‹โ€‹(E1โ€‹))โ‹ˆฮธโ€‹E2โ€‹
  • (b) ฮธ1\theta_1ฮธ1โ€‹์ด E1E_1E1โ€‹์˜ attribute๋งŒ, ฮธ2\theta_2ฮธ2โ€‹๊ฐ€ E2E_2E2โ€‹์˜ attribute๋งŒ ํฌํ•จ
    • ฯƒฮธ1โˆงฮธ2(E1โ‹ˆฮธE2)โ‰ก(ฯƒฮธ1(E1))โ‹ˆฮธ(ฯƒฮธ2(E2))\sigma_{\theta_1 \land \theta_2}(E_1 \bowtie_{\theta} E_2) \equiv (\sigma_{\theta_1}(E_1)) \bowtie_{\theta} (\sigma_{\theta_2}(E_2))ฯƒฮธ1โ€‹โˆงฮธ2โ€‹โ€‹(E1โ€‹โ‹ˆฮธโ€‹E2โ€‹)โ‰ก(ฯƒฮธ1โ€‹โ€‹(E1โ€‹))โ‹ˆฮธโ€‹(ฯƒฮธ2โ€‹โ€‹(E2โ€‹))
  1. Projection ์—ฐ์‚ฐ์€ theta join์— ๋‹ค์Œ๊ณผ ๊ฐ™์ด ๋ถ„๋ฐฐ ๊ฐ€๋Šฅ
  • (a) L1,L2L_1, L_2L1โ€‹,L2โ€‹๊ฐ€ ๊ฐ๊ฐ E1,E2E_1, E_2E1โ€‹,E2โ€‹์˜ attribute์ด๊ณ  ฮธ\thetaฮธ๊ฐ€ L1โˆชL2L_1 \cup L_2L1โ€‹โˆชL2โ€‹์˜ attribute๋งŒ ํฌํ•จ
    • ฮ L1โˆชL2(E1โ‹ˆฮธE2)โ‰กฮ L1(E1)โ‹ˆฮธฮ L2(E2)\Pi_{L_1 \cup L_2}(E_1 \bowtie_{\theta} E_2) \equiv \Pi_{L_1}(E_1) \bowtie_{\theta} \Pi_{L_2}(E_2)ฮ L1โ€‹โˆชL2โ€‹โ€‹(E1โ€‹โ‹ˆฮธโ€‹E2โ€‹)โ‰กฮ L1โ€‹โ€‹(E1โ€‹)โ‹ˆฮธโ€‹ฮ L2โ€‹โ€‹(E2โ€‹)
  • (b) ์ผ๋ฐ˜์ ์ธ ๊ฒฝ์šฐ (E1โ‹ˆฮธE2E_1 \bowtie_{\theta} E_2E1โ€‹โ‹ˆฮธโ€‹E2โ€‹), L1โІE1,L2โІE2L_1 \subseteq E_1, L_2 \subseteq E_2L1โ€‹โІE1โ€‹,L2โ€‹โІE2โ€‹. L3L_3L3โ€‹๋Š” E1E_1E1โ€‹์˜ attribute (join ์กฐ๊ฑด ฮธ\thetaฮธ์— ํฌํ•จ๋˜์ง€๋งŒ L1โˆชL2L_1 \cup L_2L1โ€‹โˆชL2โ€‹์—๋Š” ๋ฏธํฌํ•จ), L4L_4L4โ€‹๋Š” E2E_2E2โ€‹์˜ attribute (join ์กฐ๊ฑด ฮธ\thetaฮธ์— ํฌํ•จ๋˜์ง€๋งŒ L1โˆชL2L_1 \cup L_2L1โ€‹โˆชL2โ€‹์—๋Š” ๋ฏธํฌํ•จ)
    • ฮ L1โˆชL2(E1โ‹ˆฮธE2)โ‰กฮ L1โˆชL2(ฮ L1โˆชL3(E1)โ‹ˆฮธฮ L2โˆชL4(E2))\Pi_{L_1 \cup L_2}(E_1 \bowtie_{\theta} E_2) \equiv \Pi_{L_1 \cup L_2}(\Pi_{L_1 \cup L_3}(E_1) \bowtie_{\theta} \Pi_{L_2 \cup L_4}(E_2))ฮ L1โ€‹โˆชL2โ€‹โ€‹(E1โ€‹โ‹ˆฮธโ€‹E2โ€‹)โ‰กฮ L1โ€‹โˆชL2โ€‹โ€‹(ฮ L1โ€‹โˆชL3โ€‹โ€‹(E1โ€‹)โ‹ˆฮธโ€‹ฮ L2โ€‹โˆชL4โ€‹โ€‹(E2โ€‹))
  1. ์ง‘ํ•ฉ ์—ฐ์‚ฐ union(โˆช\cupโˆช)๊ณผ intersection(โˆฉ\capโˆฉ)์€ ๊ตํ™˜ ๊ฐ€๋Šฅ (Set difference(โˆ’-โˆ’)๋Š” ์•„๋‹˜)
  2. Set union๊ณผ intersection์€ ๊ฒฐํ•ฉ ๊ฐ€๋Šฅ
  3. Selection ์—ฐ์‚ฐ์€ โˆช\cupโˆช, โˆฉ\capโˆฉ, โˆ’-โˆ’ ์— ๋ถ„๋ฐฐ ๊ฐ€๋Šฅ
  • a. ฯƒฮธ(E1โˆชE2)โ‰กฯƒฮธ(E1)โˆชฯƒฮธ(E2)\sigma_{\theta}(E_1 \cup E_2) \equiv \sigma_{\theta}(E_1) \cup \sigma_{\theta}(E_2)ฯƒฮธโ€‹(E1โ€‹โˆชE2โ€‹)โ‰กฯƒฮธโ€‹(E1โ€‹)โˆชฯƒฮธโ€‹(E2โ€‹)
  • b. ฯƒฮธ(E1โˆฉE2)โ‰กฯƒฮธ(E1)โˆฉฯƒฮธ(E2)\sigma_{\theta}(E_1 \cap E_2) \equiv \sigma_{\theta}(E_1) \cap \sigma_{\theta}(E_2)ฯƒฮธโ€‹(E1โ€‹โˆฉE2โ€‹)โ‰กฯƒฮธโ€‹(E1โ€‹)โˆฉฯƒฮธโ€‹(E2โ€‹)
  • c. ฯƒฮธ(E1โˆ’E2)โ‰กฯƒฮธ(E1)โˆ’ฯƒฮธ(E2)\sigma_{\theta}(E_1 - E_2) \equiv \sigma_{\theta}(E_1) - \sigma_{\theta}(E_2)ฯƒฮธโ€‹(E1โ€‹โˆ’E2โ€‹)โ‰กฯƒฮธโ€‹(E1โ€‹)โˆ’ฯƒฮธโ€‹(E2โ€‹)
  • d. ฯƒฮธ(E1โˆฉE2)โ‰กฯƒฮธ(E1)โˆฉE2\sigma_{\theta}(E_1 \cap E_2) \equiv \sigma_{\theta}(E_1) \cap E_2ฯƒฮธโ€‹(E1โ€‹โˆฉE2โ€‹)โ‰กฯƒฮธโ€‹(E1โ€‹)โˆฉE2โ€‹ (โˆช\cupโˆช ์—๋Š” ์„ฑ๋ฆฝ ์•ˆ ํ•จ)
  • e. ฯƒฮธ(E1โˆ’E2)โ‰กฯƒฮธ(E1)โˆ’E2\sigma_{\theta}(E_1 - E_2) \equiv \sigma_{\theta}(E_1) - E_2ฯƒฮธโ€‹(E1โ€‹โˆ’E2โ€‹)โ‰กฯƒฮธโ€‹(E1โ€‹)โˆ’E2โ€‹
  1. Projection ์—ฐ์‚ฐ์€ union์— ๋ถ„๋ฐฐ ๊ฐ€๋Šฅ (๋‹จ, E1,E2E_1, E_2E1โ€‹,E2โ€‹ schema ๋™์ผ)
  • ฮ L(E1โˆชE2)โ‰ก(ฮ L(E1))โˆช(ฮ L(E2))\Pi_L(E_1 \cup E_2) \equiv (\Pi_L(E_1)) \cup (\Pi_L(E_2))ฮ Lโ€‹(E1โ€‹โˆชE2โ€‹)โ‰ก(ฮ Lโ€‹(E1โ€‹))โˆช(ฮ Lโ€‹(E2โ€‹))

Pictorial Depiction of Equivalence Rules

Transformation Example: Pushing Selections

  • Selection์„ ๊ฐ€๋Šฅํ•œ ํ•œ ๋นจ๋ฆฌ ์ˆ˜ํ–‰ํ•˜๋ฉด joinํ•  relation์˜ ํฌ๊ธฐ ๊ฐ์†Œ
  • Query ์˜ˆ
    • ฮ name,ย title(ฯƒdept_name=ย โ€™Musicโ€™(instructorโ‹ˆ(teachesโ‹ˆฮ course_id,ย title(course))))\Pi_{\text{name,~title}}(\sigma_{\text{dept\_name= 'Music'}}(\text{instructor} \bowtie (\text{teaches} \bowtie \Pi_{\text{course\_id,~title}} (\text{course}))))ฮ name,ย titleโ€‹(ฯƒdept_name=ย โ€™Musicโ€™โ€‹(instructorโ‹ˆ(teachesโ‹ˆฮ course_id,ย titleโ€‹(course))))
  • ๊ทœ์น™ 7a๋ฅผ ์ด์šฉํ•œ ๋ณ€ํ™˜
    • ฮ name,ย title(ฯƒdept_name=ย โ€™Musicโ€™(instructor)โ‹ˆ(teachesโ‹ˆฮ course_id,ย title(course)))\Pi_{\text{name,~title}}(\sigma_{\text{dept\_name= 'Music'}} (\text{instructor}) \bowtie (\text{teaches} \bowtie \Pi_{\text{course\_id,~title}} (\text{course})))ฮ name,ย titleโ€‹(ฯƒdept_name=ย โ€™Musicโ€™โ€‹(instructor)โ‹ˆ(teachesโ‹ˆฮ course_id,ย titleโ€‹(course)))

Example with Multiple Transformations

  • Query: "2017๋…„์— ๊ฐ•์˜๋ฅผ ๋‹ด๋‹นํ•œ ์Œ์•…ํ•™๊ณผ ์†Œ์† ๋ชจ๋“  ๊ต์ˆ˜์ง„์˜ ์„ฑ๋ช…์„ ํ•ด๋‹น ๊ต์ˆ˜์ง„์ด ๋‹ด๋‹นํ•œ ๊ฐ•์˜ ์ œ๋ชฉ๊ณผ ํ•จ๊ป˜ ์ฐพ์•„์ฃผ์„ธ์š”."
    • ฮ name,ย title(ฯƒdept_name=ย โ€™Musicโ€™โˆงyear=2017(instructorโ‹ˆ(teachesโ‹ˆฮ course_id,ย title(course))))\Pi_{\text{name,~title}}(\sigma_{\text{dept\_name= 'Music'} \land {year = 2017}} (\text{instructor} \bowtie (\text{teaches} \bowtie \Pi_{\text{course\_id,~title}} (\text{course}))))ฮ name,ย titleโ€‹(ฯƒdept_name=ย โ€™Musicโ€™โˆงyear=2017โ€‹(instructorโ‹ˆ(teachesโ‹ˆฮ course_id,ย titleโ€‹(course))))
  • Join associatively (๊ทœ์น™ 6a) ๋ณ€ํ™˜
    • ฮ name,ย title(ฯƒdept_name=ย โ€œMusicโ€โˆงyear=2017((instructorโ‹ˆteaches)โ‹ˆฮ course_id,ย title(course)))\Pi_{\text{name,~title}}(\sigma_{\text{dept\_name= โ€œMusicโ€} \land {year = 2017}} ((\text{instructor} \bowtie \text{teaches}) \bowtie \Pi_{\text{course\_id,~title}} (\text{course})))ฮ name,ย titleโ€‹(ฯƒdept_name=ย โ€œMusicโ€โˆงyear=2017โ€‹((instructorโ‹ˆteaches)โ‹ˆฮ course_id,ย titleโ€‹(course)))
  • ๋‘ ๋ฒˆ์งธ ํ˜•ํƒœ๋Š” 'selection ์กฐ๊ธฐ ์ˆ˜ํ–‰' ๊ทœ์น™ ์ ์šฉ ๊ธฐํšŒ ์ œ๊ณต (๊ทœ์น™ 7a, 7b)
  • ฯƒdept_nameย =ย โ€œMusicโ€(instructor)โ‹ˆฯƒyearย =ย 2017(teaches)\sigma_{\text{dept\_name = โ€œMusicโ€}}(\text{instructor}) \bowtie \sigma_{\text{year = 2017}}(\text{teaches})ฯƒdept_nameย =ย โ€œMusicโ€โ€‹(instructor)โ‹ˆฯƒyearย =ย 2017โ€‹(teaches)

alt text

Transformation Example: Pushing Projections

  • Projection์„ ๊ฐ€๋Šฅํ•œ ํ•œ ๋นจ๋ฆฌ ์ˆ˜ํ–‰ํ•˜๋ฉด joinํ•  relation์˜ ํฌ๊ธฐ ๊ฐ์†Œ
  • ฮ name,ย title((ฯƒdept_name=ย โ€œMusicโ€(instructor)โ‹ˆteaches)โ‹ˆฮ course_id,ย title(course))\Pi_{\text{name,~title}}((\sigma_{\text{dept\_name= โ€œMusicโ€}}(\text{instructor}) \bowtie \text{teaches}) \bowtie \Pi_{\text{course\_id,~title}} (\text{course}))ฮ name,ย titleโ€‹((ฯƒdept_name=ย โ€œMusicโ€โ€‹(instructor)โ‹ˆteaches)โ‹ˆฮ course_id,ย titleโ€‹(course))
  • (ฯƒdept_nameย =ย โ€œMusicโ€(instructor)โ‹ˆteaches)(\sigma_{\text{dept\_name = โ€œMusicโ€}}(\text{instructor}) \bowtie \text{teaches})(ฯƒdept_nameย =ย โ€œMusicโ€โ€‹(instructor)โ‹ˆteaches) ๊ณ„์‚ฐ ์‹œ,
    • schema๋Š” (ID, name, dept_name, salary, course_id, sec_id, semester, year)
  • ์ด ์ค‘ course_id (join์šฉ)์™€ name (output์šฉ)๋งŒ ํ•„์š”
  • ๋™๋“ฑ์„ฑ ๊ทœ์น™ 8a, 8b๋ฅผ ์‚ฌ์šฉํ•œ projection push
    • ฮ name,ย title(ฮ name,ย course_id(ฯƒdept_name=ย โ€œMusicโ€(instructor)โ‹ˆteaches))โ‹ˆฮ course_id,ย title(course))\Pi_{\text{name,~title}}(\Pi_{\text{name,~course\_id}} (\sigma_{\text{dept\_name= โ€œMusicโ€}}(\text{instructor}) \bowtie \text{teaches})) \bowtie \Pi_{\text{course\_id,~title}} (\text{course}))ฮ name,ย titleโ€‹(ฮ name,ย course_idโ€‹(ฯƒdept_name=ย โ€œMusicโ€โ€‹(instructor)โ‹ˆteaches))โ‹ˆฮ course_id,ย titleโ€‹(course))

Join Ordering

  • ๋ชจ๋“  relation r1,r2,r3r_1, r_2, r_3r1โ€‹,r2โ€‹,r3โ€‹์— ๋Œ€ํ•ด (r1โ‹ˆr2)โ‹ˆr3=r1โ‹ˆ(r2โ‹ˆr3)(r_1 \bowtie r_2) \bowtie r_3 = r_1 \bowtie (r_2 \bowtie r_3)(r1โ€‹โ‹ˆr2โ€‹)โ‹ˆr3โ€‹=r1โ€‹โ‹ˆ(r2โ€‹โ‹ˆr3โ€‹) (Join Associativity)
  • ๋งŒ์•ฝ r2โ‹ˆr3r_2 \bowtie r_3r2โ€‹โ‹ˆr3โ€‹์ด ๋งค์šฐ ํฌ๊ณ  r1โ‹ˆr2r_1 \bowtie r_2r1โ€‹โ‹ˆr2โ€‹๊ฐ€ ์ž‘๋‹ค๋ฉด, (r1โ‹ˆr2)โ‹ˆr3(r_1 \bowtie r_2) \bowtie r_3(r1โ€‹โ‹ˆr2โ€‹)โ‹ˆr3โ€‹๋ฅผ ์„ ํƒํ•˜์—ฌ ๋” ์ž‘์€ ์ž„์‹œ relation ๊ณ„์‚ฐ ๋ฐ ์ €์žฅ
  • ์˜ˆ: ฮ name,ย title((ฯƒdept_name=ย โ€œMusicโ€(instructor)โ‹ˆteaches)โ‹ˆฮ course_id,ย title(course))\Pi_{\text{name,~title}}((\sigma_{\text{dept\_name= โ€œMusicโ€}}(\text{instructor}) \bowtie \text{teaches}) \bowtie \Pi_{\text{course\_id,~title}} (\text{course}))ฮ name,ย titleโ€‹((ฯƒdept_name=ย โ€œMusicโ€โ€‹(instructor)โ‹ˆteaches)โ‹ˆฮ course_id,ย titleโ€‹(course))
  • teachesโ‹ˆฮ course_id,ย title(course)\text{teaches} \bowtie \Pi_{\text{course\_id,~title}}(\text{course})teachesโ‹ˆฮ course_id,ย titleโ€‹(course)๋ฅผ ๋จผ์ € ๊ณ„์‚ฐํ•  ์ˆ˜ ์žˆ์œผ๋‚˜, ์ฒซ join ๊ฒฐ๊ณผ๊ฐ€ ํฐ relation์ผ ๊ฐ€๋Šฅ์„ฑ ๋†’์Œ
  • Music department ๊ฐ•์‚ฌ ๋น„์œจ์€ ๋‚ฎ์„ ๊ฒƒ์ด๋ฏ€๋กœ, ฯƒdept_name=ย โ€œMusicโ€(instructor)โ‹ˆteaches\sigma_{\text{dept\_name= โ€œMusicโ€}}(\text{instructor}) \bowtie \text{teaches}ฯƒdept_name=ย โ€œMusicโ€โ€‹(instructor)โ‹ˆteaches๋ฅผ ๋จผ์ € ๊ณ„์‚ฐํ•˜๋Š” ๊ฒƒ์ด ๋” ์ข‹์Œ

Estimating Statistics of (Sub)Expression Results

Statistical Information for Cost Estimation

  • Database system catalog๋Š” ๋‹ค์Œ ํ†ต๊ณ„ ์ •๋ณด ์ €์žฅ
    • nrn_rnrโ€‹: relation rrr์˜ tuple ์ˆ˜
    • brb_rbrโ€‹: rrr์˜ tuple์„ ํฌํ•จํ•˜๋Š” block ์ˆ˜
    • lrl_rlrโ€‹: rrr์˜ tuple ํฌ๊ธฐ
    • frf_rfrโ€‹: rrr์˜ blocking factor (ํ•œ block์— ๋งž๋Š” rrr์˜ tuple ์ˆ˜)
      • ๋ฌผ๋ฆฌ์ ์œผ๋กœ rrr์˜ tuple์ด ํŒŒ์ผ์— ํ•จ๊ป˜ ์ €์žฅ๋˜๋ฉด
        • br=โŒˆnr/frโŒ‰b_r = \lceil n_r / f_r \rceilbrโ€‹=โŒˆnrโ€‹/frโ€‹โŒ‰
    • V(A,ย r)V(A,~r)V(A,ย r): attribute AAA์— ๋Œ€ํ•ด rrr์— ๋‚˜ํƒ€๋‚˜๋Š” distinct ๊ฐ’์˜ ์ˆ˜ (ฮ A(r)\Pi_A(r)ฮ Aโ€‹(r)์˜ ํฌ๊ธฐ์™€ ๋™์ผ)
  • Index์— ๋Œ€ํ•œ ํ†ต๊ณ„๋„ catalog์— ์œ ์ง€
    • fif_ifiโ€‹: B+-tree ๊ฐ™์€ tree ๊ตฌ์กฐ index iii์˜ internal node์˜ ํ‰๊ท  fan-out
    • HTiHT_iHTiโ€‹: index iii์˜ level ์ˆ˜ (๋†’์ด). B+-tree์˜ ๊ฒฝ์šฐ HTi=โŒˆlogโกfi(V(A,r))โŒ‰HT_i = \lceil \log_{f_i} (V(A,r)) \rceilHTiโ€‹=โŒˆlogfiโ€‹โ€‹(V(A,r))โŒ‰. Hash index๋Š” HTi=1HT_i = 1HTiโ€‹=1
    • LBiLB_iLBiโ€‹: iii์˜ ๊ฐ€์žฅ ๋‚ฎ์€ level์˜ index block ์ˆ˜ (leaf level block ์ˆ˜)

Histograms

  • ์‹ค์ œ optimizer๋Š” ์ข…์ข… ์ถ”๊ฐ€ ํ†ต๊ณ„ ์ •๋ณด ์œ ์ง€
  • ๋Œ€๋ถ€๋ถ„์˜ database๋Š” ๊ฐ attribute์˜ ๊ฐ’ ๋ถ„ํฌ๋ฅผ histogram์œผ๋กœ ์ €์žฅ
    • Equi-width histograms: ๊ฐ’์˜ ๋ฒ”์œ„๋ฅผ ๋™์ผํ•œ ํฌ๊ธฐ์˜ ๋ฒ”์œ„๋กœ ๋ถ„ํ• 
    • Equi-depth histograms: ๊ฐ ๋ฒ”์œ„๊ฐ€ (๊ฑฐ์˜) ๋™์ผํ•œ ์ˆ˜์˜ tuple์„ ๊ฐ–๋„๋ก ๋ฒ”์œ„ ๋ถ„ํ•  (๋” ์„ ํ˜ธ๋จ)
  • ๋งŽ์€ database๋Š” n๊ฐœ์˜ ๊ฐ€์žฅ ๋นˆ๋ฒˆํ•œ ๊ฐ’(most-frequent values)๊ณผ ๊ทธ ๊ฐœ์ˆ˜(count)๋ฅผ ์ €์žฅ
  • Histogram ๋ฐ ๊ธฐํƒ€ ํ†ต๊ณ„๋Š” ๋ณดํ†ต random sample ๊ธฐ๋ฐ˜์œผ๋กœ ๊ณ„์‚ฐ
  • ํ†ต๊ณ„๊ฐ€ ์ตœ์‹ ์ด ์•„๋‹ ์ˆ˜ ์žˆ์Œ (analyze ๋ช…๋ น ํ•„์š” ๋˜๋Š” ์ž๋™ ์žฌ๊ณ„์‚ฐ)
    • ์˜ˆ: relation ์•ˆ์˜ tuple์˜ ๊ฐœ์ˆ˜๊ฐ€ ๋ช‡ % ๋ณ€ํ™”ํ•˜๋Š” ๊ฒฝ์šฐ

Problem

  • ์ด์ „ ์—ฐ์‚ฐ์˜ output์„ input์œผ๋กœ ๋ฐ›๋Š” ์ค‘๊ฐ„ ์—ฐ์‚ฐ์˜ ๋น„์šฉ ์ถ”์ • ๋ฐฉ๋ฒ•?
  • E1,E2,E3E_1, E_2, E_3E1โ€‹,E2โ€‹,E3โ€‹ ํ†ต๊ณ„๋Š” ์žˆ์ง€๋งŒ E1โ‹ˆE2E_1 \bowtie E_2E1โ€‹โ‹ˆE2โ€‹ ํ†ต๊ณ„๋Š” ์—†์Œ
  • ๊ฐ ์—ฐ์‚ฐ output์— ๋Œ€ํ•œ ํ†ต๊ณ„ ๊ฐ’ '์ถ”์ •' ํ•„์š”

Selection Size Estimation

  • ฯƒA=v(r)\sigma_{A=v} (r)ฯƒA=vโ€‹(r)
    • nr/V(A,r)n_r / V(A,r)nrโ€‹/V(A,r): selection์„ ๋งŒ์กฑํ•  ๊ฒƒ์œผ๋กœ ์ถ”์ •๋˜๋Š” record ์ˆ˜ (A ๊ฐ’์ด ๋™์ผํ•œ tuple์˜ ํ‰๊ท  ์ˆ˜)
    • Key attribute์— ๋Œ€ํ•œ ๋™๋“ฑ ์กฐ๊ฑด: ํฌ๊ธฐ ์ถ”์ • = 1
  • ฯƒAโ‰คv(r)\sigma_{A \le v} (r)ฯƒAโ‰คvโ€‹(r) ( ฯƒAโ‰ฅv(r)\sigma_{A \ge v} (r)ฯƒAโ‰ฅvโ€‹(r)๋„ ๋Œ€์นญ)
    • Attribute ๊ฐ’์˜ 'uniform' ๋ถ„ํฌ ๊ฐ€์ •
    • ccc = ์กฐ๊ฑด์„ ๋งŒ์กฑํ•˜๋Š” ์ถ”์ • tuple ์ˆ˜
    • min(A,ย r)\text{min}(A,~r)min(A,ย r)๊ณผ max(A,ย r)\text{max}(A,~r)max(A,ย r)์„ catalog์—์„œ ์‚ฌ์šฉ ๊ฐ€๋Šฅํ•˜๋ฉด:
      • c=0c = 0c=0, (if v<min(A,ย r)v < \text{min}(A,~r)v<min(A,ย r))
      • c=nrc = n_rc=nrโ€‹, (if vโ‰ฅmax(A,ย r)v \ge \text{max}(A,~r)vโ‰ฅmax(A,ย r))
      • c=nrโ‹…vโˆ’min(A,ย r)max(A,ย r)โˆ’min(A,ย r)c = n_r \cdot \frac{v - \text{min}(A,~r)}{\text{max}(A,~r) - \text{min}(A,~r)}c=nrโ€‹โ‹…max(A,ย r)โˆ’min(A,ย r)vโˆ’min(A,ย r)โ€‹, (otherwise)
    • Histogram ์‚ฌ์šฉ ๊ฐ€๋Šฅํ•˜๋ฉด ์œ„ ์ถ”์ •์น˜ ๊ฐœ์„  ๊ฐ€๋Šฅ
    • ํ†ต๊ณ„ ์ •๋ณด ๋ถ€์žฌ ์‹œ, c=nr2c = \frac{n_r} {2}c=2nrโ€‹โ€‹๋กœ ๊ฐ€์ •

alt text

Size Estimation of Complex Selections

  • ์กฐ๊ฑด ฮธi\theta_iฮธiโ€‹์˜ Selectivity: relation rrr์˜ tuple์ด ฮธi\theta_iฮธiโ€‹๋ฅผ ๋งŒ์กฑํ•  ํ™•๋ฅ 
    • sis_isiโ€‹๊ฐ€ rrr์—์„œ ๋งŒ์กฑํ•˜๋Š” tuple ์ˆ˜ (์ด์ „ ์Šฌ๋ผ์ด๋“œ์—์„œ ์ถ”์ •)์ด๋ฉด, ฮธi\theta_iฮธiโ€‹์˜ selectivity๋Š” si/nrs_i / n_rsiโ€‹/nrโ€‹
  • Conjunction
    • ฯƒฮธ1โˆงฮธ2โˆงโ‹ฏโˆงฮธn(r)\sigma_{\theta_1 \land \theta_2 \land \dots \land \theta_n} (r)ฯƒฮธ1โ€‹โˆงฮธ2โ€‹โˆงโ‹ฏโˆงฮธnโ€‹โ€‹(r)
    • ๊ฒฐ๊ณผ tuple ์ถ”์ •์น˜ (๋…๋ฆฝ์„ฑ ๊ฐ€์ •)
      • nrโ‹…(s1nr)โ‹…(s2nr)โ‹ฏ(snnr)n_r \cdot (\frac{s_1}{n_r}) \cdot (\frac{s_2}{n_r}) \cdots (\frac{s_n}{n_r})nrโ€‹โ‹…(nrโ€‹s1โ€‹โ€‹)โ‹…(nrโ€‹s2โ€‹โ€‹)โ‹ฏ(nrโ€‹snโ€‹โ€‹)
  • Disjunction
    • ฯƒฮธ1โˆจฮธ2โˆจโ‹ฏโˆจฮธn(r)\sigma_{\theta_1 \lor \theta_2 \lor \dots \lor \theta_n} (r)ฯƒฮธ1โ€‹โˆจฮธ2โ€‹โˆจโ‹ฏโˆจฮธnโ€‹โ€‹(r)
    • ์ถ”์ • tuple ์ˆ˜
      • nr(1โˆ’(1โˆ’s1nr)(1โˆ’s2nr)โ€ฆ(1โˆ’snnr))n_r \left( 1 - (1 - \frac{s_1}{n_r})(1 - \frac{s_2}{n_r}) \dots (1 - \frac{s_n}{n_r}) \right)nrโ€‹(1โˆ’(1โˆ’nrโ€‹s1โ€‹โ€‹)(1โˆ’nrโ€‹s2โ€‹โ€‹)โ€ฆ(1โˆ’nrโ€‹snโ€‹โ€‹))
  • Negation
    • ฯƒยฌฮธ(r)\sigma_{\neg \theta} (r)ฯƒยฌฮธโ€‹(r)
    • ์ถ”์ • tuple ์ˆ˜
      • nrn_rnrโ€‹ โ€“ (ฯƒฮธ(r)\sigma_{\theta} (r)ฯƒฮธโ€‹(r)์˜ ์ถ”์ • tuple ์ˆ˜)

Join Size Estimation: A Running Example

  • ์˜ˆ์ œ Query: student โ‹ˆ\bowtieโ‹ˆ takes
  • Catalog ์ •๋ณด:
    • nstudent=5000n_{\text{student}} = 5000nstudentโ€‹=5000
    • fstudent=50f_{\text{student}} = 50fstudentโ€‹=50, bstudent=100b_{\text{student}} = 100bstudentโ€‹=100
    • ntakes=10000n_{\text{takes}} = 10000ntakesโ€‹=10000
    • ftakes=25f_{\text{takes}} = 25ftakesโ€‹=25, btakes=400b_{\text{takes}} = 400btakesโ€‹=400
    • V(ID,ย takes)=2500V(\text{ID,~takes}) = 2500V(ID,ย takes)=2500 (๊ณผ๋ชฉ์„ ์ˆ˜๊ฐ•ํ•œ ํ•™์ƒ์€ ํ‰๊ท  4๊ณผ๋ชฉ ์ˆ˜๊ฐ•)
    • takes์˜ ID๋Š” student๋ฅผ ์ฐธ์กฐํ•˜๋Š” foreign key
    • V(ID,ย student)=5000V(\text{ID,~student}) = 5000V(ID,ย student)=5000 (primary key)

Join Size Estimation

  • Cartesian product rร—sr \times srร—s๋Š” nrโ‹…nsn_r \cdot n_snrโ€‹โ‹…nsโ€‹ ๊ฐœ์˜ tuple ํฌํ•จ. ๊ฐ tuple์€ lr+lsl_r + l_slrโ€‹+lsโ€‹ bytes
  • rโ‹ˆs=rร—sr \bowtie s = r \times srโ‹ˆs=rร—s, (if RโˆฉS=โˆ…R \cap S = \emptysetRโˆฉS=โˆ…)
  • RโˆฉSR \cap SRโˆฉS๊ฐ€ RRR์˜ key์ธ ๊ฒฝ์šฐ
    • sss์˜ tuple์€ rrr์˜ ์ตœ๋Œ€ 1๊ฐœ tuple๊ณผ join. ( rโ‹ˆsr \bowtie srโ‹ˆs์˜ tuple ์ˆ˜) โ‰คns\le n_sโ‰คnsโ€‹
  • RโˆฉSR \cap SRโˆฉS๊ฐ€ SSS์—์„œ RRR์„ ์ฐธ์กฐํ•˜๋Š” foreign key์ธ ๊ฒฝ์šฐ: (rโ‹ˆsr \bowtie srโ‹ˆs์˜ tuple ์ˆ˜) =ns= n_s=nsโ€‹ (foreign key ์ œ์•ฝ์กฐ๊ฑด ๋•Œ๋ฌธ)
  • student โ‹ˆ\bowtieโ‹ˆ takes ์˜ˆ์ œ: takes์˜ ID๋Š” student๋ฅผ ์ฐธ์กฐํ•˜๋Š” foreign key, ๋”ฐ๋ผ์„œ ๊ฒฐ๊ณผ๋Š” ntakesn_{\text{takes}}ntakesโ€‹ (10000)๊ฐœ์˜ tuple ๊ฐ€์ง
  • RโˆฉS={A}R \cap S = \{A\}RโˆฉS={A}๊ฐ€ RRR ๋˜๋Š” SSS์˜ key๊ฐ€ ์•„๋‹Œ ๊ฒฝ์šฐ:
    • RRR์˜ ๋ชจ๋“  tuple ttt๊ฐ€ Rโ‹ˆSR \bowtie SRโ‹ˆS์—์„œ tuple์„ ์ƒ์„ฑํ•œ๋‹ค๊ณ  ๊ฐ€์ •ํ•˜๋ฉด, Rโ‹ˆSR \bowtie SRโ‹ˆS์˜ tuple ์ˆ˜ ์ถ”์ •
      • nrโ‹…nsV(A,ย s)\frac{n_r \cdot n_s}{V(A,~s)}V(A,ย s)nrโ€‹โ‹…nsโ€‹โ€‹
    • ๋ฐ˜๋Œ€์˜ ๊ฒฝ์šฐ (S ๊ธฐ์ค€) ์ถ”์ •
      • nrโ‹…nsV(A,ย r)\frac{n_r \cdot n_s}{V(A,~r)}V(A,ย r)nrโ€‹โ‹…nsโ€‹โ€‹
    • ์ด ๋‘ ์ถ”์ •์น˜ ์ค‘ ๋‚ฎ์€ ๊ฐ’์ด ์•„๋งˆ๋„ ๋” ์ •ํ™• (join์— ์ฐธ์—ฌํ•˜์ง€ ์•Š๋Š” dangling tuple ์กด์žฌ ๊ฐ€๋Šฅ)
  • Foreign key ์ •๋ณด ์—†์ด student โ‹ˆ\bowtieโ‹ˆ takes ํฌ๊ธฐ ์ถ”์ •:
    • V(ID,ย takes)=2500V(\text{ID,~takes}) = 2500V(ID,ย takes)=2500, V(ID,ย student)=5000V(\text{ID,~student}) = 5000V(ID,ย student)=5000
    • ๋‘ ์ถ”์ •์น˜
      • (5000โ‹…10000)/2500=20000(5000 \cdot 10000) / 2500 = 20000(5000โ‹…10000)/2500=20000 ๋ฐ (5000โ‹…10000)/5000=10000(5000 \cdot 10000) / 5000 = 10000(5000โ‹…10000)/5000=10000
    • ๋” ๋‚ฎ์€ ์ถ”์ •์น˜ 10000 ์„ ํƒ (foreign key ์‚ฌ์šฉํ•œ ์ด์ „ ๊ณ„์‚ฐ๊ณผ ๋™์ผ)

Size Estimation for Other Operations

  • Projection
    • ฮ A(r)\Pi_A (r)ฮ Aโ€‹(r)์˜ ์ถ”์ • ํฌ๊ธฐ = V(A,ย r)V(A,~r)V(A,ย r)
  • Set operations:
    • ๋™์ผ relation์— ๋Œ€ํ•œ selection์˜ union/intersection: ์žฌ์ž‘์„ฑ ํ›„ selection ํฌ๊ธฐ ์ถ”์ • ์‚ฌ์šฉ (์˜ˆ: ฯƒฮธ1(r)โˆชฯƒฮธ2(r)โ†’ฯƒฮธ1โˆจฮธ2(r)\sigma_{\theta_1} (r) \cup \sigma_{\theta_2} (r) \rightarrow \sigma_{\theta_1 \lor \theta_2} (r)ฯƒฮธ1โ€‹โ€‹(r)โˆชฯƒฮธ2โ€‹โ€‹(r)โ†’ฯƒฮธ1โ€‹โˆจฮธ2โ€‹โ€‹(r))
    • ๋‹ค๋ฅธ relation์— ๋Œ€ํ•œ ์—ฐ์‚ฐ:
      • rโˆชsr \cup srโˆชs ์ถ”์ • ํฌ๊ธฐ = nr+nsn_r + n_snrโ€‹+nsโ€‹
      • rโˆฉsr \cap srโˆฉs ์ถ”์ • ํฌ๊ธฐ = minโก(nr,ย ns)\min(n_r,~n_s)min(nrโ€‹,ย nsโ€‹)
      • rโˆ’sr - srโˆ’s ์ถ”์ • ํฌ๊ธฐ = nrn_rnrโ€‹
    • (์„ธ ์ถ”์ •์น˜ ๋ชจ๋‘ ๋ถ€์ •ํ™•ํ•  ์ˆ˜ ์žˆ์œผ๋‚˜ upper bound ์ œ๊ณต)

Estimation of Number of Distinct Values

  • ๊ฐ ์—ฐ์‚ฐ ๊ฒฐ๊ณผ์˜ attribute AAA์— ๋Œ€ํ•œ distinct ๊ฐ’์˜ ์ˆ˜ V(A,โ€ฆโ€‰)V(A, \dots)V(A,โ€ฆ) ์ถ”์ • ํ•„์š”
  • Selections: ฯƒฮธ(r)\sigma_{\theta} (r)ฯƒฮธโ€‹(r)
    • ฮธ\thetaฮธ๊ฐ€ AAA๋ฅผ ํŠน์ • ๊ฐ’(์˜ˆ: A=3A=3A=3)์œผ๋กœ ๊ฐ•์ œ
      • V(A,ย ฯƒฮธ(r))=1V(A,~\sigma_{\theta} (r)) = 1V(A,ย ฯƒฮธโ€‹(r))=1
    • ฮธ\thetaฮธ๊ฐ€ AAA๋ฅผ ํŠน์ • ๊ฐ’ ์ง‘ํ•ฉ ์ค‘ ํ•˜๋‚˜(์˜ˆ: A=1โˆจA=3โˆจA=4A=1 \lor A=3 \lor A=4A=1โˆจA=3โˆจA=4)๋กœ ๊ฐ•์ œ
      • V(A,ย ฯƒฮธ(r))=V(A,~\sigma_{\theta} (r)) =V(A,ย ฯƒฮธโ€‹(r))= (๋ช…์‹œ๋œ ๊ฐ’์˜ ์ˆ˜)
    • ฮธ\thetaฮธ๊ฐ€ Aย (op)ย vA \text{ (op) } vAย (op)ย v ํ˜•ํƒœ
      • V(A,ย ฯƒฮธ(r))โ‰ˆV(A,ย r)โ‹…sV(A,~\sigma_{\theta} (r)) \approx V(A,~r) \cdot sV(A,ย ฯƒฮธโ€‹(r))โ‰ˆV(A,ย r)โ‹…s (sss๋Š” selection์˜ selectivity)
    • ๊ทธ ์™ธ
      • minโก(V(A,ย r),ย nฯƒฮธ(r))\min(V(A,~r),~n_{\sigma_{\theta}(r)})min(V(A,ย r),ย nฯƒฮธโ€‹(r)โ€‹) (upper bound)
  • Joins: rโ‹ˆsr \bowtie srโ‹ˆs
    • AAA๊ฐ€ rrr์˜ attribute์ธ ๊ฒฝ์šฐ
      • V(A,ย rโ‹ˆs)โ‰ˆminโก(V(A,ย r),ย nrโ‹ˆs)V(A,~r \bowtie s) \approx \min(V(A,~r),~n_{r \bowtie s})V(A,ย rโ‹ˆs)โ‰ˆmin(V(A,ย r),ย nrโ‹ˆsโ€‹)
  • Projections: ฮ A(r)\Pi_A (r)ฮ Aโ€‹(r)
    • V(A,ย ฮ A(r))=V(A,ย r)V(A,~\Pi_A (r)) = V(A,~r)V(A,ย ฮ Aโ€‹(r))=V(A,ย r)

Choice of Evaluation Plans

  • Cost-based optimizer: ์ฃผ์–ด์ง„ query์™€ ๋™๋“ฑํ•œ ๋ชจ๋“  query-evaluation plan ํƒ์ƒ‰, ์ตœ์†Œ ์ถ”์ • ๋น„์šฉ plan ์„ ํƒ
  • Plan ์„ ํƒ ์‹œ evaluation technique ๊ฐ„ ์ƒํ˜ธ์ž‘์šฉ ๊ณ ๋ ค ํ•„์š”
  • ๊ฐ ์—ฐ์‚ฐ์— ๋Œ€ํ•ด ๋…๋ฆฝ์ ์œผ๋กœ ๊ฐ€์žฅ ์ €๋ ดํ•œ algorithm ์„ ํƒ์ด ์ตœ์ ์˜ ์ „์ฒด algorithm์„ ๋ณด์žฅํ•˜์ง€ ์•Š์Œ
    • ์˜ˆ: Merge-join์ด hash-join๋ณด๋‹ค ๋น„์‹ธ์ง€๋งŒ, ์ •๋ ฌ๋œ output ์ œ๊ณตํ•˜์—ฌ ์™ธ๋ถ€ ์ง‘๊ณ„(aggregation) ๋น„์šฉ ์ค„์ผ ์ˆ˜ ์žˆ์Œ
    • (Block) nested-loop join์€ pipelining ๊ธฐํšŒ ์ œ๊ณต ๊ฐ€๋Šฅ
  • ์‹ค์ œ query optimizer๋Š” ๋‹ค์Œ ๋‘ ์ ‘๊ทผ ๋ฐฉ์‹ ์š”์†Œ ํ†ตํ•ฉ
    1. ๋ชจ๋“  plan ํƒ์ƒ‰ ํ›„ cost-based ๋ฐฉ์‹์œผ๋กœ ์ตœ์„  plan ์„ ํƒ
    2. Heuristics๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ plan ์„ ํƒ

Cost-based Join Order Selection

  • r1โ‹ˆr2โ‹ˆโ‹ฏโ‹ˆrnr_1 \bowtie r_2 \bowtie \dots \bowtie r_nr1โ€‹โ‹ˆr2โ€‹โ‹ˆโ‹ฏโ‹ˆrnโ€‹์— ๋Œ€ํ•œ ์ตœ์  join-order ์ฐพ๊ธฐ
  • (2(nโˆ’1))!/(nโˆ’1)!(2(n-1))! / (n-1)!(2(nโˆ’1))!/(nโˆ’1)! ๊ฐœ์˜ ๋‹ค๋ฅธ join order ์กด์žฌ
  • n=3n=3n=3์ผ ๋•Œ 12๊ฐœ

r1โ‹ˆ(r2โ‹ˆr3)r1โ‹ˆ(r3โ‹ˆr2)(r2โ‹ˆr3)โ‹ˆr1(r3โ‹ˆr2)โ‹ˆr1r2โ‹ˆ(r1โ‹ˆr3)r2โ‹ˆ(r3โ‹ˆr1)(r1โ‹ˆr3)โ‹ˆr2(r3โ‹ˆr1)โ‹ˆr2r3โ‹ˆ(r1โ‹ˆr2)r3โ‹ˆ(r2โ‹ˆr1)(r1โ‹ˆr2)โ‹ˆr3(r2โ‹ˆr1)โ‹ˆr3\begin{array}{cccc} r_1 \bowtie (r_2 \bowtie r_3) & r_1 \bowtie (r_3 \bowtie r_2) & (r_2 \bowtie r_3) \bowtie r_1 & (r_3 \bowtie r_2) \bowtie r_1 \\ r_2 \bowtie (r_1 \bowtie r_3) & r_2 \bowtie (r_3 \bowtie r_1) & (r_1 \bowtie r_3) \bowtie r_2 & (r_3 \bowtie r_1) \bowtie r_2 \\ r_3 \bowtie (r_1 \bowtie r_2) & r_3 \bowtie (r_2 \bowtie r_1) & (r_1 \bowtie r_2) \bowtie r_3 & (r_2 \bowtie r_1) \bowtie r_3 \end{array} r1โ€‹โ‹ˆ(r2โ€‹โ‹ˆr3โ€‹)r2โ€‹โ‹ˆ(r1โ€‹โ‹ˆr3โ€‹)r3โ€‹โ‹ˆ(r1โ€‹โ‹ˆr2โ€‹)โ€‹r1โ€‹โ‹ˆ(r3โ€‹โ‹ˆr2โ€‹)r2โ€‹โ‹ˆ(r3โ€‹โ‹ˆr1โ€‹)r3โ€‹โ‹ˆ(r2โ€‹โ‹ˆr1โ€‹)โ€‹(r2โ€‹โ‹ˆr3โ€‹)โ‹ˆr1โ€‹(r1โ€‹โ‹ˆr3โ€‹)โ‹ˆr2โ€‹(r1โ€‹โ‹ˆr2โ€‹)โ‹ˆr3โ€‹โ€‹(r3โ€‹โ‹ˆr2โ€‹)โ‹ˆr1โ€‹(r3โ€‹โ‹ˆr1โ€‹)โ‹ˆr2โ€‹(r2โ€‹โ‹ˆr1โ€‹)โ‹ˆr3โ€‹โ€‹

  • n=7n=7n=7์ผ ๋•Œ 665,280๊ฐœ, n=10n=10n=10์ผ ๋•Œ 1760์–ต ๊ฐœ ์ด์ƒ
  • ๋ชจ๋“  join order ์ƒ์„ฑ ๋ถˆํ•„์š”
  • Dynamic programming ์‚ฌ์šฉ
    • {r1,ย r2,ย โ€ฆ,ย rn}\{r_1,~r_2,~\dots,~r_n\}{r1โ€‹,ย r2โ€‹,ย โ€ฆ,ย rnโ€‹}์˜ ๋ชจ๋“  ๋ถ€๋ถ„์ง‘ํ•ฉ(subset)์— ๋Œ€ํ•œ ์ตœ์†Œ ๋น„์šฉ join order๋ฅผ ํ•œ ๋ฒˆ๋งŒ ๊ณ„์‚ฐํ•˜๊ณ  ํ–ฅํ›„ ์‚ฌ์šฉ ์œ„ํ•ด ์ €์žฅ
  • ์˜ˆ: (r1โ‹ˆr2โ‹ˆr3)โ‹ˆr4โ‹ˆr5(r_1 \bowtie r_2 \bowtie r_3) \bowtie r_4 \bowtie r_5(r1โ€‹โ‹ˆr2โ€‹โ‹ˆr3โ€‹)โ‹ˆr4โ€‹โ‹ˆr5โ€‹
  • {r1,ย r2,ย r3}\{r_1,~r_2,~r_3\}{r1โ€‹,ย r2โ€‹,ย r3โ€‹}์— ๋Œ€ํ•œ ์ตœ์  join order ์ฐพ์œผ๋ฉด, r4,r5r_4, r_5r4โ€‹,r5โ€‹์™€์˜ ์ถ”๊ฐ€ join์— ํ•ด๋‹น order ์‚ฌ์šฉ

Left-Deep Join Trees

  • Left-deep join tree์—์„œ ๊ฐ join์˜ ์˜ค๋ฅธ์ชฝ ์ž…๋ ฅ(right-hand-side input)์€ (์ค‘๊ฐ„ join ๊ฒฐ๊ณผ๊ฐ€ ์•„๋‹Œ) relation
  • Pipelined evaluation์— ํŠนํžˆ ํŽธ๋ฆฌ (์˜ค๋ฅธ์ชฝ ํ”ผ์—ฐ์‚ฐ์ž๊ฐ€ ์ €์žฅ๋œ relation์ด๋ฏ€๋กœ ๊ฐ join์˜ ์ž…๋ ฅ ์ค‘ ํ•˜๋‚˜๋งŒ pipelined)
  • System R optimizer๋Š” left-deep join order๋งŒ ๊ณ ๋ ค
  • ์ด n!n!n! ๊ฐœ์˜ join order ( (2(nโˆ’1))!/(nโˆ’1)!(2(n-1))! / (n-1)!(2(nโˆ’1))!/(nโˆ’1)! ๋ณด๋‹ค ํ›จ์”ฌ ์ ์Œ) alt text

Heuristic Optimization

  • Cost-based optimization์€ dynamic programming ์‚ฌ์šฉํ•ด๋„ ๋น„์šฉ ๋†’์Œ
  • System์€ cost-based ๋ฐฉ์‹์œผ๋กœ ์„ ํƒํ•ด์•ผ ํ•˜๋Š” ๋Œ€์•ˆ ์ˆ˜๋ฅผ ์ค„์ด๊ธฐ ์œ„ํ•ด heuristics ์‚ฌ์šฉ ๊ฐ€๋Šฅ
  • Heuristic optimization: ์‹คํ–‰ ์„ฑ๋Šฅ์„ (ํ•ญ์ƒ์€ ์•„๋‹ˆ์ง€๋งŒ) ์ผ๋ฐ˜์ ์œผ๋กœ ํ–ฅ์ƒ์‹œํ‚ค๋Š” ๊ทœ์น™ ์ง‘ํ•ฉ์„ ์‚ฌ์šฉํ•˜์—ฌ query-tree ๋ณ€ํ™˜
    • Selection ์กฐ๊ธฐ ์ˆ˜ํ–‰ (tuple ์ˆ˜ ๊ฐ์†Œ)
    • Projection ์กฐ๊ธฐ ์ˆ˜ํ–‰ (attribute ์ˆ˜ ๊ฐ์†Œ)
    • ๋‹ค๋ฅธ ์œ ์‚ฌ ์—ฐ์‚ฐ๋ณด๋‹ค ๊ฐ€์žฅ ์ œํ•œ์ ์ธ(๊ฐ€์žฅ ์ž‘์€ ๊ฒฐ๊ณผ ํฌ๊ธฐ) selection ๋ฐ join ์—ฐ์‚ฐ ๋จผ์ € ์ˆ˜ํ–‰
  • ์ผ๋ถ€ system์€ heuristics๋งŒ ์‚ฌ์šฉ, ๋‹ค๋ฅธ system์€ heuristics์™€ ๋ถ€๋ถ„์  cost-based optimization ๊ฒฐํ•ฉ
์ตœ๊ทผ ์ˆ˜์ •: 25. 11. 30. ์˜คํ›„ 2:45
Contributors: kmbzn
Prev
15. Query Processing
Next
17. Transactions

BUILT WITH

CloudflareNode.jsGitHubGitVue.jsJavaScriptVSCodenpm

All trademarks and logos are property of their respective owners.
ยฉ 2025 kmbzn ยท MIT License