Skip to content
This repository has been archived by the owner on Sep 5, 2020. It is now read-only.

Latest commit

 

History

History
16 lines (14 loc) · 1.22 KB

50-Day6.md

File metadata and controls

16 lines (14 loc) · 1.22 KB

Lecture 6 Summary

SQL

In this lecture, we learned about:

  • Multiset operators(intersect, union, compositional)
  • Equivalent SQL queries: Relational Algebra tends to be faster

IO model

Generally speaking, data is read from the hard disk into the cache and then stored in memory. The hard disk can be divided into many pages.The hard disk can be divided into many pages.

When operating the file, first read the corresponding page and save it in the cache; modify the page in the cache and save it to disk, or give up.

External sort

This is a sorting algorithm that requires the use of I / O caches. Input the ordinal sequence from two hard disks and the combined ordinal sequence from the output hard disk. The algorithm requires three pages of cached data space. This sorting algorithm is more suitable for sorting a large amount of data in external memory than sorting in memory.

Index of data

Similar to book management, data should also be searched by index. In the actual implementation, a B + tree is used for indexing, and each node can be accommodated by an external page. The use of the B + tree reduces the height of the tree and reduces the total number of reads to the external memory to reduce the cost of indexing data.