Inserting a value into an existing B-Tree so that it remains a valid B-Tree turns out to be amazingly easy for so complex a data structure. If there is still room in the page then insertion really is trivial. The only problem arises if the page is full, i.e. already has 2n items. In this case the full page is split into two new pages at the same level each containing n items and one of the items is inserted into the page above - see Figure 8.
Figure 8: Inserting an item into a B-Tree
Of course there is always the possibility that the item that you have to insert in the page above will need that page to be split and so on, propagating splits perhaps even as far up the tree as the root page.
Notice that the splitting operation propagating back to the root is the only way that a B-Tree ever gets any deeper - which is a weird way to grow a tree. You can work out a similar operation for deleting elements from a B-Tree.
The advantages of the B-Tree form of index are reasonably obvious:
You only need access at most as many pages as there are levels in the tree.
As long as pages are the same size as a disk sector, reading or writing a page only involves a single disk access.
At worst only 50% of the disk space allocated to the index is empty.
Updating the index requires each page involved in the update to be accessed and manipulated only once. A less obvious advantage is the local nature of the update.
In a multi-user system or LAN only the pages involved in the modification have to be locked making it possible to share an index efficiently.
There are subtle ways of improving the performance of a B-Tree by redistributing items between pages to achieve a better balance but this is icing on the tree.
Many database packages proclaim the fact that they are better because they use B-Trees - now you know why. If you need to make use of B-Trees yourself then you can program everything from scratch but there are plenty of B-Tree subroutine libraries that will save you hours of coding and now you understand why you need one and how they work.
Credits
I have to admit that my account of B-Trees is based on the one given by Niklaus Wirth in his classic book Algorithms+Data Structure=Programs. If you need a more complete but more abstract approach complete with code fragments then try to get hold of a copy. Alternatively turn to The Algorithm Design Manualor one of other books in the side panel.
Glossary
ancestor
a node above
binary tree
each node has at most two descendants
descendant
a node below
degree of node
the number of direct descendants
degree of tree
the maximum degree of any node in the tree
depth of tree
the maximum level of any node
interior node
any node that isn't a terminal node
internal path
length the sum of all the path lengths
leaf
a final node
level
the root is at level one, its direct descendants
are at level 2 and so on
ordered tree
one in which the order of the descendants from each node is important
path length of node
the distance of the node from the root
root
the first node
sub-tree
a complete set of nodes connected to any given node
You have probably heard of Quicksort but this is just one of many partitioning algorithms that work in clever ways to do things faster. Here we look at another of the algorithms devised by C.A.R. Hoar [ ... ]
John Conway's Life isn't just a fascinating program, it's an example of a cellular automaton. The theory of cellular automata (CA) sounds intimidating, but in fact it's simple and fun. It is a deep my [ ... ]