A Sort of Balanced Binary Trees

Table Of Contents

Disclaimer

I don't clam this is good code or a good algorithm or good for anything in particular. I'm sure that someone previously came up with the same "stuff". It is me just playing around with code. Take it for what it is.

Result Summary

I hypothesized that shuffling a few nodes during node insertion could make a binary search tree faster by shorting the number of levels (see diagrams below).

There are two different ways of applying my algorithm. btree_9 applies the algorithm on each new node as it is inserted in a tree. btree_a applies the algorithm after the tree is built.

The results of creating the code and testing it is that btree_9 does not work. btree_a, however, can make slight improvements in some cases, but in a few cases it make things slightly worse. (See the code/documentation/diagrams below.)

After seeing that the first hypothesis did not work in many cases, I hypothesized that perhaps the number of nodes in the tree had some effect (small tree vs large tree). This also did not work.

The results of testing various tree sizes indicated the following:

The result of various tree sizes (number of nodes) appears to be random good and bad. In general shuffling nodes does not make things better. In fact, it can make things much worse.

It was however fun playing with the code.

Introduction

This is me playing around with binary search trees.

Depending on the order of the incoming data, a tree can become very un-balanced, awkward, and slow searches down.

I was thinking, "How can I ameliorate really bad trees?". My code will not build a perfect binary tree, but perhaps build a tree that would speed up searches (a shallower, not as deep, tree). It would also be great if this was done at tree build time rather than after the fact. (This could be described as "build seldom" and "search often" data usage.)

Sure I could Google "stuff" (algorithms, etc.), but that's no fun. I wanted to see if I could come up with something. That is why the title is "A Sort of Balanced Binary Trees".

The following diagram shows a "bad" binary tree. As you can see it is much deeper that it needs to be.

image missing

I am hoping that my code will ameliorate the worst cases. If not, it was fun playing around with binary trees.

The code is not as compact as it could be, but hopefully it is easier to read and understand.

Also, there is code commented out. This code is "I tried out code" or "what if code" or "debugging code" or ... Remove it if you want.

By the way, I am using Python 3.

Node Class Definition

class Node:
    def __init__(self, data):
        self.data    = data
        self.left    = None
        self.right   = None
        self.parent  = None

    def PrintNode(self,title='',single_line = True):
        d = 'None' if self.data   == None else self.data
        l = 'None' if self.left   == None else self.left.data
        r = 'None' if self.right  == None else self.right.data
        p = 'None' if self.parent == None else self.parent.data
        if title: print(title)
        if single_line:
            print('(d={},l={},r={},p={})'.format(d,l,r,p))
        else:
            print('data   = {}'.format(d))
            print('left   = {}'.format(l))
            print('right  = {}'.format(r))
            print('parent = {}'.format(p))

image missing

You should know several things about the Node class definition

BinaryTree Class Definition

class BinaryTree:

    def __init__(self):
        self.root    = None
        self.balance = False
        self.debug   = False

    .... tree functions not show here ....
    .... see the link  below          ....

You should know several things about the BinaryTree class definition

Diagrams

As you can see in the diagrams below, adding node 55 will make the depth of the right subtree longer. However, if node 50 has an empty left subtree, the nodes can be shuffled to keep the subtrees the same depth. The shuffling can be done when node 55 is inserted into the tree as part of the insert process.

Insert Node 55
image missing
After Balancing
image missing

As you can see in the diagrams below, adding node 5 will make the depth of the left subtree longer. However, if node 10 has an empty right subtree, the nodes can be shuffled to keep the subtrees the same depth. The shuffling can be done when node 5 is inserted into the tree as part of the insert process.

Insert Node 5
image missing
After Balancing
image missing

Pseudo Code

Pseudo Code #1 Try to balance each time a node is inserted.

Pseudo Code #2 Try to balance each node after the tree is built.

Files

btree_9 tries to balance a new node's child nodes when the new node is inserted into the tree.

btree_a tries to balance a tree after the tree is create.

btree_9.py
btree_9_output.txt (sample output)
btree_9_test.py
btree_9.html (Pydoc)
btree_a.py
btree_a_output.txt (sample output)
btree_a_test.py
btree_a.html (Pydoc)

btree_?_raw_stats.py creates raw data for analysis

btree_9_raw_stats.py
btree_9_raw_stats.txt
btree_a_raw_stats.py
btree_a_raw_stats.txt

btree_?_plot_stats.py creates plot data for analysis1

btree_9_plot_stats.py
btree_9_plot_stats.csv
btree_9_plot_diff.py
btree_a_plot_stats.py
btree_a_plot_stats.csv
btree_a_plot_diff.py

1I have this idea that I could plot this data and get some meaningful information

btree_? functionality tests

btree_avg_test.py (for both btree_9 amd btree_a - see code)

Other btree programs from the web, etc.

Other BTree Files