I am trying to build a binary search tree, however, it is vital for the algorithm that I am implementing to do so with a vector to diminish cache misses. My original idea was to adapt something similar to the heap insertion technique , since data placement is the same and, once you add an item, you need to bubble sort up the branch to make sure the properties of each data structure are respected (thus the O(log n) complexity). However, adapting the insert function has proven trickier than anticipated.

This is the original working code for the binary heap:

```
template <typename DataType>
void BinHeap<DataType>:: Insert(const DataType& value)
{
data.push_back(value);
if(data.size() > 1)
{
BubbleUp(data.size() -1);
}
}
template <typename DataType>
void BinHeap<DataType>::BubbleUp(unsigned pos)
{
int parentPos = Parent(pos);
if(parentPos > 0 && data[parentPos] < data[pos])
{
std::swap(data[parentPos], data[pos]);
BubbleUp(parentPos);
}
}
```

And here is my attempt to adapt it into a vector based Binary Search Tree (please do not mind the odd naming of the class, as this is still not the final version):

```
template <typename DataType>
void BinHeap<DataType>:: Insert(const DataType& value)
{
data.push_back(value);
if(data.size() > 1)
{
BubbleUp(data.size() -1);
}
}
template <typename DataType>
void BinHeap<DataType>::BubbleUp(unsigned pos)
{
int parentPos = Parent(pos);
bool isLeftSon = LeftSon(parentPos) == pos;
if(parentPos >= 0)
{
if(isLeftSon && ( data[parentPos] < data[pos] ) )
{
std::swap(data[parentPos] , data[pos]);
}
else if (data[parentPos] > data[pos])// RightSon
{
std::swap(data[parentPos] , data[pos]);
}
BubbleUp(parentPos-1);
BubbleDown(parentPos-1);
}
}
template <typename DataType>
void BinHeap<DataType>::BubbleDown(unsigned pos)
{
int leftChild = LeftSon(pos);
int rightChild = RightSon(pos);
bool leftExists = leftChild < data.size() && leftChild > 0;
bool rightExists = rightChild < data.size() && rightChild > 0;
// No children
if(!leftExists && !rightExists)
{
return;
}
if(leftExists && data[pos] < data[leftChild])
{
std::swap(data[leftChild] , data[pos]);
}
else if (rightExists && data[pos] > data[rightChild])
{
std::swap(data[rightChild] , data[pos]);
}
}
```

This approach is able to guarantee that the properties of the BST are respected locally, but not across siblings or ancestors (grandparents, etc). For example, if every number from 1 to 16 is inserted in order, 12 will have a left child of 6 and right child of 14. However, it parent 16 will have a left child of 8 and a right child of 12 (thus 6 is on the right subtree of 16). I feel my current approach is over complicating the process, but I am not sure how to rearrange it to make the necessary changes in an efficient manner. Any insight would be greatly appreciated.

The realistic answer to the question title (which is at the time I composed this answer "How to create the insert function for a binary search tree built with a vector?") is: Don't do that!

It is clear from your code that you are trying to preserve the compact storage and self-balancing properties of a heap while also wishing it to be searchable via classic left/right child tree navigation. But, the heap trick of using (index-1)/2 to locate the parent node only works for a "perfectly balanced" tree. That is, the N element array is perfectly packed from 0 to N-1. And then, you expect an in-order walk of this tree to be sorted (if you didn't, then your binary left/right search navigation would not be able to find the right node).

Thus, you are maintaining a sorted set of elements in your array. Except, you have some strange rules for how to navigate the array to get the sorted order.

There is no way that your scheme can maintain a binary sorted array any simpler than a scheme that maintains a plain sorted array. The node manipulations only lead to a complicated piece of software that is difficult to understand, to maintain, and reason about correctly. A sorted array, on the other hand, is easy to understand and maintain, and is easy to see how it leads to a correct result. The binary search (or optionally, dictionary search) is fast.

While maintaining a sorted array requires a linear insertion logic, your scheme must be at least as complex, **because it is also maintaining a sorted set of elements in the array.**

If you want a data structure that is hardware data cache friendly, and provides logarithmic insertion and search, use a B^{+}-tree. It is a little more complex than your average data structure, but this is a case where the complexity can be worth it. Especially if regular trees just cause too much data cache thrash. As a bit of advice, optimal performance usually results if an interior node (with keys) is sized to fit within a cache line or two.