Zero  0.1.0
Public Types | Static Public Member Functions | Static Public Attributes | List of all members
btree_impl Class Reference

The internal implementation class which actually implements the functions of btree_m. More...

#include <btree_impl.h>

Inheritance diagram for btree_impl:

Public Types

enum  traverse_mode_t { t_fence_contain, t_fence_low_match, t_fence_high_match }
 3 modes of traverse(). More...
 
enum  slot_follow_t { t_follow_invalid = -3, t_follow_foster = -2, t_follow_pid0 = -1 }
 
enum  { GAC_HASH_BITS = 16, GAC_HASH_MOD = 65521 }
 

Static Public Member Functions

static rc_t _ux_get_page_and_status (StoreID store, const w_keystr_t &key, bool &need_lock, slotid_t &slot, bool &found, bool &took_XN, bool &is_ghost, btree_page_h &leaf)
 Function finds the leaf page for a key, locks the key if it exists, and determines if it was found and if it was a ghost. More...
 
static rc_t _ux_insert (StoreID store, const w_keystr_t &key, const cvec_t &elem)
 This function finds the leaf page to insert the given tuple, inserts it to the page and, if needed, splits the page. More...
 
static rc_t _ux_insert_core (StoreID store, const w_keystr_t &key, const cvec_t &elem)
 
static rc_t _ux_insert_core_tail (StoreID store, const w_keystr_t &key, const cvec_t &el, bool &need_lock, slotid_t &slot, bool &found, bool &alreay_took_XN, bool &is_ghost, btree_page_h &leaf)
 
static rc_t _ux_update (StoreID store, const w_keystr_t &key, const cvec_t &elem)
 This function finds the given key, updates the element if found. If needed, this method also splits the page. More...
 
static rc_t _ux_update_core (StoreID store, const w_keystr_t &key, const cvec_t &elem)
 
static rc_t _ux_update_core_tail (StoreID store, const w_keystr_t &key, const cvec_t &elem, bool &need_lock, slotid_t &slot, bool &found, bool &is_ghost, btree_page_h &leaf)
 
static rc_t _ux_put (StoreID store, const w_keystr_t &key, const cvec_t &elem)
 This function finds the given key, updates the element if found and inserts it if not. If needed, this method also splits the page. Could also be called "insert or update". More...
 
static rc_t _ux_put_core (StoreID store, const w_keystr_t &key, const cvec_t &elem)
 
static rc_t _ux_overwrite (StoreID store, const w_keystr_t &key, const char *el, smsize_t offset, smsize_t elen)
 This function finds the given key, updates the specific part of element if found. More...
 
static rc_t _ux_overwrite_core (StoreID store, const w_keystr_t &key, const char *el, smsize_t offset, smsize_t elen)
 
static rc_t _sx_reserve_ghost (btree_page_h &leaf, const w_keystr_t &key, int elem_len)
 Creates a ghost record for the key as a preparation for insert. Context: System transaction. More...
 
static rc_t _ux_reserve_ghost_core (btree_page_h &leaf, const w_keystr_t &key, int elem_len)
 
static rc_t _ux_remove (StoreID store, const w_keystr_t &key)
 Removes the specified key from B+Tree. More...
 
static rc_t _ux_remove_core (StoreID store, const w_keystr_t &key)
 
static rc_t _ux_undo_ghost_mark (StoreID store, const w_keystr_t &key)
 Reverses the ghost record of specified key to regular state. More...
 
static rc_t _ux_traverse (StoreID store, const w_keystr_t &key, traverse_mode_t traverse_mode, latch_mode_t leaf_latch_mode, btree_page_h &leaf, bool allow_retry=true)
 Traverse the btree starting at root node to find an appropriate leaf page. More...
 
static rc_t _ux_traverse_recurse (btree_page_h &start, const w_keystr_t &key, traverse_mode_t traverse_mode, latch_mode_t leaf_latch_mode, btree_page_h &leaf, PageID &leaf_pid_causing_failed_upgrade)
 For internal recursion. Assuming start is non-leaf, check children recursively. More...
 
static void _ux_traverse_search (btree_impl::traverse_mode_t traverse_mode, btree_page_h *current, const w_keystr_t &key, bool &this_is_the_leaf_page, slot_follow_t &slot_to_follow)
 Internal helper function to actually search for the correct slot and test fence assumptions. More...
 
static rc_t _ux_traverse_try_eager_adopt (btree_page_h &current, PageID next_pid)
 
static rc_t _ux_traverse_try_opportunistic_adopt (btree_page_h &current, btree_page_h &next)
 
static rc_t _ux_lookup (StoreID store, const w_keystr_t &key, bool &found, void *el, smsize_t &elen)
 
static rc_t _ux_lookup_core (StoreID store, const w_keystr_t &key, bool &found, void *el, smsize_t &elen)
 
static rc_t _sx_norec_alloc (btree_page_h &page, PageID &new_page_id)
 Creates a new empty page as a foster-child of the given page. More...
 
static rc_t _ux_norec_alloc_core (btree_page_h &page, PageID &new_page_id)
 
static rc_t _sx_adopt_foster_all (btree_page_h &root, bool recursive=false)
 Checks all direct children of parent and, if some child has foster, pushes them up to parent, resolving foster status. More...
 
static rc_t _sx_adopt_foster_all_core (btree_page_h &parent, bool is_root, bool recursive)
 
static rc_t _sx_adopt_foster (btree_page_h &parent, btree_page_h &child)
 Pushes up a foster pointer of child to the parent. More...
 
static rc_t _ux_adopt_foster_core (btree_page_h &parent, btree_page_h &child, const w_keystr_t &new_child_key)
 
static rc_t _sx_opportunistic_adopt_foster (btree_page_h &parent, btree_page_h &child, bool &pushedup)
 Pushes up a foster pointer of child to the parent IF we can get EX latches immediately. More...
 
static rc_t _sx_adopt_foster_sweep (btree_page_h &parent_arg)
 Pushes up all foster-children of children to the parent. More...
 
static rc_t _sx_adopt_foster_sweep_approximate (btree_page_h &parent, PageID surely_need_child_pid)
 
static void _ux_adopt_foster_apply_parent (btree_page_h &parent_arg, PageID new_child_pid, lsn_t new_child_emlsn, const w_keystr_t &new_child_key)
 
static void _ux_adopt_foster_apply_child (btree_page_h &child)
 
static rc_t _sx_split_foster (btree_page_h &page, PageID &new_page_id, const w_keystr_t &triggering_key)
 Splits a page, making the new page as foster-child. More...
 
static rc_t _sx_split_if_needed (btree_page_h &page, const w_keystr_t &new_key)
 Splits the given page if we need to do so for inserting the given key. More...
 
static rc_t _ux_lock_key (const StoreID &store, btree_page_h &leaf, const w_keystr_t &key, latch_mode_t latch_mode, const okvl_mode &lock_mode, bool check_only)
 Acquires a lock on the given leaf page, tentatively unlatching the page if needed. More...
 
static rc_t _ux_lock_key (const StoreID &store, btree_page_h &leaf, const void *keystr, size_t keylen, latch_mode_t latch_mode, const okvl_mode &lock_mode, bool check_only)
 
static rc_t _ux_lock_range (const StoreID &store, btree_page_h &leaf, const w_keystr_t &key, slotid_t slot, latch_mode_t latch_mode, const okvl_mode &exact_hit_lock_mode, const okvl_mode &miss_lock_mode, bool check_only)
 
static rc_t _ux_lock_range (const StoreID &store, btree_page_h &leaf, const void *keystr, size_t keylen, slotid_t slot, latch_mode_t latch_mode, const okvl_mode &exact_hit_lock_mode, const okvl_mode &miss_lock_mode, bool check_only)
 
static rc_t _ux_assure_fence_low_entry (btree_page_h &leaf)
 Assures the given leaf page has an entry whose key is the low-fence. More...
 
static rc_t _ux_create_tree_core (const StoreID &stid, const PageID &root_pid)
 
static rc_t _sx_shrink_tree (btree_page_h &root)
 Shrink the tree. Copy the child page over the root page so the tree is effectively one level shorter. More...
 
static rc_t _ux_shrink_tree_core (btree_page_h &root)
 
static rc_t _sx_grow_tree (btree_page_h &root)
 On root page split, allocates a new child, shifts all entries of root to new child, and has the only entry in root (pid0) point to child. Tree grows by 1 level. More...
 
static rc_t _ux_verify_tree (StoreID store, int hash_bits, bool &consistent)
 Verifies the integrity of whole tree using the fence-key bitmap technique. More...
 
static rc_t _ux_verify_tree_recurse (btree_page_h &parent, verification_context &context)
 
static rc_t _ux_verify_feed_page (btree_page_h &page, verification_context &context)
 
static rc_t _ux_verify_volume (int hash_bits, verify_volume_result &result)
 Verifies consistency of all BTree indexes in the volume. More...
 
static void inquery_verify_init (StoreID store)
 
static void inquery_verify_fact (btree_page_h &page)
 
static void inquery_verify_expect (btree_page_h &page, slot_follow_t next_follow)
 
static rc_t _sx_defrag_tree (StoreID store, uint16_t inpage_defrag_ghost_threshold=10, uint16_t inpage_defrag_usage_threshold=50, bool does_adopt=true, bool does_merge=true)
 Checks the whole tree to opportunistically adopt, in-page defrag and rebalance. More...
 
static rc_t _ux_defrag_tree_core (StoreID store, uint16_t inpage_defrag_ghost_threshold, uint16_t inpage_defrag_usage_threshold, bool does_adopt, bool does_merge)
 
static rc_t _sx_defrag_page (btree_page_h &page)
 Defrags the given page to remove holes and ghost records in the page. More...
 
static rc_t _ux_defrag_page_core (btree_page_h &p)
 
static okvl_mode create_part_okvl (okvl_mode::element_lock_mode mode, const w_keystr_t &key)
 
static uint32_t shpid2hash (PageID pid)
 
static queue_based_lock_tmutex_for_high_contention (PageID pid)
 
static bool is_ex_recommended (PageID pid)
 
static uint8_t get_expected_childrens (PageID pid)
 
static void increase_ex_need (PageID real_parent_pid)
 
static void increase_forster_child (PageID new_foster_parent_pid)
 
static void clear_ex_need (PageID real_parent_pid)
 
static void clear_forster_child (PageID foster_parent_pid)
 

Static Public Attributes

static uint8_t s_ex_need_counts [1<< GAC_HASH_BITS]
 
static queue_based_lock_t s_ex_need_mutex [1<< GAC_HASH_BITS]
 
static uint8_t s_foster_children_counts [1<< GAC_HASH_BITS]
 

Detailed Description

The internal implementation class which actually implements the functions of btree_m.

To abstract implementation details, all functions of this class should be used only from btree_m except testcases.

Like btree_m, this class is stateless, meaning all functions are static and there are no class properties. Each function receives the page(s) to work on and has no global side-effect.

User and System Transactions

The functions of this class fall into two categories.

  1. (func name starts with "_ux_") functions that can be used both in user and system transactions. Such a function should have only a local change and local latch.
  2. (func name starts with "_sx_") functions that starts a nested system transactions in it. Such a function can affect the structure of BTree.

User transactions do both logical and physical changes and should do only local physical changes.

System transactions do only physical changes. They sometimes do global physical changes like page splitting and key-adopts. They should be kept short-lived.

References

Basic Btree technique is from Mohan, et. al. IBM Research Report # RJ 7008 9/6/89 C. Mohan, Aries/KVL: A Key-Value Locking Method for Concurrency Control of Multiaction Transactions Operating on B-Tree Indexes

IBM Research Report # RJ 6846 8/29/89 C. Mohan, Frank Levine Aries/IM: An Efficient and High Concureency Index Management Method using Write-Ahead Logging

Advanced Btree techniques added as of 2011 Summer is from Graefe 2011. ACM Transactions on Database Systems (TODS), 2011 Modern B-tree techniques

Member Enumeration Documentation

§ anonymous enum

anonymous enum
Enumerator
GAC_HASH_BITS 
GAC_HASH_MOD 

§ slot_follow_t

Used to specify which child/foster to follow next.

Enumerator
t_follow_invalid 
t_follow_foster 
t_follow_pid0 

§ traverse_mode_t

3 modes of traverse().

Enumerator
t_fence_contain 

Finds a leaf page whose low-fence <= key < high-fence. This is used for insert/remove and many other cases.

t_fence_low_match 

Finds a leaf page whose low-fence == key. This is used for forward cursor (similar to page.next()).

t_fence_high_match 

Finds a leaf page whose high-fence == key. This is used for backward cursor (similar to page.prev()).

Member Function Documentation

§ _sx_adopt_foster()

rc_t btree_impl::_sx_adopt_foster ( btree_page_h parent,
btree_page_h child 
)
static

Pushes up a foster pointer of child to the parent.

This method resolves only one foster pointer. If there might be many fosters in this child, or its siblings, consider _sx_adopt_foster_sweep(). Adopt consists of one or two system transactions. If the parent has enough space: one SSX to move the pointer from child to parent. If not: one SSX to split the parent and then another SSX to move the pointer. So, it checks if we have enough space first.

Parameters
[in]parentthe interior node to store new children.
[in]childchild page of the parent that (might) has foster-children.

§ _sx_adopt_foster_all()

rc_t btree_impl::_sx_adopt_foster_all ( btree_page_h root,
bool  recursive = false 
)
static

Checks all direct children of parent and, if some child has foster, pushes them up to parent, resolving foster status.

If you already know which child has foster, you can use _sx_adopt_foster instead. Context: only in system transaction.

Parameters
[in]rootroot node.
[in]recursivewhether we recursively check all descendants and fosters.

§ _sx_adopt_foster_all_core()

rc_t btree_impl::_sx_adopt_foster_all_core ( btree_page_h parent,
bool  is_root,
bool  recursive 
)
static

overload for recursion.

§ _sx_adopt_foster_sweep()

rc_t btree_impl::_sx_adopt_foster_sweep ( btree_page_h parent_arg)
static

Pushes up all foster-children of children to the parent.

This method also follows foster-children of the parent.

Parameters
[in]parent_argthe interior node to store new children.

§ _sx_adopt_foster_sweep_approximate()

rc_t btree_impl::_sx_adopt_foster_sweep_approximate ( btree_page_h parent,
PageID  surely_need_child_pid 
)
static
See also
_sx_adopt_foster_sweep()
_ux_opportunistic_adopt_foster_core() The difference from _sx_adopt_foster_sweep() is that this function doesn't exactly check children have foster-child. So, this is much faster but some foster-child might be not adopted!

§ _sx_defrag_page()

rc_t btree_impl::_sx_defrag_page ( btree_page_h page)
static

Defrags the given page to remove holes and ghost records in the page.

A page can have unused holes between records and ghost records as a result of inserts and deletes. This method removes those dead spaces to compress the page. The best thing of this is that we have to log only the slot numbers of ghost records that are removed because there are 'logically' no changes. Context: System transaction.

Parameters
[in]pagethe page to be defraged

§ _sx_defrag_tree()

rc_t btree_impl::_sx_defrag_tree ( StoreID  store,
uint16_t  inpage_defrag_ghost_threshold = 10,
uint16_t  inpage_defrag_usage_threshold = 50,
bool  does_adopt = true,
bool  does_merge = true 
)
static

Checks the whole tree to opportunistically adopt, in-page defrag and rebalance.

This method recursively checks the entire tree and to find something that needs maintenance, such as B-link chain, ghost records and unbalanced nodes. If it finds such pages, it does adopt/defrag/merge etc. This method is completely opportunistic, meaning it doesn't require a global latch or lock. If there is some page this method can't get EX latch immediately, it skips the page. Context: in system transaction.

Parameters
[in]storeStore ID
[in]inpage_defrag_ghost_threshold0 to 100 (percent). if page has this many ghosts, and:
[in]inpage_defrag_usage_threshold0 to 100 (percent). and it has this many used space, it does defrag.
[in]does_adoptwhether we do adopts
[in]does_mergewhether we do merge/rebalance (which might trigger de-adopt as well)

§ _sx_grow_tree()

rc_t btree_impl::_sx_grow_tree ( btree_page_h root)
static

On root page split, allocates a new child, shifts all entries of root to new child, and has the only entry in root (pid0) point to child. Tree grows by 1 level.

Context: only in system transaction.

Parameters
[in]rootcurrent root page.

§ _sx_norec_alloc()

rc_t btree_impl::_sx_norec_alloc ( btree_page_h page,
PageID new_page_id 
)
static

Creates a new empty page as a foster-child of the given page.

This is the primary way of allocating a new page in Zero. Whenever we need a new page, whether no-record-split or not, we allocate a new page with empty key ranges. This is done as one SSX.

Parameters
[in]pagethe new page belongs to this page as foster-child.
[out]new_page_idPage ID of the new page.

§ _sx_opportunistic_adopt_foster()

rc_t btree_impl::_sx_opportunistic_adopt_foster ( btree_page_h parent,
btree_page_h child,
bool &  pushedup 
)
static

Pushes up a foster pointer of child to the parent IF we can get EX latches immediately.

latch upgrade might block, in that case this method immediately returns without doing anything. Context: only in system transaction.

Parameters
[in]parentthe interior node to store new children.
[in]childchild page of the parent that (might) has foster-children.
[out]pushedupwhether the adopt was done

§ _sx_reserve_ghost()

rc_t btree_impl::_sx_reserve_ghost ( btree_page_h leaf,
const w_keystr_t key,
int  elem_len 
)
static

Creates a ghost record for the key as a preparation for insert. Context: System transaction.

Parameters
[in]leafthe page to which we insert ghost record.
[in]keykey of the ghost record.
[in]elem_lensize of elem to be inserted

§ _sx_shrink_tree()

rc_t btree_impl::_sx_shrink_tree ( btree_page_h root)
static

Shrink the tree. Copy the child page over the root page so the tree is effectively one level shorter.

Context: only in system transaction.

Parameters
[in]rootcurrent root page.

§ _sx_split_foster()

rc_t btree_impl::_sx_split_foster ( btree_page_h page,
PageID new_page_id,
const w_keystr_t triggering_key 
)
static

Splits a page, making the new page as foster-child.

Alternative version which uses full logging and is independent of the buffer pool, i.e., it does not require fixing pages. This version should work for single page recovery, instant restart, as well as instant restore.

It generates a system transaction with two operations:

  1. On the new page (foster child), a page_img_format log record contains the raw binary image with the moved log records and all metadata, fence keys, and pointers correctly set. Only the "used" portion of the page is logged, meaning that the data portion of the log record is at most one page in size, but it should typically be around half of a page.
  2. On the overflowing page (foster parent), a "bulk deletion" is logged. It indicates that a certain range of the page slots has been deleted. This corresponds to the records that have been moved to the new foster child. The same log record is used to set the foster pointer and keys, but this could also be implemented as a third log record.
Author
Caetano Sauer

Context: only in system transaction.

Parameters
[in]pagethe page to split. also called "old" page.
[out]new_page_idID of the newly created page.
[in]triggering_keythe key to be inserted after this split. used to determine split policy.

§ _sx_split_if_needed()

rc_t btree_impl::_sx_split_if_needed ( btree_page_h page,
const w_keystr_t new_key 
)
static

Splits the given page if we need to do so for inserting the given key.

Parameters
[in,out]pagethe page that might split. When this method splits the page and also new_key should belong to the new page (foster-child), then this in/out param is switched to the new page. Remember, we receive a reference here.
[in]new_keythe key that has to be inserted.

§ _ux_adopt_foster_apply_child()

void btree_impl::_ux_adopt_foster_apply_child ( btree_page_h child)
static

Applies the changes of one adoption on child node. Used by both usual adoption and REDO.

§ _ux_adopt_foster_apply_parent()

void btree_impl::_ux_adopt_foster_apply_parent ( btree_page_h parent_arg,
PageID  new_child_pid,
lsn_t  new_child_emlsn,
const w_keystr_t new_child_key 
)
static

Applies the changes of one adoption on parent node. Used by both usual adoption and REDO.

§ _ux_adopt_foster_core()

rc_t btree_impl::_ux_adopt_foster_core ( btree_page_h parent,
btree_page_h child,
const w_keystr_t new_child_key 
)
static

this version assumes we have already split the parent if needed. Context: in system transaction.

See also
_sx_adopt_foster()

§ _ux_assure_fence_low_entry()

rc_t btree_impl::_ux_assure_fence_low_entry ( btree_page_h leaf)
static

Assures the given leaf page has an entry whose key is the low-fence.

When we are doing structural modification on leaf pages such as Merge and Rebalance, we will change the fence-low key of the right (foster-child) page. However, the fence-low key might be used as the target of locks. Also, the fence-low key might not exist as an entry. In that case, we might not be able to assure serializability. Therefore, this method creates a ghost entry with fence-low key if the given leaf page does not have a desired entry. This method is called for foster-child when Merge/Rebalance happens. See jira ticket:84 "Key Range Locking" (originally trac ticket:86) for more details.

§ _ux_create_tree_core()

rc_t btree_impl::_ux_create_tree_core ( const StoreID stid,
const PageID root_pid 
)
static

this version assumes system transaction as the active transaction on current thread.

See also
_sx_shrink_tree()

Implementation of tree grow/shrink related functions in btree_impl.h. Separated from btree_impl.cpp.

§ _ux_defrag_page_core()

rc_t btree_impl::_ux_defrag_page_core ( btree_page_h p)
static

this version assumes system transaction as the active transaction on current thread.

See also
_sx_defrag_page()

§ _ux_defrag_tree_core()

rc_t btree_impl::_ux_defrag_tree_core ( StoreID  store,
uint16_t  inpage_defrag_ghost_threshold,
uint16_t  inpage_defrag_usage_threshold,
bool  does_adopt,
bool  does_merge 
)
static

§ _ux_get_page_and_status()

rc_t btree_impl::_ux_get_page_and_status ( StoreID  store,
const w_keystr_t key,
bool &  need_lock,
slotid_t slot,
bool &  found,
bool &  took_XN,
bool &  is_ghost,
btree_page_h leaf 
)
static

Function finds the leaf page for a key, locks the key if it exists, and determines if it was found and if it was a ghost.

Context: User transaction

Parameters
[in]storeStore ID
[in]keykey of the tuple
[out]need_lockif locking is needed
[out]slotwhere in the page the key was found
[out]foundif the key was found
[out]took_XNif we took an XN latch on the key
[out]is_ghostif the slot was a ghost
[out]leafthe leaf the key should be in (if it exists or if it did exist)

§ _ux_insert()

rc_t btree_impl::_ux_insert ( StoreID  store,
const w_keystr_t key,
const cvec_t elem 
)
static

This function finds the leaf page to insert the given tuple, inserts it to the page and, if needed, splits the page.

Context: User transaction.

Parameters
[in]storeStore ID
[in]keykey of the inserted tuple
[in]elemdata of the inserted tuple

§ _ux_insert_core()

rc_t btree_impl::_ux_insert_core ( StoreID  store,
const w_keystr_t key,
const cvec_t elem 
)
static

_ux_insert()'s internal function without retry by itself.

§ _ux_insert_core_tail()

rc_t btree_impl::_ux_insert_core_tail ( StoreID  store,
const w_keystr_t key,
const cvec_t el,
bool &  need_lock,
slotid_t slot,
bool &  found,
bool &  alreay_took_XN,
bool &  is_ghost,
btree_page_h leaf 
)
static

Last half of _ux_insert, after traversing, finding (or not) and ghost determination.

§ _ux_lock_key() [1/2]

rc_t btree_impl::_ux_lock_key ( const StoreID store,
btree_page_h leaf,
const w_keystr_t key,
latch_mode_t  latch_mode,
const okvl_mode lock_mode,
bool  check_only 
)
static

Acquires a lock on the given leaf page, tentatively unlatching the page if needed.

A lock might be not immediately granted. During the wait time, we shouldn't hold latch on it. So, this function first conditionally request the given lock and returns if succeeds. If it doesn't succeed immediately, it unlatches the page and request the lock with blocking. During this short time, it is possible that other transaction updates the page, so this function also checks the page LSN and if it's updated returns with eLOCKRETRY. The caller is responsible to restart the operation if eLOCKRETRY is thrown. When this method returns with RCOK, the page is always latched so that the caller can still access the page after this method although it might be "re-latched".

Parameters
[in]storestore id of the index
[in]leafthe page that contains the key to lock
[in]keythe key to lock
[in]latch_modeif this has to un-latch/re-latch, this mode is used.
[in]lock_modethe lock mode to be acquired
[in]check_onlywhether the lock goes away right after grant

§ _ux_lock_key() [2/2]

rc_t btree_impl::_ux_lock_key ( const StoreID store,
btree_page_h leaf,
const void *  keystr,
size_t  keylen,
latch_mode_t  latch_mode,
const okvl_mode lock_mode,
bool  check_only 
)
static

raw string and length version.

§ _ux_lock_range() [1/2]

rc_t btree_impl::_ux_lock_range ( const StoreID store,
btree_page_h leaf,
const w_keystr_t key,
slotid_t  slot,
latch_mode_t  latch_mode,
const okvl_mode exact_hit_lock_mode,
const okvl_mode miss_lock_mode,
bool  check_only 
)
static

Lock gap containing nonexistent key key in page leaf with locking mode miss_lock_mode; exception: if key equals the low fence key of leaf, instead lock just that key with lock mode exact_hit_lock_mode (the gap is not locked in this case).

Parameters
[in]slotThe slot where key would be placed if inserted (usually a return value of btree_page_h::search()) or -1, in which case the leaf will be searched to determine the correct slot.
[in]latch_modeIf this has to un-latch/re-latch, this mode is used.
[in]check_onlyIf set, release the lock immediately after acquiring it.
Precondition
key no record with key key exists in leaf (low fence is fine), exact_hit_lock_mode is N for the gap, and miss_lock_mode is N for the key.

Used when the exact key is not found and range locking is needed.

See also
_ux_lock_key()

§ _ux_lock_range() [2/2]

rc_t btree_impl::_ux_lock_range ( const StoreID store,
btree_page_h leaf,
const void *  keystr,
size_t  keylen,
slotid_t  slot,
latch_mode_t  latch_mode,
const okvl_mode exact_hit_lock_mode,
const okvl_mode miss_lock_mode,
bool  check_only 
)
static

raw string version.

§ _ux_lookup()

rc_t btree_impl::_ux_lookup ( StoreID  store,
const w_keystr_t key,
bool &  found,
void *  el,
smsize_t elen 
)
static

Find key in btree. If found, copy up to elen bytes of the entry element into el. Context: user transaction.

Parameters
[in]storeStore ID
[in]keykey we want to find
[out]foundtrue if key is found
[out]elbuffer to put el if !cursor
[out]elensize of el if !cursor

§ _ux_lookup_core()

rc_t btree_impl::_ux_lookup_core ( StoreID  store,
const w_keystr_t key,
bool &  found,
void *  el,
smsize_t elen 
)
static

_ux_lookup()'s internal function which doesn't rety for locks by itself.

§ _ux_norec_alloc_core()

rc_t btree_impl::_ux_norec_alloc_core ( btree_page_h page,
PageID new_page_id 
)
static

this version assumes system transaction as the active transaction on current thread.

See also
_sx_norec_alloc()
Parameters
[out]new_page_idPage ID of the new page.
Precondition
In SSX
In page is EX-latched

§ _ux_overwrite()

rc_t btree_impl::_ux_overwrite ( StoreID  store,
const w_keystr_t key,
const char *  el,
smsize_t  offset,
smsize_t  elen 
)
static

This function finds the given key, updates the specific part of element if found.

Context: User transaction.

Parameters
[in]storeStore ID
[in]keykey of the existing tuple
[in]elnew data of the tuple
[in]offsetoverwrites to this position of the record
[in]elennumber of bytes to overwrite

§ _ux_overwrite_core()

rc_t btree_impl::_ux_overwrite_core ( StoreID  store,
const w_keystr_t key,
const char *  el,
smsize_t  offset,
smsize_t  elen 
)
static

_ux_overwrite()'s internal function without retry by itself.

§ _ux_put()

rc_t btree_impl::_ux_put ( StoreID  store,
const w_keystr_t key,
const cvec_t elem 
)
static

This function finds the given key, updates the element if found and inserts it if not. If needed, this method also splits the page. Could also be called "insert or update".

Context: User transaction.

Parameters
[in]storeStore ID
[in]keykey of the existing tuple
[in]elemnew data of the tuple

§ _ux_put_core()

rc_t btree_impl::_ux_put_core ( StoreID  store,
const w_keystr_t key,
const cvec_t elem 
)
static

_ux_put()'s internal function without retry by itself. Uses _ux_insert_core_tail and _ux_update_core_tail for the heavy lifting

§ _ux_remove()

rc_t btree_impl::_ux_remove ( StoreID  store,
const w_keystr_t key 
)
static

Removes the specified key from B+Tree.

If the key doesn't exist, returns eNOTFOUND. Context: User transaction.

Parameters
[in]storeStore ID
[in]keykey of the removed tuple

§ _ux_remove_core()

rc_t btree_impl::_ux_remove_core ( StoreID  store,
const w_keystr_t key 
)
static

_ux_remove()'s internal function without retry by itself.

§ _ux_reserve_ghost_core()

rc_t btree_impl::_ux_reserve_ghost_core ( btree_page_h leaf,
const w_keystr_t key,
int  elem_len 
)
static

§ _ux_shrink_tree_core()

rc_t btree_impl::_ux_shrink_tree_core ( btree_page_h root)
static

this version assumes system transaction as the active transaction on current thread.

See also
_sx_shrink_tree()

§ _ux_traverse()

rc_t btree_impl::_ux_traverse ( StoreID  store,
const w_keystr_t key,
traverse_mode_t  traverse_mode,
latch_mode_t  leaf_latch_mode,
btree_page_h leaf,
bool  allow_retry = true 
)
static

Traverse the btree starting at root node to find an appropriate leaf page.

The returned leaf page is latched in leaf_latch_mode mode and the caller has to release the latch.

This function provides 3 search mode.

  1. t_fence_contain. find a leaf page whose low-fence <= key < high-fence. This is used for insert/remove and many other cases.
  2. t_fence_low_match. find a leaf page whose low-fence == key. This is used for forward cursor (similar to page.next()).
  3. t_fence_high_match. find a leaf page whose high-fence == key. This is used for backward cursor (similar to page.prev()).

    Context: Both user and system transaction.

    See also
    traverse_mode_t
    Parameters
    [in]storeStore ID
    [in]keytarget key
    [in]traverse_modesearch mode
    [in]leaf_latch_modeEX for insert/remove, SH for lookup
    [out]leafleaf satisfying search
    [in]allow_retryonly when leaf_latch_mode=EX. whether to retry from root if latch upgrade fails

§ _ux_traverse_recurse()

rc_t btree_impl::_ux_traverse_recurse ( btree_page_h start,
const w_keystr_t key,
btree_impl::traverse_mode_t  traverse_mode,
latch_mode_t  leaf_latch_mode,
btree_page_h leaf,
PageID leaf_pid_causing_failed_upgrade 
)
static

For internal recursion. Assuming start is non-leaf, check children recursively.

start has to be already fixed and SH latched, and the caller is responsible to release the latch on it. Context: Both user and system transaction.

Parameters
[in]startwe recursively search starting from this page
[in]keytarget key
[in]traverse_modesearch mode
[in]leaf_latch_modeEX for insert/remove, SH for lookup
[out]leafleaf satisfying search
[in,out]leaf_pid_causing_failed_upgrade[out:] If the latch-mode is EX, and it fails upgrading the leaf page, this function returns eRETRY and fills this value. [in:] On next try, put the page id in this param. This function will try EX-acquire, not upgrade.

cache the flag to avoid calling the functions each time

§ _ux_traverse_search()

void btree_impl::_ux_traverse_search ( btree_impl::traverse_mode_t  traverse_mode,
btree_page_h current,
const w_keystr_t key,
bool &  this_is_the_leaf_page,
slot_follow_t slot_to_follow 
)
inlinestatic

Internal helper function to actually search for the correct slot and test fence assumptions.

Called only from _ux_traverse_recurse.

Parameters
[in]keytarget key
[in]traverse_modesearch mode
[in]currentThe page we are currently searching.
[out]this_is_the_leaf_pageTrue if we have found the actually target page.
[out]slot_to_followThe slot that we will to go down (or sideways) next.

§ _ux_traverse_try_eager_adopt()

rc_t btree_impl::_ux_traverse_try_eager_adopt ( btree_page_h current,
PageID  next_pid 
)
static

Call this function when it seems like the next page will have VERY high contention and the page should adopt childrens. to avoid excessive latch contention, we use mutex only in this case This mutex doesn't assure correctness. latch does it. it just gives us chance to avoid wasting CPU for latch contention. and, false positive is fine. it's just one mutex call overhead. See jira ticket:78 "Eager-Opportunistic Hybrid Latching" (originally trac ticket:80).

§ _ux_traverse_try_opportunistic_adopt()

rc_t btree_impl::_ux_traverse_try_opportunistic_adopt ( btree_page_h current,
btree_page_h next 
)
static

If next has foster pointer and real-parent wants to adopt it, call this function. This tries opportunistic adoption by upgrading latches to EX. This has a very low cost, though might do nothing in high contention. If such a writer-starvation frequently happens, the above eager_adopt function will be called to do it. See jira ticket:78 "Eager-Opportunistic Hybrid Latching" (originally trac ticket:80).

§ _ux_undo_ghost_mark()

rc_t btree_impl::_ux_undo_ghost_mark ( StoreID  store,
const w_keystr_t key 
)
static

Reverses the ghost record of specified key to regular state.

This is only used when a delete transaction aborts. In that case, there must still exist the record with ghost mark. This method removes the ghost mark from the record. If the key doesn't exist, returns eNOTFOUND (shouldn't happen). Context: User transaction (UNDO of delete).

Parameters
[in]storeStore ID
[in]keykey of the removed tuple
See also
btree_ghost_mark_log::undo()

§ _ux_update()

rc_t btree_impl::_ux_update ( StoreID  store,
const w_keystr_t key,
const cvec_t elem 
)
static

This function finds the given key, updates the element if found. If needed, this method also splits the page.

Context: User transaction.

Parameters
[in]storeStore ID
[in]keykey of the existing tuple
[in]elemnew data of the tuple

§ _ux_update_core()

rc_t btree_impl::_ux_update_core ( StoreID  store,
const w_keystr_t key,
const cvec_t elem 
)
static

_ux_update()'s internal function without retry by itself.

§ _ux_update_core_tail()

rc_t btree_impl::_ux_update_core_tail ( StoreID  store,
const w_keystr_t key,
const cvec_t elem,
bool &  need_lock,
slotid_t slot,
bool &  found,
bool &  is_ghost,
btree_page_h leaf 
)
static

Last half of _ux_update, after traversing, finding (or not) and ghost determination.

§ _ux_verify_feed_page()

rc_t btree_impl::_ux_verify_feed_page ( btree_page_h page,
verification_context context 
)
static

Internal method to check each page. Called both from tree-recurse type and sequential-scan type. This function doesn't follow child or foster pointers.

Parameters
[in]pagethe page to check
[in]contextcontext object to maintain

§ _ux_verify_tree()

rc_t btree_impl::_ux_verify_tree ( StoreID  store,
int  hash_bits,
bool &  consistent 
)
static

Verifies the integrity of whole tree using the fence-key bitmap technique.

This method constructs a bitmap, traverses all pages in this BTree and flips each bit based on facts collected from each page. The size of bitmap is 2^hash_bits bits = 2^(hash_bits - 3) bytes. We recommend hash_bits to be around 20. For more details of this algorithm, check out TODS'11 paper. Context: in both user and system transaction.

Note
This method holds a tree-wide read latch/lock. No concurrent update is allowed.
Parameters
[in]storeStore ID
[in]hash_bitsthe number of bits we use for hashing, at most 31.
[out]consistentwhether the BTree is consistent

§ _ux_verify_tree_recurse()

rc_t btree_impl::_ux_verify_tree_recurse ( btree_page_h parent,
verification_context context 
)
static

Internal method to be called from _ux_verify_tree() for recursively check foster and children.

Parameters
[in]parentpage to check
[in]contextcontext object to maintain

§ _ux_verify_volume()

rc_t btree_impl::_ux_verify_volume ( int  hash_bits,
verify_volume_result result 
)
static

Verifies consistency of all BTree indexes in the volume.

Unlike verify_index() this method sequentially scans all pages in this volume to efficiently conduct the batch-verification. However, you cannot have concurrent update operations while you are running this verification. It might cause deadlocks! To allow concurrent transaction while verifying, consider using _ux_verify_tree().

Parameters
[in]hash_bitsthe number of bits we use for hashing per BTree, at most 31.
[out]resultResults of the verification.
See also
_ux_verify_tree()

§ clear_ex_need()

static void btree_impl::clear_ex_need ( PageID  real_parent_pid)
inlinestatic

Call this when adopted children under the page.

§ clear_forster_child()

static void btree_impl::clear_forster_child ( PageID  foster_parent_pid)
inlinestatic

Call this when you cleared foster status of the page.

§ create_part_okvl()

okvl_mode btree_impl::create_part_okvl ( okvl_mode::element_lock_mode  mode,
const w_keystr_t key 
)
static

Helper method to create an OKVL instance on one partition, using the given key.

§ get_expected_childrens()

static uint8_t btree_impl::get_expected_childrens ( PageID  pid)
inlinestatic

Returns if the page is likely to have foster-child.

§ increase_ex_need()

static void btree_impl::increase_ex_need ( PageID  real_parent_pid)
inlinestatic

Call this when encountered a failed upgrade. Again, doesn't need to be exact!

§ increase_forster_child()

static void btree_impl::increase_forster_child ( PageID  new_foster_parent_pid)
inlinestatic

Call this when there happened a split. Again, doesn't need to be exact!

§ inquery_verify_expect()

void btree_impl::inquery_verify_expect ( btree_page_h page,
slot_follow_t  next_follow 
)
static

adds expectation for next page.

§ inquery_verify_fact()

void btree_impl::inquery_verify_fact ( btree_page_h page)
static

checks one page against the given expectation.

§ inquery_verify_init()

void btree_impl::inquery_verify_init ( StoreID  store)
static

initialize context for in-query verification.

§ is_ex_recommended()

static bool btree_impl::is_ex_recommended ( PageID  pid)
inlinestatic

Returns if the page should be fixed with EX latch.

§ mutex_for_high_contention()

static queue_based_lock_t* btree_impl::mutex_for_high_contention ( PageID  pid)
inlinestatic

Returns the mutex we should use when the given page is expected to be high-contended.

§ shpid2hash()

static uint32_t btree_impl::shpid2hash ( PageID  pid)
inlinestatic

simple modular hashing. this must be cheap.

Member Data Documentation

§ s_ex_need_counts

uint8_t btree_impl::s_ex_need_counts
static

The corresponding page is a real-parent of some foster-parent page The value means an approximate count of failed Upgrade of the page.

§ s_ex_need_mutex

queue_based_lock_t btree_impl::s_ex_need_mutex
static

To avoid excessive spin locks on the same page, use this mutex when you are suspicious that the page is in high contention (should be rare). Note that you don't have to. The data protection is still done by latch. This is just optional to avoid CPU usage.

§ s_foster_children_counts

uint8_t btree_impl::s_foster_children_counts
static

The corresponding page is a foster-parent. The value means an approximate count of foster-children. The value is incremented when a page-split happens.


The documentation for this class was generated from the following files: