|
template<typename T > |
std::string | to_string_with_precision (const T a_value, const int n=14) |
|
template<typename T > |
bool | is_prefix (const T &prefix, const T &x) |
| Check if prefix is a prefix of x – works with iterables, including strings and vectors. More...
|
|
bool | contains (const std::string &s, const std::string &x) |
|
bool | contains (const std::string &s, const char x) |
|
void | replace_all (std::string &s, const std::string &x, const std::string &y, int add=0) |
| Replace all occurances of x with y in s. More...
|
|
std::string | remove_characters (std::string s, const std::string &rem) |
| Remove all characters in rem from s. TODO: Not very optimized here yeah. More...
|
|
template<const float & add_p, const float & del_p, typename T = std::string> |
double | p_delete_append (const T &x, const T &y, const float log_alphabet) |
| Probability of converting x into y by deleting some number (each with del_p, then stopping with prob 1-del_p), adding with probability add_p, and then when we add selecting from an alphabet of size alpha_n. More...
|
|
std::pair< std::string, std::string > | divide (const std::string &s, const char delimiter) |
|
size_t | count (const std::string &str, const std::string &sub) |
|
size_t | count (const std::string &str, const char x) |
|
std::string | reverse (std::string x) |
|
std::string | QQ (const std::string &x) |
|
std::string | Q (const std::string &x) |
|
void | check_alphabet (const std::string &s, const std::string &a) |
| Check that s only uses characters from a. On failure, we print the string and assert false. More...
|
|
void | check_alphabet (std::vector< std::string > t, const std::string &a) |
|
unsigned int | levenshtein_distance (const std::string &s1, const std::string &s2) |
| Compute levenshtein distiance between two strings (NOTE: Or O(N^2)) More...
|
|
double | p_KashyapOommen1984_edit (const std::string x, const std::string y, const double perr, const size_t nalphabet) |
| The string probability model from Kashyap & Oommen, 1983, basically giving a string edit distance that is a probability model. This could really use some unit tests, but it is hard to find implementations. More...
|
|
template<const float & add_p, const float & del_p, typename T = std::string>
double p_delete_append |
( |
const T & |
x, |
|
|
const T & |
y, |
|
|
const float |
log_alphabet |
|
) |
| |
|
inline |
Probability of converting x into y by deleting some number (each with del_p, then stopping with prob 1-del_p), adding with probability add_p, and then when we add selecting from an alphabet of size alpha_n.
- Parameters
-
x | |
y | |
del_p | - probability of deleting the next character (geometric) |
add_p | - probability of adding (geometric) |
log_alphabet | - log of the size of alphabet |
- Returns
- The probability of converting x to y by deleting characters with probability del_p and then adding with probability add_p
This function computes the probability that x would be converted into y, when we insert with probability add_p and delete with probabiltiy del_p and when we add we add from an alphabet of size log_alphabet. Note that this is a template function because otherwise we end up computing log(add_p) and log(del_p) a lot, and these are in fact constant.
- Parameters
-
- Returns
double p_KashyapOommen1984_edit |
( |
const std::string |
x, |
|
|
const std::string |
y, |
|
|
const double |
perr, |
|
|
const size_t |
nalphabet |
|
) |
| |
The string probability model from Kashyap & Oommen, 1983, basically giving a string edit distance that is a probability model. This could really use some unit tests, but it is hard to find implementations.
This assumes that the deletions, insertions, and changes all happen with a constant, equal probability of perr, but note that swaps and insertions also have to choose the character out of nalphabet. This could probably be optimized to not have to compute these logs
- Parameters
-
- Returns