Create New Item
Item Type
File
Folder
Item Name
Search file in folder and subfolders...
Are you sure want to rename?
File Manager
/
blog
/
wp-content
/
themes
/
poe
:
class-wp-token-map.php
Advanced Search
Upload
New Item
Settings
Back
Back Up
Advanced Editor
Save
<?php /** * Class for efficiently looking up and mapping string keys to string values, with limits. * * @package WordPress * @since 6.6.0 */ /** * WP_Token_Map class. * * Use this class in specific circumstances with a static set of lookup keys which map to * a static set of transformed values. For example, this class is used to map HTML named * character references to their equivalent UTF-8 values. * * This class works differently than code calling `in_array()` and other methods. It * internalizes lookup logic and provides helper interfaces to optimize lookup and * transformation. It provides a method for precomputing the lookup tables and storing * them as PHP source code. * * All tokens and substitutions must be shorter than 256 bytes. * * Example: * * $smilies = WP_Token_Map::from_array( array( * '8O' => '😯', * ':(' => '🙁', * ':)' => '🙂', * ':?' => '😕', * ) ); * * true === $smilies->contains( ':)' ); * false === $smilies->contains( 'simile' ); * * '😕' === $smilies->read_token( 'Not sure :?.', 9, $length_of_smily_syntax ); * 2 === $length_of_smily_syntax; * * ## Precomputing the Token Map. * * Creating the class involves some work sorting and organizing the tokens and their * replacement values. In order to skip this, it's possible for the class to export * its state and be used as actual PHP source code. * * Example: * * // Export with four spaces as the indent, only for the sake of this docblock. * // The default indent is a tab character. * $indent = ' '; * echo $smilies->precomputed_php_source_table( $indent ); * * // Output, to be pasted into a PHP source file: * WP_Token_Map::from_precomputed_table( * array( * "storage_version" => "6.6.0", * "key_length" => 2, * "groups" => "", * "long_words" => array(), * "small_words" => "8O\x00:)\x00:(\x00:?\x00", * "small_mappings" => array( "😯", "🙂", "🙁", "😕" ) * ) * ); * * ## Large vs. small words. * * This class uses a short prefix called the "key" to optimize lookup of its tokens. * This means that some tokens may be shorter than or equal in length to that key. * Those words that are longer than the key are called "large" while those shorter * than or equal to the key length are called "small." * * This separation of large and small words is incidental to the way this class * optimizes lookup, and should be considered an internal implementation detail * of the class. It may still be important to be aware of it, however. * * ## Determining Key Length. * * The choice of the size of the key length should be based on the data being stored in * the token map. It should divide the data as evenly as possible, but should not create * so many groups that a large fraction of the groups only contain a single token. * * For the HTML5 named character references, a key length of 2 was found to provide a * sufficient spread and should be a good default for relatively large sets of tokens. * * However, for some data sets this might be too long. For example, a list of smilies * may be too small for a key length of 2. Perhaps 1 would be more appropriate. It's * best to experiment and determine empirically which values are appropriate. * * ## Generate Pre-Computed Source Code. * * Since the `WP_Token_Map` is designed for relatively static lookups, it can be * advantageous to precompute the values and instantiate a table that has already * sorted and grouped the tokens and built the lookup strings. * * This can be done with `WP_Token_Map::precomputed_php_source_table()`. * * Note that if there is a leading character that all tokens need, such as `&` for * HTML named character references, it can be beneficial to exclude this from the * token map. Instead, find occurrences of the leading character and then use the * token map to see if the following characters complete