{"id":3616,"date":"2020-05-26T14:38:00","date_gmt":"2020-05-26T20:38:00","guid":{"rendered":"http:\/\/bililite.com\/blog\/?p=3616"},"modified":"2020-05-26T21:40:01","modified_gmt":"2020-05-27T03:40:01","slug":"string-replacement-in-php","status":"publish","type":"post","link":"https:\/\/bililite.com\/blog\/2020\/05\/26\/string-replacement-in-php\/","title":{"rendered":"String Replacement in PHP"},"content":{"rendered":"<p>Working with <a href=\"http:\/\/bililite.com\/blog\/2020\/05\/22\/extending-parsedown\/\">Parsedown<\/a>, I want to string manipulation but only in certain parts. For instance, on text not in HTML tags or not in quotes. The <em>right<\/em> way to do that is with a real parser. The easy way is by removing the unwanted strings, replacing them with a marker that won't come up in normal text, doing the manipulation, then replacing the markers (it is the replacement step that requires \"a marker that won't come up in normal text\"; you don't want to replace text that was present in the original).<\/p>\n<p>I would use a marker that can't be typed but still is legal HTML; turns out that U+FFFC (<a href=\"https:\/\/en.wikipedia.org\/wiki\/Specials_(Unicode_block)\">OBJECT REPLACEMENT CHARACTER<\/a>, &#xFFFC;) is perfect for that. So I made a pair of functions, `StringReplace\\remove` and `StringReplace\\restore` to make that easy.<\/p>\n<dl>\n<dt><code class=\"language-php\" >StringReplace\\remove ($re, $target)<\/code><\/dt>\n<dd>Any string that matches the regular expression <code class=\"language-php\" >$re<\/code> in <code class=\"language-php\" >$target<\/code> is replaced by a numbered marker, <code class=\"language-php\" >\"&#xFFFC;{number}&#xFFFC;\"<\/code>. The new string is returned. So for instance, <\/p>\n<pre><code class=\"language-php\" >$rawtext = StringReplace\\remove ('#&lt;\/?[^&gt;]*&gt;#', $html);<\/code><\/pre>\n<p>will remove tags.<\/dd>\n<dt><code class=\"language-php\" >StringReplace\\restore ($target)<\/code><\/dt>\n<dd>Returns a string with the markers replaced by their original versions.<\/dd>\n<\/dl>\n<h2>The code<\/h2>\n<pre><code class=\"language-php\" >namespace StringReplace;\r\n\r\ndefine ('OBJECT_REPLACEMENT_CHARACTER', '\ufffc');\r\ndefine ('RE_REPLACEMENT', '\/'.OBJECT_REPLACEMENT_CHARACTER.'(\\d+)'.OBJECT_REPLACEMENT_CHARACTER.'\/');\r\n\r\n$strings = array();\r\n\r\n$remover = function ($matches){\r\n  global $strings;\r\n  $strings []= $matches[0];\r\n  return OBJECT_REPLACEMENT_CHARACTER.count($strings).OBJECT_REPLACEMENT_CHARACTER;\r\n};\r\n\r\n$replacer = function ($matches){\r\n  global $strings;\r\n  return $strings[$matches[1]-1];\r\n};\r\n\r\nfunction remove ($re, $target){\r\n  global $remover;\r\n  return preg_replace_callback ($re, $remover, $target);\r\n}\r\n\r\nfunction restore ($target){\r\n  global $replacer;\r\n  return preg_replace_callback (RE_REPLACEMENT, $replacer, $target);\r\n}<\/code><\/pre>\n","protected":false},"excerpt":{"rendered":"<p>Working with Parsedown, I want to string manipulation but only in certain parts. For instance, on text not in HTML tags or not in quotes. The right way to do that is with a real parser. The easy way is by removing the unwanted strings, replacing them with a marker that won't come up in [&hellip;]<\/p>\n","protected":false},"author":2,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[9],"tags":[],"_links":{"self":[{"href":"https:\/\/bililite.com\/blog\/wp-json\/wp\/v2\/posts\/3616"}],"collection":[{"href":"https:\/\/bililite.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/bililite.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/bililite.com\/blog\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/bililite.com\/blog\/wp-json\/wp\/v2\/comments?post=3616"}],"version-history":[{"count":9,"href":"https:\/\/bililite.com\/blog\/wp-json\/wp\/v2\/posts\/3616\/revisions"}],"predecessor-version":[{"id":3636,"href":"https:\/\/bililite.com\/blog\/wp-json\/wp\/v2\/posts\/3616\/revisions\/3636"}],"wp:attachment":[{"href":"https:\/\/bililite.com\/blog\/wp-json\/wp\/v2\/media?parent=3616"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/bililite.com\/blog\/wp-json\/wp\/v2\/categories?post=3616"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/bililite.com\/blog\/wp-json\/wp\/v2\/tags?post=3616"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}