If you're ever posting a piece of code on your blog or whatever, you've probably realized that it's a pain to have to convert any special character to its HTML entity. Why not take the pain away and let PHP handle that for you?
There really isn't a foolproof method to pull this off, and I have been told that traversing the code using the DOM would be better, but this function (as ugly as it is) actually works quite well if you have a proper HTML structure with valid code.
It checks for any and tags and converts any special characters inside them to there respective HTML entity. It even accounts for nested
and tags, as well as any attributes you give them.
The Function
function fixcodeblocks($string)
{
return preg_replace_callback('#<(code|pre)([^>]*)>(((?!?\1).)*|(?R))*\1>#si', 'specialchars', $string);
}
function specialchars($matches)
{
return '<'.$matches[1].$matches[2].'>'.htmlspecialchars(substr(str_replace('<'.$matches[1].$matches[2].'>', '', $matches[0]), 0, -(strlen($matches[1]) + 3))).''.$matches[1].'>';
}
echo fixcodeblocks($html);
That would turn something like this:
Heading
Some stuff here, but no actually code yet.
Here is our first code that needs to be escaped:
Something inside a DIV
Hey
Inside nest
Another code block inside some code.More Content
into this:
Heading
Some stuff here, but no actually code yet.
Here is our first code that needs to be escaped:
<div id="test">Something inside a DIV</div>
<html>
<head>
<meta name="author" content="First Last" />
</head>
<body>
<p>Hey</p>
</body>
<div id="nested">
<div id="nest-inside">
<p>Inside nest</p>
</div>
</div>
<code>Another code block inside some code.</code>
</html>More Content
Feel free to offer any suggestions or improvements you might have, as well as any other ways of accomplishing this.