Convert HTML Special Chars Entities Inside Code and Pre Tags

Convert HTML Special Chars Entities Inside Code and Pre Tags

If you're ever posting a piece of code on your blog or whatever, you've probably realized that it's a pain to have to convert any special character to its HTML entity. Why not take the pain away and let PHP handle that for you?

There really isn't a foolproof method to pull this off, and I have been told that traversing the code using the DOM would be better, but this function (as ugly as it is) actually works quite well if you have a proper HTML structure with valid code.
It checks for any and

 tags and converts any special characters inside them to there respective HTML entity. It even accounts for nested  and 
 tags, as well as any attributes you give them.

The Function

function fixcodeblocks($string) 
{
return preg_replace_callback('#<(code|pre)([^>]*)>(((?!#si', 'specialchars', $string);
}

function specialchars($matches)
{
return '<'.$matches[1].$matches[2].'>'.htmlspecialchars(substr(str_replace('<'.$matches[1].$matches[2].'>', '', $matches[0]), 0, -(strlen($matches[1]) + 3))).'';
}

echo fixcodeblocks($html);

That would turn something like this:

Heading


Some stuff here, but no actually code yet.


Here is our first code that needs to be escaped:

Something inside a DIV








Hey





Inside nest




Another code block inside some code.


More Content

into this:

Heading


Some stuff here, but no actually code yet.


Here is our first code that needs to be escaped: <div id="test">Something inside a DIV</div>



<html>
<head>
<meta name="author" content="First Last" />
</head>
<body>
<p>Hey</p>
</body>
<div id="nested">
<div id="nest-inside">
<p>Inside nest</p>
</div>
</div>
<code>Another code block inside some code.</code>
</html>

More Content


Feel free to offer any suggestions or improvements you might have, as well as any other ways of accomplishing this.