html - PHP - DOMDocument - remove tags around text based on class -


i have html document want remove specific tags from, identified specific class. tags have multiple classes. simple example of markup have:

<style>.c{background-color:yellow}</style> <span class="a b c">string</span>.   <span class="a b c">another string</span>.   <span class="a b">yet string</span>. 

i want able parse through string (preferably using php's domdocument?), finding <span> tags class c result this:

<style>.c{background-color:yellow}</style> string.   string.   <span class="a b">yet string</span>. 

basically, want remove tags around text, preserve text on document.

update: think i'm close, doesn't work me:

$test = '<style>.c {background-color:yellow;}</style>' . 'this <span class="a b c">string</span>.'. 'this <span class="a b c">another string</span>.' . 'this <span class="a b">yet string</span>.';  $doc = new domdocument(); $doc->loadhtml($test); $xpath = new domxpath($doc); $query = "//span[contains(@class, 'c')]"; // gordon $oldnodes = $xpath->query($query);  foreach ($oldnodes $oldnode) {     $txt = $oldnode->nodevalue;     $oldnode->parentnode->replacechild($txt, $oldnode); }  echo $doc->savehtml(); 

you're close... create fragment children:

$query = "//span[contains(concat(' ', normalize-space(@class), ' '), ' c ')]"; $oldnodes = $xpath->query($query);  foreach ($oldnodes $node) {     $fragment = $doc->createdocumentfragment();     while($node->childnodes->length > 0) {         $fragment->appendchild($node->childnodes->item(0));     }     $node->parentnode->replacechild($fragment, $node); } 

since each iteration remove $node, there's no need iterate (it'll dynamically remove result set since it's no longer valid)...

this handle cases have more text inside span:

<span class="a b c">foo <b>bar</b> baz</span> 

note recent edit: changed xpath query more robust match exact classes c rather toc...

what's weird allows remove in iteration without affecting results (i know it's done before, don't know why here). tested code , should good.


Comments

Popular posts from this blog

javascript - Enclosure Memory Copies -

php - Replacing tags in braces, even nested tags, with regex -