In short, I need a function which attempts a rudimentary code fix by adding brackets/quotes were necessary, for parsing purposes. That is, the resulting code is not expected to be runnable.
Let's see a few examples:
[1] class Aaa { $var a = "hi"; => class Aaa { $var a = "hi"; }
[2] $var a = "hi"; } => { $var a = "hi"; }
[3] class { a = "hi; function b( } => class { a = "hi; function b( }"}
[4] class { a = "hi"; function b( } => class { a = "hi"; function b() {}}
PS: The 4th example above looks quite complicated, but in fact, it's quite easy. If the engine finds an ending bracket token which doesn't match with the stack, it should the opposite token before that one. As you can see, this works pretty well.
As a function signature, it looks like: balanceTokens($code, $bracket_tokens, $quote_tokens)
The function I wrote works using a stack. Well, it doesn't exactly work, but it does use a stack.
function balanceTokens($code, $bracket_tokens, $quote_tokens){
$stack = array(); $last = null; $result = '';
foreach(str_split($code) as $c){
if($last==$c && in_array($c, $quote_tokens)){
// handle closing string
array_pop($stack);
}elseif(!in_array($last, $quote_tokens)){
// handle other tokens
if(isset($bracket_tokens[$c])){
// handle begining bracket
$stack[] = $c;
}elseif(($p = array_search($c, $bracket_tokens)) != false){
// handle ending bracket
$l = array_pop($stack);
if($l != $p)$result .= $p;
}elseif(isset($quote_tokens[$c])){
// handle begining quote
$stack[] = $c;
$last = $c;
}// else other token...
}
$result .= $c;
}
// perform fixes
foreach($stack as $token){
// fix ending brackets
if(isset($bracket_tokens[$token]))
$result .= $bracket_tokens[$token];
// fix begining brackets
if(in_array($token, $bracket_tokens))
$result = $token . $result;
}
return $result;
}
The function is called like this:
$new_code = balanceTokens(
$old_code,
array(
'<' => '>',
'{' => '}',
'(' => ')',
'[' => ']',
),
array(
'"' => '"',
"'" => "'",
)
);
Yes, it's quite generic, there aren't any hard-coded tokens.
I haven't the slightest idea why it's not working...as a matter of fact, I don't even know if it should work. I admit I didn't put much thought into writing it. Maybe there are obvious issues which I'm not seeing.
An alternative implementation (which does more aggressive balancing):
function balanceTokens($code) {
$tokens = [
'{' => '}',
'[' => ']',
'(' => ')',
'"' => '"',
"'" => "'",
];
$closeTokens = array_flip($tokens);
$stringTokens = ['"' => true, '"' => true];
$stack = [];
for ($i = 0, $l = strlen($code); $i < $l; ++$i) {
$c = $code[$i];
// push opening tokens to the stack (for " and ' only if there is no " or ' opened yet)
if (isset($tokens[$c]) && (!isset($stringTokens[$c]) || end($stack) != $c)) {
$stack[] = $c;
// closing tokens have to be matched up with the stack elements
} elseif (isset($closeTokens[$c])) {
$matched = false;
while ($top = array_pop($stack)) {
// stack has matching opening for current closing
if ($top == $closeTokens[$c]) {
$matched = true;
break;
}
// stack has unmatched opening, insert closing at current pos
$code = substr_replace($code, $tokens[$top], $i, 0);
$i++;
$l++;
}
// unmatched closing, insert opening at start
if (!$matched) {
$code = $closeTokens[$c] . $code;
$i++;
$l++;
}
}
}
// any elements still on the stack are unmatched opening, so insert closing
while ($top = array_pop($stack)) {
$code .= $tokens[$top];
}
return $code;
}
Some examples:
$tests = array(
'class Aaa { public $a = "hi";',
'$var = "hi"; }',
'class { a = "hi; function b( }',
'class { a = "hi"; function b( }',
'foo { bar[foo="test',
'bar { bar[foo="test] { bar: "rgba(0, 0, 0, 0.1}',
);
Passing those to the function gives:
class Aaa { public $a = "hi";}
{$var = "hi"; }
class { a = "hi; function b( )"}
class { a = "hi"; function b( )}
foo { bar[foo="test"]}
bar { bar[foo="test"] { bar: "rgba(0, 0, 0, 0.1)"}}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With