5
\$\begingroup\$

In JavaScript, any function is basically an object on which you can call (function(){}).toString() to get it's underlying code as a string.

I'm working on a function aimed to do the job in PHP. The intended goal is to convert code from PHP into other languages, such as JavaScript.

It looks like this so far:

function fn_to_string($fn, $strip_comments = true) { static $contents_cache = array(); static $nl = "\r\n"; # change this to how you want if(!is_callable($fn)) return ''; # it should be a function if(!class_exists('ReflectionFunction')) return ''; # PHP 5.1 I think # get function info $rfn = new ReflectionFunction($fn); $file = $rfn->getFileName(); $start = $rfn->getStartLine(); $end = $rfn->getEndLine(); if(!is_readable($file)) return ''; # file should be readable # cache file contents for subsequent reads (in case we use multiple fns defined in the same file) $md5 = md5($file); if(!isset($contents_cache[$md5])) $contents_cache[$md5] = file($file, FILE_IGNORE_NEW_LINES); if(empty($contents_cache[$md5])) return ''; # there should be stuff in the file $file = $contents_cache[$md5]; # get function code and tokens $code = "<?php ". implode($nl, array_slice($file, $start-1, ($end+1)-$start)); $tokens = token_get_all( $code); # now let's parse the code; $code = ''; $function_count = 0; $ignore_input = false; # we use this to get rid of "use" or function name $got_header = false; $in_function = false; $braces_level = 0; foreach($tokens as $token){ # get the token name or string if(is_string($token)){ $token_name = $token; }elseif(is_array($token) && isset($token[0]) ){ $token_name = token_name($token[0]); $token = isset($token[1]) ? $token[1] : ""; }else{ continue; } # strip comments if( 1 && $strip_comments && ($token_name == "T_COMMENT" || $token_name == "T_DOC_COMMENT" || $token_name == "T_ML_COMMENT") ){ # but put back the new line if(substr($token,-1) == "\n") $code.=$nl; continue; } # let's decide what to do with it now if($in_function){ # nesting level if($token_name == "{"){ $braces_level++; # done ignoring `use` $ignore_input = false; } # append if( 1 && $function_count==1 && ( 0 # skip function names || ( $ignore_input && $token_name == "(" && !$got_header && (!($ignore_input=false)) ) # skip function () use (...) in closures functions || ( $braces_level == 0 && !$got_header && $token_name == ")" && ($ignore_input=true) && ($got_header=true) ) # this fall-through is intentional || !$ignore_input ) ) { $code .= $token; } # ending "}" if($token_name == "}"){ $braces_level--; # done collecting the function if($braces_level == 0) $in_function = false; } }elseif($token_name == "T_FUNCTION"){ $function_count++; $in_function = true; $ignore_input = true; $braces_level = 0; $code.=$token; # we can't detect this properly so bail out if($function_count>1){ $code = ''; break; } } } return $code; } 

The function uses the ReflectionFunction class to determine where the passed function was declared, and token_get_all() to process the different parts of the declaration.

This works as intended:

  1. Handles function names passed as strings
  2. Handles variable functions
  3. Handles closures and lambdas
  4. Can even handle itself
  5. Can strip out comments

However,

  1. It relies on the undocumented-yet class, ReflectionFunction
  2. Fails if it can't read its own source files
  3. Fails if there are multiple functions declared on the same line(s) where the passed function was declared:

    function a(){} function b(){} fn_to_string('a'); // fails 
  4. Cannot determine scope or context so it strips out function names and the use keyword to avoid future problems

I'm trying to determine if something like this is ready for the real world, so my questions are:

  1. Are there any reasons for which using this approach may not be a good idea?
  2. Are there any foreseeable performance issues?
  3. Are there any better alternatives?
  4. Are there any overlooked cases which the function doesn't cover?
  5. Are there server settings in which a script may not be able to read itself

    is_readable(__FILE__)===false 
\$\endgroup\$

    1 Answer 1

    10
    \$\begingroup\$

    It relies on the undocumented-yet class, ReflectionFunction

    ReflectionFunction is not undocumented.

    Fails if it can't read its own source files

    Seems unavoidable.

    Fails if there are multiple functions declared on the same line(s) where the passed function was declared

    Why not just return everything in the range of lines ReflectionFunction gives us, since that's as accurate as it can get? Which leads to the inevitable question, what is the use case for all of this?

    Cannot determine scope or context so it strips out function names and the use keyword to avoid future problems

    What problems? If you explained how you planned to use this, that might make more sense (context!).

    Are there any reasons for which using this approach may not be a good idea?

    Can you first explain what reasons you are doing this for in the first place?

    Are there any foreseeable performance issues?

    1. Parsing the function (token_get_all) seems unnecessary when all you want to do is get the function's source.

    2. Using the MD5 hash of the filename as a cache key seems unnecessary, why not just use the filename itself?

    Are there any better alternatives?
    Are there any overlooked cases which the function doesn't cover?

    Maybe, depending on your intended use case.

    Are there server settings in which a script may not be able to read itself

    is_readable(__FILE__)===false 

    Code run in the PHP interactive shell is one case where is_readable(__FILE__)===false.


    Some miscellaneous notes:

    • Using # for comments is unusual; you usually only see # used in the shebang line.

    • Reusing the $file variable for both a file name and an array of lines from the file is confusing.

    • Stripping comments does not seem necessary given your stated goal (acting like JavaScript's Function.prototype.toString).


    Without knowing more about your intended use case, I'd suggest something much simpler, along the lines of:

    function fn_to_string($fn) { $r = new ReflectionFunction($fn); $file = $r->getFileName(); if (!is_readable($file)) { return ''; } $lines = file($file); $start = $r->getStartLine() - 1; $length = $r->getEndLine() - $start; return implode('', array_slice($lines, $start, $length)); } 
    \$\endgroup\$
    1
    • \$\begingroup\$Hi there. Thank you very much for taking your time on this. The intended goal here is to convert code from PHP into other languages, like JS. Will update the question.\$\endgroup\$CommentedJun 15, 2014 at 9:05

    Start asking to get answers

    Find the answer to your question by asking.

    Ask question

    Explore related questions

    See similar questions with these tags.