Token list for PHP

Usage tips, posted by users. No questions here please.

Moderators: AmigoJack, helios, bbadmin, Bob Hansen, MudGuard

Post Reply
troels_kn
Posts: 32
Joined: Fri Oct 28, 2005 12:51 pm

Token list for PHP

Post by troels_kn »

The following PHP script parses a PHP script, and displays a list of classes/interfaces and functions along with line number and a short excerpt of any docblock type comments.

Create a new tool, with the following settings:
Command: php (You may have to put the full path to php.exe)
Parameters: "C:\Program Files\TextPad 5\System\tokens.php" "$File"
[X] in 'Capture output'
[X] in 'Sound alert when completed'
Leave other checkboxes empty.
Regular expression to match output: ^\([0-9]+\)
Registers:
File: <empty> Line: 1 Column: <empty>

Code: Select all

<?php
if (!defined('T_UNSPECIFIED_STRING')) {
  define('T_UNSPECIFIED_STRING', -1);
}
function token_get_all_improved($data) {
  $tokens = Array();
  $line = 1;
  $col = 0;
  $level = 0;
  $scope_level = NULL;
  $in_scope = FALSE;
  foreach (token_get_all($data) as $token) {
    if (is_array($token)) {
      list ($token, $text) = $token;
    } else if (is_string($token)) {
      $text = $token;
      $token = T_UNSPECIFIED_STRING;
    }
    if ($token === T_CURLY_OPEN || $token === T_DOLLAR_OPEN_CURLY_BRACES || $text == '{') {
      ++$level;
      if (is_null($scope_level)) {
        $scope_level = $level;
      }
    } else if ($text == '}') {
      --$level;
      if ($in_scope && $level < $scope_level) {
        $in_scope = FALSE;
      }
    }
    $tmp = $text;
    $numNewLines = substr_count($tmp, "\n");
    if (1 <= $numNewLines) {
       $line += $numNewLines;
       $col  =  1;
       $tmp = substr($tmp, strrpos($tmp, "\n") + 1);
       if ($tmp === false) {
           $tmp = '';
       }
    }
    $col += strlen($tmp);

    if ($token === T_INTERFACE || $token === T_CLASS) {
      $in_scope = TRUE;
      $scope_level = NULL;
    }

    $xtoken = new StdClass();
    $xtoken->type = $token;
    $xtoken->text = $text;
    $xtoken->line = $line;
    $xtoken->col = $col;
    $xtoken->blockLevel = $level;
    $xtoken->isClassScope = $in_scope && !is_null($scope_level);
    $tokens[] = $xtoken;
  }
  return $tokens;
}

function docblock_excerpt($str) {
  if (preg_match('~\*{2}[\s\n*]+(.*)~', trim($str, "/"), $matches)) {
    return $matches[1];
  }
}

function parse_file($file) {
  $buffer = NULL;
  $docblock = NULL;
  $results = Array();
  foreach (token_get_all_improved(file_get_contents($file)) as $token) {
    switch ($token->type) {
      case T_DOC_COMMENT:
        $docblock = $token->text;
        break;
      case T_INTERFACE:
      case T_CLASS:
      case T_FUNCTION:
        $buffer = $token;
        break;
      case T_STRING:
        if (!is_null($buffer)) {
          $buffer->isMember = ($buffer->type != T_FUNCTION) || $buffer->isClassScope;
          $buffer->docblock = $docblock;
          $buffer->name = $token->text;
          $results[] = $buffer;
          $buffer = NULL;
          $docblock = NULL;
        }
        break;
    }
  }
  return $results;
}

function results_to_table($results) {
  $view = Array();
  $last = NULL;
  foreach ($results as $token) {
    if ($last && ((!$token->isMember && $last->isMember) || (in_array($token->type, Array(T_INTERFACE, T_CLASS))))) {
      $view[] = Array("", "", "", "");
    }
    $last = $token;

    $view[] = Array(
      $token->line,
      $type = strtolower(str_replace("T_", "", token_name($token->type))),
      $token->name,
      docblock_excerpt($token->docblock)
    );
  }
  return $view;
}

function format_table($map) {
  $out = Array();
  $column_widths = array_fill(0, count($map[0]), 0);
  foreach ($map as $row) {
    foreach ($row as $num => $col) {
      $column_widths[$num] = max($column_widths[$num], strlen($col));
    }
  }
  foreach ($map as $row) {
    $line = "";
    foreach ($row as $num => $col) {
      $line .= str_pad($col, $column_widths[$num] + 2);
    }
    $out[] = $line;
  }
  return implode("\n", $out);
}

print(
  format_table(
    results_to_table(
      parse_file($argv[1]))) . "\n");
kAlvaro
Posts: 17
Joined: Mon Apr 28, 2003 7:46 am
Contact:

Post by kAlvaro »

I believe there's a typo. Where it says:

Regular expression to match output: ^\([0-9]+\)

it should say:

Regular expression to match output: ^([0-9]+)
ben_josephs
Posts: 2456
Joined: Sun Mar 02, 2003 9:22 pm

Post by ben_josephs »

They're both correct. Which one you should use depends on the style of regular expression syntax selected at the time you enter the expression.

The \( ... \) form is correct for the default style.
The ( ... ) form is correct for Posix style.

You can select the style with
Configure | Preferences | Editor

[X] Use POSIX regular expression syntax
Recommendation: use Posix style. It reduces regular expression backslashitis and unreadability.
Post Reply