Skip to content

Add NodeIterator #139

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 15 commits into from
92 changes: 92 additions & 0 deletions docs/Traversal.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,92 @@

# AST Traversal

All Nodes implement the `IteratorAggregate` interface, which means their immediate children can be directly traversed with `foreach`:

```php
foreach ($node as $key => $child) {
var_dump($key)
var_dump($child);
}
```

`$key` is set to the child name (e.g. `parameters`).
Multiple child nodes may have the same key.

The Iterator that is returned to `foreach` from `$node->getIterator()` implements the `RecursiveIterator` interface.
To traverse all descendant nodes, you need to "flatten" it with PHP's built-in `RecursiveIteratorIterator`:

```php
$it = new \RecursiveIteratorIterator($node, \RecursiveIteratorIterator::SELF_FIRST);
foreach ($it as $node) {
var_dump($node);
}
```

The code above will walk all nodes and tokens depth-first.
Passing `RecursiveIteratorIterator::CHILD_FIRST` would traverse breadth-first, while `RecursiveIteratorIterator::LEAVES_ONLY` (the default) would only traverse terminal Tokens.

## Exclude Tokens

To exclude terminal Tokens and only traverse Nodes, use PHP's built-in `ParentIterator`:

```php
$nodes = new \ParentIterator(new \RecursiveIteratorIterator($node, \RecursiveIteratorIterator::SELF_FIRST));
```

## Skipping child traversal

To skip traversal of certain Nodes, use PHP's `RecursiveCallbackIterator`.
Naive example of traversing all nodes in the current scope:

```php
// Find all nodes in the current scope
$nodesInScopeReIt = new \RecursiveCallbackFilterIterator($node, function ($current, string $key, \RecursiveIterator $it) {
// Don't traverse into function nodes, they form a different scope
return !($current instanceof Node\Expression\FunctionDeclaration);
});
// Convert the RecursiveIterator to a flat Iterator
$it = new \RecursiveIteratorIterator($nodesInScope, \RecursiveIteratorIterator::SELF_FIRST);
```

## Filtering

Building on that example, to get all variables in that scope us a non-recursive `CallbackFilterIterator`:

```php
// Filter out all variables
$vars = new \CallbackFilterIterator($it, function ($current, string $key, \Iterator $it) {
return $current instanceof Node\Expression\Variable && $current->name instanceof Token;
});

foreach ($vars as $var) {
echo $var->name . PHP_EOL;
}
```

## Traversing ancestors

Use the `NodeAncestorIterator` to walk the AST upwards from a Node to the root.
Example that finds the closest namespace Node to a Node:

```php
use Microsoft\PhpParser\Iterator\NodeAncestorIterator;
use Microsoft\PhpParser\Node;

foreach (new NodeAncestorIterator($node) as $ancestor) {
if ($ancestor instanceof Node\Statement\NamespaceDefinition) {
var_dump($ancestor->name);
break;
}
}
```

## Converting to an array

You can convert your iterator to a flat array with

```php
$arr = iterator_to_array($it, true);
```

The `true` ensures that the array is indexed numerically and not by Iterator keys (otherwise later Nodes with the same key will override previous Nodes).
Empty file removed docs/a.md
Empty file.
2 changes: 2 additions & 0 deletions phpunit.xml
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,8 @@
<file>tests/api/getResolvedName.php</file>
<file>tests/api/PositionUtilitiesTest.php</file>
<file>tests/api/TextEditTest.php</file>
<file>tests/api/NodeIteratorTest.php</file>
<file>tests/api/NodeAncestorIteratorTest.php</file>
</testsuite>

<testsuite name="performance">
Expand Down
75 changes: 75 additions & 0 deletions src/Iterator/NodeAncestorIterator.php
Original file line number Diff line number Diff line change
@@ -0,0 +1,75 @@
<?php
declare(strict_types = 1);

namespace Microsoft\PhpParser\Iterator;

use Microsoft\PhpParser\Node;

/**
* An Iterator to walk the ancestors of a Node up to the root
*/
class NodeAncestorIterator implements \Iterator {

/**
* @var Node
*/
private $start;

/**
* @var Node
*/
private $current;

/**
* @param Node $node The node to start with
*/
public function __construct(Node $node) {
$this->start = $node;
}

/**
* Rewinds the Iterator to the beginning
*
* @return void
*/
public function rewind() {
$this->current = $this->start;
}

/**
* Returns `true` if `current()` can be called to get the current node.
* Returns `false` if the last Node was the root node.
*
* @return bool
*/
public function valid() {
return $this->current !== null;
}

/**
* Always returns null.
*
* @return null
*/
public function key() {
return null;
}

/**
* Returns the current Node
*
* @return Node
*/
public function current() {
return $this->current;
}

/**
* Advances the Iterator to the parent of the current Node
*
* @return void
*/
public function next() {
$this->current = $this->current->parent;
}
}
161 changes: 161 additions & 0 deletions src/Iterator/NodeIterator.php
Original file line number Diff line number Diff line change
@@ -0,0 +1,161 @@
<?php
declare(strict_types = 1);

namespace Microsoft\PhpParser\Iterator;

use Microsoft\PhpParser\{Node, Token};

/**
* An Iterator to the descendants of a Node
*/
class NodeIterator implements \RecursiveIterator {

/**
* The Node being iterated
*
* @var Node
*/
private $node;

/**
* The current index in the CHILD_NAMES array
*
* @var int
*/
private $childNamesIndex;

/**
* The length of the CHILD_NAMES array for the current node
*
* @var int
*/
private $childNamesLength;

/**
* The current index of value at the current child name, if the value is an array.
* Otherwise null.
*
* @var int|null
*/
private $valueIndex;

/**
* The length of the array at the current child name, if the value is an array.
* Otherwise null.
*
* @var int|null
*/
private $valueLength;

private $childNames;

private $childName;

/**
* @param Node $node The node that should be iterated
*/
public function __construct(Node $node) {
$this->node = $node;
$this->childNames = $node::CHILD_NAMES;
$this->childNamesLength = \count($node::CHILD_NAMES);
}

/**
* Rewinds the Iterator to the beginning
*
* @return void
*/
public function rewind() {
$this->childNamesIndex = -1;
$this->next();
}

/**
* Returns `true` if `current()` can be called to get the current child.
* Returns `false` if this Node has no more children (direct descendants).
*
* @return bool
*/
public function valid() {
return $this->childNamesIndex < $this->childNamesLength;
}

/**
* Returns the current child name being iterated.
* Multiple values may have the same key.
*
* @return string
*/
public function key() {
return $this->childName;
}

/**
* Returns the current child (direct descendant)
*
* @return Node|Token
*/
public function current() {
if ($this->valueIndex === null) {
return $this->node->{$this->childName};
} else {
return $this->node->{$this->childName}[$this->valueIndex];
}
}

/**
* Advances the Iterator to the next child (direct descendant)
*
* @return void
*/
public function next() {
if ($this->valueIndex === $this->valueLength) {
// If not iterating value array or finished with it, go to next child name
$this->childNamesIndex++;
if ($this->childNamesIndex === $this->childNamesLength) {
// If child names index is invalid, become invalid
return;
}
$this->childName = $this->childNames[$this->childNamesIndex];
$value = $this->node->{$this->childName};
// If new value is null or empty array, skip it
if (empty($value)) {
$this->next();
} else if (\is_array($value)) {
// If new value is an array, start index at 0
$this->valueIndex = 0;
$this->valueLength = \count($value);
} else {
// Else reset everything to null
$this->valueIndex = null;
$this->valueLength = null;
}
} else {
// Else go to next item in value array
$this->valueIndex++;
// If new value is null or empty array, skip it
if (empty($this->node->{$this->childName}[$this->valueIndex])) {
$this->next();
}
}
}

/**
* Returns true if the current child is another Node (not a Token)
* and can be used to create another NodeIterator
*
* @return bool
*/
public function hasChildren(): bool {
return $this->current() instanceof Node;
}

/**
* Returns a NodeIterator for the children of the current child Node
*
* @return NodeIterator
*/
public function getChildren() {
return new NodeIterator($this->current());
}
}
11 changes: 10 additions & 1 deletion src/Node.php
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@
use Microsoft\PhpParser\Node\Statement\NamespaceDefinition;
use Microsoft\PhpParser\Node\Statement\NamespaceUseDeclaration;

abstract class Node implements \JsonSerializable {
abstract class Node implements \JsonSerializable, \IteratorAggregate {
/** @var array[] Map from node class to array of child keys */
private static $childNames = [];

Expand Down Expand Up @@ -149,6 +149,15 @@ public function getRoot() : Node {
return $node;
}

/**
* Gets an Iterator to iterate all descendant nodes
*
* @return NodeIterator
*/
public function getIterator() {
return new Iterator\NodeIterator($this);
}

/**
* Gets generator containing all descendant Nodes and Tokens.
*
Expand Down
Loading