Skip to content
This repository has been archived by the owner on Jun 4, 2019. It is now read-only.

CodeQuery

Dan Miller edited this page Nov 9, 2015 · 12 revisions

codequery is an interactive tool a la SQL to query information about the structure of the code (the inheritance tree, the call graph, the data graph, etc). The data is the code. The query language is Prolog (http://en.wikipedia.org/wiki/Prolog), a logic-based programming language used mainly in AI but also popular in database (http://en.wikipedia.org/wiki/Datalog). The particular Prolog implementation we use is SWI-prolog (http://www.swi-prolog.org/pldoc/refman/).

By default when you give just a directory to codequery it builds the Prolog database and then enters Prolog's read-eval-print loop. After the ?- prompt, you can enter a query followed by a dot. For instance:

$ cd /tmp/test/
$ cat foo.php
  <?php
  class A {
  }
  class B extends A {
  }
  class C extends B {
  }
$ codequery .
  generating prolog facts in /tmp/test/facts.pl
  compiling prolog facts with swipl in /tmp/test/prolog_compiled_db
  % /tmp/test/facts.pl compiled 0.00 sec, 13,984 bytes
  % /home/pad/pfff/h_program-lang/database_code.pl compiled 0.00 sec, 19,072 bytes
  ...
  Welcome to SWI-Prolog (Multi-threaded, 64 bits, Version 5.11.29-39-g35fdbf2)
  Copyright (c) 1990-2011 University of Amsterdam, VU Amsterdam
  SWI-Prolog comes with ABSOLUTELY NO WARRANTY. This is free software,
  and you are welcome to redistribute it under certain conditions.
  Please visit http://www.swi-prolog.org for details.

  For help, use ?- help(Topic). or ?- apropos(Word).

  ?- children(X, 'A').

Prolog will then try to find a solution to this query by unifying X with something that would satisfy the query given all the facts built into the database (see facts.pl in the same directory). But you'll get only one solution. To get the next solution type a semicolon:

X = 'B' ;
X = 'C' ;
false.

?-

See https://github.com/facebook/pfff/blob/master/main_codequery.ml

The synopsis is:

$ codequery [-lang <string>] <dir>

The facts.pl file generated in the directory will contain the set of facts about your codebase.

The pfff/h_programl-lang/database_code.pl file contains some helper predicates. See https://github.com/facebook/pfff/blob/master/h_program-lang/database_code.pl to know which predicates are available and what they mean.

See also https://github.com/facebook/pfff/blob/master/lang_php/analyze/foundation/unit_prolog_php.ml for example of queries.

children(X, 'InterestingClass'), writeln(X), fail
children('InterestingClass', X), writeln(X), fail
kind(X, class), at(X, A, _), sub_string(A, 0, _, _, 'some/interesting/code'), writeln(X), fail

In this example, we look for all subclasses of WebController, who have a genResponse() method which invokes a getResponse() method in it.

children(X, 'WebController'), docall((X, 'genResponse'), getResponse, method), writeln(X), fail

I have no idea what it does, but it's an example in FB's wiki, so maybe useful to you.

docall((Class, Method), 'delegateToYield', method), not((children(Class, Parent), kind((Parent, Method), Kind))), writeln((Class, Method)), fail
aggregate(count, A^docall(A, B, function), Count), writeln((Count, B)), fail

then take the result of that and pipe to | sort -rn | head -50

children(X, 'IInterestingInterface'), kind(X, class), writeln(X), fail
May want to pipe that to | sort | uniq
kind((A, '__construct'), _), docall((A, '__construct'), B, class), at((A, '__construct'), File, Line), writeln((File, A, B)), fail
docall(X, B, function), at(B, File, Col), file(File, Dir), member('PHP_STDLIB', Dir), writeln(B), fail

May want to pipe that to | sort | uniq -c

kind((C,F), field), is_private((C,F)), use((C,'__construct'), F, field, write), \+ (use((C, M), F, field, write), M \= '__construct'), type((C,F),T), writeln((C,F,T)), fail
kind(Trait, trait), kind((Trait, Method), method), is_private((Trait, Method)), static((Trait, Method)), writeln((Trait, Method)), fail
kind(X, class), aggregate_all(count, children(_, X), Count), open('results.txt', append, Stream), write(Stream, (X, Count)), nl(Stream), close(Stream), fail.

Why use Prolog? OCaml, now Prolog ... Why use those french esoteric programming languages? Because I don't know how to use SQL or PHP and Prolog is arguably a very good language to query a database. See http://en.wikipedia.org/wiki/Datalog

Ctrl-D multiple times.