-
Notifications
You must be signed in to change notification settings - Fork 26
Scripting tutorial
The ProbCog toolbox supports several scripting libraries for Jython and Python.
ProbCog comes with a Jython interpreter, which you can invoke as either jython
or pcjython
(use the latter to invoke ProbCog's interpreter when another version of Jython is already installed on your system).
For an optimized workflow, it may be desirable to script learning and inference processes such that they can, for example, easily be repeated as we obtain new data or as our models change. This page lists a few example scripts to give you an idea of how simple it really is.
Contents:
- JyProbCog: A Unified Inference Interface for Jython
- MLN Inference Scripting with Python
- BLN Inference Scripting with Jython
The ''jyprobcog'' library allows to conveniently script inference tasks, making use of the Java implementations of BLNs and MLNs (J-MLNs).
Although model pools are primarily designed for use in client-server applications, we can also use them in scripts.
Model pools are defined in XML files. A very simple model pool file could look like this:
<pool>
<model name="meals_bln" type="BLN" path="meals">
<file type="decls" name="meals_any_for.blnd" />
<file type="network" name="meals_any_for_functional.xml" />
<file type="logic" name="meals_any_for_functional.blnl" />
</model>
<model name="alarm_mln" type="MLN" path="alarm">
<file type="network" name="alarm-noisyor.mln" />
</model>
<model name="alarm_bln" type="BLN" path="alarm">
<file type="decls" name="alarm.blnd" />
<file type="network" name="alarm.pmml" />
<file type="logic" name="alarm.blnl" />
<param name="inferenceMethod" value="EnumerationAsk" />
</model>
<model name="smokers" type="MLN" path="smokers">
<file type="network" name="wts.pybpll.smoking-train-smoking.mln" />
</model>
</pool>
Source: examples/examples.pool.xml
For any model, we can specify an arbitrary number of default parameters that are always to be used for the particular model (e.g. the inferenceMethod
parameter for the model alarm_bln
). We can, of course, override these defaults in our scripts.
Here's a script that makes use of the models referenced in the pool above:
from jyprobcog import *
# load pool of models
pool = ModelPool("examples.pool.xml")
# query alarm BLN and MLN models
evidence = [
"livesIn(James, Yorkshire)",
"livesIn(Stefan, Freiburg)",
"burglary(James)",
"tornado(Freiburg)",
"neighborhood(James, Average)",
"neighborhood(Stefan, Bad)"]
for modelName in ("alarm_mln", "alarm_bln"):
print "\n%s:" % modelName
for result in pool.query(modelName, ["alarm(x)", "burglary(Stefan)"], evidence, verbose=False):
print result
Source: examples/inference_script.py
Note that we are querying two models (one MLN and one BLN) using exactly the same interface.
Run the script using jython inference_script.py
in the examples
directory. The output:
alarm_mln:
0.930400 alarm(James)
0.948200 alarm(Stefan)
0.515800 burglary(Stefan)
alarm_bln:
0.972000 alarm(James)
0.962000 alarm(Stefan)
0.567000 burglary(Stefan)
We can easily parameterize an inference procedure by specifying an arbitrary number of keyword arguments. (Above, we only had one keyword argument, verbose=False
, which turned off any output generated by the inference method itself). Here's an example that queries the meals model and uses a number of additional parameters (specified in a dictionary):
# query meals BLN using time-limited inference
queries = ["mealT", "usesAnyIn(x,Bowl,M)"]
evidence = ["takesPartIn(P, M)", "takesPartIn(P2, M)", "consumesAnyIn(P, Cereals, M)"]
params = {
"verbose": True,
"inferenceMethod": "BackwardSampling",
"timeLimit": 5.0,
"infoTime": 1.0,
}
pool.query("meals_bln", queries, evidence, **params)
Here, we are not suppressing the output generated by the inference method (verbose=True
) and we are using the time-limited version of backward simulation, the timeLimit
being set to 5.0 seconds. The infoTime
parameter causes intermediate results to be printed every 1.0 seconds.
It is not required for a model to be part of pool for it to be queried (though pools are convenient). We can directly construct either a BLNModel
or an MLNModel
and make use of exactly the same query interface as above. Here's a snippet that loads the smokers MLN and queries it:
from jyprobcog import *
# query smokers MLN (not in pool)
print "\nsmokers:"
mln = MLNModel("smokers/smoking.mln")
for result in mln.query("Smokes(Anna)", ["Smokes(Bob)", "Friends(Anna,Bob)"], verbose=False):
print result
We could construct and query a BLN model in a similar fashion: BLNModel(declsFilename, networkFilename, logicFilename)
The query
method works the same way as for model pools (only we need not specify a model name, obviously).
The JyProbCog library currently supports ProbCog's Java libraries for MLN and BLN inference. If you are primarily interested in MLN inference, you have two further options, as briefly described below.
To use the Python-based MLN engine instead of the Java-based engine that is used by JyProbCog, we can directly apply the Python API of PyMLNs.
Here's a short Python script that computes a query using the smokers MLN:
from MLN import *
mln = MLN("wts.pybpll.smoking-train-smoking.mln")
mrf = mln.groundMRF("smoking-test-smaller.db")
queries = ["Smokes(Ann)", "Smokes(Bob)", "Smokes(Ann) ^ Smokes(Bob)"]
results = mrf.inferMCSAT(queries, verbose=False)
for query, prob in zip(queries, results):
print " %f %s" % (prob, query)
Source: examples/smokers/pymlns_example.py
The output:
0.452800 Smokes(Ann)
0.142000 Smokes(Ann) ^ Smokes(Bob)
0.238800 Smokes(Bob)
Because we specified verbose=False
, the inference process does not produce any additional output.
You can also make use of the MLN Query Tool's wrapper to make inference calls to all the engines that the tool supports. Here's an example that passes the same query to all three supported MLN inference engines:
from mlnQueryTool import MLNInfer
inf = MLNInfer()
mlnFiles = ["wts.pybpll.smoking-train-smoking.mln"]
db = "smoking-test-smaller.db"
queries = "Smokes"
output_filename = "results.txt"
allResults = {}
tasks = (("MC-SAT", "PyMLNs"), ("MC-SAT", "J-MLNs"), ("MC-SAT", "Alchemy - August 2010 (AMD64)"))
for method, engine in tasks:
allResults[(method,engine)] = inf.run(mlnFiles, db, method, queries, engine, output_filename,
saveResults=True, maxSteps=5000)
for (method, engine), results in allResults.iteritems():
print "Results obtained using %s and %s" % (engine, method)
for atom, p in results.iteritems():
print " %.6f %s" % (p, atom)
Source: examples/smokers/querytool_example.py
The output:
Results obtained using PyMLNs and MC-SAT
0.469200 Smokes(Ann)
0.327400 Smokes(Bob)
Results obtained using Alchemy - August 2010 (AMD64) and MC-SAT
0.472203 Smokes(Ann)
0.343616 Smokes(Bob)
Results obtained using J-MLNs and MC-SAT
0.474200 Smokes(Ann)
0.345400 Smokes(Bob)
Here's a Jython script that queries the ''meals'' model.
from jyblns import infer
network = "meals_any_for_functional.xml"
decls = "meals_any_for.blnd"
logic = "meals_any_for_functional.blnl"
inferenceMethod = "LikelihoodWeighting"
evidenceDB = "query2.blogdb"
queries = "name,usesAnyIn(x,Plate,M)"
inf = infer(network, decls, logic, inferenceMethod, evidenceDB, queries, args=["--confidenceLevel=0.95"])
for result in inf.getResults():
print result
Source: examples/meals/inference_script.py
In args
, we can specify any of the parameters that are supported by the BLNinfer
app (issue command to the console for help).
The output:
name(P1) ~ Frank: 0.15395434374819145, Charly: 0.11770215872367056, Emily: 0.17835276188745905,
Anna: 0.13949866695914753, Dorothy: 0.22996125909666892, Bert: 0.18053080958485512
name(P3) ~ Frank: 0.1301522895355616, Charly: 0.13878919351464314, Emily: 0.19012310254477874,
Anna: 0.16327138727971174, Dorothy: 0.20366246595672868, Bert: 0.17400156116856863
name(P2) ~ Frank: 0.13822631521409542, Charly: 0.13177546807435006, Emily: 0.18139164983898534,
Anna: 0.14539097470979825, Dorothy: 0.22480551699023912, Bert: 0.17841007517252438
usesAnyIn(P1,Plate,M) ~ True: 0.6730501979998668, False: 0.32694980200012724
usesAnyIn(P3,Plate,M) ~ True: 0.68600117734425, False: 0.31399882265574375
usesAnyIn(P2,Plate,M) ~ True: 0.6640724469729609, False: 0.3359275530270331
In this excerpt of a variation of the first example, we print all the results ourselves by directly accessing the inf
object that the inference process returned:
for r in inf.getResults():
print "%s" % r.varName
for i in range(r.getDomainSize()):
print " %f %s" % (r.probabilities[i], r.domainElements[i]),
if r.additionalInfo is not None:
interval = r.additionalInfo[i]
print " [%f;%f]" % (interval.lowerEnd, interval.upperEnd),
print
print "time taken: %fs" % inf.getSamplingTime()
print "steps taken: %d" % inf.getNumSteps()
Source: examples/meals/inference_script2.py
The output:
name(P1)
0.144996 Frank [0.124553;0.168196]
0.129045 Charly [0.109700;0.151269]
0.191428 Emily [0.168259;0.216992]
0.156962 Anna [0.135754;0.180834]
0.204797 Dorothy [0.180952;0.230933]
0.172772 Bert [0.150622;0.197462]
name(P3)
0.154836 Frank [0.133760;0.178592]
0.141516 Charly [0.121304;0.164510]
0.171245 Emily [0.149183;0.195859]
0.154705 Anna [0.133637;0.178453]
0.211857 Dorothy [0.187672;0.238278]
0.165842 Bert [0.144096;0.190183]
name(P2)
0.165775 Frank [0.144032;0.190112]
0.123136 Charly [0.104224;0.144974]
0.198498 Emily [0.174966;0.224370]
0.159914 Anna [0.138524;0.183943]
0.192391 Dorothy [0.169171;0.217997]
0.160287 Bert [0.138874;0.184336]
usesAnyIn(P1,Plate,M)
0.688682 True [0.659296;0.716607]
0.311318 False [0.283393;0.340704]
usesAnyIn(P3,Plate,M)
0.668035 True [0.638242;0.696526]
0.331965 False [0.303474;0.361758]
usesAnyIn(P2,Plate,M)
0.698704 True [0.669537;0.726330]
0.301296 False [0.273670;0.330463]
time taken: 0.485000s
steps taken: 1000
The intervals at the end of each line are estimations of the intervals within which the true result lies with a confidence level of 0.95. The additional parameter --confidenceLevel
triggered the computation of these intervals.