Note
Go to the end to download the full example code.
Abstracted PCFGs
This is an example of defining a simple discrete RBN equivalent to a PCFG using the AbstractedPCFG
class.
from rbnet.pcfg import AbstractedPCFG
Minimal Example
We start with a minimal example (also used in Discrete RBN):
pcfg = AbstractedPCFG(non_terminals="SAB", terminals="ab", start="S", rules=[
("S --> A B", 1), ("S --> B A", 1), # prior + first transition
("A --> B A", 1), ("B --> A B", 1), # non-terminal transitions
("A --> a", 1), ("B --> b", 1) # terminal transition
])
print(pcfg.inside(sequence="aaaa"))
print(pcfg.inside(sequence="bbbb"))
print(pcfg.inside(sequence="aaab"))
print(pcfg.inside_chart[0].pretty())
tensor(0., dtype=torch.float64, grad_fn=<SumBackward0>)
tensor(0., dtype=torch.float64, grad_fn=<SumBackward0>)
tensor(0.0078, dtype=torch.float64, grad_fn=<SumBackward0>)
╱╲
╱ ╲
╱ ╲
╱ ╲
╱ ╲
╱ ╲
╱ ╲
╱ ╲
╱ ╲
╱ ╲
╱ ╲
╱ ╲
╱ ╲
╱ ╲
╱ ╲
╱ ╲
╱ [0.0078125 0. 0.0078125]╲
╱╲ ╱╲
╱ ╲ ╱ ╲
╱ ╲ ╱ ╲
╱ ╲ ╱ ╲
╱ ╲ ╱ ╲
╱ ╲ ╱ ╲
╱ ╲ ╱ ╲
╱ ╲ ╱ ╲
╱ ╲ ╱ ╲
╱ ╲ ╱ ╲
╱ ╲ ╱ ╲
╱ ╲ ╱ ╲
╱ ╲ ╱ ╲
╱ ╲ ╱ ╲
╱ ╲ ╱ ╲
╱ ╲ ╱ ╲
╱ [0. 0. 0.]╲╱ [0.03125 0. 0.03125]╲
╱╲ ╱╲ ╱╲
╱ ╲ ╱ ╲ ╱ ╲
╱ ╲ ╱ ╲ ╱ ╲
╱ ╲ ╱ ╲ ╱ ╲
╱ ╲ ╱ ╲ ╱ ╲
╱ ╲ ╱ ╲ ╱ ╲
╱ ╲ ╱ ╲ ╱ ╲
╱ ╲ ╱ ╲ ╱ ╲
╱ ╲ ╱ ╲ ╱ ╲
╱ ╲ ╱ ╲ ╱ ╲
╱ ╲ ╱ ╲ ╱ ╲
╱ ╲ ╱ ╲ ╱ ╲
╱ ╲ ╱ ╲ ╱ ╲
╱ ╲ ╱ ╲ ╱ ╲
╱ ╲ ╱ ╲ ╱ ╲
╱ ╲ ╱ ╲ ╱ ╲
╱ [0. 0. 0.]╲╱ [0. 0. 0.]╲╱ [0.125 0. 0.125]╲
╱╲ ╱╲ ╱╲ ╱╲
╱ ╲ ╱ ╲ ╱ ╲ ╱ ╲
╱ ╲ ╱ ╲ ╱ ╲ ╱ ╲
╱ ╲ ╱ ╲ ╱ ╲ ╱ ╲
╱ ╲ ╱ ╲ ╱ ╲ ╱ ╲
╱ ╲ ╱ ╲ ╱ ╲ ╱ ╲
╱ ╲ ╱ ╲ ╱ ╲ ╱ ╲
╱ ╲ ╱ ╲ ╱ ╲ ╱ ╲
╱ ╲ ╱ ╲ ╱ ╲ ╱ ╲
╱ ╲ ╱ ╲ ╱ ╲ ╱ ╲
╱ ╲ ╱ ╲ ╱ ╲ ╱ ╲
╱ ╲ ╱ ╲ ╱ ╲ ╱ ╲
╱ ╲ ╱ ╲ ╱ ╲ ╱ ╲
╱ ╲ ╱ ╲ ╱ ╲ ╱ ╲
╱ ╲ ╱ ╲ ╱ ╲ ╱ ╲
╱ ╲ ╱ ╲ ╱ ╲ ╱ ╲
╱ [0. 0.5 0. ]╲╱ [0. 0.5 0. ]╲╱ [0. 0.5 0. ]╲╱ [0. 0. 0.5]╲
Defining the PCFG
First we define a number of words (terminal symbols) of different categories that sentences can be composed of:
subjects = ["I", "You", "We", "They"]
verbs = ["run", "drink", "sleep"]
adverb_non_gradable = ["a-lot", "alone"]
adverb_gradable = ["fast", "slowly", "quickly"]
grade = ["very", "veeery", "really"]
verb_qualifier = ["rarely", "do-not", "never", "always"]
terminals = subjects + verbs + adverb_non_gradable + adverb_gradable + grade + verb_qualifier
Then we define some non-terminal symbols (a start symbol and one symbol for each category of words used above):
non_terminals = ["start",
"subject",
"verb",
"gradable_adverb",
"non_gradable_adverb",
"verb_qualifier",
"grade"]
Finally, we define the rules and give them a weight (for simplicity we use a weight of 1 everywhere):
non_terminal_rules = [("start --> subject verb", 1),
("verb --> verb_qualifier verb", 1),
("verb --> verb gradable_adverb", 1),
("verb --> verb non_gradable_adverb", 1),
("gradable_adverb --> grade gradable_adverb", 1),
("grade --> grade grade", 1)]
terminal_rules = []
for non_terminal_symbol, corresponding_list_of_terminal_symbols in zip(
non_terminals[1:], # skip the start symbol
[subjects, verbs, adverb_gradable, adverb_non_gradable, verb_qualifier, grade]
):
for terminal_symbol in corresponding_list_of_terminal_symbols:
terminal_rules.append((f"{non_terminal_symbol} --> {terminal_symbol}", 1))
Now we can define our PCFG by providing it with the terminals, non-terminals, rules, and start symbol.
pcfg = AbstractedPCFG(terminals=terminals,
non_terminals=non_terminals,
rules=non_terminal_rules + terminal_rules,
start="start")
Parsing Sentences
Let’s test the grammar by computing the marginal likelihood of some grammatical sentences (which should be greater than zero) and for some un-grammatical ones (which should have zero marginal likelihood)
grammatical_sentences = [
"I run",
"You never run",
"We run very veeery slowly",
"They always run alone",
"I never sleep really very quickly",
"You do-not drink very quickly"]
ungrammatical_sentences = [
"I You",
"run fast"
]
for sentence in grammatical_sentences + ungrammatical_sentences:
marginal_likelihood = pcfg.inside(sequence=sentence.split())
print(f"{sentence} --> {marginal_likelihood}")
I run --> 0.0416666679084301
You never run --> 0.0017361111240461469
We run very veeery slowly --> 1.3563368156610522e-05
They always run alone --> 0.00028935185400769114
I never sleep really very quickly --> 1.1302806797175435e-06
You do-not drink very quickly --> 9.042245437740348e-06
I You --> 0.0
run fast --> 0.0
We can also print a simple textual visualisation of the parse chart, which shows
non-terminal symbol|inside probability
at each location
pcfg.inside(sequence="You never run".split())
print(pcfg.map_inside_chart(precision=2).pretty())
╱╲
╱ ╲
╱ ╲
╱ ╲
╱ ╲
╱ ╲
╱ ╲
╱ ╲
╱ ╲
╱ ╲
╱ ╲
╱ start:1.74e-03╲
╱╲ ╱╲
╱ ╲ ╱ ╲
╱ ╲ ╱ ╲
╱ ╲ ╱ ╲
╱ ╲ ╱ ╲
╱ ╲ ╱ ╲
╱ ╲ ╱ ╲
╱ ╲ ╱ ╲
╱ ╲ ╱ ╲
╱ ╲ ╱ ╲
╱ ╲ ╱ ╲
╱ []╲╱ verb:6.94e-03╲
╱╲ ╱╲ ╱╲
╱ ╲ ╱ ╲ ╱ ╲
╱ ╲ ╱ ╲ ╱ ╲
╱ ╲ ╱ ╲ ╱ ╲
╱ ╲ ╱ ╲ ╱ ╲
╱ ╲ ╱ ╲ ╱ ╲
╱ ╲ ╱ ╲ ╱ ╲
╱ ╲ ╱ ╲ ╱ ╲
╱ ╲ ╱ ╲ ╱ ╲
╱ ╲ ╱ ╲ ╱ ╲
╱ ╲ ╱ ╲ ╱ ╲
╱ subject:2.5e-01╲╱verb_qualifier:2.5e-01╲╱ verb:1.67e-01╲
Training Parameters
For a given dataset of sentences, we can train the model parameters
import pytorch_lightning as pl
import torch
import numpy as np
from rbnet.util import SequenceDataModule
print(pcfg)
pcfg.auto_tokenise = False
# pcfg.cells[0].variable.chart_type = "dict"
data = SequenceDataModule([pcfg.tokenise(s.split()) for s in grammatical_sentences], val_split=0, test_split=0)
data.setup()
# for batch in data.train_dataloader():
# print(batch[0])
# print(pcfg.inside(batch[0]))
#
# for s in grammatical_sentences:
# s = pcfg.tokenise(s.split())
# print(s)
# print(pcfg.inside(s))
# print(list(pcfg.parameters()))
# trainer = pl.Trainer(max_epochs=100)
# trainer.fit(pcfg, data.train_dataloader())
#
# print(list(pcfg.parameters()))
AbstractedPCFG(
(_cells): ConstrainedModuleList(
(0): DiscreteCell(
(transition_probabilities): LogProb()
(_transitions): ConstrainedModuleList(
(0): DiscreteTerminalTransition(
(transition_probabilities): LogProb()
)
(1): DiscreteBinaryNonTerminalTransition(
(transition_probabilities): LogProb()
)
)
)
)
(_prior): DiscretePrior(
(structural_distribution): LogProb()
(prior_distributions): ConstrainedModuleList(
(0): LogProb()
)
)
)
Total running time of the script: (0 minutes 0.037 seconds)