Par is a simple parse graph for DSLs and NLP
- DSLs (domain specific languages), like Tr3
- NLP (chat bots) with flexible word position and closures
with the following features
- modified Backus Naur Form (BNF) to define a named parse tree
- optional namespace { } brackets to restrict a sub-parse
- allow runtime recompile of syntax for NLP / chat bots
- somewhat idiomatic to Swift syntax
graph based intermediate representation
- breaks graph loops when resolving namespace
- allow future integration with data flow graphs
- allow future bottom-up restructuring of parse tree
allow runtime closures to extend lexicon
- search current Calendar, flight schedules, etc
- integrate procedural code
allow imprecise searching
- allow different word orders
- based on minimal hops (hamming distance) from graph
allow short term memory (STM)
- keep keywords from previous queries to complete imprecise matching
- may be adjusted to 0 seconds for computer language parsing
Here is the ubiquitous Hello World
greeting ≈ "hello" "world"
namespace { }
brackets limits the symbols hello
and world
to greeting
.
greeting ≈ hello world {
hello ≈ "hello"
world ≈ "world"
}
double quotes match strings, while single quotes match regular expressions:
year ≈ '(19|20)[0-9][0-9]'
digits ≈ '[0-9]{1, 5}'
Alternation and repetitions are supported
greetings ≈ cough{, 3} (hello | yo+) (big | beautiful)* world?
in the file test.par is the line
events ≈ 'event' eventList()
whereupon the source in TestNLP+test.swift, attaches to eventList()
root?.setMatch("test show event eventList()", eventListChecker)
and attaches a simple callback to extend the lexicon:
func eventListChecker(_ str: Substring) -> String? {
let ret = str.hasPrefix("yo") ? "yo" : nil
return ret
}
which in the real world could attach to a dynamic calendar, or any other 3rd party API.
Here is the output from ParTests/TestNLP+Test.swift :
⟹ before attaching eventListChecker() - `yo` is unknown
"test show event yo" ⟹ 🚫 failed
⟹ runtime is attaching eventListChecker() callback to eventList()
"test show event eventList()" ⟹ eventList.924 = (Function)
⟹ now `yo` is now matched during runtime
"test show event yo" ⟹ test: 0 show: 0 event: 0 yo: 0 ⟹ hops: 0 ✔︎
For NLP, word order may not perfectly match parse tree order. So, report number of hops (or Hamming Distance) from ideal.
Output from ParTests/TestNLP+Test.swift:
"test event show yo" ⟹ test: 0 show: 1 event: 0 yo: 1 ⟹ hops: 2 ✔︎
"yo test show event" ⟹ test: 1 show: 1 event: 2 yo: 2 ⟹ hops: 6 ✔︎
"test show yo event" ⟹ test: 0 show: 0 event: 1 yo: 0 ⟹ hops: 1 ✔︎
"test event yo show" ⟹ test: 0 show: 2 event: 0 yo: 0 ⟹ hops: 2 ✔︎
For NLP, set a time where words from a previous query continue onto the next query.
Output from ParTests/TestNLP+Test.swift:
⟹ with no shortTermMemory, partial matches fail
"test show event yo" ⟹ test: 0 show: 0 event: 0 yo: 0 ⟹ hops: 0 ✔︎
"test hide yo" ⟹ 🚫 failed
"test hide event" ⟹ 🚫 failed
"hide event" ⟹ 🚫 failed
"hide" ⟹ 🚫 failed
⟹ after setting ParRecents.shortTermMemory = 8 seconds
"test show event yo" ⟹ test: 0 show: 0 event: 0 yo: 0 ⟹ hops: 0 ✔︎
"test hide yo" ⟹ test: 0 show: 10 event: 10 yo: 0 ⟹ hops: 20 ✔︎
"test hide event" ⟹ test: 0 show: 10 event: 1 yo: 9 ⟹ hops: 20 ✔︎
"hide event" ⟹ test: 10 show: 9 event: 0 yo: 8 ⟹ hops: 27 ✔︎
"hide" ⟹ test: 9 show: 8 event: 8 yo: 9 ⟹ hops: 34 ✔︎
Here is the Par definition in the Par format:
par ≈ name "≈" right+ sub? end_ {
name ≈ '^[A-Za-z_]\w*'
right ≈ or_ | and_ | paren {
or_ ≈ and_ orAnd+ {
orAnd ≈ "|" and_
}
and_ ≈ leaf reps? {
leaf ≈ match | path | quote | regex {
match ≈ '^([A-Za-z_]\w*)\(\)'
path ≈ '^[A-Za-z_][A-Za-z0-9_.]*'
quote ≈ '^\"([^\"]*)\"' // skip \"
regex ≈ '^([i_]*\'[^\']+)'
}
}
parens ≈ "(" right ")" reps
}
sub ≈ "{" end_ par "}" end_?
end_ ≈ '[ \\n\\t,]*'
reps ≈ '^([\~]?([\?\+\*]|\{],]?\d+[,]?\d*\})[\~]?)'
}
Here is a complete Par definition for the functional data flow graph, called Tr3:
tr3 ≈ left right* {
left ≈ (path | name)
right ≈ (hash | time | value | child | many | copyat | array | edges | embed | comment)+
hash ≈ "#" num
time ≈ "~" num
child ≈ "{" comment* tr3+ "}" | "." tr3+
many ≈ "." "{" tr3+ "}"
array ≈ "[" thru "]"
copyat ≈ "@" (path | name) ("," (path | name))*
value ≈ scalar | exprs
value1 ≈ scalar1 | exprs
scalar ≈ "(" scalar1 ")"
scalars ≈ "(" scalar1 ("," scalar1)* ")"
scalar1 ≈ (thru | modu | data | num) {
thru ≈ num ("..." | "…") num dflt? now?
modu ≈ "%" num dflt? now?
index ≈ "[" (name | num) "]"
data ≈ "*"
dflt ≈ "=" num
now ≈ ":" num
}
exprs ≈ "(" expr+ ("," expr+)* ")" {
expr ≈ (exprOp | name | scalars | scalar1 | quote)
exprOp ≈ '^(<=|>=|==|<|>|\*|_\/|\/|\%|\:|in|\,)|(\+)|(\-)[ ]'
}
edges ≈ edgeOp (edgePar | exprs | edgeItem) comment* {
edgeOp ≈ '^([<←][<!@⟐⟡◇→>]+|[!@⟐⟡◇→>]+[>→])'
edgePar ≈ "(" edgeItem+ ")" edges?
edgeItem ≈ (edgeVal | ternary) comment*
edgeVal ≈ (path | name) (edges+ | value)?
ternary ≈ "(" tern ")" | tern {
tern ≈ ternIf ternThen ternElse? ternRadio?
ternIf ≈ (path | name) ternCompare?
ternThen ≈ "?" (ternary | path | name | value1)
ternElse ≈ ":" (ternary | path | name | value1)
ternCompare ≈ compare (path | name | value1)
ternRadio ≈ "|" ternary
}
}
path ≈ '^(([A-Za-z_][A-Za-z0-9_]*)?[.º˚*]+[A-Za-z0-9_.º˚*]*)'
name ≈ '^([A-Za-z_][A-Za-z0-9_]*)'
quote ≈ '^\"([^\"]*)\"'
num ≈ '^([+-]*([0-9]+[.][0-9]+|[.][0-9]+|[0-9]+[.](?![.])|[0-9]+)([e][+-][0-9]+)?)'
comment ≈ '^([,]+|^[/]{2,}[ ]*(.*?)[\n\r\t]+|\/[*]+.*?\*\/)'
compare ≈ '^[<>!=][=]?'
embed ≈ '^[{][{](?s)(.*?)[}][}]'
}
"""#
Par is vertically integrated with Tr3 here
- Future version Tr3 may embed Par as a node value type
Bottom up restructuring of parse from user queries
- Parse tree may be discarded in favor of a parse graph
- Graph built from user queries as prevNode & nextNode edges as n-grams
- Match queries by assembling nodes middle-out from dictionary of words and n-gram edges