Rego Treesitter implementation
For daily routine, I spend majority of the time in the NeoVim, where I write policies for corporate application. By default, NeoVim provides the Regex system, which parses default OPA policies. But I needed more of that, so I started on investigation how can I achieve a better experience.
Treesitter libraries
I don't need to copy-paste the documentation from the official docs, so to get the general principle of it, you can see this page.
In short, you will need a repo. For this case, my repo will be tree-sitter-rego
.
Initialize the reporisory for the npm
yarn init -y
This will create a package.json with defaults, which you can modify
Then install the necessary dependencies:
yarn add -D tree-sitter-cli nan
Then it's necessary to create a file for the grammar. Name it grammar.js
, which is the convention to name for the tree-sitter.
Fill it with following to create a rego parser:
module.exports = grammar({
name: 'rego',
rules: {
source_file: $ => repeat($._definition),
_definition: $ => choice(
$.package_definition,
$.import_package,
$.operator_check,
$.comment,
$.rego_block,
$.builtin_function,
$._junk
),
operator: $ => choice(
'==',
':=',
'=',
'!=',
'<',
'>',
'/',
'-',
'+',
),
true: $ => 'true',
false: $ => 'false',
comma: $ => ',',
comment: $ => /\#.*?\n\r?/,
function_name: $ => choice(
'lower',
'is_string',
'object.get',
'print',
'concat',
'contains',
'time.now_ns',
'io.jwt.encode_sign_raw',
'io.jwt.encode_sign',
'io.jwt.decode',
'io.jwt.verify_es256',
'strings.replace_n',
'http.send',
),
opening_parameter: $ => '(',
closing_parameter: $ => ')',
builtin_function: $ => seq(
field('function_name',
$.function_name
),
field('opening_parameter', $.opening_parameter),
field('function_body',
repeat(
choice(
$.identifier,
$.array_definition,
$.true,
$.false,
$.number,
$.object_field,
$.string_definition,
$.identifier,
$.comma,
),
),
),
field('closing_parameter', $.closing_parameter),
),
string_definition: $ => seq(
'"',
/[a-zA-Z0-9<>@\-._:=\s\/\\]*/,
'"',
),
_array_opening: $ => '[',
_array_closing: $ => ']',
object_field: $ => prec(
1,
seq(
/[a-zA-Z\._]+\[/,
choice(
$.identifier,
$.number,
$.object_field,
$.string_definition
),
$._array_closing,
),
),
array_definition: $ => seq(
$._array_opening,
repeat(
choice(
$.array_definition,
$.string_definition,
$.identifier,
$.identifier,
$.number,
$.object_field,
$.true,
$.false,
$.comma,
),
),
$._array_closing,
),
operator_check: $ => seq(
choice(
$.identifier,
$.builtin_function,
$.string_definition,
$.object_field,
$.array_definition,
$.true,
$.false,
),
$.operator,
choice(
$.identifier,
$.builtin_function,
$.string_definition,
$.object_field,
$.array_definition,
$.true,
$.false,
),
),
rego_rule: $ => prec(1, choice(
$.identifier,
$.operator_check,
$.array_definition,
$.test_case,
$.true,
$.false,
),),
test_case: $ => seq(
$.identifier,
repeat(
seq(
$.reserved_keywords,
$.identifier,
),
),
),
rego_block: $ => seq(
field('rego_rule_name', $.identifier),
optional(
seq(
$.operator,
$.identifier,
),
),
'{',
repeat($.rego_rule),
'}',
),
_junk: $ => /\n/,
reserved_keywords: $ => choice(
'as',
'with',
),
as_keyword: $ => seq(
$.reserved_keywords,
field('package_alias', $.identifier),
),
import_package: $ => seq(
'import',
field('imported_package_name',
choice(
$.identifier,
),
),
optional($.as_keyword),
),
package_definition: $ => seq(
'package',
field('package_name', $.identifier),
),
identifier: $ => /[a-zA-Z\._]+/,
number: $ => /\d+/
}
});
After this, you can execute a command tree-sitter generate
. For convenience, you can even set up
the aliases in scripts for package.json, so you will be able to verify the correctness of your
grammar.
{
"scripts": {
"generate": "tree-sitter generate",
"build": "tree-sitter generate && node-gyp configure && node-gyp build",
"test": "tree-sitter test"
}
}
Complete source for the parser located here: https://github.com/FallenAngel97/tree-sitter-rego
Photo by Taneli Lahtinen on Unsplash