A parser for the OScript language written in JavaScript. Returns an abstract syntax tree (AST). See also oscript-ast-walker for traversing the AST and oscript-interpreter for its execution.
import { parseText } from 'oscript-parser'
const program = parseText('i = 0', { sourceType: 'script' })
console.log(JSON.stringify(program))
Use your favourite package manager to install this package locally in your Node.js project:
npm i oscript-parser
pnpm i oscript-parser
yarn add oscript-parser
If you want to use the executables osparse
or oslint
from PATH
, install this package globally instead.
The OScript AST resembles the AST for JavaScript, but it includes nodes specific to the OScript syntax. See the language grammar and the AST node declarations.
The OScript language is case-insensitive. Values of keywords and identifiers in tokens and AST nodes (value
property) are converted lower-case to make comparisons and look-ups more convenient. If you need the original letter-case, enable the raw AST node content (raw
property) by the rawIdentifiers
parser option.
The output of the parser is an Abstract Syntax Tree (AST) formatted in JSON. The parser functionality is exposed by parseText()
and parseTokens()
. The parseText()
expects an input text. The startTokenization()
expects an input text with tokens already produced by tokenize()
.
The available options are:
defines: {}
Preprocessor named values. For evaluating preprocessor directives.tokens: false
Include lexer tokens in the output object. Useful for code formatting or partial analysis in case of errors.preprocessor: false
Include tokens of preprocessor directives and the content skipped by the preprocessor. Useful for code formatting.comments: false
Include comment tokens in the output of parsing or lexing. Useful for code formatting.whitespace: false
Include whitespace tokens in the output of parsing or lexing. Useful for code formatting.locations: false
Store location information on each parsed node.ranges: false
Store the start and end character locations on each parsed node.raw: false
Store the raw original of identifiers and literals.rawIdentifiers: false
Store the raw original of identifiers only.rawLiterals: false
Store the raw original of literals only.sourceType: 'script'
Set the source type toobject
,script
ordump
(the old object format).oldVersion: undefined
Expect the old version of the OScript language.sourceFile: 'snippet'
File name to refer in source locations to.
The default options are also exposed through defaultOptions
where
they can be overridden globally.
import { parseText } from 'oscript-parser'
const program = parseText('foo = "bar"', { sourceType: 'script' })
// { type: "Program",
// body:
// [{ type: "AssignmentStatement",
// variables: [{ type: "Identifier", value: "foo" }],
// init: [{ type: "StringLiteral", value: "bar" }]
// }]
// }
The lexer can be used independently of the parser. The lexer functionality is exposed by tokenize()
and startTokenization()
. The tokenize()
will return an array of tokens. The startTokenization()
will return a generator advancing to the next token up until EOF
is reached. The EOF
itself will not be returned as a token. The options are the same as for the method parse()
, except for tokens
, which will be ignored.
Each token consists of:
type
expressed as an enum flag which can be matched withtokenTypes
.value
line
,lineStart
range
can be used to slice out the raw token content. For example,foo = "bar"
will return aStringLiteral
token with the valuebar
. Slicing out the range on the other hand will return"bar"
.
import { tokenize } from 'oscript-parser'
const tokens = tokenize('foo = "bar"', { sourceType: 'script' })
// [{ type: 8, value: "foo", line: 1, lineStart: 0, range: [0, 3] }
// { type: 32, value: "=", line: 1, lineStart: 0, range: [4, 5]}
// { type: 2, value: "bar", line: 1, lineStart: 0, range: [6, 11] }]
Tokens can be consumed incrementally by an iterator:
import { startTokenization } from 'oscript-parser'
const iterator = startTokenization('foo = "bar"', { sourceType: 'script' })
iterator.next() // { value: { type: 8, value: "foo", line: 1, range: [0, 3] } }
iterator.next() // { value: { type: 32, value: "=", line: 1, range: [4, 5]} }
iterator.next() // { value: { type: 2, value: "bar", line: 1, range: [6, 11] } }
iterator.next() // { done: true }
The osparse
executable can be used from the shell by installing oscript-parser
globally using npm
:
$ npm i -g oscript-parser
$ osparse -h
Usage: osparse [option...] [file]
Options:
--[no]-tokens include lexer tokens. defaults to false
--[no]-preprocessor include preprocessor directives. defaults to false
--[no]-comments include comments. defaults to false
--[no]-whitespace include whitespace. defaults to false
--[no]-locations store location of parsed nodes. defaults to false
--[no]-ranges store start and end token ranges. defaults to false
--[no]-raw store raw identifiers & literals. defaults to false
--[no]-raw-identifiers store raw identifiers & literals. defaults to false
--[no]-raw-literals store raw identifiers & literals. defaults to false
--[no]-context show near source as error context. defaults to true
--[no]-colors enable colors in the terminal. default is auto
-D|--define <name> define a named value for preprocessor
-S|--source <type> source type is object, script (default) or dump
-O|--old-version expect an old version of OScript. defaults to false
-t|--tokenize print tokens instead of AST
-c|--compact print without indenting and whitespace
-w|--warnings consider warnings as failures too
-s|--silent suppress output
-v|--verbose print error stacktrace
-p|--performance print parsing timing
-V|--version print version number
-h|--help print usage instructions
If no file name is provided, standard input will be read. If no source type
is provided, it will be inferred from the file extension: ".os" -> object,
".e|lxe" -> script, ".osx" -> dump. The source type object will enable the
new OScript language and source type dump the old one by default.
Examples:
echo 'foo = "bar"' | osparse --no-comments -S script
osparse -t foo.os
Example usage:
$ echo "i = 0" | osparse -c -S script
{"type":"Program","body":[{"type":"AssignmentStatement",
"variables":[{"type":"Identifier","value":"i"}],
"init":[{"type":"NumericLiteral","value":0}]}]}
The oslint
executable can be used in the shell by installing oscript-parser
globally using npm
:
$ npm i -g oscript-parser
$ oslint -h
Usage: oslint [option...] [pattern ...]
Options:
--[no]-context show near source as error context. defaults to true
--[no]-colors enable colors in the terminal. default is auto
-D|--define <name> define a named value for preprocessor
-S|--source <type> source type is object, script (default) or dump
-O|--old-version expect an old version of OScript. defaults to false
-e|--errors-only print only files that failed the check
-w|--warnings consider warnings as failures too
-s|--silent suppress output
-v|--verbose print error stacktrace
-p|--performance print parsing timing
-V|--version print version number
-h|--help print usage instructions
If no file name is provided, standard input will be read. If no source type
is provided, it will be inferred from the file extension: ".os" -> object,
".e|lxe" -> script, ".osx" -> dump. The source type object will enable the
new OScript language and source type dump the old one by default.
Examples:
echo 'foo = "bar"' | oslint -S script
oslint -t foo.os
Example usage:
$ echo "i = 0" | oslint
snippet succeeded
If tokenizing or parsing fails, a non-zero exit code will be returned by either of osparse
and oslint
and the error with an extra context will be printed on the console. For example, after deleting an equal sign (=
) from example.os:
$ oslint example.os example.os failed with 1 error and 0 warnings example.os:7:28: error: <modifier>, <type>, function, script or end expected near 'public' 5| 6| public object Document inherits CORE::Node 7| override Boolean fEnabled TRUE | ~~~~ 8| 9| // Gets a livelink document
All output of oslint
goes to standard output. For osparse
, the result AST goes to standard output and error and timing information to standard error.
Copyright (c) 2020-2022 Ferdinand Prantl
Licensed under the MIT license.