There have been lots of times that I've wanted to be able to keep my hand on the keyboard when editing, rather than running off to the mouse all the time. There's an implementation of VIM in Javascript but I figured I would learn something by doing it myself. My goal is vi, not vim, since I don't need anything that sophisticated.
The first step is implementing the line-oriented part of vi, called ex, based on the manual from the sourceforge project. My version is based on bililiteRange, and depends on the bililiteRange utilities and undo plugin.
Use it simply as bililiteRange(textarea).ex('%s/foo/bar/');
, passing the ex command to the ex()
function. The biggest difference from real ex is that this uses javascript regular expressions, rather than the original ex ones. Thus s/\w/x/
rather than s/[:class:]/x/
, and use ?/.../
rather than ?...?
to search backwards (the question mark is used in Javascript regular expressions so I don't want to use it as a delimiter).
Command Syntax
The syntax of the function is trivial: bililiteRange(someelement).ex(excommand {String} [, defaultAddress {String}])
, where excommand
is described below, and defaultAddress
is the address of the line to use if no address is given in the command string; the default is '.'
.
Ex commands are in the form of addressrange command variant parameter
; all parts are optional. White space can separate the parts but are not necessary if it is unambiguous. Multiple commands can be entered in one string, separated by |
. To include |
or other special characters in the parameter
(that's the only place it would be relevant), enclose in double quotes, as a JSON string.Thus ex('a hello | a bye')
appends two lines, hello
and bye
, while ex('a "hello | a bye"')
appends one line, hello | a bye
.
The parser tries to imitate ex's too-clever-by-half automatic closing of regular expressions (and by extension, strings). This means that an address or a parameter that contains an unmatched /
or "
will have that delimiter added at the end. So ex('a his/hers')
will in fact append his/hers/
. Use JSON strings to avoid that problem. So ex('a "his/hers"')
or ex('a "his\"hers"')
.
Syntax errors or undefined commands are thrown as throw new Error(errormessage)
.
Non-error messages (like the list of options from set
) are returned in a field in the range called exMessage
. Thus console.log( bililiteRange(element).ex('set').exMessage )
.
variant
simply means optionally appending a !
to the command name. For many commands, this changes the behavior slightly. E.g., append text
respects the value of the autoindent
option, but append! text
does the opposite of the autoindent
option.
Command Completion
The parser first looks up the command in the object bililiteRange.ex.commands
. If it is defined as a function, executes it. If it is defined as a string, looks that value up in bililiteRange.ex.commands
with this algorithm (so infinite loops are possible!). If it not defined at all, treats it as a abbreviation and returns the first member of bililiteRange.ex.commands
for which the command is a prefix.
So if bililiteRange.ex.commands
were:
{
c: 'cut',
copy: function(){...}
cut: function(){...}
}
Then c
would execute the cut
function and co
would execute the copy function.
Addresses
Ex is line oriented; every command works on one or more complete lines, which are delimited by '|'
. Address ranges are of the form x ([,;] x)*
meaning one or more address parts separated by commas or semicolons. Each address part (described below) may be followed by a positive or negative offset. Each address part evaluates to a line number, then the offset is added or subtracted to get the "current line". This is pushed onto a stack of lines numbers. The final range that the command operates on is the range of lines between the top two lines on that stack (inclusive). If only one address part is given, then the range is that line alone. If there is no address, then the default address passed to ex()
is used. The default for that is '.'
.
The "starting line" is the first line of the range passed in. Separating address parts with a semicolon instead of a comma resets the "starting line" to the "current line". Thus if the current line is line 4, '1,.+1'
means lines 1 through 5 but '1;.+1'
means lines 1 through 2.
Address parts can be any of the following:
- %%
- This is my own extension. Pushes both the starting and ending line of the original range onto the stack. If this is the only address, then does not change the range at all (useful for creating commands that are not line-oriented).
- . (a single period)
- Pushes the current line.
- $
- Pushes the last line of the text.
- Any number
- Pushes that number
- 'x (a single quote mark followed by a lower-case letter)
- Pushes the value of the corresponding mark (see the mark command)
- '' (two single quote marks)
- Pushes the current line of the previous command (so you can go back).
- %
- Equivalent to
0,$
; all the lines. - /re/flags
- Searches for the regular expression
/re/
from the current line. Pushes the first line found that matches, or the current line if not found. An empty regular expression (//
, or, since the parser closes regular expressions, just/
) uses the previous regular expression. - The following flags are valid (this is an extension of real Javascript regular expressions):
i
: Force a case-insensitive search.I
: Force a case-sensitive search.- If neither
i
orI
is set, uses the value of theignorecase
option. m
: Use multiline mode (^
and$
match the end of lines rather than just the end of text).M
: Do not use multiline mode.- By default, uses multiline mode.
w
: Search wraps around to the beginning of the text if needed .W
: Search does not wrap around.- If neither
w
orW
is set, uses the value of thewrapscan
option. - The
g
flag is legal but is ignored (since this is only looking for the first match
/re/
as above.Commands
append
ora
- Inserts
parameter
as a new line after the address range. Add multiple lines with escaped newlines in a JSON string, asex('a "first\nsecond\nthird"')
. If theautoindent
option is set, copies any whitespace at the beginning of the current line to the beginning of each of the appended lines. Ifvariant
is set (i.e.append!
), toggles theautoindent
option for this command. change
or c- Replaces the address range with
parameter
, respecting theautoindent
option as withappend
(and its variant). the deleted text is pushed onto a "delete stack", to be popped withput
. This is the analogue of the clipboard in modern GUI's, but it is a stack--the contents can only be used once (but see thedelete
command with named registers). copy
ortranscribe
ort
parameter
is interpreted as an address range as above, and the text of the original addressed range is copied after that. Thusex('1,2 copy 4')
copies lines 1 and 2 after line 4. Note that this is different from what modern GUI's call "copy", copying the text to the clipboard; for that useyank
.delete
ordel
- Deletes the address range. Interprets
parameter
as an optional letter followed by an optional number. If the letter is not present, then the deleted text is pushed onto the delete stack. If a lower-case letter is present, that is taken as the name of a register and the text is stored there. If an upper-case letter is present, the deleted text is appended to the corresponding lower-case letter register. If the number is present, then the affected range is the last line of the address range and the n-1 following lines. Thus,1,2 delete
deletes lines 1 and 2, but1,2 delete 2
deletes lines 2 and 3. global
org
- This one is hard to explain. Interprets
parameter
as a regular expression followed by a command, then executes the command on all the lines in the address range that match the regular expression. Ifvariant
is set, then executes the command on all the lines that do not match the regular expression. - For example,
1,4 g /foo/ d
would delete all lines from 1 to 4 that contain "foo".% g! /^\d/ s /^/"\t"
would prepend a tab to every line that does not start with a digit. - Since a
|
(vertical bar) is used to separate individual commands, we can't use that to separate the commands passed inparameter
. The real ex uses newlines to separate those, but it's hard to type that, so I use the character\n
to separate commands. So, to insert a blank line before and prepend a tab to lines that start with a digit, use% g! /^\d/ i \n s /^/"\t"
. - A fluke of the way I parse regular expressions is that flags to the search expression have to be at the end of
parameter
, like:% g /foo/ d /i
to delete all lines that containfoo
, case insensitive. Setting theignorecase
option beforehand would work as well. insert
ori
- Same as
append
, but addsparameter
before the address range. join
orj
- Joins all the lines in the address range, collapsing whitespace at the beginning and end of the lines into a single space (uses
replace(/\s*\n\s*/g, ' ')
. Ifvariant
is set, don't adjust whitespace (usesreplace(/\n/, '')
. Ifparameter
is present, interpreted as a number of lines to join; the address range is considered to be the last line of the actual address range and the nextparameter-1
lines. - If there is only one line in the address range, joins it to the next line.
mark
ork
parameter
should be a single lowercase letter (this is not enforced, but the only way to refer to a mark is with a single lowercase letter, so anything else is write-only). Assigns the current address range to that mark, and makes it live, so it will remain attached to that text even if the text around it is edited (see the description for bililiteRange.live).move
orm
- Same as
copy
above, but the original address range is deleted after copying. noglobal
orv
- Same as
global!
print
- Just selects the address range. If no command is given, this is the default.
put
parameter
, if present, should be a single lowercase letter (anything else is not an error, but there's no defined way to get anything into any other register). Ifparameter
is present, then inserted text is from that register. If not, pop the delete stack. Then useappend
above to insert that text.redo
- Does
bililiteRange(element).redo()
. set
- Sets or displays options. If
parameter
is not set, setsbililiteRange(element).exMessage
toJSON.stringify(bililiteRange(element).data())
. Ifparameter
is set, it is interpreted as a space-delimited list of commands, in the following forms:option=value
- Does the command
option value
. Note that each option is actually a command, with the parameter being the value to set that option to. option
- Does the command
option on
. nooption
(that is, the literal string "no" followed by the name of the option)- Does the command
option no
. option?
- Does
option ?
, which setsbililiteRange(element).exMessage
to the value of that option. Note that each option overwrites the previous, so only the last will be displayed.
substitute
ors
or&
parameter
is a regular expression followed by a string to be used as a replacement, as/regexp/replacement/flags
(note that the flags are after the replacement, and the final delimiter is optional if there are no flags). Use JSON string notation if needed for the replacement, as/regexp/"replacement_string"
(adding/flags
if needed).- Note that as a quirk of the way I parse regular expressions, flags are read first, then the replacement string. That means that
/regexp/i
is read as "replace/regexp/i
with an empty string", not "replace/regexp/
with"i"
". - Just uses
addressrange.text().replace(regexp, replacement)
, soreplacement
can use"$1"
etc. - If no
parameter
is given, repeats the last search. undo
- Does
bililiteRange(element).undo()
. yank
ory
- Does the same as
delete
above, moving the text into the register named inparameter
or onto the delete stack, but does not delete the text. Analagous tocopy
in a modern GUI. =
- Sets
exMessage
to the current lines; either"[n]"
for a single line, or"[m,n]"
for a range. ~
- Search for the last replacement text (uses
new RegExp()
, so special characters are not quoted), and replaces withparameter
. >
- Indents the lines of the address range by prepending a tab character. If
parameter
is set, prepend that many tabs. - Feel free to start a tab-versus-spaces religious war.
<
- Unindents the lines of the address range by removing a leading tab character or
shiftwidth
spaces. Ifparameter
is set, remove that many tabs or sets of spaces. !
- "Shell" escape—actually, Javascript escape. Does
eval(parameter)
, withthis
set to the bililiteRange covering the address range, and if the result is notundefined
, replaces the address range with that text. For instance, to change a line to upper case:2 ! this.text().toUpperCase()
.
Options
The option-setting commands are just commands of the form option value
to set and option ?
to return the value in exNessage
. For boolean options, value
of off
, false
or no
set it to false
; toggle
toggles the value, and any other value (including leaving it blank) sets it to true
.
autoindent
orai
- If
true
, thenappend
,change
andinsert
will copy initial whitespace from the first line of the address range to all inserted lines. Use the variant (append!
etc.) to toggle this for one command. Default:false
. shiftwidth
orhardtabs
orht
orsw
ortabstop
orts
(in the real ex these are all different, but here they are all synonyms)- Number of spaces in a tab. Uses
tabsize
to change the displayed text, if supported. Default:8
. wrapscan
- If
true
, regular expression searches wrap around to the beginning of the text if not found from the current line to the end (for backward searches, wrap to the end if not found before the current line). Force wrap around by using thew
flag; disable wraparound by using theW
flag. Default:true
.
State
The state of the editor (including the values of the options and the locations of the marks) are kept in the data
attached to the element. Thus, calling ex()
on the same element, even if the bililiteRange is different, works correctly. That state is returned with bililiteRange(element).data()
, and you can extend the editor commands and take advantage of that object.
The registers (the stored text from delete
and yank
) is stored in a singleton shared by all editor instances, called bililiteRange.ex.registers
. This is an array (so the delete stack just uses code class="language-javascript" >bililiteRange.ex.registers.unshift(text) and code class="language-javascript" >text = bililiteRange.ex.registers.shift()). Named registers are simply added to that array with code class="language-javascript" >bililiteRange.ex.registers['a'] = text. shift
and unshift
are used rather than push
and pop
, so the most recent text is code class="language-javascript" >bililiteRange.ex.registers[0].
Unfortunately, browser security keeps Javascript from directly manipulating the clipboard, so I can't integrate this with the browser's cut/paste, but the fact that this is exposed means you can display it to the user for direct manipulation.
Extending
The bililiteRange.ex
namespace is used to expose some of the objects and methods that can extend the interpreter.
bililiteRange.ex.commands
- Described above under "Command Completion". This is an object with the keys being the name of the command and the value either being string (to define a synonym, so you can do:
bililiteRange.ex.commands.cut = 'delete'
) or a function with the signaturefunction(parameter, variant)
.this
is set to the bililiteRange being edited, with bounds set to the address range.parameter
is a string, andvariant
is a Boolean, set totrue
if the command was followed by a "!". - Legal command names are
/[a-zA-Z=&~><]+/
(letters and a few special characters; note that numbers and underscores are not legal). So
works, butbililiteRange.ex.commands['hello~world'] = function (parameter, variant) { // note the tilde between 'hello' and 'world' this.text( variant ? 'Hello, world' : 'Goodbye, world') }
bililiteRange.ex.commands['hello-world'] = function...
, while perfectly legal Javascript, will never be executed by ex since the parser will not recognize'hello-world'
(with the dash). - For instance, there's no
write
command; there are too many inconsistent ways to implement persistent storage. If you wanted to implement awrite
command using local storage, you could do:bililiteRange.ex.createOption('file', 'Untitled'); // see the createOption method below bililite.ex.commands.write = function (parameter, variant){ var state = this.data(); // 'this' in a command is the bililiteRange if (parameter) state.file = parameter; // allow changing the "file" name var key = 'ex.'+state.file; localStorage[key] = this.all(); }; bililite.ex.commands.read = function (parameter, variant){ // this isn't exactly the semantics of the original ex read command if (!parameter) return; // need a file name var key = 'ex.'+parameter; if (!(key in localStorage)) throw new Error (parameter+' not found'); this.all(localStorage[key]); this.data().file = parameter; }
bililiteRange.ex.createOption (name {String}, value {Any | undefined})
- Does
bililiteRange.data(name, {value: value})
, then creates anex
commandname
for setting that option. The actual command created is based on the type ofvalue
, which is the default value. SobililiteRange.ex.createOption('happy', true)
creates a boolean option namedhappy
, and this can be set withrng.ex('set happy=false')
orrng.ex('happy toggle')
. bililiteRange.ex.createRE (s {String}, ignoreCase {Boolean})
- Creates an enhanced regular expression from
s
. Takes a string in the form expected bysubstitute
above, meaning delimiter-regular-expression-string-delimiter-flags, with any of the flags allowed in the/re/
addresses above. Note that the delimiter can actually be any character, though using"/"
makes sense.ignoreCase
is the default if thei
orI
flags are not used; normally you would usethis.data().ignorecase
. - If the regular expression part of
s
is missing (it just starts delimiter-delimiter) then it uses the regular expression part from the lastcreateRe()
- The regular expression is formed with
new RegExp('regular expression part of s', 'legal flags')
, where'legal flags'
arei
andm
;g
never present. It is then extended with the following fields:flags
- The actual flags used internally (
/[imwIMW]*/)
. rest
- The rest of the string, the part that was not part of the regular expression or the flags.
toJSON
- Function used by
JSON.stringify()
so that the regular expression can be displayed. Normal RegExp's are ignored byJSON.stringify()
.
- The regular expression is formed with
bililiteRange.ex.splitCommands (s {String}, delim {String})
- Simple parser that basically does
s.split(delim)
, wheredelim
is a string, not a regular expression, but knows about strings (contained in"
) and regular expressions (contained in/
) and does not split on a delimiter inside one of those. Also, to be consistent withex
, automatically closes open strings and regular expressions, which can cause problems if you meant to have a slash, not a regular expression. In that case, enclose the whole thing in quotes. - An example:
bililiteRange.ex.splitCommands ('a=1 b="1 2" c=/3 4', ' ')
returns['a=1', 'b="1 2"', 'c=/3 4/']
. Note the automatic closing of/3 4/
. bililiteRange.ex.string (s {String})
- First, trims whitespace off the ends of
s
. Ifs
starts with a quote mark ("
), then returnsJSON.parse(s)
; otherwise returns the trimmed string. Ifs
is undefined, returns the empty string. bililiteRange.ex.toID (s {String})
- Encodes
s
so it is a legal ex identifier (likeencodeURI
). Legal identifiers match/(!|[a-zA-Z=&~><]+)/
. Note that numbers are not allowed, and that a single exclamation point is, but not any other use of the exclamation point (that is used for command variants).
It has been education writing this; I'm not sure it will be useful to anyone but I am presenting it to my discerning public. If you exist.
Leave a Reply