There have been lots of times that I've wanted to be able to keep my hand on the keyboard when editing, rather than running off to the mouse all the time. There's an implementation of VIM in Javascript but I figured I would learn something by doing it myself. My goal is vi, not vim, since I don't need anything that sophisticated.

The first step is implementing the line-oriented part of vi, called ex, based on the manual from the sourceforge project. My version is based on bililiteRange, and depends on the bililiteRange utilities and undo plugin.

Use it simply as bililiteRange(textarea).ex('%s/foo/bar/');, passing the ex command to the ex() function. The biggest difference from real ex is that this uses javascript regular expressions, rather than the original ex ones. Thus s/\w/x/ rather than s/[:class:]/x/, and use ?/.../ rather than ?...? to search backwards (the question mark is used in Javascript regular expressions so I don't want to use it as a delimiter).

See a demo.

See the code on github.

Command Syntax

The syntax of the function is trivial: bililiteRange(someelement).ex(excommand {String} [, defaultAddress {String}]), where excommand is described below, and defaultAddress is the address of the line to use if no address is given in the command string; the default is '.'.

Ex commands are in the form of addressrange command variant parameter; all parts are optional. White space can separate the parts but are not necessary if it is unambiguous. Multiple commands can be entered in one string, separated by |. To include | or other special characters in the parameter (that's the only place it would be relevant), enclose in double quotes, as a JSON string.Thus ex('a hello | a bye') appends two lines, hello and bye, while ex('a "hello | a bye"') appends one line, hello | a bye.

The parser tries to imitate ex's too-clever-by-half automatic closing of regular expressions (and by extension, strings). This means that an address or a parameter that contains an unmatched / or " will have that delimiter added at the end. So ex('a his/hers') will in fact append his/hers/. Use JSON strings to avoid that problem. So ex('a "his/hers"') or ex('a "his\"hers"').

Syntax errors or undefined commands are thrown as throw new Error(errormessage).

Non-error messages (like the list of options from set) are returned in a field in the range called exMessage. Thus console.log( bililiteRange(element).ex('set').exMessage ).

variant simply means optionally appending a ! to the command name. For many commands, this changes the behavior slightly. E.g., append text respects the value of the autoindent option, but append! text does the opposite of the autoindent option.

Command Completion

The parser first looks up the command in the object bililiteRange.ex.commands. If it is defined as a function, executes it. If it is defined as a string, looks that value up in bililiteRange.ex.commands with this algorithm (so infinite loops are possible!). If it not defined at all, treats it as a abbreviation and returns the first member of bililiteRange.ex.commands for which the command is a prefix.

So if bililiteRange.ex.commands were:

{
  c: 'cut',
  copy: function(){...}
  cut: function(){...}
}

Then c would execute the cut function and co would execute the copy function.

Addresses

Ex is line oriented; every command works on one or more complete lines, which are delimited by '|'. Address ranges are of the form x ([,;] x)* meaning one or more address parts separated by commas or semicolons. Each address part (described below) may be followed by a positive or negative offset. Each address part evaluates to a line number, then the offset is added or subtracted to get the "current line". This is pushed onto a stack of lines numbers. The final range that the command operates on is the range of lines between the top two lines on that stack (inclusive). If only one address part is given, then the range is that line alone. If there is no address, then the default address passed to ex() is used. The default for that is '.'.

The "starting line" is the first line of the range passed in. Separating address parts with a semicolon instead of a comma resets the "starting line" to the "current line". Thus if the current line is line 4, '1,.+1' means lines 1 through 5 but '1;.+1' means lines 1 through 2.

Address parts can be any of the following:

%%
This is my own extension. Pushes both the starting and ending line of the original range onto the stack. If this is the only address, then does not change the range at all (useful for creating commands that are not line-oriented).

. (a single period)
Pushes the current line.
$
Pushes the last line of the text.
Any number
Pushes that number
'x (a single quote mark followed by a lower-case letter)
Pushes the value of the corresponding mark (see the mark command)
'' (two single quote marks)
Pushes the current line of the previous command (so you can go back).
%
Equivalent to 0,$; all the lines.
/re/flags
Searches for the regular expression /re/ from the current line. Pushes the first line found that matches, or the current line if not found. An empty regular expression (//, or, since the parser closes regular expressions, just /) uses the previous regular expression.
The following flags are valid (this is an extension of real Javascript regular expressions):
  • i: Force a case-insensitive search.
  • I: Force a case-sensitive search.
  • If neither i or I is set, uses the value of the ignorecase option.
  • m: Use multiline mode (^ and $ match the end of lines rather than just the end of text).
  • M: Do not use multiline mode.
  • By default, uses multiline mode.
  • w: Search wraps around to the beginning of the text if needed .
  • W: Search does not wrap around.
  • If neither w or W is set, uses the value of the wrapscan option.
  • The g flag is legal but is ignored (since this is only looking for the first match
?/re/flags
Search backwards for /re/ as above.

Commands

append or a
Inserts parameter as a new line after the address range. Add multiple lines with escaped newlines in a JSON string, as ex('a "first\nsecond\nthird"'). If the autoindent option is set, copies any whitespace at the beginning of the current line to the beginning of each of the appended lines. If variant is set (i.e. append!), toggles the autoindent option for this command.
change or c
Replaces the address range with parameter, respecting the autoindent option as with append (and its variant). the deleted text is pushed onto a "delete stack", to be popped with put. This is the analogue of the clipboard in modern GUI's, but it is a stack--the contents can only be used once (but see the delete command with named registers).
copy or transcribe or t
parameter is interpreted as an address range as above, and the text of the original addressed range is copied after that. Thus ex('1,2 copy 4') copies lines 1 and 2 after line 4. Note that this is different from what modern GUI's call "copy", copying the text to the clipboard; for that use yank.
delete or del
Deletes the address range. Interprets parameter as an optional letter followed by an optional number. If the letter is not present, then the deleted text is pushed onto the delete stack. If a lower-case letter is present, that is taken as the name of a register and the text is stored there. If an upper-case letter is present, the deleted text is appended to the corresponding lower-case letter register. If the number is present, then the affected range is the last line of the address range and the n-1 following lines. Thus, 1,2 delete deletes lines 1 and 2, but 1,2 delete 2 deletes lines 2 and 3.
global or g
This one is hard to explain. Interprets parameter as a regular expression followed by a command, then executes the command on all the lines in the address range that match the regular expression. If variant is set, then executes the command on all the lines that do not match the regular expression.
For example, 1,4 g /foo/ d would delete all lines from 1 to 4 that contain "foo". % g! /^\d/ s /^/"\t" would prepend a tab to every line that does not start with a digit.
Since a | (vertical bar) is used to separate individual commands, we can't use that to separate the commands passed in parameter. The real ex uses newlines to separate those, but it's hard to type that, so I use the character \n to separate commands. So, to insert a blank line before and prepend a tab to lines that start with a digit, use % g! /^\d/ i \n s /^/"\t".
A fluke of the way I parse regular expressions is that flags to the search expression have to be at the end of parameter, like: % g /foo/ d /i to delete all lines that contain foo, case insensitive. Setting the ignorecase option beforehand would work as well.
insert or i
Same as append, but adds parameter before the address range.
join or j
Joins all the lines in the address range, collapsing whitespace at the beginning and end of the lines into a single space (uses replace(/\s*\n\s*/g, ' '). If variant is set, don't adjust whitespace (uses replace(/\n/, ''). If parameter is present, interpreted as a number of lines to join; the address range is considered to be the last line of the actual address range and the next parameter-1 lines.
If there is only one line in the address range, joins it to the next line.
mark or k
parameter should be a single lowercase letter (this is not enforced, but the only way to refer to a mark is with a single lowercase letter, so anything else is write-only). Assigns the current address range to that mark, and makes it live, so it will remain attached to that text even if the text around it is edited (see the description for bililiteRange.live).
move or m
Same as copy above, but the original address range is deleted after copying.
noglobal or v
Same as global!
print
Just selects the address range. If no command is given, this is the default.
put
parameter, if present, should be a single lowercase letter (anything else is not an error, but there's no defined way to get anything into any other register). If parameter is present, then inserted text is from that register. If not, pop the delete stack. Then use append above to insert that text.
redo
Does bililiteRange(element).redo().
set
Sets or displays options. If parameter is not set, sets bililiteRange(element).exMessage to JSON.stringify(bililiteRange(element).data()). If parameter is set, it is interpreted as a space-delimited list of commands, in the following forms:
option=value
Does the command option value. Note that each option is actually a command, with the parameter being the value to set that option to.
option
Does the command option on.
nooption (that is, the literal string "no" followed by the name of the option)
Does the command option no.
option?
Does option ?, which sets bililiteRange(element).exMessage to the value of that option. Note that each option overwrites the previous, so only the last will be displayed.
substitute or s or &
parameter is a regular expression followed by a string to be used as a replacement, as /regexp/replacement/flags (note that the flags are after the replacement, and the final delimiter is optional if there are no flags). Use JSON string notation if needed for the replacement, as /regexp/"replacement_string" (adding /flags if needed).
Note that as a quirk of the way I parse regular expressions, flags are read first, then the replacement string. That means that /regexp/i is read as "replace /regexp/i with an empty string", not "replace /regexp/ with "i"".
Just uses addressrange.text().replace(regexp, replacement), so replacement can use "$1" etc.
If no parameter is given, repeats the last search.
undo
Does bililiteRange(element).undo().
yank or y
Does the same as delete above, moving the text into the register named in parameter or onto the delete stack, but does not delete the text. Analagous to copy in a modern GUI.
=
Sets exMessage to the current lines; either "[n]" for a single line, or "[m,n]" for a range.
~
Search for the last replacement text (uses new RegExp(), so special characters are not quoted), and replaces with parameter.
>
Indents the lines of the address range by prepending a tab character. If parameter is set, prepend that many tabs.
Feel free to start a tab-versus-spaces religious war.
<
Unindents the lines of the address range by removing a leading tab character or shiftwidth spaces. If parameter is set, remove that many tabs or sets of spaces.
!
"Shell" escape—actually, Javascript escape. Does eval(parameter), with this set to the bililiteRange covering the address range, and if the result is not undefined, replaces the address range with that text. For instance, to change a line to upper case: 2 ! this.text().toUpperCase().

Options

The option-setting commands are just commands of the form option value to set and option ? to return the value in exNessage. For boolean options, value of off, false or no set it to false; toggle toggles the value, and any other value (including leaving it blank) sets it to true.

autoindent or ai
If true, then append, change and insert will copy initial whitespace from the first line of the address range to all inserted lines. Use the variant (append! etc.) to toggle this for one command. Default: false.
shiftwidth or hardtabs or ht or sw or tabstop or ts (in the real ex these are all different, but here they are all synonyms)
Number of spaces in a tab. Uses tabsize to change the displayed text, if supported. Default: 8.
wrapscan
If true, regular expression searches wrap around to the beginning of the text if not found from the current line to the end (for backward searches, wrap to the end if not found before the current line). Force wrap around by using the w flag; disable wraparound by using the W flag. Default: true.

State

The state of the editor (including the values of the options and the locations of the marks) are kept in the data attached to the element. Thus, calling ex() on the same element, even if the bililiteRange is different, works correctly. That state is returned with bililiteRange(element).data(), and you can extend the editor commands and take advantage of that object.

The registers (the stored text from delete and yank) is stored in a singleton shared by all editor instances, called bililiteRange.ex.registers. This is an array (so the delete stack just uses code class="language-javascript" >bililiteRange.ex.registers.unshift(text) and code class="language-javascript" >text = bililiteRange.ex.registers.shift()). Named registers are simply added to that array with code class="language-javascript" >bililiteRange.ex.registers['a'] = text. shift and unshift are used rather than push and pop, so the most recent text is code class="language-javascript" >bililiteRange.ex.registers[0].

Unfortunately, browser security keeps Javascript from directly manipulating the clipboard, so I can't integrate this with the browser's cut/paste, but the fact that this is exposed means you can display it to the user for direct manipulation.

Extending

The bililiteRange.ex namespace is used to expose some of the objects and methods that can extend the interpreter.

bililiteRange.ex.commands
Described above under "Command Completion". This is an object with the keys being the name of the command and the value either being string (to define a synonym, so you can do: bililiteRange.ex.commands.cut = 'delete') or a function with the signature function(parameter, variant). this is set to the bililiteRange being edited, with bounds set to the address range. parameter is a string, and variant is a Boolean, set to true if the command was followed by a "!".
Legal command names are /[a-zA-Z=&~><]+/ (letters and a few special characters; note that numbers and underscores are not legal). So
bililiteRange.ex.commands['hello~world'] = function (parameter, variant) { // note the tilde between 'hello' and 'world'
  this.text( variant ? 'Hello, world' : 'Goodbye, world')
}
works, but bililiteRange.ex.commands['hello-world'] = function..., while perfectly legal Javascript, will never be executed by ex since the parser will not recognize 'hello-world' (with the dash).
For instance, there's no write command; there are too many inconsistent ways to implement persistent storage. If you wanted to implement a write command using local storage, you could do:

bililiteRange.ex.createOption('file', 'Untitled'); // see the createOption method below
bililite.ex.commands.write = function (parameter, variant){
  var state = this.data(); // 'this' in a command is the bililiteRange
  if (parameter) state.file = parameter; // allow changing the "file" name
  var key = 'ex.'+state.file;
  localStorage[key] = this.all();
};
bililite.ex.commands.read = function (parameter, variant){ // this isn't exactly the semantics of the original ex read command
  if (!parameter) return; // need a file name
  var key = 'ex.'+parameter;
  if (!(key in localStorage)) throw new Error (parameter+' not found');
  this.all(localStorage[key]);
  this.data().file = parameter;
}
bililiteRange.ex.createOption (name {String}, value {Any | undefined})
Does bililiteRange.data(name, {value: value}), then creates an ex command name for setting that option. The actual command created is based on the type of value, which is the default value. So bililiteRange.ex.createOption('happy', true) creates a boolean option named happy, and this can be set with rng.ex('set happy=false') or rng.ex('happy toggle').
bililiteRange.ex.createRE (s {String}, ignoreCase {Boolean})
Creates an enhanced regular expression from s. Takes a string in the form expected by substitute above, meaning delimiter-regular-expression-string-delimiter-flags, with any of the flags allowed in the /re/ addresses above. Note that the delimiter can actually be any character, though using "/" makes sense. ignoreCase is the default if the i or I flags are not used; normally you would use this.data().ignorecase.
If the regular expression part of s is missing (it just starts delimiter-delimiter) then it uses the regular expression part from the last createRe()
The regular expression is formed with new RegExp('regular expression part of s', 'legal flags'), where 'legal flags' are i and m; g never present. It is then extended with the following fields:
flags
The actual flags used internally (/[imwIMW]*/).
rest
The rest of the string, the part that was not part of the regular expression or the flags.
toJSON
Function used by JSON.stringify() so that the regular expression can be displayed. Normal RegExp's are ignored by JSON.stringify().
bililiteRange.ex.splitCommands (s {String}, delim {String})
Simple parser that basically does s.split(delim), where delim is a string, not a regular expression, but knows about strings (contained in ") and regular expressions (contained in /) and does not split on a delimiter inside one of those. Also, to be consistent with ex, automatically closes open strings and regular expressions, which can cause problems if you meant to have a slash, not a regular expression. In that case, enclose the whole thing in quotes.
An example: bililiteRange.ex.splitCommands ('a=1 b="1 2" c=/3 4', ' ') returns ['a=1', 'b="1 2"', 'c=/3 4/']. Note the automatic closing of /3 4/.
bililiteRange.ex.string (s {String})
First, trims whitespace off the ends of s. If s starts with a quote mark ("), then returns JSON.parse(s); otherwise returns the trimmed string. If s is undefined, returns the empty string.
bililiteRange.ex.toID (s {String})
Encodes s so it is a legal ex identifier (like encodeURI). Legal identifiers match /(!|[a-zA-Z=&~><]+)/. Note that numbers are not allowed, and that a single exclamation point is, but not any other use of the exclamation point (that is used for command variants).

It has been education writing this; I'm not sure it will be useful to anyone but I am presenting it to my discerning public. If you exist.

Leave a Reply


Warning: Undefined variable $user_ID in /home/public/blog/wp-content/themes/evanescence/comments.php on line 75