Package monq

Class Oooo

java.lang.Object
monq.Oooo

public class Oooo extends Object
A program similar to grep/sed/awk. After creating the jar, call it like
  java -cp build/monq-*[0-9].jar monq.Oooo -help

Alternatively, on Linux, copy the jar to a simpler file name and make it executable:

   mv monq-*.jar ooo
   chmod a+x ooo
 

Examples

Assuming you have done the copy+chmod above, here are some simple examples:

  • Similar to a recursive grep:
      ooo r='Dfa<(.*>)! dfa =' src
    where the semicolon syntax (...)! means: shortest match.
  • Similar to grep -o and suppressing the names of the files in the output:
      ooo r='Dfa<(.*>)! +dfa += +(.*;)!' unmatched=DROP names=no src
  • Replacing text:
      ooo r='[a-z]+->[[ALPHA]]' r='[0-9]+->[[NUM]]' src
    Note how multiple r= rules can be used.

See regular expression syntax for the syntax to use before the -> in r= options.

Additional Information

Some background information to better understand the -help output:
  • Options have the form var=value, much like the UNIX dd command. The -help "option" is an exception to be less surprising at start.
  • The program splits the input into partitions. This is what the split parameter is for. By default it splits the input into lines like other programs.
  • Within a partition, regular expression replacements can be performed, defined by one or more r= options. Like split, the format is r=regex->template where regex is a monq regular expression (see link above) and template is any string which may contain brace replacements, like {} (see below), to be replaced by the actual match. So r=[0-9]+-><num>{}</num> will embed each number found with a num XML-tag. If you need the -> as part of the regex, write -[>]. Should you need the text {} literally in the output, use \{}. NOTE: in most shells be sure to enclose regex->template in quotes to prevent the > character acting as an output redirection.
  • By default, matching is case insensitive, use case=true to avoid case folding.
  • If a partition has a match, the whole partition is shown, except with unmatched=DROP.
  • A partition without match is not shown by default, but can be forced to appear with copy=true.
  • Any command line parameter of the form a=b is understood as an option, everything else or everything after the first '--' is treated as a file or directory.
  • Directories are traversed up to the depth= parameter, where 0 is the immediate directory content. The depth defaults to depth=all if only one non option parameter is provided. Otherwise it is 0.
  • The file name "-" (minus character) requests processing of standard input. If a real file's name is just the minus sign, provide it as ./- as usual.
  • The option names= can be used to force or suppress showing file names in front of the output partitions. Further, with names=only, only the names of matching files are shown, not the matches, and scanning a file is terminated right after the first match.
  • The option i=, for "include file" specifies a regular expression to find in file names to be included. The file name to be checked is the relative path beyond a directory in which it was found, including path separators. So an include like i=foo/bar will include, among others, dodo/foo/bar and foo/bar/baz/dong.z. This is case sensitive matching. Only file names are matched, not directory names.
  • The option x=, for "exclude file" is checked on files in the same way as i=. A file is processed for i AND !x, so x overrides an i match.
  • The option dotfile= requests the resulting DFA from combining the split and all r options to be written to a file in dot (graphviz) format.
  • Options with a boolean value, if provided, are true except their value, in lowercase, starts with a prefix of the word false. So case= or case=voodoo mean true, but case=f means false.

Brace Replacements

The template may contain the following brace replacements:

  1. {} — copy the match
  2. {c:color} — copies the match with ANSI escape sequences around to color it
  3. {c} — shortcut for {c:red}
  4. {s:color} — ANSI escape to start the given color
  5. {e} — ANSI escape to reset all coloring

The color above may be red, green or blue (yes, just these) or #xxxxxx with 6 hex digits for a CSS like color specification.

Example: match->{s:red}«{}»{e}» and match->«{c:red}» will both insert the match between ANSI escapes to show the match in red and show guillemets left and right like "«match»" and "«match»" respectively.

One common DFA or "what is this weird error?"

All regular expressions from the r= and the split= option are compiled into a common DFA, which in particular means that all regular expressions matched *at once* or *in parallel*. Now consider having the two rules

  r='[a-z]+->alpha' r='duck->found'

When finding duck, the string also matches [a-z]+, and the machine cannot know, which of the two rules is "better". Consequently it does not even compile the DFA and rather proclaims:

  CompileDfaException: two stop states with different actions recognize the same string.
  The following set(s) of clashes exist:
  1) path `duck':
  ...
 

Currently there is no way around other than fixing the regular expressions to not overlap. Remember that split= goes into the mix, so a rule like r=\n->... will clash with the default split rule.

  • Constructor Details

    • Oooo

      public Oooo()
  • Method Details

    • main

      public static void main(String[] argv)