MakeRegex

The Perl-module MakeRegex composes a regular expression from a list of words. It had been inspired by the emacs elisp module make-regex.el, by Simon Marshall.

Some applications

Usage

A simple example (there are more examles in the package).
#!/usr/bin/perl

use strict;
use MakeRegex;

my @list= qw(
 a al all alla an ann anna annas ananas
);

print MakeRegex::make_regex(@list);
my $l;
foreach (@list) {
  print "$_:";
  if (/$regex/) {
    print "YES\n";
  } else {
    print "NO\n";
  }  
}
  
This module should at least give the same intelligence as Mr. Marshalls' implementation. The test in make-regex.el gives exactly the same result (se the file tests.results). However the modules differs with respect to grouping and the handling of the '?'-operator.

 
(insert (make-regexp '("a" "al" "all" "alla" "an" "ann" "anna" "annas")))
    ->  a(|[ln]|lla?|nn(|as?))

 make_regex( ("a","al","all","alla","an","ann","anna","annas");
   ->  a(l(la?)?|n(n(as?)?)?)?
 
which makes Marshall a slight victory. But if we append "ananas" (swedish for 'pineapple'), we see a big difference:
 (insert (make-regexp '("a" "al" "all" "alla" "an" "ann" "anna" "annas" "ananas")))
    -> a(|[ln]|lla?|nn(|as?))a(|[ln]|lla?|n(anas|n(|as?)))
    make_regex( ("a","al","all","alla","an","ann","anna","annas","ananas") );
    -> a(l(la?)?|n(anas|n(as?)?)?)?

A NOTE AND TODO

This version is based on common prefixes, which means that it do not take proper care of common suffixes or infixes, as the list

    ("alla" "palla" "balla" "kalla")
bluntly shows:
 ->  (alla|palla|balla|kalla)
This may, or may not be fixed in the future.

Download the MakeRegex package. This package includes MakeRegex.pm and some test programs.

Note: This page was earlier located at http://www.netch.se/~hakank/makeregex/ .
MakeRegex was created by Hakan Kjellerstrand, hakank@gmail.com.
To my other useless programs. To my homepage.