[Novalug] regexp problem in perl

Bonnie Dalzell bdalzell at qis.net
Tue Aug 31 19:24:51 EDT 2010


If anyone has the time to help me and the experience with regexps in perl 
I have a problem that I have not been able to solve using web searches 
and a day's worth of experimentation.

it involves extracting a variable number of digits from the middle of a 
line of text.

sometimes there will be 6 digits, sometimes 7 sometimes 8 sometimes 9

I have a regexp that works over a range of two different number of digits
(such as 7 or 8) but it fails with a wider range.

the input is line is from an online dog show catalog

1  ||46|ATTAWAY-KINOBI SPRING FEVER. HM 82485718. 04-09-99
2dump--|----dog name---------------| |regnumber-| |-birth-|

the dog name can have different non-alpha characters as well as letters:
hyphens, ampersands, single quotes.

sometimes the date of birth is in a format like: 02/03/00 or 02\03\00 
rather than the hyphen format 02-03-00.


  this regexp works if there are only 7 or 8 numbers in regnumber:

if(
$line=~m/(\|.*)(\|)(.*)(\w{2}.*\d{7,8}\w*).(\s\d\d[\W|-]{1}\d\d[\W|-]{1}\d\d)/
){

yields:

$1 ||46  - to be ignored
$2 |      -to be ignored
$3 ATTAWAY-KINOBI SPRING FEVER.  = $name
$4 HM 82485718                   = $regnumber
$5  04-09-99                     = $birthday

if I try to give a range of numbers by doing:

\d{6,9} it fails.

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
                        Bonnie Dalzell, MA
mail:5100 Hydes Rd PO Box 60, Hydes,MD,USA 21082-0060|EMAIL:bdalzell at qis.net
Freelance anatomist, vertebrate paleontologist, writer, illustrator, dog
breeder, computer nerd & iconoclast... Borzoi info at www.borzois.com.
HOME www.batw.net    ART bdalzellart.batw.net  BUSINESS www.boardingatwedge.com




More information about the Novalug mailing list