[Novalug] regexp problem in perl
Bonnie Dalzell
bdalzell at qis.net
Tue Aug 31 19:24:51 EDT 2010
If anyone has the time to help me and the experience with regexps in perl
I have a problem that I have not been able to solve using web searches
and a day's worth of experimentation.
it involves extracting a variable number of digits from the middle of a
line of text.
sometimes there will be 6 digits, sometimes 7 sometimes 8 sometimes 9
I have a regexp that works over a range of two different number of digits
(such as 7 or 8) but it fails with a wider range.
the input is line is from an online dog show catalog
1 ||46|ATTAWAY-KINOBI SPRING FEVER. HM 82485718. 04-09-99
2dump--|----dog name---------------| |regnumber-| |-birth-|
the dog name can have different non-alpha characters as well as letters:
hyphens, ampersands, single quotes.
sometimes the date of birth is in a format like: 02/03/00 or 02\03\00
rather than the hyphen format 02-03-00.
this regexp works if there are only 7 or 8 numbers in regnumber:
if(
$line=~m/(\|.*)(\|)(.*)(\w{2}.*\d{7,8}\w*).(\s\d\d[\W|-]{1}\d\d[\W|-]{1}\d\d)/
){
yields:
$1 ||46 - to be ignored
$2 | -to be ignored
$3 ATTAWAY-KINOBI SPRING FEVER. = $name
$4 HM 82485718 = $regnumber
$5 04-09-99 = $birthday
if I try to give a range of numbers by doing:
\d{6,9} it fails.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Bonnie Dalzell, MA
mail:5100 Hydes Rd PO Box 60, Hydes,MD,USA 21082-0060|EMAIL:bdalzell at qis.net
Freelance anatomist, vertebrate paleontologist, writer, illustrator, dog
breeder, computer nerd & iconoclast... Borzoi info at www.borzois.com.
HOME www.batw.net ART bdalzellart.batw.net BUSINESS www.boardingatwedge.com
More information about the Novalug
mailing list