Split Meta Type Data With A Regex

Blnukem Source

I Have a array that has stored data like this:

<WebPage>
<Action>Action Goes Here 1</Action>
<SystemData>SystemData Goes Here 1</SystemData>
<PageSatausData>PageSatausData Goes Here 1</PageSatausData>
<PageNameData>PageNameData Goes Here 1</PageNameData>
<TitleData>TitleData Goes Here 1</TitleData>
<KeywordData>KeywordData Goes Here 1</KeywordData>
<DescriptionData>DescriptionData Goes Here 1</DescriptionData>
<HeaderData>HeaderData Goes Here 1</HeaderData>
<BodyData>BodyData Goes Here 1</BodyData>
<FooterData>FooterData Goes Here 1</FooterData>
</WebPage>
<WebPage>
<Action>Action Goes Here 2</Action>
<SystemData>SystemData Goes Here 2</SystemData>
<PageSatausData>PageSatausData Goes Here 2</PageSatausData>
<PageNameData>PageNameData Goes Here 2</PageNameData>
<TitleData>TitleData Goes Here 2</TitleData>
<KeywordData>KeywordData Goes Here 2</KeywordData>
<DescriptionData>DescriptionData Goes Here 2</DescriptionData>
<HeaderData>HeaderData Goes Here 2</HeaderData>
<BodyData>BodyData Goes Here 2</BodyData>
<FooterData>FooterData Goes Here 2</FooterData>
</WebPage>

What I'm trying to do is loop thew it and assign variable to each of the values like this:

foreach my $Line (@Meta_Content) {

my($Var1,$Var2,$Var3,$Var4,$Var5,$Var6,$Var7,$Var8,$Var9,$Var10) = split (/\>\</,$Line,10);

print "Result: $Var1,$Var2,$Var3,$Var4,$Var5,$Var6,$Var7,$Var8,$Var9,$Var10<br>";
 }

With no luck I'm aware of the XML modules but in this case I need a regex to do so modules are not an option.

perlcgi

Answers

answered 1 week ago George Bouras #1

here it is

#!/usr/bin/perl
use strict; use warnings; use Data::Dumper;
my $hash;


while (<DATA>) {

    if ( /<WebPage>/ ) {
    $hash={} 
    }
    elsif  ( /<\/WebPage>/ ) {
    print Dumper $hash
    }
    elsif ( /^<(.+)>(.+)<\/\1>\s*/ ) {
    $hash->{$1}=$2      
    }
}

__DATA__
<WebPage>
<Action>Action Goes Here 1</Action>
<SystemData>SystemData Goes Here 1</SystemData>
<PageSatausData>PageSatausData Goes Here 1</PageSatausData>
<PageNameData>PageNameData Goes Here 1</PageNameData>
<TitleData>TitleData Goes Here 1</TitleData>
<KeywordData>KeywordData Goes Here 1</KeywordData>
<DescriptionData>DescriptionData Goes Here 1</DescriptionData>
<HeaderData>HeaderData Goes Here 1</HeaderData>
<BodyData>BodyData Goes Here 1</BodyData>
<FooterData>FooterData Goes Here 1</FooterData>
</WebPage>
<WebPage>
<Action>Action Goes Here 2</Action>
<SystemData>SystemData Goes Here 2</SystemData>
<PageSatausData>PageSatausData Goes Here 2</PageSatausData>
<PageNameData>PageNameData Goes Here 2</PageNameData>
<TitleData>TitleData Goes Here 2</TitleData>
<KeywordData>KeywordData Goes Here 2</KeywordData>
<DescriptionData>DescriptionData Goes Here 2</DescriptionData>
<HeaderData>HeaderData Goes Here 2</HeaderData>
<BodyData>BodyData Goes Here 2</BodyData>
<FooterData>FooterData Goes Here 2</FooterData>
</WebPage>

answered 6 days ago 7stud #2

Instead of this:

my($Var1,$Var2,$Var3,$Var4,$Var5,$Var6,$Var7,$Var8,$Var9,$Var10) = split (/\>\</,$Line,10);

print "Result: $Var1,$Var2,$Var3,$Var4,$Var5,$Var6,$Var7,$Var8,$Var9,$Var10<br>";
 }

you can write:

my @pieces = split (/\>\</,$Line,10);
my $str = join '', @pieces;
print "Results: $str <br>";

And if you need to refer to the individual items, instead of writing $var1, you can write $pieces[0]; and instead of writing $var2, you can write $pieces[1], etc.

See how much more succinct that is? Beginners in every language try what you did. The rule is: if you ever find yourself writing variable names that only differ by a number, then you should store the data in an array instead.

comments powered by Disqus