How Can I Parse Data to a Hash of Arrays in Perl?

In order to parse data to a hash of arrays, first, the data must be in a format that Perl can read. One format that will work is a text file separated into lines. If each line has a well-defined format, Perl’s regex engine can be used to parse that format. The following is an example text file where each line has a well-defined format:

data.txt

BOOK: Cat In The Hat

COLOR: Green

GAME: Dodgeball

COLOR: Red

BOOK: If You Give a Mouse a Cookie

GAME: Checkers

COLOR: Orange

GAME: Connect Four

BOOK: Charlie and the Chocolate Factory

GAME: Hide and Seek

In this example, each line contains a category followed by a ‘:’ character, then an item that belongs in that category. The following Perl code could be used to parse this file:

parse.prl

#!/usr/bin/perl

my %data; # hash of arrays

# Parse the input file line-by-line. Push each item into the array

# associated with its category.

open my $file, “<data.txt”;

while (my $line = <$file>) {

if ($line =~ /([A-Z]+):[ \t]*(.*)/) {

push @{ $data{$1} }, $2;

}

close $file;

# Print out all items in the GAME category

print “Games:\n”;

foreach my $item (@{ $data{GAME} }) {

print ” * $item\n”;

}

This code first declares an empty hash that will be used to store the data. Then it parses the input file line-by-line and processes each line that matches a regex pattern. For each matching line, it finds the corresponding entry in the hash, casts it to an array and pushes on a new item. After closing the file, the hash of arrays is ready to be used. The example shows how to print all of the items with a specific key. The expected output is:

> ./parse.prl

Games:

* Dodgeball

* Checkers

* Connect Four

* Hide and Seek