13 Part 12: Hashes

LinuxChix Perl Course Part 12: Hashes

1) Introduction
2) An Example
3) Hashes and Lists
4) Keys and Values
5) Exercises
6) Answer to Previous Exercise
7) Acknowledgements
8) Licensing


----------------------------------------

1) Introduction

Recall that there are four basic data types in Perl:
a) scalar - A single chunk of data
b) array - A bunch of scalars, indexed by a number
c) hash (hashtable) - A bunch of scalars, indexed by another scalar
d) handle - A pointer enabling the opening of resources from the
operating system (files, directories, etc.)

We've seen three of these four, so there's only one to go: hashes. I
assume that you've already been introduced to hashtables, but if you
haven't, you should still be able to follow. Don't be afraid to ask
the list for examples of when hashtables are useful.

----------------------------------------

2) An Example

The following is an example program.

#!/usr/bin/perl -w
use strict;

my %capitals; # percent sign = hash
$capitals{'England'} = 'London';
$capitals{China} = 'Beijing'; # No quotes around China. OK.

print "Enter a country. ";
chomp(my $country = <STDIN>);

if ( defined( $capitals{$country} ) ) {
print "The capital of $country is $capitals{$country}\n";
}
else {
print "I don't know that country.\n";
}

----------------------------------------

3) Hashes and Lists

If a hash is assigned to or from a list, the items alternate key1,
value1, key2, value2...

my %h = ('a', 1, 'b', 2, 'c', 3);
print $h{b}; # Output is: 2

To make the association more apparent, we can use the "=>" operator,
also called "fat comma":

my %h = ('a' => 1, 'b' => 2, 'c' => 3);
print $h{b}; # Output is: 2

However, apart from the key-value pairing, a hash has no order:

my %h = ('a' => 1, 'b' => 2, 'c' => 3);
my @a = %h; # No problem, but order is lost.
print $a[0]; # It will be one of a,b,c - but we don't know which!

----------------------------------------

4) Keys and Values

The "keys" function returns the list of keys in a hashtable:

foreach my $k ( keys(%hash) ) {
print "$k : $hash{$k}\n";
}

The less used "values" function returns the list of values associated
with the keys.

----------------------------------------

5) Exercises

a) Write a program that reads /etc/passwd and creates a mapping of
usernames to UIDs. Test by inviting the user to enter a username and
outputting the UID.

b) Write a Perl program that reads standard input and removes
duplicate lines (so if you enter "foo", then "bar", then "foo" again,
the second "foo" is not output). Remember: There Is More Than One Way
To Do It (TIMTOWTDI).

c) The "reverse" function reverses the order of an array:

@arr = reverse(@arr);

What happens if you use "reverse" on a hash?

d) In our examples we haven't used numbers as hash keys, even though
Perl allows it. If you are indexing by number, you would normally use
an array rather than a hash. However, there are cases when a hash
with integer keys is more appropriate than an array. Can you name
such a case?

e) The original BASIC interpreter might be considered a primitive
IDE, because it was both an editor and an interpreter. You might use
it something like this (with "user" and "comp" added for clarity):

user> 10 PRINT "What is your name?";
comp> Ok

user> 20 INPUT NAME$
comp> Ok

user> 30 PRINT "Hello "; NAME$
comp> Ok

user> LIST
comp> 10 PRINT "What is your name?"
comp> 20 INPUT NAME$
comp> 30 PRINT "Hello "; NAME$

user> RUN
comp> What is your name?
user> Patricia
comp> Hello Patricia
comp> Ok

user> SAVE "hello.bas"
comp> Ok

User commands beginning with a number mean lines in the program.
Programs are always executed in order by line number, regardless of
the order in which the lines were entered. Each line entered
overwrites any line with the same number that may already exist. This
was one way of editing programs before we had visual editors.

Write your own subset of the BASIC interpreter. The choice of
commands is up to you, but I recommend PRINT, INPUT, LIST and (of
course) RUN.

Hint 1: There are several good uses for hashes here, not the least of
which is mapping line numbers to lines.

Hint 2: Though you can require the user to enter uppercase if you
want, a better solution is to use the "uc" function and/or "m//i".

Hint 3: Remember that to sort a list in numerical order, you use
"sort( {$a<=>$b} @array )"

----------------------------------------

6) Answer to Previous Exercise

a) This program processes a CSV file, including sorting rows by date.

#/usr/bin/perl -w
use strict;

open MY_INPUT, "< input.csv" or die "Couldn't open input file:
$!";
my @lines = <MY_INPUT>; # Read whole file.
close MY_INPUT;

foreach my $line (@lines) {
# Split line into fields.
my @fields = split /,/, $line;

# Decide whether amount should be negative.
my $amount = $fields[1];
$amount = -$amount if $fields[0] !~ /deposit/i;

# Reconstruct fields, with date at the beginning
# (for sorting) and new amount inserted in the middle.
@fields = ( $fields[2], @fields[0,1], $amount,
@fields[2..$#fields] );
$line = join(',', @fields);
}

# Sort by date (which is currently at the beginning).
@lines = sort @lines;

# Fast way of removing date from the beginning.
s/^.*?,// for (@lines);

open MY_OUTPUT, "> output.csv" or die "Couldn't open output file:
$!";
print MY_OUTPUT @lines; # Write whole file (with newlines).
close MY_OUTPUT;

----------------------------------------

7) Acknowledgements

A big thank you to Jacinta Richardson for suggestions and
proofreading. More advanced Perl users might want to check out the
free material from Perl Training Australia
<http://www.perltraining.com.au/>, which she is a part of.

Other contributors include Meryll Larkin.

----------------------------------------

8) Licensing

This course (i.e., all parts of it) is copyright 2003-2005 by Dan
Richter and Alice Wood, and is released under the same license as
Perl itself (Artistic License or GPL, your choice). This is the
license of choice to make it easy for other people to integrate your
Perl code/documentation into their own projects. It is not generally
used in projects unrelated to Perl.