14 Part 13: Functions

LinuxChix Perl Course Part 13: Functions

1) Introduction
2) A Simple Example
3) A More Complicated Example
4) Exercises
5) Answers to Previous Exercises
6) Acknowledgements
7) Licensing


----------------------------------------

1) Introduction

As we write larger and larger programs, the need for functions
becomes apparent. This week we're (finally) going to see how to write
functions in Perl.

----------------------------------------

2) A Simple Example

We're going to start with a simple example that returns whatever is
passed in to it.

sub identity_function {
return @_; # @_ is array of parameters.
}

my @result = identity_function('foo', 'bar', 'baz');
print "@result\n";

Notice that the function accepts any number of parameters (in the
form of an array, namely "@_") and returns any number of parameters
(also in the form of an array). Try calling it like this:

my @arr = ('b', 'c', 'd');
my @result = identity_function('a', @arr, 'e');
print "@result\n"; # Output is: a b c d e

As you can see, the array is expanded when passed in. There is no way
to pass in or return two distinct arrays; they are always merged.
However, when we learn about references, we'll see that it's possible
to pass in or return references to two distinct arrays.

Perl provides wonderful flexibility in allowing any number of
parameters to be passed in or returned. USE IT! For example, if you
write a "max" function to return the largest number, don't limit the
user to two: accept any number of parameters.

----------------------------------------

3) A More Complicated Example

Now let's try an example that's a little more interesting. In Perl
there is no command to perform an s/// operation on an array, so
here's a function to do just that.

#!/usr/bin/perl -w
use strict;

sub array_s {
my $find = shift; # Get first parameter.
my $replacement = shift; # Get second parameter.

foreach ( @_ ) { # Now do the work.
s/$find/$replacement/g; # s///g = Global replacement.
}

# No return statement: change was made to input params.
}

my $x = 'the';
my $y = 'good';
my $z = 'samaritan';
array_s('[aeiou]', '_', $x, $y, $z);

print "$x $y $z\n";

A few things to note:
a) "shift" is applied to "@_" if you provide it with no arguments.
This has an effect similar to the "shift" statement in shell scripts.
b) A change to "@_" also changes one of the variables passed in. (In
other words, non-literals are passed by reference.)

I don't actually recommend changing "@_"; it would probably be better
to return the result instead. But Perl gives you lots of options, and
one of those options is changing what's passed in.

By the way, in Perl 4 we would have called the function with a
preceeding ampersand, like this:

&array_s('[aeiou]', '_', $x, $y, $z);

This is no longer necessary in Perl 5, but you may see it in legacy
code, and the ampersand is associated with functions in other ways
which we'll see later.

----------------------------------------

4) Exercises

a) The "reverse" function, when used in scalar context, reverses the
order of the characters in a string. It also acts on "$_", but we
still can't do this:

@arr = map(reverse, @arr); # Doesn't work.

Write a "strictly_scalar_reverse" function that returns all the
elements of an array written backwards. So
"strictly_scalar_reverse('foo','bar');" will return "'oof','rab'".

Bonus credit to anyone who can determine why using "map" doesn't
work, and double bonus if you find a work-around.

b) Consider the following code:

sub foo($$$) {
print "You called foo with arguments @_\n";
}

What happens if you call "foo" with exactly three scalar parameters?
What happens if you call it with more or fewer? What happens if you
call it with an array of exactly three elements?

c) Consider the following code:

sub bar {
if ( wantarray ) {
return ('a', 'lot', 'of', 'scalars');
}
else {
return 'one';
}
}

my @x = bar;
my $y = bar;
print "@x\n";
print "$y\n";

What does the "wantarray" function do?

----------------------------------------

5) Answers to Previous Exercises

a) The following program uses /etc/passwd to map usernames to UIDs.
Note that it takes advantage of "$_", which will no doubt delight
some and disgust others.

#!/usr/bin/perl -w
use strict;

my %map;

open PASSWD, "< /etc/passwd" or die "Couldn't open file.";
while ( <PASSWD> ) {
if ( /^(.*?):.*?:(\d+):/ ) {
$map{$1} = $2;
}
else {
die "Encountered weird data in password file.";
}
}
close PASSWD;

print "Enter a username and I'll give you a UID.\n";

while ( my $username = <STDIN> ) {
chomp($username);
if ( defined($map{$username}) ) {
print "User $username has UID $map{$username}\n";
}
else {
print "Sorry: I don't know that user.\n";
}
}

b) The right way to write a program that removes duplicate lines is
to store previous lines as keys in a hash:

#/usr/bin/perl -w
use strict;

my %previous;
while ( my $line = <STDIN> ) {
if ( ! defined( $previous{$line} ) ) {
$previous{$line} = 1; # 1 is a dummy value.
print $line; # No newline: wasn't chomped.
}
}

Of course, There Is More Than One Way To Do It (TIMTOWTDI). A less
elegant way is to store previous lines in an array and "grep" for
them. This is much less efficient because it doesn't take advantage
of the fast lookup abilities of a hashtable. But here's an example of
how to do it.

#/usr/bin/perl -w
use strict;

#
# A much less efficient way to do the same thing.
#

my @previous;
while ( my $line = <STDIN> ) {
if ( ! grep {$_ eq $line} @previous ) {
push(@previous, $line);
print $line; # No need for \n : wasn't chomped.
}
}

c) Calling "reverse" on a hash causes keys to become values and
values to become keys.

d) A hash is more appropriate than an array for indexing by number if
the indeces are sparsely populated. For example, the first exercise
was to map usernames to UIDs, but mapping UIDs to usernames would
also be an appropriate use of a hash, because a system with only a
few users will usually have UIDs over 500 (sometimes over 1000). You
could store these in an array, but a lot of space (memory) would be
wasted, so a hash is probably a better choice. The following exercise
uses a hash to map line numbers to commands in BASIC for the same
reason.

e) Here's the BASIC interpreter I came up with. It supports the
PRINT, INPUT, LIST, RUN and SAVE commands, though it's very light on
error checking.

#!/usr/bin/perl -w
use strict;

my %code; # Mapping of lines to commands.
my %vars; # Mapping of variables to values.

while ( my $line = <STDIN> ) {

# Convert to uppercase, except stuff in quotes.
# Adding a space prevents split from dropping last quote.
my @chunks = split /"/, $line . ' ';
for (my $i=0; $i<@chunks; $i+=2) {
$chunks[$i] = uc($chunks[$i]);
}
$line = join('"', @chunks);

$line =~ s/^\s*|\s*$//g; # Remove white space.

# User entered a line of code (starts with number).
if ( $line =~ /^(\d+)\s*(.*)$/ ) {
$code{$1} = $2;
delete $code{$1} if $2 eq ''; # Blank line = delete.
print "Ok\n";
}

# User asked for a LIST of his code.
elsif ( $line =~ /^LIST\s*(\d*)$/ ) {
my @line_nums = ($1 ne '' ? $1 : keys(%code));
@line_nums = sort( {$a <=> $b} @line_nums);
foreach my $num (@line_nums) {
print "$num $code{$num}\n";
}
print "Ok\n";
}

# User asked to RUN his code.
elsif ( $line eq 'RUN' ) {
my @line_nums = sort( {$a <=> $b} keys(%code));
foreach my $num (@line_nums) {
my($command,$rest) = split( /\s+/, $code{$num}, 2 );
#print "Debug: $command|$rest\n";
if ( $command eq 'INPUT' ) {
print '? ';
chomp($vars{$rest} = <STDIN>);
}
elsif ( $command eq 'PRINT' ) {
# A PRINT takes args separated by semicolons.
my $ends_in_semicolon = ($rest =~ /;\s*$/);
while ( defined($rest) && $rest ne '' ) {
if ($rest =~ /^"/) { # If it's a string.
(my $string, $rest) = ($rest =~
/^"(.*?)"\s*;?\s*(.*)$/);
print $string;
}
else {
(my $var, undef, $rest) = ($rest =~
/^(.*?)\s*(;\s*(.*)|$)/);
print $vars{$var} if defined($vars{$var});
}
}
# A PRINT command ending in a semicolon gets no newline.
print "\n" unless $ends_in_semicolon;
}
else {
print "Unrecognised command at line $num: $command\n";
}
}
}

# User asked to SAVE the program.
elsif ( $line =~ /^SAVE\s+"(.*)"$/ ) {
my $ok = open OUTPUT, "> $1";
if ( $ok ) {
foreach my $number (sort keys %code) {
print OUTPUT "$number $code{$number}\n";
}
close OUTPUT;
print "Ok\n";
}
else { warn "Couldn't write $1: $!\n"; }
}

elsif ( $line eq 'QUIT' ) {
print "Ok\n";
exit;
}

elsif ( $line eq '' ) {
# Do nothing.
}

else {
print "Unrecognised command\n";
}
}

As you can see, there's a lot of functionality in a little code. We
might be tempted to think that we could easily add more features and
write a real BASIC interpreter, but it turns out that we're already
approaching the limits of what Perl can do easily. (For example, we
haven't considered considered assignments to variables or
mathematical expressions.) Perl's power comes from manipulating text
without worrying about the details, but in a programming language you
MUST worry about the details. You could write a
parser/lexer/interpreter in Perl, but it turns out that it wouldn't
be any easier than implementing it in any other language.

----------------------------------------

6) Acknowledgements

A big thank you to Jacinta Richardson for suggestions and
proofreading. More advanced Perl users might want to check out the
free material from Perl Training Australia
<http://www.perltraining.com.au/>, which she is a part of.

Other contributors include Meryll Larkin.

----------------------------------------

7) Licensing

This course (i.e., all parts of it) is copyright 2003-2005 by Dan
Richter and Alice Wood, and is released under the same license as
Perl itself (Artistic License or GPL, your choice). This is the
license of choice to make it easy for other people to integrate your
Perl code/documentation into their own projects. It is not generally
used in projects unrelated to Perl.