'''''An Introduction to Perl Modules'''''
'''Writing, Documenting and Testing Object-Oriented Perl Modules'''
Author:
''Madison Kelly, mkelly@alteeve.ca''
Date:
''October 29, 2009''
Shameless Plug:
''http://alteeve.ca''
= Welcome! =
This talk is essentially a condensed, hopefully more approachable version of 'perlmod', 'perlobj' and 'perltoot'.
For years, Perl Modules seemed like a mysterious, inaccessible realm best left to perl gurus. This kept me from even considering writing them. It wasn't until I was had to read through the source of a set of modules that I realize how, well, accessible they are.
I want to present this talk to help other perl developers see how powerful perl modules are, and how easy it is to write them.
For reference, the modules I will introduce here are Object Oriented. Modules do not need to be so, however.
= So Then, What is a Perl Module, Anyway? =
A perl module is, at it's most basic, a collection of methods stored in a file that can be loaded by a program to provide a certain set of functions. It is very similar to how we would load a library of functions. There is a little magic in perl modules, but nothing crazy.
== That's Nice, So Why not just use a library? ==
Perl modules are designed to be very portable blocks of code. In truth, a well written library is not really different from a module, but modules are generally designed to stand alone from the programs that use them.
== So What is the Difference? ==
Three main things:
* Modules have their scope defined via the 'package NAME' command. Any variables and subroutines following the 'package' operator are considered to be in the package's name space. The scope refers to code in the same block, same file or same eval, same as how the 'my' operator defines a variable's scope.
* Modules generally have a "constructor method", usually a special subroutine called 'new'. This sets up the module for use by a program and returns a handle to the module, through which its methods are accessed.
* Module constructor methods receive, as their first argument, the name of the package. This is also the package's "name space". Secondly, they usually 'bless' a variable, which simply marks that reference as being inside the name space of the package.
If this seems overwhelming just now, please don't worry. It will make more sense once the examples start flowing.
= Let's Get Started =
Let's start with a simple module that does simple math.
== Name Space ==
First up, I need a name for my modules. I don't want to risk stepping on any existing module names, so I hopped over to '[http://cpan.org CPAN]' and did a search for names that interest me. In my case, I like the acronym 'AN' for my personal projects, so I would like to name my first module 'AN::Tut::Sample1'. I don't see anything using the 'AN::' root, so I will use it.
There are no special rules for naming your modules. Use whatever appeals to you and fits your project. However, if you want your module to eventually be hosted on CPAN, you may need to think twice about your module name. Specifically, the "top level". CPAN is reluctant to give out new top-level name space and would rather you use an existing top-level name space like 'App::*', 'Net::*' or similar. My choice of 'AN::*' is, admittedly, a little risky and could well bite me one day.
This is a good time to mention the double-colon you see in module names. This is nothing more than a directory deliminator. It's meant to be more portable between differing operating systems. So, to use my 'AN::Tut::Sample1' module name as an example, this would translate to:
AN/Tut/Sample1.pm
On *nix systems or:
AN\Tut\Sample1.pm
On Microsoft systems.
When you use your module, you do not need to specify the '.pm' at the end of the actual module name. This is because perl knows to look for module files with this extension in its '@INC' variable.
== Directory and Files ==
Before we begin, we need to decide where to store our files. Perl maintains an internal variable called '@INC' which contains an array of directories. When you use the 'use ...' command to load a module, perl steps through this array looking for the module or library you are asking for. We will use:
/usr/share/perl5/
Any new directories will be under this one. So our first example will be in the directory:
/usr/share/perl5/AN/Tut/Sample1.pm
If you want to store your module in a directory not in '@INC', you can do so. Simply have your calling script push the directory containing your module into the '@INC' array in a 'BEGIN {}' block so that it can be found before the interpreter goes looking for it.
== Some Lingo ==
There are a few terms that are worth covering now.
* '''Reference''':
** A reference is simply a pointer, stored in a string variable, to another string variable, an array, a hash or a code block. See 'perldoc perlref'.
* '''Function''':
** A function is another name for a sub routine, generally used for perl's various built-in functions. I often use the term function for my own subroutines out of (bad?) habit. See 'perldoc perlref'.
* '''Library''':
** This is simply a standard file with a collection of sub routines. Generally their extension is '.lib' and must end with '1;'.
* '''Module''':
** A module is a special type of library. Exactly how it is special will be explained below. See 'perldoc perlmod'.
* '''Package''':
** A package is, essentially, a block of code with a specified name space. The scope of a package's name space is the same as the scope used by 'my ...'. See 'perldoc -f package'.
* '''Method''':
** A method is simply a sub routine in a module. The only difference is that a method expects an object reference or a package name as it's first argument. Which it gets depends on how it was called. See the first part of 'perldoc perlobj'.
* '''Class''':
** A class is simply a package with a set of methods in it's scope.
* '''Object''':
** An object is simply a reference with the class it belongs to prefixed onto the front of the reference. See 'perldoc perlobj'.
* '''bless''':
** This is actually a function whose sole purpose is to associate a reference with the package it is in. What it actually does is take the package name and prefixes it to the reference. See 'perldoc -f bless'.
If this seems a little vague just now, don't worry. Each item will be shown below, one step at a time.
= Our First Module; AN::Tut::Sample1 =
Let's start off with a very simple module that provides two methods:
* A constructor method.
* A method that takes two numbers, adds them and returns the result.
== Our First Script; sample1.pl ==
We need a simple script to load and call our method. Let me show you a completed script and the completed module, then we will step through them to show how they work.
It would probably help to copy these into two files on your own system so that you can play along.
This is the normal perl script that we will use to call our module.
== Our First Module; Sample1.pm ==
This is the actual module. It contains only two methods; the 'new' constructor method and the 'add' method.
== Step Through the Module; 'AN::Tut::Sample1' ==
Let's start by stepping through the 'AN::Tut::Sample1' file and talk about each section.
=== Function: package ===
The first line is the 'package' function.
This function sets the scope of the 'AN::Tut::Sample1' package. It is traditionally at the top of the module file, but doesn't strictly need to be. Anything within the scope of the 'package' call will be loaded by the calling script. For example, let's say I had something like this:
The subroutine 'blah' would not be a method available in the 'AN::Tut::Sample1' package as it is outside the package scope.
=== The 'BEGIN' Block ===
At this time, this isn't really needed, but I like to be in the habit of setting versions in my modules. This will be useful later if you want to ensure that a script is using a particular version of a module (or higher) later.
Setting up the Environment
The only thing of note here is the 'use Carp' module. This allows me to have the module 'die' (croak) or 'warn' (carp) in a way friendlier for the calling program. Please see 'perldoc Carp' for details.
=== The Constructor Method ===
The constructor method is useful for storing values, setting options and so on within a given invocation of the module. I'll go into this more in the next example. We will see in the next example how we could use the constructor to record how many times a method is called and store the last arguments sent to a method. For now though, let's keep it simple.
Modules do not need this, per-se, as we will see when we call the 'add' method later.
Let's look at a few things now:
==== sub new ====
The name 'new' itself is nothing special. I could just as easily have called this method 'constructor', 'glarb' or 'wow'... Perl doesn't care. The only thing to be said for the word 'new' is that it is probably the most often used name for constructor methods, so it might make the most sense to users of your module.
==== my $class=shift; ====
When a method is called directly, the first argument passed into it is the name of the package. In our case, this is 'AN::Tut::Sample1'. This is the module's "class", or name space. This will be used in a moment by 'bless'.
==== my $self={}; ====
This is a string variable that stores a pointer to an anonymous hash. In the real world, and as we will see in the next example, this would be where we'd store module-wide data. For now though, to keep things simple, we'll leave it empty and focus on what it looks like from invocation and through 'bless'ing.
If the syntax seems a bit off, I could have written:
This is exactly the same as above. If this still doesn't seem clear, please look at 'perldoc perlref'.
==== The 'add' Method ====
The biggest thing to note here is the the first argument passed into the function. I've called the variable that will pick up this argument 'self' because I anticipate that this method will be called using the object returned by my 'new' constructor method described above. This would look like:
In this case, the first argument passed in is the 'bless'ed hash reference.
However:
If the user calls this method directly, the first argument passed in will be the module's class, it's name space. Specifically, the name following the 'package' function. If the user does this, nothing stored in the 'bless'ed '$self' hash reference will be available. In this case, that is fine as the constructor does nothing yet. As we'll see shortly, the 'new' method will work just fine in either case.
Other than the first passed argument, the rest of this method works like any old subroutine.
==== Closing the Module ====
If you are familiar with writing libraries then this will be familiar.
When the module is loaded, it must end with '1;' so that the 'use AN::Tut::Sample1' returns a success.
== The 'sample1.pl' script ==
Before we step through the 'sample1.pl' script itself, lets take a look at the output printed when we run it. This way, as we step through it, we can talk about the relevant output at the same time.
=== Output From Running 'sample1.pl' ===
This shows the output printed to the shell when the 'sample1.pl' script is called. Notice the difference of what it passed to the 'add' method when it's called via the 'new' constructor versus being called directly? At this point, it makes no difference at all, but very shortly it will.
/usr/share/perl5/AN/Tut/sample1.pl
Here is what happens when I call the 'new' constructor method.
The first argument passed into my constructor method is the 'class': [AN::Tut::Sample1]
This is what the simple hash reference 'self' looks like at first: [HASH(0x945d880)]
This is what the hash reference 'self' looks like after being 'bless'ed into this class: [AN::Tut::Sample1=HASH(0x945d880)]
Now, this is what my 'an' object looks like: [AN::Tut::Sample1=HASH(0x945d880)]
The first argument passed into my 'add' method: [AN::Tut::Sample1=HASH(0x945d880)]
2 + 2 = [4]
The first argument passed into my 'add' method: [AN::Tut::Sample1]
2 + 2 = [4]
In brief; we call the 'new' constructor method which in turn shows what happens in that method. Then I call the 'add' method using the object returned by 'new' and print the results. Secondly, I call 'add' again with the same arguments to compare what happens differently in the 'add' method when called directly.
=== Loading Modules ===
The first two modules are the usual, built-in modules that tell perl to be picky about what it fails on.
Then we load our module.
What we've done here is tell perl to look for the module 'AN::Tut::Sample1' and load anything in it's name space, specifically, anything in the scope of the 'package' function.
You probably noticed that the 'AN' sub directory was not added to the '@INC' array, but the module can still be found. You will remember that perl looks at the module name and converts the double-colons '::' into directory delimiters. Therefore, perl actually looks for 'AN/Sample1.pm' within the directories in '@INC', not simply 'Sample1.pm'.
In our module, we set a version in the 'BEGIN' block. You likely also noticed that the variable was in all capital letters. This is a special variable that we can use to ensure in our calling script that the module is of a certain version or newer. We don't actually do this here, to keep things simple, but you can modify the 'use' line above to specify a certain version. To do so, change the line to:
In this case, perl will fail at compile time if the module version it finds is too old. For example, if you instead changed the version to, say, '0.1.002' and tried to run the script, you would see the error:
We won't explore this further in this paper, but it was worth mentioning that we could see how this works.
=== Creating an Object to Access our Module's Methods ===
We want to show what happens when we call the 'add' method using the object returned by our constructor method versus calling the method directly. To do this, we must first call the constructor and store the returned object in a string variable. Once we have this, we will print the contents of the object.
You will notice once we run this script that the contents of '$an' will match the 'bless'ed version of the module's internal '$self' hash reference. The code above generates this output:
Here is what happens when I call the 'new' constructor method.
The first argument passed into my constructor method is the 'class': [AN::Tut::Sample1]
This is what the simple hash reference 'self' looks like at first: [HASH(0x9271880)]
This is what the hash reference 'self' looks like after being 'bless'ed into this class: [AN::Tut::Sample1=HASH(0x9271880)]
Now, this is what my 'an' object looks like: [AN::Tut::Sample1=HASH(0x9271880)]
We can see that the first and last lines come from the 'sample1.pl' script and the rest of the lines are printed by 'AN::Tut::Sample1's 'new' constructor. Pay attention to how the module's class is the argument passed in, what the '$self' hash reference looks like at first and how the module's class get prefixed to this hash reference by the 'bless' function. This is how the object returned to the caller is constructed and will become the instance of this module.
=== Call the 'add' Method Via the Module's Object ===
First I will call the 'add' method using the object, then, we will call it again directly. Thanks to the 'print' statement in the 'add' method, you will see that the first argument passed into the method differs in either case. When called using '$an', the first argument it receives is the module's object, which again is the 'bless'ed '$self' hash reference. When called directly, you will see that the first argument is the module's class, which is the name set by the 'package' function.
==== Calling 'add' Via the Module's Object ====
Watch when this portion of 'sample1.pl' runs.
You will see a combination of the 'print' from the 'add' method followed by the simple 'print' above.
The first argument passed into my 'add' method: [AN::Tut::Sample1=HASH(0x9271880)]
2 + 2 = [4]
Take notice of how the argument passed into the method is the object used to call it.
==== Calling 'add' Directly ====
This time we call the 'add' method directly.
This time, the 'add' method prints a line showing that the first argument passed in is the module's class.
The first argument passed into my 'add' method: [AN::Tut::Sample1]
2 + 2 = [4]
= Our Second Module; AN::Tut::Sample2 =
This module is simply an expansion of our first module. The difference is that now we will make use of the module's object. We add five methods to this module. Two are meant to be used by the user and three are "private methods".
== The New Module; Sample2.pm ==
This is the new module, 'AN::Tut::Sample2', with the five new methods and an expanded constructor.
== Public Versus Private Methods ==
"Perl doesn't have an infatuation with enforced privacy. It would prefer that you
stayed out of its living room because you weren't invited, not because it has a
shotgun"
- Larry Wall
In truth, there is no difference between "public" and "private" methods. This is entirely a "by convention" concept. If you wish, you can call a private method, though you do so at your own risk.
=== Public Methods ===
Public methods are the "normal" methods in that they are well documented. Good module programmers do all that they can to not change the arguments or returned values of a method so that programs that use them won't break as the method changes over time.
In this module, there are four public methods.
* 'new'
** The constructor method.
* 'add'
** The addition method that takes two numbers and returns the sum of those numbers.
* 'subtract'
** The subtraction method that takes two numbers and returns the subtracted value.
* 'get_counts'
** The method that takes no arguments and returns how many times the module has been called and how many times the 'add' and 'subtract' methods where called.
=== Private Methods ===
Private methods are intended to be used internally within the module itself. By convention, private methods begin with an underscore, though this is not a requirement. There is rarely public documentation and little effort is made towards backwards compatibility.
In this module, there are three private methods.
* '_count_module'
** This method counts each time the module is called.
* '_count_method_add'
** This counts each time the 'add' method is called.
* '_count_method_subtract'
** This counts each time the 'subtract' method is called.
=== The 'bless'ed '$self' Hash Reference, redux ===
Throughout this document, we will use '$self' as our object and it will be a hash reference. Note though that you could use any reference; An array reference, a simple string reference, etc. Likewise, you can use a name other than '$self'. All that really matters is that it is 'bless'ed into your package.
When the object is used to by the user to call a method, the object itself becomes the first argument passed into the called method. In this module, we are creating a somewhat artificial use for the object's hash reference; We use it to store integer values representing how many times different methods are called.
You will notice that in any method called using the object, the '$self' hash keys in the constructor can be directly accessed. Doing so is considered bad form though. So right off the bat we will create private methods to manipulate the contents of this hash. The reasoning for this extra overhead is protection of the data contained within the hash and to create a single point of manipulation for any given key. This becomes particularly important in complex methods where the values stored in the hash are used to control how things work.
== Stepping Through the New Module and Script ==
There isn't much changed from the user's point of view, so we will just look at the important bits.
In the real world, this would be the next version of the same module. For this reason, we will be careful to not change what arguments are accepted or what values are returned. That is exactly the goal of modules. Portable, backwards compatible code!
=== The Constructor Method ===
The first real difference with the new module is the '$self' hash reference. Now there are keys and values where the first module's '$self' hash reference was empty.
You might have noticed that these keys are in all capital letters. There is no requirement for this, but it is recommended to make it easier to see at a glance that you are accessing keys in the 'bless'ed hash.
These keys will be used to keep a running count of how many times the module is called. Every call to any method will be recorded in '$self->{CALL_COUNT}', how many times the 'add' and 'subtract' methods are called will be stored in '$self->{CALLED}{ADD}' and '$self->{CALLED}{SUBTRACT}' respectively.
=== The Updated 'add' Method ===
From a functional point of view, 'add' still does exactly the same thing it did before. This way, if our user upgrades to the new version of this module, their old calls to this method will still work.
There are three key internal differences:
* In the first module, we recommended that users access our method using the module's object returned by the constructor. Now we require it and will return an error if it is called directly.
* We call the '_count_module' private method which will record the call to the module.
* We call the '_count_method_add' private method which will record the call to the 'add' method.
==== Requiring Calls be Via the Module's Object ====
We want to make sure that every call to our module and the 'add' method are counted. This can only occur if the module is called using the module's '$an' object as this is how we get access to our 'bless'ed hash reference.
If the user tries to directly access this method, they will see this error:
The method 'add' must be called via the object returned by the 'new' constructor method.
at ./sample2.pl line 36
This works because the 'ref()' function will return the module's class if '$self' is indeed our hash reference. Otherwise '$self' contains the module's class name directly so 'ref()' returns nothing, triggering the 'croak'.
==== Automatically Counting the Method and Module Call ====
Before we get to work on the numbers, we count the call to the module and this method by making two calls to the relevant private methods.
The rest of this method is the same as before.
=== The New 'subtract' Public Method ===
This nearly identical to the 'add' method. The only difference is that is subtracts the second number from the first.
=== The New '_count_module' Private Method ===
This private method is called automatically by all three public methods. This way we can keep a count of how many times the module was called in any way.
It simply increments the object's 'CALL_COUNT' hash key and the returns the same.
=== The New '_count_method_add' Private Method ===
This private method is called automatically when the 'add' method is called.
Like in the '_count_module' above, this simply increments the object's '{CALLED}{ADD}' hash key and then returns it.
=== The New '_count_method_subtract' Private Method ===
This private method is almost identical the '_count_method_add' except that it is called by the 'subtract' public method. It increments and returns the object's '{CALLED}{SUBTRACT}' hash key
=== The New 'get_counts' Public Method ===
This method returns three values;
* How many times the module has been called, including the current call. It does this by calling '_count_module' in the 'return ()'.
* How many times the 'add' method was called. In this case, we directly call and return the object's hash key because calling '_count_method_add' would increment the value improperly.
* How many times the 'add' method was called in the same way as above.
== The New Script; Sample2.pl ==
This is the updated script that will demonstrate the new methods in this module.
This is pretty similar to 'sample1.pl'. We see that the 'add' method is still called the same way and we've got a call to the new 'subtract' method.
The first real difference is the call to the 'get_counts' method to show how many times we called the module and it's public methods. The second is the disabled, illegal call to 'add'. If you change 'if (0)' to 'if (1)' and try running the script you will see how the 'add' method will 'croak' not having been called via the module's object.
== Running 'sample2.pl' ==
This is what we see when we run 'sample2.pl'.
It's a lot cleaner than 'sample1.pl' and closer to what your user might actually expect to see.
= Using Sibling Modules =
At some point you may decide that you have too many methods, or you will have methods that are too dissimilar to keep them in the same module. At this point you will want to break up your methods into a suite of modules in a common name space. These modules are then referred to as "sibling" modules.
This begs the question; How do you put methods in different modules while still allowing those methods to call one another? There are many ways to split up your modules, and no one way is particularly better than another. We will see here one method that is some what object oriented.
== Overview ==
This section will use almost all new methods and a new file layout. Before we jump into it, I would like to give a brief overview of how the script and modules will work to help you follow along easier.
Access to all modules and their methods will be done via a single "parent" module. This is in fact just another sibling module, and is being called a "parent" only to help you visualize it's role.
When the user calls the parent's constructor method, it automatically calls the constructor methods of all the sibling modules and then stores the returned objects as internal variables. Then the constructor calls the method '_parent' in each sibling, passing into the child module the handle to itself. Finally, the parent returns it's own blessed reference, it's object, to the user.
The parent module offers a method for each sibling module with a matching name whose sole purpose is to return to the user it's handle on the child module's object. It is through these methods that the user will access the methods in each siblings modules. This is also how each child module will access it's sibling's methods.
This will all make more sense once we step through the new script and modules.
=== The New Suite ===
This example suite uses many files. They are:
* sample3.pl
** This is the new sample script that will use the new module suite.
* AN::Tut::Tools
** This is the "parent" method. Access to all child modules will be through this module.
* AN::Tut::Tools::Math
** This is a module that provides mathematical methods. It provides the same methods as before plus one new method.
* AN::Tut::Tools::Say
** This is a module that provides a localization method.
* Tools.pod
** This is the POD documentation for the AN::Tut::Tools module. We will cover documentation in the next section.
* Math.pod
** This is the POD documentation for AN::Tut::Tools::Math module.
* Say.pod
** This is the POD documentation for AN::Tut::Tools::Say module.
* test.pl
** This is a script used for testing our modules. We will cover tests more in the last section.
* t/Math.t
** This is the test script for testing the AN::Tut::Tools::Math methods.
* t/Say.t
** This is the test script for testing the AN::Tut::Tools::Say methods.
=== The Parent Module; AN::Tut::Tools ===
Let's start by looking at the full "parent" module:
This is the "main" module that your users will load to get access to all the modules and method in your new suite. It provides only three methods, 'new', the usual constructor, 'Math', which sets and/or returns a handle to 'AN::Tut::Tools::Math' and 'Say', which sets and/or returns a handle to 'AN::Tut::Tools::Say'.
==== Loading the Siblings ====
The first real difference in this module is that it loads the sibling modules itself.
By doing this, the user will not need to load all of the modules in our suite themselves. By loading just 'AN::Tut::Tools', they will have access to all the methods in all our modules.
==== Getting a Handle on our Siblings ====
Simply loading our siblings isn't enough because we are building object oriented style methods. We will need to call each sibling module's constructor method and store the returned object.
In this case, we will store these objects in the hash keys below:
The name of the keys is not important. The only thing that matters is that the objects are stored in the 'bless'ed hash reference. For this reason, we call their constructors after we bless '$self'.
Notice that the call to each sibling's constructor is contained withing the argument list for the 'Math' and 'Say' methods? This is just a lazy way to get the module's object and to pass it to the relevant method in one go.
==== Passing on our 'bless'ed Reference to our Siblings ====
The last thing our constructor method does is call a special, internal methods called '_parent' for each of our sibling modules.
As we will see later, each sibling will store the parent's object and us it to call the parent and it's siblings' methods.
==== Getting a Handle on AN::Tut::Tools::Math ====
The 'Math' method does only two things; It returns whatever is stored in '$self->{HANDLE}{MATH}' and, if something was passed into it, it saved that argument.
When our constructor method called 'AN::Tut::Tools::Math's 'new' constructor within the call to 'Math', the returned object got passed to this method and stored in '$self->{HANDLE}{MATH}'. Now, whenever 'Math' gets called, the object referencing 'AN::Tut::Tools::Math' gets returned.
We do it this way for the same reasons we discussed in the section; "[[TPM Talk: An Introduction Perl Modules#The 'bless'ed '$self' Hash Reference, redux|The 'bless'ed '$self' Hash Reference, redux]]". That is, we never want to directly access or alter the values stored in '$self'. It is always better in the long run to have a single method to do this manipulation so that any future changes, tweaks or edits can be done in one place only.
==== Getting a Handle on AN::Tut::Tools::Say ====
This works exactly the same as the 'Math' call above.
=== The 'Say' Sibling; AN::Tut::Tools::Say ===
This new module provides a method that can return a language-specific string suitable for display to the end user. We will use this to create English or French strings, depending on the user's preference, describing the results of a calls to 'AN::Tut::Tools::Math'.
==== Getting a Handle on our Parent; The '_parent' Method ====
In this module, we will use the hash key 'HANDLE_TUT_TOOLS' in our blessed reference to store the handle to the "parent" 'AN::Tut::Tools' module. The difference this time is that this module can't go out and get the handle. If it tried to, it would get a new object that wouldn't
match the handle the user got. This is why 'AN::Tut::Tools' called this module's '_parent' method and passed in it's own blessed hash reference. The '_parent' module takes that reference and stores it. Thus, any further call to '_parent' will return the 'AN::Tut::Tools' object.
=== The 'math' Method ===
This method is pretty straight forward from a module point of view. There is a little trick at the start that lets this method pick up arguments via an array or a hash. This trick exists so that the testing done in the last section will be more interesting. It checks to see if the first argument passed in my the user is a hash reference. If it is, it begins looking for variables by name. If it isn't, it switches to array-type reading of arguments and copies '$param' to '$task'.
=== The 'Math' Sibling; AN::Tut::Tools::Math ===
Let's take a look at the whole module before looking at the new parts:
This module is fairly similar to what we saw earlier in 'Sample1.pm' and the 'Sample2.pm' modules. The main difference is that the counting methods are gone and a new 'add_and_say' method has been added.
==== Using the Parent Handle in 'add_and_say' ====
The new 'add_and_say' method is actually a wrapper for two methods; The 'add' method in this module and the 'math' module in the AN::Tut::Tools::Say sibling module.
The first interesting bit is:
This calls the private method '_parent' and stores the returned object in '$an'. By doing this, we now have the object to 'AN::Tut::Tools' that matches the object returned to the user. This is particularly important in more complex programs that make use of shared variable. This object is how we will access methods in the 'AN::Tut::Tools::Math' module in the next two steps.
We could have used 'my $result=$self->add' just effectively. The only reason I don't is that using the '$an' object allows me to move a method from one module to another module without changing any code; It makes this method more portable.
Now the magic!
By using the 'Say' method in the parent module 'AN::Tut::Tools' via the '$an' object, I can call the 'math' method in 'AN::Tut::Tools::Say' from this module! This is how we can create a method that interacts with it's siblings, allowing the user to make a single method call the provides functions spanning multiple modules and methods!
== Putting it All Together; sample3.pl ==
We accessed both the 'Math' and 'Say' modules via the '$an' object. We only needed to load the parent module 'AN::Tut::Tools' and then use it's pointer methods. This lets us expand our module in future releases, adding entirely new modules to our suite and the user never has to worry about it. They can simply start using the new module's methods via new pointer methods we would add.
The rest of this sample script shows examples of different ways the methods can be called. These are needlessly complex, but will show different ways of writing tests in the last section of this paper.
Here's what it looks like when we run it.
/usr/share/perl5/AN/Tut/sample3.pl
I added: [10] and: [12] and got: [22].
J'ai additionné: [10] et: [12] et la somme est: [22].
J'ai additionné: [10] et: [12] et la somme est: [22].
J'ai soustrait: [40] de: [12] et la différence est: [28].
I added: [15] and: [20] and got: [35].
J'ai additionné: [2] et: [18] et la somme est: [20].
Pretty neat, eh?
= Documentation: PODs (Plain-Old Document) =
Documenting your modules can be done in a few ways. As good programmers, we already liberally sprinkle our code with comments in the program itself. We also use clear variable names to make the code as self-documenting as possible.
What about our users though? If we're writing a module, we can't expect a user to ever look at our module's source code, so we need another mechanism to make our documentation available to them. Perl has this ability through 'POD's; Plain Old Documentation.
POD documentation can exist in-line in our module or exist in a dedicated 'Module.pod' file sitting beside our 'Module.pm' module file. In either case, the user can then read our file by using a tool like the Linux command line 'perldoc Your::Module'.
The syntax for PODs is a simple markup language.
== In-line POD Documentation ==
Before we get into the actual syntax of POD, I wanted to point out the difference between in-line POD and a dedicated POD file.
When writing your POD in-line, you simply wrap the docs in '=pod' and '=cut'. Anything between these will be ignored by the perl interpreter at run time. All POD command syntax must start at the beginning of a newline with a blank line above and below it to be parsed properly.
Generally, when doing in-line documentation, a POD section will precede the method or function it relates to. This is by convention only however.
You do not need to wrap the text inside PODs unless you specifically want a certain indentation. Instead, just write each paragraph on a single line and the POD interpreted will handle line wrapping for you.
== POD Markup Syntax ==
It is a simple markup syntax that POD compatible readers can interpret and format for display to the user. The most common interpreter is 'perldoc', though many other POD interpreters exist to translate POD documentation into web pages and other formats.
POD syntax is pretty straight-forward. Here is a quick overview of how the syntax would be ordered, with the actual documentation still missing. This is pretty much copypasta of the 'perldoc perlpod' documentation.
=pod
=encoding type
=head1 Heading Text
=head2 Heading Text
=head3 Heading Text
=head4 Heading Text
=over #
=item stuff
=back
=begin format
=end format
=for format text...
=cut
All command syntax requires a blank line preceding it and for the command to be at the start of the new line.
=== =pod ===
This starts the POD documentation. If you are using a dedicated 'something.pod' file, this will be the first line, or at least the start of what you want the user to read. When using in-line documentation, this will end the compilation of your code until the first '=cut' is seen.
=== =cut ===
This ends a block of POD documentation. When in-line with a program, normal compilation of the script will resume after this line.
=== =head# (1 - 4) ===
This precedes the heading text. Any text following a '=head#' will be given a prominent font style to bring attention to the user. There are four heading levels supported, with 3 and 4 being recent additions not supported by older versions of perldoc.
Under all four heading styles, text following the heading is indented eight spaces.
==== =head1 ====
Heading 1 bolds the text and does not indent the string following it. An example:
=head1 METHODS
This module provides methods that do foo...
==== =head2 ====
Heading 2 bolds the text and indents the following text four spaces. For example:
=head 2 method_name
This method provides bar...
==== =head3 ====
Heading 3 underlines the text and indents the following text eight spaces.
==== =head4 ====
Heading 4 does not format the following text and indents it eight space.
=== =over #, =item, =back ===
The '=over' command starts a list of '=item's, ending with the closing '=back' command. This is useful for creating a list, for example, of parameters a method takes.
The '#' following the '=over' is an optional number of 'ems' (elements) to indent the paragraph(s) describing each item in the list. The default is 4 ems when no number is given. At the command line, this translates to the description text being incremented four spaces. By contrast, '=over 2' will increment the description text by 2 elements, or spaces when interpreted by perldoc.
An example:
=over
=item
foo
Foo is the first argument that 'some_method' takes and can be ...
=item
bar
Bar is the second argument that 'some_method' takes and is used to...
=item
baz
Baz is the last argument taken and is optional. It controls...
=back
==== Some notes on '=item' ====
If you put one bare word following '=item', the description text will start in-line after item's text. If you put a space in front of the first line of description text though, the description text will start on an indented new line with the first line of the paragraph being indented one extra space.
Alternatively, you can put a star '*' between the item command and it's text to create a bullet list (=item * foo). Likewise, if the first line in the description is a single bare word, perldoc will create a bulleted entry. Last, you can put a number followed by a period after the item and before it's text to create a numbered list (=item 1. foo). A bare word followed by '()' (=item foo()) will be underlined and the description text will start on a new line. Whichever you use, just be sure to be consistent.
Lastly, you can not use '=head#' inside an '=over' - '=back' block.
=== Code Blocks ===
To prevent a line from being parsed in any way, simply put a space or tab at the beginning of the line. This is an effective way to show code.
=== Other Syntax ===
This covers just the basics of documenting your perl module. There are many other commands and formatting options well worth reviewing in 'perldoc perlpod'! There are ways to specify the encoding used in your POD, a method for embedding other bits of data for compatible interpreters like HTML, manual formatting and embedding links.
== Dedicated POD File Examples ==
Lets see some real-world POD files created for the our most recent suite of modules.
=== AN::Tut::Tools.pod ===