![]() |
![]() |
Home / Documentation / 1.0 / mod_perl 1.0 User Guide / | ![]() |
|
![]() |
||||
![]() |
![]() |
|||
![]() |
![]() |
|||
![]() |
||||
![]() |
![]() |
|||
![]() |
CGI to mod_perl Porting. mod_perl Coding guidelines. | ![]() |
||
![]() |
||||
![]() |
![]() |
![]() |
||
![]() |
||||
![]() |
||||
![]() |
![]() |
![]() |
![]() |
||
![]() |
||
![]() |
![]() |
![]() |
![]() |
||
![]() |
||
![]() |
![]() |
![]() |
![]() |
||
![]() |
||
![]() |
![]() |
![]() |
![]() |
||
![]() |
||
![]() |
![]() |
![]() |
![]() |
||
![]() |
||
![]() |
![]() |
![]() |
![]() |
||
![]() |
||
![]() |
||
![]() |
||
![]() |
||
![]() |
||
![]() |
||
|
|
||
![]() |
||
This chapter is relevant both to writing a new CGI script or perl handler from scratch and to migrating an application from plain CGI to mod_perl.
It also addresses the situation where the CGI script being ported does
the job, but is too dirty to be altered easily to run as a mod_perl
program. (Apache::PerlRun mode)
If you are at the porting stage, you can use this chapter as a reference for possible problems you might encounter when running an existing CGI script in the new mode.
If your project schedule is tight, I would suggest converting to
mod_perl in the following steps: Initially, run all the scripts in the
Apache::PerlRun mode. Then as time allows, move them into
Apache::Registry mode. Later if you need Apache Perl API
functionality you can always add it.
If you are about to write a new CGI script from scratch, it would be a good idea to learn about possible mod_perl related pitfalls and to avoid them in the first place.
If you don't need mod_cgi compatibility, it's a good idea to start writing using the mod_perl API in first place. This will make your application a little bit more efficient and it will be easier to use the full mod_perl feature set, which extends the core Perl functionality with Apache specific functions and overridden Perl core functions that were reimplemented to work better in mod_perl environment.
It can be a good idea to tighten up some of your Perl programming practices, since mod_perl doesn't tolerate sloppy programming.
This chapter relies on a certain level of Perl knowledge. Please read through the Perl Reference chapter and make sure you know the material covered there. This will allow me to concentrate on pure mod_perl issues and make them more prominent to the experienced Perl programmer, which would otherwise be lost in the sea of Perl background notes.
Additional resources:
This page describes the mechanics of creating, compiling, releasing, and maintaining Perl modules. http://world.std.com/~swmcd/steven/perl/module_mechanics.html
The information is very relevant to a mod_perl developer.
"Writing Apache Modules with Perl and C" is a "must have" book!
See the details at http://www.modperl.com .
Let's start with some simple code and see what can go wrong with it, detect bugs and debug them, discuss possible pitfalls and how to avoid them.
I will use a simple CGI script, that initializes a $counter to 0,
and prints its value to the browser while incrementing it.
counter.pl:
----------
#!/usr/bin/perl -w
use strict;
print "Content-type: text/plain\r\n\r\n";
my $counter = 0; # Explicit initialization technically redundant
for (1..5) {
increment_counter();
}
sub increment_counter{
$counter++;
print "Counter is equal to $counter !\r\n";
}
You would expect to see the output:
Counter is equal to 1 ! Counter is equal to 2 ! Counter is equal to 3 ! Counter is equal to 4 ! Counter is equal to 5 !
And that's what you see when you execute this script the first time. But let's reload it a few times... See, suddenly after a few reloads the counter doesn't start its count from 1 any more. We continue to reload and see that it keeps on growing, but not steadily starting almost randomly at 10, 10, 10, 15, 20... Weird...
Counter is equal to 6 ! Counter is equal to 7 ! Counter is equal to 8 ! Counter is equal to 9 ! Counter is equal to 10 !
We saw two anomalies in this very simple script: Unexpected increment of our counter over 5 and inconsistent growth over reloads. Let's investigate this script.
First let's peek into the error_log file. Since we have enabled the
warnings what we see is:
Variable "$counter" will not stay shared at /home/httpd/perl/conference/counter.pl line 13.
The Variable "$counter" will not stay shared warning is generated when the script contains a named nested subroutine (a named - as opposed to anonymous - subroutine defined inside another subroutine) that refers to a lexically scoped variable defined outside this nested subroutine. This effect is explained in my () Scoped Variable in Nested Subroutines.
Do you see a nested named subroutine in my script? I don't! What's going on? Maybe it's a bug? But wait, maybe the perl interpreter sees the script in a different way, maybe the code goes through some changes before it actually gets executed? The easiest way to check what's actually happening is to run the script with a debugger.
But since we must debug it when it's being executed by the webserver,
a normal debugger won't help, because the debugger has to be invoked
from within the webserver. Luckily Doug MacEachern wrote the
Apache::DB module and we will use this to debug my script. While
Apache::DB allows you to debug the code interactively, we will do
it non-interactively.
Modify the httpd.conf file in the following way:
PerlSetEnv PERLDB_OPTS "NonStop=1 LineInfo=/tmp/db.out AutoTrace=1 frame=2"
PerlModule Apache::DB
<Location /perl>
PerlFixupHandler Apache::DB
SetHandler perl-script
PerlHandler Apache::Registry
Options ExecCGI
PerlSendHeader On
</Location>
Restart the server and issue a request to counter.pl as before. On the surface nothing has changed--we still see the correct output as before, but two things happened in the background:
Firstly, the file /tmp/db.out was written, with a complete trace of the code that was executed.
Secondly, if you have loaded the Carp module already, error_log
now contains the real code that was actually executed. This is
produced as a side effect of reporting the Variable "$counter" will
not stay shared at... warning that we saw earlier. To load the Carp
module, you can add:
use Carp;
in your startup.pl file or in the executed code.
Here is the code that was actually executed:
package Apache::ROOT::perl::conference::counter_2epl;
use Apache qw(exit);
sub handler {
BEGIN {
$^W = 1;
};
$^W = 1;
use strict;
print "Content-type: text/plain\r\n\r\n";
my $counter = 0; # Explicit initialization technically redundant
for (1..5) {
increment_counter();
}
sub increment_counter{
$counter++;
print "Counter is equal to $counter !\r\n";
}
}
The code in the error_log wasn't indented. I've indented it for you to stress that the code was wrapped inside the handler() subroutine.
What do we learn from this?
Well firstly that every CGI script is cached under a package whose
name is formed from the Apache::ROOT:: prefix and the relative part
of the script's URL (perl::conference::counter_2epl) by replacing
all occurrences of / with :: and . with _2e. That's how
mod_perl knows what script should be fetched from the cache--each
script is just a package with a single subroutine named handler.
If we were to add use diagnostics to the script we would also see a
reference in the error text to an inner (nested)
subroutine--increment_counter is actually a nested subroutine.
With mod_perl, each subroutine in every Apache::Registry script is
nested inside the handler subroutine.
It's important to understand that the inner subroutine effect
happens only with code that Apache::Registry wraps with a
declaration of the handler subroutine. If you put all your code
into modules, which the main script use()s, this effect doesn't
occur.
Do not use Perl4-style libraries. Subroutines in such libraries will
only be available to the first script in any given interpreter thread
to require() a library of any given name. This can lead to
confusing sporadic failures.
The easiest and the fastest way to solve the nested subroutines
problem is to switch every lexically scoped variable foe which you get
the warning for to a package variable. The handler subroutines are
never called re-entrantly and each resides in a package to itself.
Most of the usual disadvantates of package scoped variables are,
therefore, not a concern. Note, however, that whereas explicit
initialization is not always necessary for lexical variables it is
usually necessary for these package variables as they persist in
subsequent executions of the handler and unlike lexical variables,
don't get automatically destroyed at the end of each handler.
counter.pl: ---------- #!/usr/bin/perl -w use strict; print "Content-type: text/plain\r\n\r\n"; # In Perl <5.6 our() did not exist, so: # use vars qw($counter); our $counter = 0; # Explicit initialization now necessary
for (1..5) {
increment_counter();
}
sub increment_counter{
$counter++;
print "Counter is equal to $counter !\r\n";
}
If the variable contains a reference it may hold onto lots of
unecessary memory (or worse) if the reference is left to hang about
until the next call to the same handler. For such variables you
should use local so that the value is removed when the handler
subroutine exits.
my $query = CGI->new;
becomes:
local our $query = CGI->new;
All this is very interesting but as a general rule of thumb, unless
the script is very short, I tend to write all the code in external
libraries, and to have only a few lines in the main script. Generally
the main script simply calls the main function of my library. Usually
I call it init() or run(). I don't worry about nested
subroutine effects anymore (unless I create them myself :).
The section 'Remedies for Inner Subroutines' discusses many other possible workarounds for this problem.
You shouldn't be intimidated by this issue at all, since Perl is your
friend. Just keep the warnings mode On and Perl will gladly tell
you whenever you have this effect, by saying:
Variable "$counter" will not stay shared at ...[snipped]
Just don't forget to check your error_log file, before going into production!
By the way, the above example was pretty boring. In my first days of using mod_perl, I wrote a simple user registration program. I'll give a very simple representation of this program.
use CGI;
$q = CGI->new;
my $name = $q->param('name');
print_response();
sub print_response{
print "Content-type: text/plain\r\n\r\n";
print "Thank you, $name!";
}
My boss and I checked the program at the development server and it worked OK. So we decided to put it in production. Everything was OK, but my boss decided to keep on checking by submitting variations of his profile. Imagine the surprise when after submitting his name (let's say "The Boss" :), he saw the response "Thank you, Stas Bekman!".
What happened is that I tried the production system as well. I was new to mod_perl stuff, and was so excited with the speed improvement that I didn't notice the nested subroutine problem. It hit me. At first I thought that maybe Apache had started to confuse connections, returning responses from other people's requests. I was wrong of course.
Why didn't we notice this when we were trying the software on our development server? Keep reading and you will understand why.
Let's return to our original example and proceed with the second mystery we noticed. Why did we see inconsistent results over numerous reloads?
That's very simple. Every time a server gets a request to process, it hands it over one of the children, generally in a round robin fashion. So if you have 10 httpd children alive, the first 10 reloads might seem to be correct because the effect we've just talked about starts to appear from the second re-invocation. Subsequent reloads then return unexpected results.
Moreover, requests can appear at random and children don't always run the same scripts. At any given moment one of the children could have served the same script more times than any other, and another may never have run it. That's why we saw the strange behavior.
Now you see why we didn't notice the problem with the user
registration system in the example. First, we didn't look at the
error_log. (As a matter of fact we did, but there were so many
warnings in there that we couldn't tell what were the important ones
and what were not). Second, we had too many server children running to
notice the problem.
A workaround is to run the server as a single process. You achieve
this by invoking the server with the -X parameter (httpd -X).
Since there are no other servers (children) running, you will see the
problem on the second reload.
But before that, let the error_log help you detect most of the
possible errors--most of the warnings can become errors, so you should
make sure to check every warning that is detected by perl, and
probably you should write your code in such a way that no warnings
appear in the error_log. If your error_log file is filled up
with hundreds of lines on every script invocation, you will have
difficulty noticing and locating real problems--and on a production
server you'll soon run out of disk space if your site is popular.
Of course none of the warnings will be reported if the warning
mechanism is not turned On. Refer to the section "Tracing Warnings Reports" to learn about
warnings in general and to the "Warnings" section
to learn how to turn them on and off under mod_perl.
When you start running your scripts under mod_perl, you might find
yourself in a situation where a script seems to work, but sometimes it
screws up. And the more it runs without a restart, the more it screws
up. Often the problem is easily detectable and solvable. You have to
test your script under a server running in single process mode
(httpd -X).
Generally the problem is the result of using global variables. Because global variables don't change from one script invocation to another unless you change them, you can find your scripts do strange things.
Let's look at three real world examples:
The first example is amazing--Web Services. Imagine that you enter some site where you have an account, perhaps a free email account. Having read your own mail you decide to take a look at someone else's.
You type in the username you want to peek at and a dummy password and try to enter the account. On some services this will work!!!
You say, why in the world does this happen? The answer is simple: Global Variables. You have entered the account of someone who happened to be served by the same server child as you. Because of sloppy programming, a global variable was not reset at the beginning of the program and voila, you can easily peek into someone else's email! Here is an example of sloppy code:
use vars ($authenticated);
my $q = new CGI;
my $username = $q->param('username');
my $passwd = $q->param('passwd');
authenticate($username,$passwd);
# failed, break out
unless ($authenticated){
print "Wrong passwd";
exit;
}
# user is OK, fetch user's data
show_user($username);
sub authenticate{
my ($username,$passwd) = @_;
# some checking
$authenticated = 1 if SOME_USER_PASSWD_CHECK_IS_OK;
}
Do you see the catch? With the code above, I can type in any valid
username and any dummy password and enter that user's account,
provided she has successfully entered her account before me using the
same child process! Since $authenticated is global--if it becomes 1
once, it'll stay 1 for the remainder of the child's life!!! The
solution is trivial--reset $authenticated to 0 at the beginning of
the program.
A cleaner solution of course is not to rely on global variables, but rely on the return value from the function.
my $q = CGI->new;
my $username = $q->param('username');
my $passwd = $q->param('passwd');
my $authenticated = authenticate($username,$passwd);
# failed, break out
unless ($authenticated){
print "Wrong passwd";
exit;
}
# user is OK, fetch user's data
show_user($username);
sub authenticate{
my ($username,$passwd) = @_;
# some checking
return (SOME_USER_PASSWD_CHECK_IS_OK) ? 1 : 0;
}
Of course this example is trivial--but believe me it happens!
Just another little one liner that can spoil your day, assuming you
forgot to reset the $allowed variable. It works perfectly OK in
plain mod_cgi:
$allowed = 1 if $username eq 'admin';
But using mod_perl, and if your system administrator with superuser access rights has previously used the system, anybody who is lucky enough to be served later by the same child which served your administrator will happen to gain the same rights.
The obvious fix is:
$allowed = $username eq 'admin' ? 1 : 0;
Another good example is usage of the /o regular expression
modifier, which compiles a regular expression once, on its first
execution, and never compiles it again. This problem can be difficult
to detect, as after restarting the server each request you make will
be served by a different child process, and thus the regex pattern for
that child will be compiled afresh. Only when you make a request that
happens to be served by a child which has already cached the regex
will you see the problem. Generally you miss that. When you press
reload, you see that it works (with a new, fresh child). Eventually it
doesn't, because you get a child that has already cached the regex
and won't recompile because of the /o modifier.
An example of such a case would be:
my $pat = $q->param("keyword");
foreach( @list ) {
print if /$pat/o;
}
To make sure you don't miss these bugs always test your CGI in single process mode.
To solve this particular /o modifier problem refer to Compiled Regular Expressions.
Scripts under Apache::Registry do not run in package main, they run
in a unique name space based on the requested URI. For example, if
your URI is /perl/test.pl the package will be called
Apache::ROOT::perl::test_2epl.
The basic Perl @INC behaviour is explained in section use(), require(), do(), %INC and @INC Explained.
When running under mod_perl, once the server is up @INC is frozen
and cannot be updated. The only opportunity to temporarily modify
@INC is while the script or the module are loaded and compiled for
the first time. After that its value is reset to the original
one. The only way to change @INC permanently is to modify it at
Apache startup.
Two ways to alter @INC at server startup:
In the configuration file. For example add:
PerlSetEnv PERL5LIB /home/httpd/perl
or
PerlSetEnv PERL5LIB /home/httpd/perl:/home/httpd/mymodules
Note that this setting will be ignored if you have the
PerlTaintCheck mode turned on.
In the startup file directly alter the @INC. For example
startup.pl ---------- use lib qw(/home/httpd/perl /home/httpd/mymodules); 1;
and load the startup file from the configuration file by:
PerlRequire /path/to/startup.pl
You might want to read the "use(), require(), do(), %INC and @INC Explained" before you proceed with this section.
When you develop plain CGI scripts, you can just change the code, and rerun the CGI from your browser. Since the script isn't cached in memory, the next time you call it the server starts up a new perl process, which recompiles it from scratch. The effects of any modifications you've applied are immediately present.
The situation is different with Apache::Registry, since the whole
idea is to get maximum performance from the server. By default, the
server won't spend time checking whether any included library modules
have been changed. It assumes that they weren't, thus saving a few
milliseconds to stat() the source file (multiplied by however many
modules/libraries you use() and/or require() in your script.)
The only check that is done is to see whether your main script has been changed. So if you have only scripts which do not use() or require() other perl modules or packages, there is nothing to worry about. If, however, you are developing a script that includes other modules, the files you use() or require() aren't checked for modification and you need to do something about that.
So how do we get our mod_perl-enabled server to recognize changes in library modules? Well, there are a couple of techniques:
The simplest approach is to restart the server each time you apply some change to your code. See Server Restarting techniques.
After restarting the server about 100 times, you will tire of it and you will look for other solutions.
Help comes from the Apache::StatINC module. When Perl pulls a file
via require(), it stores the full pathname as a value in the global
hash %INC with the file name as the key. Apache::StatINC looks
through %INC and immediately reloads any files that have been
updated on disk.
To enable this module just add two lines to httpd.conf.
PerlModule Apache::StatINC PerlInitHandler Apache::StatINC
To be sure it really works, turn on debug mode on your development box
by adding PerlSetVar StatINCDebug On to your config file. You end
up with something like this:
PerlModule Apache::StatINC
<Location /perl>
SetHandler perl-script
PerlHandler Apache::Registry
Options ExecCGI
PerlSendHeader On
PerlInitHandler Apache::StatINC
PerlSetVar StatINCDebug On
</Location>
Be aware that only the modules located in @INC are reloaded on
change, and you can change @INC only before the server has been
started (in the startup file).
Nothing you do in your scripts and modules which are pulled in with
require() after server startup will have any effect on @INC.
When you write:
use lib qw(foo/bar);
@INC is changed only for the time the code is being parsed and
compiled. When that's done, @INC is reset to its original
value.
To make sure that you have set @INC correctly, configure
/perl-status location, fetch
http://www.example.com/perl-status?inc and look at the bottom of the
page, where the contents of @INC will be shown.
Notice the following trap:
While "." is in @INC, perl knows to require() files with
pathnames given relative to the current (script) directory. After the
script has been parsed, the server doesn't remember the path!
So you can end up with a broken entry in %INC like this:
$INC{bar.pl} eq "bar.pl"
If you want Apache::StatINC to reload your script--modify @INC at
server startup, or use a full path in the require() call.
Apache::Reload comes as a drop-in replacement for
Apache::StatINC. It provides extra functionality and better
flexibility.
If you want Apache::Reload to check all the loaded modules on each
request, you just add to httpd.conf:
PerlInitHandler Apache::Reload
If you want to reload only specific modules when these get changed, you have two ways to do that.
The first way is to turn Off the ReloadAll variable, which is
On by default
PerlInitHandler Apache::Reload PerlSetVar ReloadAll Off
and add:
use Apache::Reload;
to every module that you want to be reloaded on change.
The second way is to explicitly specify modules to be reloaded in httpd.conf:
PerlInitHandler Apache::Reload PerlSetVar ReloadModules "My::Foo My::Bar Foo::Bar::Test"
Note that these are split on whitespace, but the module list must be in quotes, otherwise Apache tries to parse the parameter list.
You can register groups of modules using the metacharacter (*).
PerlSetVar ReloadModules "Foo::* Bar::*"
In the above example all modules starting with Foo:: and Bar:: will become registered. This features allows you to assign the whole project modules tree in one pattern.
You can also set a file that you can touch(1) that causes the reloads to be performed. If you set this, and don't touch(1) the file, the reloads don't happen (no matter how have you registered the modules to be reloaded).
PerlSetVar ReloadTouchFile /tmp/reload_modules
Now when you're happy with your changes, simply go to the command line and type:
% touch /tmp/reload_modules
This feature is very convenient in a production server environment, but compared to a full restart, the benefits of preloaded modules memory sharing are lost, since each child will get it's own copy of the reloaded modules.
This module might have a problem with reloading single modules that contain multiple packages that all use pseudo-hashes.
Also if you have modules loaded from directories which are not in
@INC, Apache::Reload will fail to find the files, due the fact
that @INC is reset to its original value even if it gets temporary
modified in the script. The solution is to extend @INC at the
server startup to include directories you load the files from which
aren't in @INC.
For example, if you have a script which loads MyTest.pm from /home/stas/myproject:
use lib qw(/home/stas/myproject); require MyTest;
Apache::Reload won't find this file, unless you alter @INC in
startup.pl (or httpd.conf):
startup.pl ---------- use lib qw(/home/stas/myproject);
and restart the server. Now the problem is solved.
Checking all the modules in %INC on every request can add a large
overhead to server response times, and you certainly would not want
the Apache::StatINC module to be enabled in your production site's
configuration. But sometimes you want a configuration file reloaded
when it is updated, without restarting the server.
This is an especially important feature if for example you have a person that is allowed to modify some of the tool configuration, but for security reasons it's undesirable for him to telnet to the server to restart it.
Since we are talking about configuration files, I would like to show you some good and bad approaches to configuration file writing.
If you have a configuration file of just a few variables, it doesn't really matter how you do it. But generally this is not the case. Configuration files tend to grow as a project grows. It's very relevant to projects that generate HTML files, since they tend to demand many easily configurable parameters, like headers, footers, colors and so on.
So let's start with the approach that is most often taken by CGI scripts writers. All configuration variables are defined in a separate file.
For example:
$cgi_dir = "/home/httpd/perl"; $cgi_url = "/perl"; $docs_dir = "/home/httpd/docs"; $docs_url = "/"; $img_dir = "/home/httpd/docs/images"; $img_url = "/images"; ... many more config params here ... $color_hint = "#777777"; $color_warn = "#990066"; $color_normal = "#000000";
The use strict; pragma demands that all the variables be declared.
When we want to use these variables in a mod_perl script we must
declare them with use vars in the script. (Under Perl v5.6.0
our() has replaced use vars.)
So we start the script with:
use strict;
use vars qw($cgi_dir $cgi_url $docs_dir $docs_url
... many more config params here ....
$color_hint $color_warn $color_normal
);
It is a nightmare to maintain such a script, especially if not all the features have been coded yet. You have to keep adding and removing variable names. But that's not a big deal.
Since we want our code clean, we start the configuration file with
use strict; as well, so we have to list the variables with use
vars pragma here as well. A second list of variables to maintain.
If you have many scripts, you may get collisions between configuration files. One of the best solutions is to declare packages, with unique names of course. For example for our configuration file we might declare the following package name:
package My::Config;
The moment you add a package declaration and think that you are done,
you realize that the nightmare has just begun. When you have declared
the package, you cannot just require() the file and use the variables,
since they now belong to a different package. So you have either to
modify all your scripts to use a fully qualified notation like
$My::Config::cgi_url instead of just $cgi_url or to import the
needed variables into any script that is going to use them.
Since you don't want to do the extra typing to make the variables fully qualified, you'd go for importing approach. But your configuration package has to export them first. That means that you have to list all the variables again and now you have to keep at least three variable lists updated when you make some changes in the naming of the configuration variables. And that's when you have only one script that uses the configuration file, in the general case you have many of them. So now our example configuration file looks like this:
package My::Config;
use strict;
BEGIN {
use Exporter ();
@My::HTML::ISA = qw(Exporter);
@My::HTML::EXPORT = qw();
@My::HTML::EXPORT_OK = qw($cgi_dir $cgi_url $docs_dir $docs_url
... many more config params here ....
$color_hint $color_warn $color_normal);
}
use vars qw($cgi_dir $cgi_url $docs_dir $docs_url
... many more config params here ....
$color_hint $color_warn $color_normal
);
$cgi_dir = "/home/httpd/perl";
$cgi_url = "/perl";
$docs_dir = "/home/httpd/docs";
$docs_url = "/";
$img_dir = "/home/httpd/docs/images";
$img_url = "/images";
... many more config params here ...
$color_hint = "#777777";
$color_warn = "#990066";
$color_normal = "#000000";
And in the code:
use strict;
use My::Config qw($cgi_dir $cgi_url $docs_dir $docs_url
... many more config params here ....
$color_hint $color_warn $color_normal
);
use vars qw($cgi_dir $cgi_url $docs_dir $docs_url
... many more config params here ....
$color_hint $color_warn $color_normal
);
This approach is especially bad in the context of mod_perl, since exported variables add a memory overhead. The more variables exported the more memory you use. If we multiply this overhead by the number of servers we are going to run, we get a pretty big number which could be used to run a few more servers instead.
As a matter of fact things aren't so bad. You can group your variables, and call the groups by special names called tags, which can later be used as arguments to the import() or use() calls. You are probably familiar with:
use CGI qw(:standard :html);
We can implement this quite easily, with the help of export_ok_tags()
from Exporter. For example:
BEGIN {
use Exporter ();
use vars qw( @ISA @EXPORT @EXPORT_OK %EXPORT_TAGS);
@ISA = qw(Exporter);
@EXPORT = qw();
@EXPORT_OK = qw();
%EXPORT_TAGS = (
vars => [qw($fname $lname)],
subs => [qw(reread_conf untaint_path)],
);
Exporter::export_ok_tags('vars');
Exporter::export_ok_tags('subs');
}
You export subroutines exactly like variables, since what's actually being exported is a symbol. The definition of these subroutines is not shown here.
Notice that we didn't use export_tags(), as it exports the variables automatically without the user asking for them in first place, which is considered bad style. If a module automatically exports variables with export_tags() you can stop this by not exporting at all:
use My::Config ();
In your code you can now write:
use My::Config qw(:subs :vars);
Groups of group tags:
The :all tag from CGI.pm is a group tag of all other groups. It
will require a little more effort to implement, but you can always save
time by looking at the solution in CGI.pm's code. It's just a
matter of a little code to expand all the groups recursively.
After going through the pain of maintaining a list of variables in a big project with a huge configuration file (more than 100 variables) and many files actually using them, I came up with a much simpler solution: keeping all the variables in a single hash, which is built from references to other anonymous scalars, arrays and hashes.
Now my configuration file looks like this:
package My::Config;
use strict;
BEGIN {
use Exporter ();
@My::Config::ISA = qw(Exporter);
@My::Config::EXPORT = qw();
@My::Config::EXPORT_OK = qw(%c);
}
use vars qw(%c);
%c = (
dir => {
cgi => "/home/httpd/perl",
docs => "/home/httpd/docs",
img => "/home/httpd/docs/images",
},
url => {
cgi => "/perl",
docs => "/",
img => "/images",
},
color => {
hint => "#777777",
warn => "#990066",
normal => "#000000",
},
);
Good perl style suggests keeping a comma at the end of lists. That's because additional items tend to be added to the end of the list. If you keep that last comma in place, you don't have to remember to add one when you add a new item.
So now the script looks like this:
use strict;
use My::Config qw(%c);
use vars qw(%c)
print "Content-type: text/plain\r\n\r\n";
print "My url docs root: $c{url}{docs}\n";
Do you see the difference? The whole mess has gone, there is only one variable to worry about.
There is one small downside to taking this approach:
auto-vivification. For example, if we wrote $c{url}{doc} by mistake,
perl would silently create this element for us with the value
undef. When we use strict; Perl will tell us about any
misspelling of this kind for a simple scalar, but this check is not
performed for hash elements. This puts the onus of responsibility back
on us since we must take greater care. A possible solution to this is
to use pseudo-hashes, but they are still considered experimental so we
won't cover them here.
The benefits of the hash approach are significant and we can make do
even better. I would like to get rid of the Exporter stuff
completely. I remove all the exporting code so my config file now
looks like:
package My::Config;
use strict;
use vars qw(%c);
%c = (
dir => {
cgi => "/home/httpd/perl",
docs => "/home/httpd/docs",
img => "/home/httpd/docs/images",
},
url => {
cgi => "/perl",
docs => "/",
img => "/images",
},
color => {
hint => "#777777",
warn => "#990066",
normal => "#000000",
},
);
And the code:
use strict;
use My::Config ();
print "Content-type: text/plain\r\n\r\n";
print "My url docs root: $My::Config::c{url}{docs}\n";
Since we still want to save lots of typing, and since now we need to
use a fully qualified notation like $My::Config::c{url}{docs},
let's use the magical Perl aliasing feature. I'll modify the code to
be:
use strict;
use My::Config ();
use vars qw(%c);
*c = \%My::Config::c;
print "Content-type: text/plain\r\n\r\n";
print "My url docs root: $c{url}{docs}\n";
I have aliased the *c glob with \%My::Config::c, a reference to
a hash. From now on, %My::Config::c and %c are the same
hash and you can read from or modify either of them.
Just one last little point. Sometimes you see a lot of redundancy in the configuration variables, for example:
$cgi_dir = "/home/httpd/perl"; $docs_dir = "/home/httpd/docs"; $img_dir = "/home/httpd/docs/images";
Now if you want to move the base path "/home/httpd" into a new
place, it demands lots of typing. Of course the solution is:
$base = "/home/httpd"; $cgi_dir = "$base/perl"; $docs_dir = "$base/docs"; $img_dir = "$docs_dir/images";
You cannot do the same trick with a hash, since you cannot refer to its values before the definition is finished. So this wouldn't work:
%c =
(
base => "/home/httpd",
dir => {
cgi => "$c{base}/perl",
docs => "$c{base}/docs",
img => "$c{base}{docs}/images",
},
);
But nothing stops us from adding additional variables, which are lexically scoped with my (). The following code is correct.
my $base = "/home/httpd";
%c =
(
dir => {
cgi => "$base/perl",
docs => "$base/docs",
img => "$base/docs/images",
},
);
You have just learned how to make configuration files easily maintainable, and how to save memory by avoiding the export of variables into a script's namespace.
First, lets look at a simple case, when we just have to look after a simple configuration file like the one below. Imagine a script that tells you who is the patch pumpkin of the current Perl release.
Sidenote: Pumpkin A humorous term for the token (notional or real) that gives its possessor (the "pumpking" or the "pumpkineer") exclusive access to something, e.g. applying patches to a master copy of some source (for which the token is called the "patch pumpkin").
use CGI ();
use strict;
my $fname = "Larry";
my $lname = "Wall";
my $q = CGI->new;
print $q->header(-type=>'text/html');
print $q->p("$fname $lname holds the patch pumpkin" .
"for this Perl release.");
The script has a hardcoded value for the name. It's very simple: initialize the CGI object, print the proper HTTP header and tell the world who is the current patch pumpkin.
When the patch pumpkin changes we don't want to modify the
script. Therefore, we put the $fname and $lname variables into a
configuration file.
$fname = "Gurusamy"; $lname = "Sarathy"; 1;
Please note that there is no package declaration in the above file, so
the code will be evaluated in the caller's package or in the main::
package if none was declared. This means that the variables $fname
and $lname will override (or initialize if they weren't yet) the
variables with the same names in the caller's namespace. This works
for global variables only--you cannot update variables defined
lexically (with my ()) using this technique.
You have started the server and everything is working properly. After a while you decide to modify the configuration. How do you let your running server know that the configuration was modified without restarting it? Remember we are in production and server restarting can be quite expensive for us. One of the simplest solutions is to poll the file's modification time by calling stat() before the script starts to do real work. If we see that the file was updated, we force a reconfiguration of the variables located in this file. We will call the function that reloads the configuration reread_conf() and have it accept a single argument, which is the relative path to the configuration file.
Apache::Registry calls a chdir() to the script's directory before
it starts the script's execution. So if your CGI script is invoked
under the Apache::Registry handler you can put the configuration
file in the same directory as the script. Alternatively you can put
the file in a directory below that and use a path relative to the
script directory. You have to make sure that the file will be found,
somehow. Be aware that do() searches the libraries in the directories
in @INC.
use vars qw(%MODIFIED);
sub reread_conf{
my $file = shift;
return unless defined $file;
return unless -e $file and -r _;
my $mod = -M _;
unless (exists $MODIFIED{$file} and $MODIFIED{$file} == $mod) {
my $result;
unless ($result = do $file) {
warn "couldn't parse $file: $@" if $@;
warn "couldn't do $file: $!" unless defined $result;
warn "couldn't run $file" unless $result;
}
$MODIFIED{$file} = $mod; # Update the MODIFICATION times
}
} # end of reread_conf
Notice that we use the == comparison operator when checking file's
modification timestamp, because all we want to know whether the file
was changed or not.
When the require(), use() and do() operators successfully return, the
file that was passed as an argument is inserted into %INC (the key
is the name of the file and the value the path to it). Specifically,
when Perl sees require() or use() in the code, it first tests %INC
to see whether the file is already there and thus loaded. If the test
returns true, Perl saves the overhead of code re-reading and
re-compiling; however calling do() will (re)load regardless.
You generally don't notice with plain perl scripts, but in mod_perl it's used all the time; after the first request served by a process all the files loaded by require() stay in memory. If the file is preloaded at server startup, even the first request doesn't have the loading overhead.
We use do() to reload the code in this file and not require() because
while do() behaves almost identically to require(), it reloads the
file unconditionally. If do() cannot read the file, it returns
undef and sets $! to report the error. If do() can read the
file but cannot compile it, it returns undef and sets an error
message in $@. If the file is successfully compiled, do() returns
the value of the last expression evaluated.
The configuration file can be broken if someone has incorrectly modified it. We don't want the whole service that uses that file to be broken, just because of that. We trap the possible failure to do() the file and ignore the changes, by the resetting the modification time. If do() fails to load the file it might be a good idea to send an email to the system administrator about the problem.
Notice however, that since do() updates %INC like require() does,
if you are using Apache::StatINC it will attempt to reload this
file before the reread_conf() call. So if the file wouldn't compile,
the request will be aborted. Apache::StatINC shouldn't be used in
production (because it slows things down by stat()'ing all the files
listed in %INC) so this shouldn't be a problem.
Note that we assume that the entire purpose of this function is to reload the configuration if it was changed. This is fail-safe, because if something goes wrong we just return without modifying the server configuration. The script should not be used to initialize the variables on its first invocation. To do that, you would need to replace each occurrence of return() and warn() with die(). If you do that, take a look at the section "Redirecting Errors to the Client instead of error_log".
I used the above approach when I had a huge configuration file that was loaded only at server startup, and another little configuration file that included only a few variables that could be updated by hand or through the web interface. Those variables were initialized in the main configuration file. If the webmaster breaks the syntax of this dynamic file while updating it by hand, it won't affect the main (write-protected) configuration file and so stop the proper execution of the programs. Soon we will see a simple web interface which allows us to modify the configuration file without actually breaking it.
A sample script using the presented subroutine would be:
use vars qw(%MODIFIED $fname $lname);
use CGI ();
use strict;
my $q = CGI->new;
print $q->header(-type=>'text/plain');
my $config_file = "./config.pl";
reread_conf($config_file);
print $q->p("$fname $lname holds the patch pumpkin" .
"for this Perl release.");
sub reread_conf{
my $file = shift;
return unless defined $file;
return unless -e $file and -r _;
my $mod = -M _;
unless ($MODIFIED{$file} and $MODIFIED{$file} == $mod) {
my $result;
unless ($result = do $file) {
warn "couldn't parse $file: $@" if $@;
warn "couldn't do $file: $!" unless defined $result;
warn "couldn't run $file" unless $result;
}
$MODIFIED{$file} = $mod; # Update the MODIFICATION times
}
} # end of reread_conf
Remember that you should be using (stat $file)[9] instead of -M
$file if you are modifying the $^T variable. In some of my
scripts, I reset $^T to the time of the script invocation with
"$^T = time()". That way I can perform -M and the similar
(-A, -C) file status tests relative to the script invocation
time, and not the time the process was started.
If your configuration file is more sophisticated and it declares a package and exports variables, the above code will work just as well. Even if you think that you will have to import() variables again, when do() recompiles the script the originally imported variables get updated with the values from the reloaded code.
The CGI script below allows a system administrator to dynamically update a configuration file through the web interface. Combining this with the code we have just seen to reload the modified files, you get a system which is dynamically reconfigurable without needing to restart the server. Configuration can be performed from any machine having just a web interface (a simple browser connected to the Internet).
Let's say you have a configuration file like this:
package MainConfig;
use strict;
use vars qw(%c);
%c = (
name => "Larry Wall",
release => "5.000",
comments => "Adding more ways to do the same thing :)",
other => "More config values",
hash => { foo => "ouch",
bar => "geez",
},
array => [qw( a b c)],
);
You want to make the variables name, release and comments
dynamically configurable. You want to have a web interface with an input
form that allows you to modify these variables. Once modified you want
to update the configuration file and propagate the changes to all the
currently running processes. Quite a simple task.
Let's look at the main stages of the implementation. Create a form with preset current values of the variables. Let the administrator modify it and submit the changes. Validate the submitted information (numeric fields should carry numbers, literals--words, etc). Update the configuration file. Update the modified value in the memory of the current process. Present the form as before but with updated fields if any.
The only part that seems to be complicated to implement is a configuration file update, for a couple of reasons. If updating the file breaks it, the whole service won't work. If the file is very big and includes comments and complex data structures, parsing the file can be quite a challenge.
So let's simplify the task. If all we want is to update a few variables, why don't we create a tiny configuration file with just those variables? It can be modified through the web interface and overwritten each time there is something to be changed. This way we don't have to parse the file before updating it. If the main configuration file is changed we don't care, we don't depend on it any more.
The dynamically updated variables are duplicated, they will be in the main file and in the dynamic file. We do this to simplify maintenance. When a new release is installed the dynamic configuration file won't exist at all. It will be created only after the first update. As we just saw, the only change in the main code is to add a snippet to load this file if it exists and was changed.
This additional code must be executed after the main configuration file has been loaded. That way the updated variables will override the default values in the main file.
META: extend on the comments:
# remember to run this code in taint mode
use strict;
use vars qw($q %c $dynamic_config_file %vars_to_change %validation_rules);
use CGI ();
use lib qw(.);
use MainConfig ();
*c = \%MainConfig::c;
$dynamic_config_file = "./config.pl";
# load the dynamic configuration file if it exists, and override the
# default values from the main configuration file
do $dynamic_config_file if -e $dynamic_config_file and -r _;
# fields that can be changed and their titles
%vars_to_change =
(
'name' => "Patch Pumpkin's Name",
'release' => "Current Perl Release",
'comments' => "Release Comments",
);
%validation_rules =
(
'name' => sub { $_[0] =~ /^[\w\s\.]+$/; },
'release' => sub { $_[0] =~ /^\d+\.[\d_]+$/; },
'comments' => sub { 1; },
);
$q = CGI->new;
print $q->header(-type=>'text/html'),
$q->start_html();
my %updates = ();
# We always rewrite the dynamic config file, so we want all the
# vars to be passed, but to save time we will only do checking
# of vars that were changed. The rest will be retrieved from
# the 'prev_foo' values.
foreach (keys %vars_to_change) {
# copy var so we can modify it
my $new_val = $q->param($_) || '';
# strip a possible ^M char (DOS/WIN)
$new_val =~ s/\cM//g;
# push to hash if was changed
$updates{$_} = $new_val
if defined $q->param("prev_".$_)
and $new_val ne $q->param("prev_".$_);
}
# Note that we cannot trust the previous values of the variables
# since they were presented to the user as hidden form variables,
# and the user can mangle those. We don't care: it cannot do any
# damage, as we verify each variable by rules which we define.
# Process if there is something to process. Will be not called if
# it's invoked a first time to display the form or when the form
# was submitted but the values weren't modified (we know that by
# comparing with the previous values of the variables, which are
# the hidden fields in the form)
# process and update the values if valid
process_change_config(%updates) if %updates;
# print the update form
conf_modification_form();
# update the config file but first validate that the values are correct ones
#########################
sub process_change_config{
my %updates = @_;
# we will list here all the malformatted vars
my %malformatted = ();
print $q->b("Trying to validate these values<BR>");
foreach (keys %updates) {
print "<DT><B>$_</B> => <PRE>$updates{$_}</PRE>";
# now we have to handle each var to be changed very carefully
# since this file goes immediately into production!
$malformatted{$_} = delete $updates{$_}
unless $validation_rules{$_}->($updates{$_});
} # end of foreach (keys %updates)
# print warnings if there are any invalid changes
print $q->hr,
$q->p($q->b(qq{Warning! These variables were changed
but found malformed, thus the original
values will be preserved.})
),
join(",<BR>",
map { $q->b($vars_to_change{$_}) . " : $malformatted{$_}\n"
} keys %malformatted)
if %malformatted;
# Now complete the vars that weren't changed from the
# $q->param('prev_var') values
map { $updates{$_} = $q->param('prev_'.$_) unless exists $updates{$_}
} keys %vars_to_change;
# Now we have all the data that should be written into the dynamic
# config file
# escape single quotes "'" while creating a file
my $content = join "\n",
map { $updates{$_} =~ s/(['\\])/\\$1/g;
'$c{' . $_ . "} = '" . $updates{$_} . "';\n"
} keys %updates;
# now add '1;' to make require() happy
$content .= "\n1;";
# keep the dummy result in $res so it won't complain
eval {my $res = $content};
if ($@) {
print qq{Warning! Something went wrong with config file
generation!<P> The error was : <BR><PRE>$@</PRE>};
return;
}
print $q->hr;
# overwrite the dynamic config file
use Symbol ();
my $fh = Symbol::gensym();
open $fh, ">$dynamic_config_file.bak"
or die "Can't open $dynamic_config_file.bak for writing :$! \n";
flock $fh,2; # exclusive lock
seek $fh,0,0; # rewind to the start
truncate $fh, 0; # the file might shrink!
print $fh $content;
close $fh;
# OK, now we make a real file
rename "$dynamic_config_file.bak",$dynamic_config_file
or die "Failed to rename: $!";
# rerun it to update variables in the current process! Note that
# it won't update the variables in other processes. Special
# code that watches the timestamps on the config file will do this
# work for each process. Since the next invocation will update the
# configuration anyway, why do we need to load it here? The reason
# is simple: we are going to fill the form's input fields with
# the updated data.
do $dynamic_config_file;
} # end sub process_change_config
##########################
sub conf_modification_form{
print $q->center($q->h3("Update Form"));
print $q->hr,
$q->p(qq{This form allows you to dynamically update the current
configuration. You don\'t need to restart the server in
order for changes to take an effect}
);
# set the previous settings in the form's hidden fields, so we
# know whether we have to do some changes or not
map {$q->param("prev_$_",$c{$_}) } keys %vars_to_change;
# rows for the table, go into the form
my @configs = ();
# prepare one textfield entries
push @configs,
map {
$q->td(
$q->b("$vars_to_change{$_}:"),
),
$q->td(
$q->textfield(-name => $_,
-default => $c{$_},
-override => 1,
-size => 20,
-maxlength => 50,
)
),
} qw(name release);
# prepare multiline textarea entries
push @configs,
map {
$q->td(
$q->b("$vars_to_change{$_}:"),
),
$q->td(
$q->textarea(-name => $_,
-default => $c{$_},
-override => 1,
-rows => 10,
-columns => 50,
-wrap => "HARD",
)
),
} qw(comments);
print $q->startform('POST',$q->url),"\n",
$q->center($q->table(map {$q->Tr($_),"\n",} @configs),
$q->submit('','Update!'),"\n",
),
map ({$q->hidden("prev_".$_, $q->param("prev_".$_))."\n" }
keys %vars_to_change), # hidden previous values
$q->br,"\n",
$q->endform,"\n",
$q->hr,"\n",
$q->end_html;
} # end sub conf_modification_form
Once updated the script generates a file like:
$c{release} = '5.6';
$c{name} = 'Gurusamy Sarathy';
$c{comments} = 'Perl rules the world!';
1;
If you want to reload a perlhandler on each invocation, the following trick will do it:
PerlHandler "sub { do 'MyTest.pm'; MyTest::handler(shift) }"
do() reloads MyTest.pm on every request.
This section requires an in-depth understanding of use(), require(), do(), %INC and @INC .
To make things clear before we go into details: each child process has
its own %INC hash which is used to store information about its
compiled modules. The keys of the hash are the names of the modules
and files passed as arguments to require() and use(). The values are
the full or relative paths to these modules and files.
Suppose we have my-lib.pl and MyModule.pm both located at
/home/httpd/perl/my/.
/home/httpd/perl/my/ is in @INC at server startup.
require "my-lib.pl";
use MyModule;
print $INC{"my-lib.pl"},"\n";
print $INC{"MyModule.pm"},"\n";
prints:
/home/httpd/perl/my/my-lib.pl /home/httpd/perl/my/MyModule.pm
Adding use lib:
use lib qw(.);
require "my-lib.pl";
use MyModule;
print $INC{"my-lib.pl"},"\n";
print $INC{"MyModule.pm"},"\n";
prints:
my-lib.pl MyModule.pm
/home/httpd/perl/my/ isn't in @INC at server startup.
require "my-lib.pl";
use MyModule;
print $INC{"my-lib.pl"},"\n";
print $INC{"MyModule.pm"},"\n";
wouldn't work, since perl cannot find the modules.
Adding use lib:
use lib qw(.);
require "my-lib.pl";
use MyModule;
print $INC{"my-lib.pl"},"\n";
print $INC{"MyModule.pm"},"\n";
prints:
my-lib.pl MyModule.pm
Let's look at three scripts with faults related to name space. For the following discussion we will consider just one individual child process.
First, You can't have two identical module names running on the
same server! Only the first one found in a use() or require()
statement will be compiled into the package, the request for the other
module will be skipped, since the server will think that it's already
compiled. This is a direct result of using %INC, which has keys
equal to the names of the modules. Two identical names will refer to
the same key in the hash. (Refer to the section 'Looking inside the server' to find out how you can know
what is loaded and where.)
So if you have two different Foo modules in two different
directories and two scripts script1.pl and script2.pl, placed
like this:
./tool1/Foo.pm ./tool1/tool1.pl ./tool2/Foo.pm ./tool2/tool2.pl
Where some sample code could be:
./tool1/tool1.pl ---------------- use Foo; print "Content-type: text/plain\r\n\r\n"; print "I'm Script number One\n"; foo();
./tool1/Foo.pm
--------------
sub foo{
print "<B>I'm Tool Number One!</B>\n";
}
1;
./tool2/tool2.pl ---------------- use Foo; print "Content-type: text/plain\r\n\r\n"; print "I'm Script number Two\n"; foo();
./tool2/Foo.pm
--------------
sub foo{
print "<B>I'm Tool Number Two!</B>\n";
}
1;
Both scripts call use Foo;. Only the first one called will know
about Foo. When you call the second script it will not know about
Foo at all--it's like you've forgotten to write use Foo;. Run
the server in single server mode to detect this kind of
bug immediately.
You will see the following in the error_log file:
Undefined subroutine &Apache::ROOT::perl::tool2::tool2_2epl::foo called at /home/httpd/perl/tool2/tool2.pl line 4.
If the files do not declare a package, the above is true for libraries (i.e. my-lib.pl") you require() as well:
Suppose that you have a directory structure like this:
./tool1/config.pl ./tool1/tool1.pl ./tool2/config.pl ./tool2/tool2.pl
and both scripts contain:
use lib qw(.); require "config.pl";
while ./tool1/config.pl can be something like this:
$foo = 0; 1;
and ./tool2/config.pl:
$foo = 1; 1;
The second scenario is not different from the first, there is almost
no difference between use() and require() if you don't have to import
some symbols into a calling script. Only the first script served will
actually do the require(), for the same reason as the example above.
%INC already includes the key "config.pl"!
It is interesting that the following scenario will fail too!
./tool/config.pl ./tool/tool1.pl ./tool/tool2.pl
where tool1.pl and tool2.pl both require() the same
config.pl.
There are three solutions for this:
The first two faulty scenarios can be solved by placing your library modules in a subdirectory structure so that they have different path prefixes. The file system layout will be something like:
./tool1/Tool1/Foo.pm ./tool1/tool1.pl ./tool2/Tool2/Foo.pm ./tool2/tool2.pl
And modify the scripts:
use Tool1::Foo; use Tool2::Foo;
For require() (scenario number 2) use the following:
./tool1/tool1-lib/config.pl ./tool1/tool1.pl ./tool2/tool2-lib/config.pl ./tool2/tool2.pl
And each script contains respectively:
use lib qw(.); require "tool1-lib/config.pl";
use lib qw(.); require "tool2-lib/config.pl";
This solution isn't good, since while it might work for you now, if
you add another script that wants to use the same module or
config.pl file, it would fail as we saw in the third scenario.
Let's see some better solutions.
Another option is to use a full path to the script, so it will be used
as a key in %INC;
require "/full/path/to/the/config.pl";
This solution solves the problem of the first two scenarios. I was surprised that it worked for the third scenario as well!
With this solution you lose some portability. If you move the tool around in the file system you will have to change the base directory or write some additional script that will automatically update the hardcoded path after it was moved. Of course you will have to remember to invoke it.
Make sure you read all of this solution.
Declare a package name in the required files! It should be unique in
relation to the rest of the package names you use. %INC will then
use the unique package name for the key. It's a good idea to use at
least two-level package names for your private modules,
e.g. MyProject::Carp and not Carp, since the latter will collide
with an existing standard package. Even though a
package may not exist in the standard distribution now, a package may
come along in a later distribution which collides with a name you've
chosen. Using a two part package name will help avoid this problem.
Even a better approach is to use three level naming, like
CompanyName::ProjectName::Module, which is most unlikely to have
conflicts with later Perl releases. Foresee problems like this and
save yourself future trouble.
What are the implications of package declaration?
Without package declarations, it is very convenient to use() or
require() files because all the variables and subroutines are part of
the main:: package. Any of them can be used as if they are part of
the main script. With package declarations things are more awkward.
You have to use the Package::function() method to call a subroutine
from Package and to access a global variable $foo inside the
same package you have to write $Package::foo.
Lexically defined variables, those declared with my () inside Package
will be inaccessible from outside the package.
You can leave your scripts unchanged if you import the names of the
global variables and subroutines into the namespace of package
main:: like this:
use Module qw(:mysubs sub_b $var1 :myvars);
You can export both subroutines and global variables. Note however that this method has the disadvantage of consuming more memory for the current process.
See perldoc Exporter for information about exporting other
variables and symbols.
This completely covers the third scenario. When you use different module names in package declarations, as explained above, you cover the first two as well.
The following solution should be used only as a short term
bandaid. You can force reloading of the modules by either fiddling
with %INC or replacing use() and require() calls with do().
If you delete the module entry from the %INC hash, before calling
require() or use() the module will be loaded and compiled again. For
example:
./project/runA.pl
-----------------
BEGIN {
delete $INC{"MyConfig.pm"};
}
use lib qw(.);
use MyConfig;
print "Content-type: text/plain\n\n";
print "Script A\n";
print "Inside project: ", project_name();
Apply the same fix to runB.pl.
Another alternative is to force module reload via do():
./project/runA.pl ----------------- use lib qw(.); do "MyConfig.pm"; print "Content-type: text/plain\n\n"; print "Script B\n"; print "Inside project: ", project_name();
Apply the same fix to runB.pl.
If you needed to import() something from the loaded module, call the import() method explicitly. For example if you had:
use MyConfig qw(foo bar);
now the code will look as:
do "MyConfig.pm"; MyConfig->import(qw(foo bar));
Both presented solutions are ineffective, since the modules in question will be reloaded on each request, slowing down the response times. Therefore use these only when a very quick fix is needed and provide one of the more robust solutions discussed in the previous sections.
See also the perlmodlib and perlmod manpages.
From the above discussion it should be clear that you cannot run development and production versions of the tools using the same apache server! You have to run a separate server for each. They can be on the same machine, but the servers will use different ports.
If you have the following:
PerlHandler Apache::Work::Foo PerlHandler Apache::Work::Foo::Bar
And you make a request that pulls in Apache/Work/Foo/Bar.pm first,
then the Apache::Work::Foo package gets defined, so mod_perl does
not try to pull in Apache/Work/Foo.pm
Apache::Registry scripts cannot contain __END__ or __DATA__
tokens.
Why? Because Apache::Registry scripts are being wrapped into a
subroutine called handler, like the script at URI /perl/test.pl:
print "Content-type: text/plain\r\n\r\n"; print "Hi";
When the script is being executed under Apache::Registry handler,
it actually becomes:
package Apache::ROOT::perl::test_2epl;
use Apache qw(exit);
sub handler {
print "Content-type: text/plain\r\n\r\n";
print "Hi";
}
So if you happen to put an __END__ tag, like:
print "Content-type: text/plain\r\n\r\n"; print "Hi"; __END__ Some text that wouldn't be normally executed
it will be turned into:
package Apache::ROOT::perl::test_2epl;
use Apache qw(exit);
sub handler {
print "Content-type: text/plain\r\n\r\n";
print "Hi";
__END__
Some text that wouldn't be normally executed
}
and you try to execute this script, you will receive the following error:
Missing right bracket at .... line 4, at end of line
Perl cuts everything after the __END__ tag. The same applies to
the __DATA__ tag.
Also, remember that whatever applies to Apache::Registry scripts,
in most cases applies to Apache::PerlRun scripts.
The output of system(), exec(), and open(PIPE,"|program") calls
will not be sent to the browser unless your Perl was configured with
sfio.
You can use backticks as a possible workaround:
print `command here`;
But you're throwing performance out the window either way. It's best not to fork at all if you can avoid it. See the "Forking or Executing subprocesses from mod_perl" section to learn about implications of forking.
Also read about Apache::SubProcess for overridden system() and exec() implementations that work with mod_perl.
The interface to filehandles which are linked to variables with Perl's
tie() function is not yet complete. The format() and write() functions
are missing. If you configure Perl with sfio, write() and format()
should work just fine.
Otherwise you could use sprintf() to replace format(): ##.##
becomes %2.2f and ####.## becomes %4.2f.
Pad all strings with (" " x 80) before using, and set their length with: %.25s for a max 25 char string. Or prefix the string with (" " x 80) for right-justifying.
Another alternative is to use the Text::Reform module.
Perl's exit() built-in function (all versions pr