NAME ModPerl2::Tools - a few hopefully useful tools SYNOPSIS use ModPerl2::Tools; ModPerl2::Tools::spawn +{keep_fd=>[3,4,7], survive=>1}, sub {...}; ModPerl2::Tools::spawn +{keep_fd=>[3,4,7], survive=>1}, qw/bash -c .../; ModPerl2::Tools::safe_die $status; $r->safe_die($status); $f->safe_die($status); $content=ModPerl2::Tools::fetch_url $url; $content=$r->fetch_url($url); INSTALLATION perl Makefile.PL make make test make install DESCRIPTION This module is a collection of functions and methods that I found useful when working with "mod_perl". I work mostly under Linux. So, I don't expect all of these functions to work on other operating systems. Forking off long running processes Sometimes one needs to spawn off a long running process as the result of a request. Under modperl this is not as simple as calling "fork" because that way all open file descriptors would be inherited by the child and, more subtle, the long running process would be killed when the administrator shuts down the web server. The former is usually considered a security issue, the latter a design decision. There is already $r->spawn_proc_prog that serves a similar purpose as the "spawn" function. However, "spawn_proc_prog" is not usable for long running processes because it kills the children after a certain timeout. Solution $pid=ModPerl2::Tools::spawn \%options, $subroutine, @parameters; or $pid=ModPerl2::Tools::spawn \%options, @command_line; "spawn" expects as the first parameter an options hash reference. The second parameter may be a code reference or a string. In case of a code ref no other program is executed but the subroutine is called instead. The remaining parameters are passed to this function. Note, the perl environment under modperl differs in certain ways from a normal perl environment. For example %ENV is not bound to the C-level "environ". These modifications are not undone by this module. So, it's generally better to execute another perl interpreter instead of using the $subroutine feature. The options parameter accepts these options: keep_fd => \@fds here an array of file descriptor numbers (not file handles) is expected. All other file descriptors except for the listed and file descriptor 2 (STDERR) are closed before calling $subroutine or executing @command_line. survive => $boolean if passed "false" the created process will be killed when Apache shuts down. if true it will survive an Apache restart. The return code on success is the PID of the process. On failure "undef" or an empty string is returned. The created process is not related as a child process to the current apache child. Serving "ErrorDocument"s Triggering "ErrorDocument"s from a registry script or even more from an output filter is not simple. The normal way as a handler is return Apache2::Const::STATUS; This does not work for registry scripts. An output filter even if it returns a status can trigger only a "SERVER_ERROR". The main interface to enter standard error processing in Apache is "ap_die()" at C-level. Its Perl interface is hidden in Apache2::HookRun. There is one case when an error message cannot be sent to the user. This happens if the HTTP headers are already on the wire. Then it is too late. The various flavors of "safe_die()" take this into account. "safe_die" won't return. Instead it calls ModPerl::Util::exit(0) which raises an exception. ModPerl2::Tools::safe_die $status This function is designed to be called from registry scripts. It uses Apache2::RequestUtil->request to fetch the current request object. So, PerlOption +GlobalRequest must be enabled. Usage example: ModPerl2::Tools::safe_die 401; $r->safe_die($status) $f->safe_die($status) These 2 methods are to be used if a request object or a filter object are available. Usage from within a filter: package My::Filter; use strict; use warnings; use ModPerl2::Tools; use base 'Apache2::Filter'; sub handler : FilterRequestHandler { my ($f, $bb)=@_; $f->safe_die(410); } The filter flavor removes the current filter from the request's output filter chain. $r->headers_sent This function checks if the HTTP_HEADER output filter is still present. If so, it returns an empty list, true otherwise. The presence of this filter means no output has yet been written to the client. The HTTP status code and header fields can still be modified. Fetching the content of another document Sometimes a handler or a filter needs the content of another document in the web server's realm. Apache provides subrequests for this purpose. The 2 "fetch_url" variants use a subrequest to fetch the content of another document. The document can even be fetched via "mod_proxy" from another server. "ModPerl2::Tools::fetch_url" needs PerlOption +GlobalRequest Usage: $content=ModPerl2::Tools::fetch_url '/some/where?else=42'; $content=$r->fetch_url('/some/where?else=42'); ($content, $headers)= $r->fetch_url('http://what.is/the/meaning/of?life=42'); If "mod_proxy" is available "fetch_url" can use it to fetch a document from another web server. If "mod_ssl" is configured to allow proxying SSL (see "SSLProxyEngine") even the "https" scheme works. Another subtle point, "ProxyErrorOverride" may affect the output in case of an error. Further, if "fetch_url" is passed a subroutine as the last argument the content is not accumulated in a single variable but passed brigade-wise to the function: ($content, $headers)= $r->fetch_url('http://what.is/the/meaning/of?life=42', sub { my ($subr, @brigade)=$_; ... }); The subroutine is called with the subrequest as the first parameter and a list of non-empty strings. The list itself may be empty if all buckets of the brigade do not contain data. On success the resulting $content will be the empty string in this case. "fetch_url()" normally strips almost all input HTTP header fields from the subrequest before running it. However, if the $r request object has a "Host" header field it is passed on. Also, a "User-Agent" header is set for the subrequest containing "ModPerl2::Tools/$VERSION" where $VERSION is the module's version. If you need to pass more fields pass an array reference as the 2nd parameter to "fetch_url()": ($content, $headers)= $r->fetch_url('http://what.is/the/meaning/of?life=42', [qw/ X-MyHeader my-value X-MyNextHeader my-next-value /]); or even: ($content, $headers)= $r->fetch_url('http://what.is/the/meaning/of?life=42', [qw/ X-MyHeader my-value X-MyNextHeader my-next-value /], sub { my ($subr, @brigade)=$_; ... }); If "Host" or "User-Agent" headers are passed this way they overwrite the default ones. Note, though, the header fields are assigned to the subrequest just before the response handler is run. Earlier phases will see a copy of the main request's headers. How does it work? If the passed URL starts with "https://" or "http://" a subrequest for the URI "/" is initiated via "$r->lookup_uri('/')". Before the subrequest is run it is changed into a proxy request for the passed URL. One precondition for this to work is that there are no access restrictions for the URL "/". Otherwise it is simply a subrequest for the passed URL. Then "ModPerl2::Tools::Filter::fetch_content_filter" is installed as output filter for the subrequest. After that the subrequest is run. The output filter collects all output. When the request is done its "$r->headers_out" is copied into a normal hash and in list context the output string and this hash are returned. In scalar context only the string is returned. HTTP header names are case insensitive. Their names are all converted to lower case in the $headers hash. There are 2 hash members in upper case. "STATUS" contains the HTTP status code of the subrequest and "STATUSLINE" the status line. Useful functions for similar cases Note, it is always better to process data one chunk at a time. Try hard to do that! Collecting data in memory should only be a last resort. ModPerl2::Tools::Filter::read_bb $bucket_brigade, \@buffer "read_bb" collects the data of a bucket brigade in the @buffer array. If an "EOS" bucket has been seen it returns true otherwise false. A simple output filter that collects all data could look like: sub filter { my ($f, $bb)=@_; my @buffer; do_something(join '', @buffer) if ModPerl2::Tools::Filter::read_bb $bb, \@buffer; return Apache2::Const::OK; } ModPerl2::Tools::Filter::fetch_content_filter This function is a "FilterRequestHandler". Is is controlled by 2 elements of "$r->pnotes", "out" and "force_fetch_content". "out" must be an array reference. It is passed to "read_bb" to collect the output. "force_fetch_content" is a flag. If false the filter does nothing and removes itself if the "$r->status" on the first invocation of the filter is not "HTTP_OK". Usage: my $subr=$r->lookup_uri(...); my $output=[]; @{$subr->pnotes}{qw/out force_fetch_content/}=($output,1); $subr->add_output_filter (\&ModPerl2::Tools::Filter::fetch_content_filter); $subr->run; do_something(join '', @$output) EXPORTS None. TODO Look at APR to see what it provides to make things easier. For example "apr_proc_create()" SEE ALSO AUTHOR Torsten Förtsch, COPYRIGHT AND LICENSE Copyright (C) 2010 by Torsten Förtsch This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.