LWP::Parallel::UserAgent(3pm) User Contributed Perl Documentation LWP::Parallel::UserAgent(3pm)
NAME
LWP::Parallel::UserAgent - A class for parallel User Agents
SYNOPSIS
require LWP::Parallel::UserAgent;
$ua = LWP::Parallel::UserAgent->new();
...
$ua->redirect (0); # prevents automatic following of redirects
$ua->max_hosts(5); # sets maximum number of locations accessed in parallel
$ua->max_req (5); # sets maximum number of parallel requests per host
...
$ua->register ($request); # or
$ua->register ($request, '/tmp/sss'); # or
$ua->register ($request, \&callback, 4096);
...
$ua->wait ( $timeout );
...
sub callback { my($data, $response, $protocol) = @_; .... }
DESCRIPTION
This class implements a user agent that access web sources in parallel.
Using a LWP::Parallel::UserAgent as your user agent, you typically start by registering
your requests, along with how you want the Agent to process the incoming results (see
$ua->register).
Then you wait for the results by calling $ua->wait. This method only returns, if all
requests have returned an answer, or the Agent timed out. Also, individual callback
functions might indicate that the Agent should stop waiting for requests and return. (see
$ua->register)
See the file LWP::Parallel for a set of simple examples.
METHODS
The LWP::Parallel::UserAgent is a sub-class of LWP::UserAgent, but not all of its methods
are available here. However, you can use its main methods, $ua->simple_request and
$ua->request, in order to simulate singular access with this package. Of course, if a
single request is all you need, then you should probably use LWP::UserAgent in the first
place, since it will be faster than our emulation here.
For parallel access, you will need to use the new methods that come with
LWP::Parallel::UserAgent, called $pua->register and $pua->wait. See below for more
information on each method.
$ua = LWP::Parallel::UserAgent->new();
Constructor for the parallel UserAgent. Returns a reference to a
LWP::Parallel::UserAgent object.
Optionally, you can give it an existing LWP::Parallel::UserAgent (or even an
LWP::UserAgent) as a first argument, and it will "clone" a new one from this (This
just copies the behavior of LWP::UserAgent. I have never actually tried this, so let
me know if this does not do what you want).
$ua->initialize;
Takes no arguments and initializes the UserAgent. It is automatically called in
LWP::Parallel::UserAgent::new, so usually there is no need to call this explicitly.
However, if you want to re-use the same UserAgent object for a number of "runs", you
should call $ua->initialize after you have processed the results of the previous call
to $ua->wait, but before registering any new requests.
$ua->redirect ( $ok )
Changes the default value for permitting Parallel::UserAgent to follow redirects and
authentication-requests. The standard value is 'true'.
See "$ua-"register> for how to change the behaviour for particular requests only.
$ua->nonblock ( $ok )
Per default, LWP::Parallel will connect to a site using a blocking call. If you want
to speed this step up, you can try the new non-blocking version of the connect call by
setting $ua->nonblock to 'true'. The standard value is 'false' (although this might
change in the future if nonblocking connects turn out to be stable enough.)
$ua->duplicates ( $ok )
Changes the default value for permitting Parallel::UserAgent to ignore duplicate
requests. The standard value is 'false'.
$ua->in_order ( $ok )
Changes the default value to restricting Parallel::UserAgent to connect to the
registered sites in the order they were registered. The default value FALSE allows
Parallel::UserAgent to make the connections in an apparently random order.
$ua->remember_failures ( $yes )
If set to one, enables ParalleUA to ignore requests or connections to sites that it
failed to connect to before during this "run". If set to zero (the dafault)
Parallel::UserAgent will try to connect to every single URL you registered, even if it
constantly fails to connect to a particular site.
$ua->max_hosts ( $max )
Changes the maximum number of locations accessed in parallel. The default value is 7.
Note: Although it says 'host', it really means 'netloc/server'! That is, multiple
server on the same host (i.e. one server running on port 80, the other one on port
6060) will count as two 'hosts'.
$ua->max_req ( $max )
Changes the maximum number of requests issued per host in parallel. The default value
is 5.
$ua->register ( $request [, $arg [, $size [, $redirect_ok]]] )
Registers the given request with the User Agent. In case of an error, a
"HTTP::Request" object containing the HTML-Error message is returned. Otherwise (that
is, in case of a success) it will return undef.
The $request should be a reference to a "HTTP::Request" object with values defined for
at least the method() and url() attributes.
$size specifies the number of bytes Parallel::UserAgent should try to read each time
some new data arrives. Setting it to '0' or 'undef' will make Parallel::UserAgent use
the default. (8k)
Specifying $redirect_ok will alter the redirection behaviour for this particular
request only. '1' or any other true value will force Parallel::UserAgent to follow
redirects, even if the default is set to 'no_redirect'. (see "$ua-"redirect>) '0' or
any other false value should do the reverse. See LWP::UserAgent for using an object's
"requests_redirectable" list for fine-tuning this behavior.
If $arg is a scalar it is taken as a filename where the content of the response is
stored.
If $arg is a reference to a subroutine, then this routine is called as chunks of the
content is received. An optional $size argument is taken as a hint for an appropriate
chunk size. The callback function is called with 3 arguments: the data received this
time, a reference to the response object and a reference to the protocol object. The
callback can use the predefined constants C_ENDCON, C_LASTCON and C_ENDALL as a return
value in order to influence pending and active connections. C_ENDCON will end this
connection immediately, whereas C_LASTCON will inidicate that no further connections
should be made. C_ENDALL will immediately end all requests and let the
Parallel::UserAgent return from $pua->wait().
If $arg is omitted, then the content is stored in the response object itself.
If $arg is a "LPW::Parallel::UserAgent::Entry" object, then this request will be
registered as a follow-up request to this particular entry. This will not create a new
entry, but instead link the current response (i.e. the reason for re-registering) as
$response->previous to the new response of this request. All other fields are either
re-initialized ($request, $fullpath, $proxy) or left untouched ($arg, $size). (This
should only be use internally)
LWP::Parallel::UserAgent->request also allows the registration of follow-up requests
to existing requests, that required redirection or authentication. In order to do
this, an Parallel::UserAgent::Entry object will be passed as the second argument to
the call. Usually, this should not be used directly, but left to the internal
$ua->handle_response method!
$ua->on_connect ( $request, $response, $entry )
This method should be overridden in an (otherwise empty) subclass in order to present
customized messages for each connection attempted by the User Agent.
$ua->on_failure ( $request, $response, $entry )
This method should be overridden in an (otherwise empty) subclass in order to present
customized messages for each connection or registration that failed.
$ua->on_return ( $request, $response, $entry )
This method should be overridden in an (otherwise empty) subclass in order to present
customized messages for each request returned. If a callback function was registered
with this request, this callback function is called before $pua->on_return.
Please note that while $pua->on_return is a method (which should be overridden in a
subclass), a callback function is NOT a method, and does not have $self as its first
parameter. (See more on callbacks below)
The purpose of $pua->on_return is mainly to provide messages when a request returns.
However, you can also re-register follow-up requests in case you need them.
If you need specialized follow-up requests depending on the request that just
returend, use a callback function instead (which can be different for each request
registered). Otherwise you might end up writing a HUGE if..elsif..else.. branch in
this global method.
$us->discard_entry ( $entry )
Completely removes an entry from memory, in case its output is not needed. Use this in
callbacks such as "on_return" or <on_failure> if you want to make sure an entry that
you do not need does not occupy valuable main memory.
$ua->wait ( $timeout )
Waits for available sockets to write to or read from. Will timeout after $timeout
seconds. Will block if $timeout = 0 specified. If $timeout is omitted, it will use the
Agent default timeout value.
$ua->handle_response($request, $arg [, $size])
Analyses results, handling redirects and security. This method may actually register
several different, additional requests.
This method should not be called directly. Instead, indicate for each individual
request registered with "$ua-"register()> whether or not you want Parallel::UserAgent
to handle redirects and security, or specify a default value for all requests in
Parallel::UserAgent by using "$ua-"redirect()>.
$ua->simple_request($request, [$arg [, $size]])
This method simulates the behavior of LWP::UserAgent->simple_request. It is actually
kinda overkill to use this method in Parallel::UserAgent, and it is mainly here for
testing backward compatibility with the original LWP::UserAgent. The following
description is taken directly from the corresponding libwww pod:
$ua->simple_request dispatches a single WWW request on behalf of a user, and returns
the response received. The $request should be a reference to a "HTTP::Request" object
with values defined for at least the method() and url() attributes.
If $arg is a scalar it is taken as a filename where the content of the response is
stored.
If $arg is a reference to a subroutine, then this routine is called as chunks of the
content is received. An optional $size argument is taken as a hint for an appropriate
chunk size.
If $arg is omitted, then the content is stored in the response object itself.
$ua->request($request, $arg [, $size])
Included for compatibility testing with LWP::UserAgent. Every day usage is
depreciated! Here is what LWP::UserAgent has to say about it:
Process a request, including redirects and security. This method may actually send
several different simple reqeusts.
The arguments are the same as for "simple_request()".
$ua->as_string
Returns a text that describe the state of the UA. Should be useful for debugging, if
it would print out anything important. But it does not (at least not yet). Try using
LWP::Debug...
ADDITIONAL METHODS
$ua->use_alarm([$boolean])
This function is not in use anymore and will display a warning when called and
warnings are enabled.
Callback functions
You can register a callback function. See LWP::UserAgent for details.
BUGS
Probably lots! This was meant only as an interim release until this functionality is
incorporated into LWPng, the next generation libwww module (though it has been this way
for over 2 years now!)
Needs a lot more documentation on how callbacks work!
SEE ALSO
LWP::UserAgent
COPYRIGHT
Copyright 1997-2004 Marc Langheinrich <marclang AT cpan.org>
This library is free software; you can redistribute it and/or modify it under the same
terms as Perl itself.
POD ERRORS
Hey! The above document had some coding errors, which are explained below:
Around line 1509:
You forgot a '=back' before '=head1'
Around line 1511:
'=item' outside of any '=over'
Around line 1522:
You forgot a '=back' before '=head1'
perl v5.10.0 2004-02-10 LWP::Parallel::UserAgent(3pm)
Generated by $Id: phpMan.php,v 4.49 2006/02/26 13:18:18 chedong Exp $ Author: Che Dong
On Apache
Under GNU General Public License
2012-05-24 15:10 @38.107.179.237 Crawled by CCBot/1.0 (+http://www.commoncrawl.org/bot.html)