LWP::Parallel::RobotUA(3pm) - phpMan

Command: man perldoc info search(apropos)  


LWP::Parallel::RobotUA(3pm)    User Contributed Perl Documentation    LWP::Parallel::RobotUA(3pm)



NAME
       LWP::Parallel::RobotUA - A class for Parallel Web Robots

SYNOPSIS
         require LWP::Parallel::RobotUA;
         $ua = new LWP::Parallel::RobotUA 'my-robot/0.1', 'me AT foo.com';
         $ua->delay(0.5);  # in minutes!
         ...
         # just use it just like a normal LWP::Parallel::UserAgent
         $ua->register ($request, \&callback, 4096); # or
         $ua->wait ( $timeout );

DESCRIPTION
       This class implements a user agent that is suitable for robot applications.  Robots should
       be nice to the servers they visit.  They should consult the /robots.txt file to ensure
       that they are welcomed and they should not make requests too frequently.

       But, before you consider writing a robot take a look at
       <URL:http://info.webcrawler.com/mak/projects/robots/robots.html>.

       When you use a LWP::Parallel::RobotUA as your user agent, then you do not really have to
       think about these things yourself.  Just send requests as you do when you are using a
       normal LWP::Parallel::UserAgent and this special agent will make sure you are nice.

METHODS
       The LWP::Parallel::RobotUA is a sub-class of LWP::Parallel::UserAgent and LWP::RobotUA and
       implements a mix of their methods.

       In addition to LWP::Parallel::UserAgent, these methods are provided:

   $ua = LWP::Parallel::RobotUA->new($agent_name, $from, [$rules])
       Your robot's name and the mail address of the human responsible for the robot (i.e. you)
       are required by the constructor.

       Optionally it allows you to specify the WWW::RobotRules object to use. (See
       WWW::RobotRules::AnyDBM_File for persistent caching of robot rules in a local file)

   $ua->delay([$minutes])
       Set/Get the minimum delay between requests to the same server.  The default is 1 minute.

       Note: Previous versions of LWP Parallel-Robot used Seconds instead of
             Minutes! This is now compatible with LWP Robot.

   $ua->host_wait($netloc)
       Returns the number of seconds you must wait before you can make a new request to this
       server. This method keeps track of all of the robots connection, and enforces the delay
       constraint specified via the delay method above for each server individually.

       Note: Although it says 'host', it really means 'netloc/server', i.e. it differentiates
       between individual servers running on different ports, even though they might be on the
       same machine ('host'). This function is mostly used internally, where RobotUA calls it to
       find out when to send the next request to a certain server.

   $ua->as_string
       Returns a string that describes the state of the UA.  Mainly useful for debugging.

SEE ALSO
       LWP::Parallel::UserAgent, LWP::RobotUA, WWW::RobotRules

COPYRIGHT
       Copyright 1997-2004 Marc Langheinrich <marclang AT cpan.org>

       This library is free software; you can redistribute it and/or modify it under the same
       terms as Perl itself.



perl v5.10.0                                2004-02-10                LWP::Parallel::RobotUA(3pm)

Generated by $Id: phpMan.php,v 4.49 2006/02/26 13:18:18 chedong Exp $ Author: Che Dong
On Apache
Under GNU General Public License
2012-05-24 15:10 @38.107.179.236 Crawled by CCBot/1.0 (+http://www.commoncrawl.org/bot.html)
Valid XHTML 1.0!Valid CSS!