About PHPCrawl
PHPCrawl is a framework for crawling/spidering websites written in the programming language PHP, so just call it a webcrawler-library or crawler-engine for PHP
PHPCrawl “spiders” websites and passes information about all found documents (pages, links, files ans so on) for futher processing to users of the library.
It provides several options to specify the behaviour of the crawler like URL- and Content-Type-filters, cookie-handling, robots.txt-handling, limiting options, multiprocessing and much more.
PHPCrawl is completly free opensource software and is licensed under the GNU GENERAL PUBLIC LICENSE v2.
http://phpcrawl.cuab.de/example.html