|
" Phparchitect's guide to web scraping with PHP "
by Matthew Turland.
Document Type
|
:
|
BL
|
Record Number
|
:
|
726640
|
Doc. No
|
:
|
b546372
|
Main Entry
|
:
|
by Matthew Turland.
|
Title & Author
|
:
|
Phparchitect's guide to web scraping with PHP\ by Matthew Turland.
|
Publication Statement
|
:
|
Toronto, Ont.: Marco Tabini & Associates, Inc., 2010. ©2010
|
Page. NO
|
:
|
xviii, 173 pages : illustrations ; 24 cm
|
ISBN
|
:
|
0981034519
|
|
:
|
: 9780981034515
|
Abstract
|
:
|
Despite all the advancements in web APIs and interoperability, it's inevitable that, at some point in your career, you will have to "scrape" content from a website that was not built with web services in mind. And, despite its sometimes less-than-stellar reputation, web scraping is usually an entire legitimate activity-for example, to capture data from an old version of a website for insertion into a modern CMS. This book, written by scraping expert Matthew Turland, covers web scraping techniques and topics that range from the simple to exotic using a variety of technologies and frameworks: * Understanding HTTP requests * The PHP HTTP streams wrapper * cURL * pecl_http * PEAR:HTTP * Zend_Http_Client * Building your own scraping library * Using Tidy * Analyzing code with the DOM, SimpleXML and XMLReader extensions * CSS selector libraries * PCRE pattern matching * Tips and Tricks * Multiprocessing / parallel processing.
|
Subject
|
:
|
Application program interfaces (Computer software)
|
Subject
|
:
|
PHP (Computer program language)
|
LC Classification
|
:
|
QA76.76.A63B963 9999
|
Added Entry
|
:
|
Matthew Turland
|
Parallel Title
|
:
|
PHP architect's guide to web scraping with PHP
|
| |