Archive for April 6th, 2016

I've put an Amazon Wishlist Widget for WordPress on my github site, that uses the techniques described before. You can see it running on the sidebar here.

I figured out how to get all the pages from screen-scraping the Amazon wish list. Basically, look for the "Next" button (it's in a <li class=a-last> element). If that element is present, look for the next page.

function getwishlistitems ($listID, $page=1){
	// ignore parsing warnings
	$wishlistdom = new DOMDocument();
	@$wishlistdom->loadHTMLFile("http://www.amazon.com/gp/registry/wishlist/$listID?disableNav=1&page=$page");
	$wishlistxpath = new DOMXPath ($wishlistdom);
	$items = iterator_to_array($wishlistxpath->query("//div[starts-with(@id,'item_')]"));
	if ($wishlistxpath->evaluate("count(//li[@class='a-last'])")) { // this is the "Next->" button
		$items = array_merge($items, $this->getwishlistitems($listID, $filter, $page+1));
	}
	return $items;
}

Note that this creates a complication: the array of items now includes nodes from different documents, so you can't use one saved DOMXPath. Instead, where the original code has $wishlistxpath->evaluate($xpath, $node), use

(new DOMXPath($node->ownerDocument))->evaluate($xpath, $node);

Hope this helps someone.