As I wrote, I'm using Amazon S3 to store files that are too expensive to keep on my web server, with the plan of having frequently-updated files on the server and relatively constant stuff on S3. The address for my S3 server is, which is stored in the global variable $_SERVER['CDN'].

So to include a file, I would do:

$filename = '/toinclude.php';
if (file_exists($_SERVER['DOCUMENT_ROOT'].$filename)){
  $filename = $_SERVER['DOCUMENT_ROOT'].$filename;
  $filename = $_SERVER['CDN'].$filename;
include ($filename);

Which I use often enough to want to generalize it into a class.

The other thing that would be useful is a directory listing, which is harder than it sounds for S3 since it has no directory structure; it's just a flat database of keys (the equivalent of filenames) and values (the files themselves). Thus has S3 return the file labelled /images/silk/add.png; it has no inherent relationship to /images/silk/delete.png or /images/silk/.

The key is that just retrieving the server URL returns an XML listing of all the files, and there is an API to limit the files returned. returns all the files (up to a numerical limit; see below). returns all the filenames that start with images/silk/ (note no leading slash). That's not quite enough, since it gives us sub-folders as well, but the delimiter parameter tells S3 to group all files that contain the delimiter after the prefix into one entry in the XML list. That's the equivalent of a subfolder. So gives us the list we want.

One more subtlety: S3 returns a maximum of 1000 names, then sets a flag in the XML to say the list was truncated. You can then ask for the next 1000 by naming the last returned file.

The documentation is pretty opaque, but it's all in there.

I put it all together into an abstract class that just handles the file part (assuming that any CDN would work the same way, just appending the host name to the file name) and a concrete class that handles the S3-specific directory-simulating parts. See the source code. The method names are meant to parallel the built-in PHP functions.

$s3 = new S3('');
$path = $s3->realpath('/toinclude.php');
// or
$content = file_get_contents($path);
returns the real path for the file, either from $_SERVER['DIRECTORY_ROOT'] or the S3 root passed in with the constructor. In other words, if the file exists on the web server, realpath returns something like '/public/www/toinclude.php' and if it does note, returns something like ''. Note that if the file does not exist on the web server, this will return the path on the S3 root without checking if the file actually exists; use file_exists for that.
$flag = $s3->isCDN($s3->realpath('/toinclude.php')); returns FALSE if the path represents a file on the web server (i.e. from $_SERVER['DIRECTORY_ROOT']), TRUE otherwise (note that it does not check if the file actually exists on the S3 server). Note also that this requires the path returned by realpath.
$flag= $s3->file_exists('/toinclude.php'); returns TRUE if the file exists on the web server or the S3 server (this does check the S3 server).
$timestamp = $s3->filemtime('/toinclude.php'); returns the time the file was last modified.
$files = $s3->scandir('/images/'); returns an array of names of files that exist either on the web server or the S3 server (it's the union of the directory contents).

This assumes that the ACL (access control list) for the files is set to allow anonymous reading; if not, use Donovan Schönknecht's excellent S3 class. Of course, you'd have to rename one of these classes to avoid the conflict (or use namespaces).

Hope this helps someone.

Leave a Reply

Warning: Undefined variable $user_ID in /home/public/blog/wp-content/themes/evanescence/comments.php on line 75