PHP file-handling tips and tricks

File-type detection

Since PHP 5.3, the finfo class has been built-in. It's a handy way of getting the type of a file without having to load it into memory. It can also provide the mime-type and encoding for many of those types.

The fileinfo module is both a set of functions and a class, though the latter is very poorly documented in the PHP manual. Even more baffling is why this hasn't been folded into the SPLFileInfo object (I mean look at the name… you'd think… but never mind).

<?php
$info = new finfo(FILEINFO_MIME_TYPE);
echo $info->file('myImage.jpg');
// prints "image/jpeg"

That FILEINFO_MIME_TYPE is optional. Leave it out entirely and you'll get a more verbose (and occasionally more useful) string for some files; for instance some image types will return size and colour depth information. Using FILEINFO_MIME will return the mime-type and encoding if available (e.g. image/png; charset=binary or text/x-php; charset=us-ascii).

SPLFileObject and SPLFileInfo

SPLFileObject and SPLFileInfo provide object-oriented interfaces to most of PHP's built in file-handling functions. This isn't just for the sake of style though; you can extend the classes to add your own methods, and make mock objects so that you can test your code without having to fake a file-system.

SPLFileInfo provides information about a file (its size, creation date, path, etc), while SPLFileObject provides an interface for interacting with the file (reading from it, writing to it, etc). If you need to do both, just create an SPLFileObject, because it extends SPLFileInfo, so offers all of its methods too. You can also create an SPLFileObject from an SPLFileInfo object by calling its openFile() method.

One neat trick for handling CSV files is that once you have an SPLFileObject for the file, you can call ->setFlags(SplFileObject::READ_CSV) on it, and then reading it with a foreach will return an arrays of fields instead of a string for each line.

DirectoryIterator

This is a nice simple interface to a directory of files. Pass in a folder name to the constructor, then you can foreach the object and it'll return SPLFileInfo objects for each file or folder. The object itself extends SPLFileInfo, so you can also get information about the folder itself (like its path, whether it's writeable, etc).

A nifty thing about SPL's iterators is that they can be filtered. For example you can create a RegexIterator object around a DirectoryIterator to quickly find files that match a certain pattern:

The Symfony Finder component

The Finder component is a fantastically powerful way of fetching a list of files from a directory or directory structure. It can perform very detailed and complex searches (on dates, paths, and file contents) very simply and return an array of SPLFileInfo objects (actually a slightly extended version that adds three helpful methods: getRelativePath(), getRelativePathName() and getContents().

Temporary files

Sometimes you need some temporary storage and you need to make a judgement whether to keep it all in memory for speed (and risk your script dying when it exceeds PHP's memory_limit setting) or on disk (slow and often unnecessary). PHP has a built-in stream called php://temp and a class called SPLTempFileObject that both store in memory up to a limit (2MB by default), then transparently switch to disk storage. They're very handy for things like retrieving files in cURL, or preparing CSV output using SPLFileObject's fputcsv() method:


Disclaimer

It should go without saying, but any example code shown on this site is yours to use without obligation or warranty of any kind. As far as it's possible to do so, I release it into the public domain.