PHP’s ZipArchive and zip archives made on OSX gotcha
Think happy posts! :) While trying to to unzip (extract) with php v5.3, which are made on osx using builtin ZipArchive library, I’ve encountered mysterious problem. ZipArchive was seeing dir (folder) separators and filenames inside zip archive in a really screwed up way:
dizajn publikacija:jelovnik/1:poc?etna.jpg -> this is a folder “dizajn publikacija” subfolder “jelovnik” and file “1_početna.jpg”
1:poc?etna.jpg – file with a name “1_početna.jpg”
Encoding problem of some sort obviously, but I really didn’t have time to find which one (if that really is the case or maybe I’ve again scored a new undocumented bug).
So I kissed ZipArchive’s extractTo() method good bye, and written my own extraction by using getStream() method and writing the output with fwrite. Good thing that I only had to unzip all files to a single folder (flatten archive) so I couldn’t care less about folders. Files did however contain “:” sign which is very illegal on windows file system. So to bypass it, I’ve “normalized” the filename to something more relaxing for storage on web server (pseudo code):
... $entry = $zip->getNameIndex($i); $base = basename($entry); $base = str_replace(array(' ', '-'), array('_','_'),$base) ; $base = preg_replace('/[^A-Za-z0-9_\.]/', '', $base) ; ...
This didn’t preserve filenames but I got the content out of the archive.
Here is full code (it lacks proper error handling):
function unzip_archive($zipfile, $destination) { $zipfile = str_replace("\\","/",$zipfile); $destination = str_replace("\\","/",$destination); if(!file_exists($zipfile)) throw new Exception('No such file.'); if (!is_dir ($destination) ) { $oldumask = umask(0); if(!mkdir($destination , 0777)) { throw new Exception('Cannot create destination folder.'); } umask($oldumask); } $zip = new ZipArchive; if ( $zip->open( $zipfile ) ) { for ( $i=0; $i < $zip->numFiles; $i++ ) { $entry = $zip->getNameIndex($i); if ( substr( $entry, -1 ) == '/' ) continue; // skip directories $pattern = '/(^._|.DS_Store|__MACOSX)/'; $matched = preg_match($pattern, $entry, $matches); if ($matched) { //echo $entry; print_r($matches); continue; } $base = basename($entry); $base = str_replace(array(' ', '-'), array('_','_'),$base) ; $base = preg_replace('/[^A-Za-z0-9_\.]/', '', $base) ; // $zip->extractTo($destination, array($entry)); //echo $zip->getStatusString(); $fp = $zip->getStream( $entry ); $ofp = fopen( $destination.'/'.$base, 'w' ); if ( ! $fp ) throw new Exception('Unable to extract the file.'); while ( ! feof( $fp ) ) fwrite( $ofp, fread($fp, 8192) ); fclose($fp); fclose($ofp); } $zip->close(); return true; } else { // nije zip arhiva? return false; } }
2 thoughts on “PHP’s ZipArchive and zip archives made on OSX gotcha”
September 20, 2010 at 23:30
RegEx? Normalized? Filename? Flatten archive?
SGU: September 28 at 9/8c
V: November 2010
Caprica: October 5 10/9c
:)
September 22, 2010 at 08:15
:-) LOL!