PHP – Safe way to download large files

curldownloadfile-get-contentsfopenphp

Information

There are many ways to download files in PHP, file_get_contents + file_put_contents, fopen, readfile and cURL.

Question?

  • When having a large file, let's say 500 MB from another server / domain, what is the "correct" way to downloaded it safe? If connection failes it should find the position and continue OR download the file again if it contains errors.
  • It's going to be used on a web site, not in php.exe shell.

What I figured out so far

  • I've read about AJAX solutions with progress bars but what I'm really looking for is a PHP solution.
  • I don't need to buffer the file to a string like file_get_contents does. That probably uses memory as well.
  • I've also read about memory problems. A solution that don't use that much memory might be prefered.

Concept

This is sort of what I want if the result is false.

function download_url( $url, $filename ) {
    // Code
    $success['success'] = false;
    $success['message'] = 'File not found';
    return $success;
}

Best Solution

The easiest way to copy large files can be demonstrated here Save large files from php stdin but the does not shows how to copy files with http range

$url = "http://REMOTE_FILE";
$local = __DIR__ . "/test.dat";

try {
    $download = new Downloader($url);
    $download->start($local); // Start Download Process
} catch (Exception $e) {
    printf("Copied %d bytes\n", $pos = $download->getPos());
}

When an Exception occur you can resume the file download for the last point

$download->setPos($pos);

Class used

class Downloader {
    private $url;
    private $length = 8192;
    private $pos = 0;
    private $timeout = 60;

    public function __construct($url) {
        $this->url = $url;
    }

    public function setLength($length) {
        $this->length = $length;
    }

    public function setTimeout($timeout) {
        $this->timeout = $timeout;
    }

    public function setPos($pos) {
        $this->pos = $pos;
    }

    public function getPos() {
        return $this->pos;
    }

    public function start($local) {
        $part = $this->getPart("0-1");

        // Check partial Support
        if ($part && strlen($part) === 2) {
            // Split data with curl
            $this->runPartial($local);
        } else {
            // Use stream copy
            $this->runNormal($local);
        }
    }

    private function runNormal($local) {
        $in = fopen($this->url, "r");
        $out = fopen($local, 'w');
        $pos = ftell($in);
        while(($pos = ftell($in)) <= $this->pos) {
            $n = ($pos + $this->length) > $this->length ? $this->length : $this->pos;
            fread($in, $n);
        }
        $this->pos = stream_copy_to_stream($in, $out);
        return $this->pos;
    }

    private function runPartial($local) {
        $i = $this->pos;
        $fp = fopen($local, 'w');
        fseek($fp, $this->pos);
        while(true) {
            $data = $this->getPart(sprintf("%d-%d", $i, ($i + $this->length)));

            $i += strlen($data);
            fwrite($fp, $data);

            $this->pos = $i;
            if ($data === - 1)
                throw new Exception("File Corupted");

            if (! $data)
                break;
        }

        fclose($fp);
    }

    private function getPart($range) {
        $ch = curl_init();
        curl_setopt($ch, CURLOPT_URL, $this->url);
        curl_setopt($ch, CURLOPT_RANGE, $range);
        curl_setopt($ch, CURLOPT_BINARYTRANSFER, 1);
        curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
        curl_setopt($ch, CURLOPT_TIMEOUT, $this->timeout);
        $result = curl_exec($ch);
        $code = curl_getinfo($ch, CURLINFO_HTTP_CODE);
        curl_close($ch);

        // Request not Satisfiable
        if ($code == 416)
            return false;

            // Check 206 Partial Content
        if ($code != 206)
            return - 1;

        return $result;
    }
}