Parse Print_r output and convert into CSV

These days I seem to looking a lot a print_r output. For the uninitiated, print_r is a handy PHP function to allow you to print the contents of an array in a neat, readable format.

When I create a PHP script, I generally output the progress of this script into a text file.
php somescript.php > output.txt

Typically, I collect data in arrays and I use print_r to see what’s going on.

Sometimes depending on what’s in this data, I may need to reuse it. I could run the script again, but this time serialize the data, or convert the array into a CSV string and save to its own file.

However in the case where you script takes hours to complete, it’s much quicker to parse the print_r data and create an array from this data. From there I can create a CSV string, and store this data away to be used later.

Here is what I use.

$file_in  = "/output.txt";
$file_out = "/item-ids.csv";

$fh            = fopen( $file_in, 'r' );
$file_contents = fread( $fh, filesize( $file_in ) );
fclose($fh);

$original_array = print_r_reverse( $file_contents );
$csv_string     = array_2_csv( $original_array );

$fh = fopen( $file_out, 'w' );
fputs( $fh, $csv_string );
fclose( $fh );

function print_r_reverse($in) { 
    $lines = explode("\n", trim($in)); 
    if (trim($lines[0]) != 'Array') { 
        // bottomed out to something that isn't an array 
        return $in; 
    } else { 
        // this is an array, lets parse it 
        if (preg_match("/(\s{5,})\(/", $lines[1], $match)) { 
            // this is a tested array/recursive call to this function 
            // take a set of spaces off the beginning 
            $spaces = $match[1]; 
            $spaces_length = strlen($spaces); 
            $lines_total = count($lines); 
            for ($i = 0; $i < $lines_total; $i++) { 
                if (substr($lines[$i], 0, $spaces_length) == $spaces) { 
                    $lines[$i] = substr($lines[$i], $spaces_length); 
                } 
            } 
        } 
        array_shift($lines); // Array 
        array_shift($lines); // ( 
        array_pop($lines); // ) 
        $in = implode("\n", $lines); 
        // make sure we only match stuff with 4 preceding spaces (stuff for this array and not a nested one) 
        preg_match_all("/^\s{4}\[(.+?)\] \=\> /m", $in, $matches, PREG_OFFSET_CAPTURE | PREG_SET_ORDER); 
        $pos = array(); 
        $previous_key = ''; 
        $in_length = strlen($in); 
        // store the following in $pos: 
        // array with key = key of the parsed array's item 
        // value = array(start position in $in, $end position in $in) 
        foreach ($matches as $match) { 
            $key = $match[1][0]; 
            $start = $match[0][1] + strlen($match[0][0]); 
            $pos[$key] = array($start, $in_length); 
            if ($previous_key != '') $pos[$previous_key][1] = $match[0][1] - 1; 
            $previous_key = $key; 
        } 
        $ret = array(); 
        foreach ($pos as $key => $where) { 
            // recursively see if the parsed out value is an array too 
            $ret[$key] = print_r_reverse(substr($in, $where[0], $where[1] - $where[0])); 
        } 
        return $ret; 
    } 
} 

function array_2_csv($array) {
	$csv = array();
	foreach ($array as $item) {
	    if (is_array($item)) {
	        $csv[] = array_2_csv($item);
	    } else {
	        $csv[] = $item;
	    }
	}
	return implode(',', $csv);
}

Found the print_r_reverse function here.
Found the array_2_csv function here.

NOTE: For best results, make sure your $file_in file contains a simple flat array.

So if your script output looks like…

Some debug messages...
blah blah blah
Array
(
    [item_ids] => Array
        (
            [0] => 1031393
            [1] => 1941666
            [2] => 1986147
            [3] => 2047279
            [4] => 2054282
            [5] => 2065078
            [6] => 2067484
            [7] => 2012085
            [8] => 2096719
            [9] => 2206253
            [10] => 2168771
            [11] => 2211019
            [12] => 2233896
            [13] => 2264443
            [14] => 2275373
            [15] => 2320905
            [16] => 2359348
            [17] => 2397892
            [18] => 2451768
            [19] => 2576203
            [20] => 2589806
        )

)

You just want the item_ids, so copy them into a new file that looks like…

Array
(
    [0] => 1031393
    [1] => 1941666
    [2] => 1986147
    [3] => 2047279
    [4] => 2054282
    [5] => 2065078
    [6] => 2067484
    [7] => 2012085
    [8] => 2096719
    [9] => 2206253
    [10] => 2168771
    [11] => 2211019
    [12] => 2233896
    [13] => 2264443
    [14] => 2275373
    [15] => 2320905
    [16] => 2359348
    [17] => 2397892
    [18] => 2451768
    [19] => 2576203
    [20] => 2589806
)

One thought on “Parse Print_r output and convert into CSV

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s