A fairly common problem that I’ve come across data parsing, usually HTML. The process usually involves a number of regular expressions and a few if/else statements.
I’ve found that generally no two processing tasks are the same so it’s not very recyclable so I decided to code up a method that was recyclable – and as an added bonus it returns a nice structured array.
Welcome to preg_magic, using an associative array of regular expressions you very quickly create structured output from almost any input data.
Here’s the included example
<?php
include_once('../preg_magic.php');
$HistoryFields = array(
'cardnumber' => 'card number:<\/span>\s*((?:\d{4}\s*){3}\d{4})',
'balance' => 'Card balance:<\/td>\s*<td>\s*(\$\d+\.\d+)',
'history' => array(
'@table' => true,
'@table_start' => 'class="results_table">.+?<\/tr>\s*',
'@table_end' => '\s*<\/table>',
'@table_fields' => array(
'stamp' => '\s*<tr class="(?:odd|even)">\s*'
. '<td>(\d+-\w+-\d+\s*\d+:\d+\s*\w+)<\/td>',
'type' => '<td.*?>\s*([^<]*?)\s*<\/td>',
'location' => '<td.*?>\s*([^<]*?)\s*<\/td>',
'amount' => '<td.*?>\s*([^<]*?)\s*<\/td>',
'gst' => '<td.*?>\s*([^<]*?)\s*<\/td>\s*<\/tr>\s*'
)
)
);
$HistoryHTML = preg_replace("/\s\s+| /",' ',file_get_contents('history.html'));
$Output = preg_magic::execute($HistoryFields, $HistoryHTML);
print_r($Output);
Example Output
Array ( [cardnumber] => 0110 0220 0330 0440 [balance] => $23.00 [history] => Array ( [0] => Array ( [stamp] => 05-Feb-2008 07:27 AM [type] => Touch Off [location] => Milton [amount] => -5.40 [gst] => * ) [1] => Array ( [stamp] => 05-Feb-2008 06:11 AM [type] => Touch On [location] => Caboolture [amount] => [gst] => ) [2] => Array ( [stamp] => 04-Feb-2008 05:12 PM [type] => Touch Off [location] => Caboolture [amount] => -5.40 [gst] => * ) [3] => Array ( [stamp] => 04-Feb-2008 03:54 PM [type] => Touch On [location] => Milton [amount] => [gst] => ) [4] => Array ( [stamp] => 04-Feb-2008 07:19 AM [type] => Touch Off [location] => Milton [amount] => -5.40 [gst] => * ) //SNIP - You don't need the whole output :)// )
Download
File:
preg_magic-2008-02-05.tar.gz [12.59 kB]
Download: 165