IPv4 Fremnet Logo
TOOLS, TINKERINGS & CODE

Show your support

preg_magic · Mar 4, 10:19 by Shannon Wynter

A fairly common problem that I’ve come across data parsing, usually HTML. The process usually involves a number of regular expressions and a few if/else statements.

I’ve found that generally no two processing tasks are the same so it’s not very recyclable so I decided to code up a method that was recyclable – and as an added bonus it returns a nice structured array.

Welcome to preg_magic, using an associative array of regular expressions you very quickly create structured output from almost any input data.

Here’s the included example

  1. <?php
  2. include_once('../preg_magic.php');
  3.  
  4. $HistoryFields = array(
  5. 'cardnumber'    => 'card number:<\/span>\s*((?:\d{4}\s*){3}\d{4})',
  6. 'balance'       => 'Card balance:<\/td>\s*<td>\s*(\$\d+\.\d+)',
  7. 'history'       => array(
  8.   '@table'        => true,
  9.   '@table_start'  => 'class="results_table">.+?<\/tr>\s*',
  10.   '@table_end'    => '\s*<\/table>',
  11.   '@table_fields' => array(
  12.   'stamp'        => '\s*<tr class="(?:odd|even)">\s*'
  13.                      . '<td>(\d+-\w+-\d+\s*\d+:\d+\s*\w+)<\/td>',
  14.   'type'         => '<td.*?>\s*([^<]*?)\s*<\/td>',
  15.   'location'     => '<td.*?>\s*([^<]*?)\s*<\/td>',
  16.   'amount'       => '<td.*?>\s*([^<]*?)\s*<\/td>',
  17.   'gst'          => '<td.*?>\s*([^<]*?)\s*<\/td>\s*<\/tr>\s*'
  18.   )
  19. )
  20. );
  21.  
  22. $HistoryHTML = preg_replace("/\s\s+|&nbsp;/",' ',file_get_contents('history.html'));
  23.  
  24. $Output = preg_magic::execute($HistoryFields, $HistoryHTML);
  25.  
  26. print_r($Output);
  27. Download this code: full_example.php (Downloaded 68 time(s))

Example Output

Array
(
 [cardnumber] => 0110 0220 0330 0440
 [balance] => $23.00
 [history] => Array
 (
  [0] => Array
  (
   [stamp] => 05-Feb-2008 07:27 AM
   [type] => Touch Off
   [location] => Milton
   [amount] => -5.40
   [gst] => *
  )
  [1] => Array
  (
   [stamp] => 05-Feb-2008 06:11 AM
   [type] => Touch On
   [location] => Caboolture
   [amount] => 
   [gst] => 
  )
  [2] => Array
  (
   [stamp] => 04-Feb-2008 05:12 PM
   [type] => Touch Off
   [location] => Caboolture
   [amount] => -5.40
   [gst] => *
  )
  [3] => Array
  (
   [stamp] => 04-Feb-2008 03:54 PM
   [type] => Touch On
   [location] => Milton
   [amount] => 
   [gst] => 
  )
  [4] => Array
  (
   [stamp] => 04-Feb-2008 07:19 AM
   [type] => Touch Off
   [location] => Milton
   [amount] => -5.40
   [gst] => *
  )
  //SNIP - You don't need the whole output :)//
)

Download
file: preg_magic-2008-02-05.tar.gz [12.59kB]
download: 70

Comments

Spam no more - rel=nofollow is active here, spamming my comments will not help your page rank.

  Textile help
---== Copyright Shannon Wynter - All rights reserved - All wrongs avenged ==--- email