Skip to main content

PHP Simple HTML DOM Parser is a dream utility for developers that work with both PHP and the DOM because developers can easily find DOM elements using PHP. The following script checks two websites for changes.

View full post...

// Pull in PHP Simple HTML DOM Parser
include("simplehtmldom/simple_html_dom.php");

// Settings on top
$sitesToCheck = array(
          // id is the page ID for selector
          array("url" => "http://www.arsenal.com/first-team/players", "selector" => "#squad"),
          array("url" => "http://www.liverpoolfc.tv/news", "selector" => "ul[style='height:400px;']")
        );
$savePath = "cachedPages/";
$emailContent = "";

// For every page to check...
foreach($sitesToCheck as $site) {
  $url = $site["url"];

  // Calculate the cachedPage name, set oldContent = "";
  $fileName = md5($url);
  $oldContent = "";

  // Get the URL's current page content
  $html = file_get_html($url);

  // Find content by querying with a selector, just like a selector engine!
  foreach($html->find($site["selector"]) as $element) {
    $currentContent = $element->plaintext;;
  }

  // If a cached file exists
  if(file_exists($savePath.$fileName)) {
    // Retrieve the old content
    $oldContent = file_get_contents($savePath.$fileName);
  }

  // If different, notify!
  if($oldContent && $currentContent != $oldContent) {
    // Here's where we can do a whoooooooooooooole lotta stuff
    // We could tweet to an address
    // We can send a simple email
    // We can text ourselves

    // Build simple email content
    $emailContent = "David, the following page has changed!nn".$url."nn";
  }

  // Save new content
  file_put_contents($savePath.$fileName,$currentContent);
}

// Send the email if there's content!
if($emailContent) {
  // Sendmail!
  mail("david@davidwalsh.name","Sites Have Changed!",$emailContent,"From: alerts@davidwalsh.name","rn");
  // Debug
  echo $emailContent;
}