Quantcast
Channel: PHPExcel Forum Rss Feed
Viewing all articles
Browse latest Browse all 2707

New Post: searching files very slow

$
0
0
Hello!

I created a simple program which is supposed to search 'query' in the first column of all .xls or .xlsx files in a given 'directory' and then return each row (6 cells) where 1st cell is matching 'query', grouping them as a table.

This program actually works as it was supposed, but it's VERY slow to be usable. The script is working about 400 seconds and more, also it takes a lot of memory - up to several hundreds of Mb for some large files. In the 'directory' I have 38 .xls and .xlsx files each up to 5 Mb of size (~ up to 120 000 rows).

So my question: is it possible to speed up somehow the program (maybe the algorythm in general or the way I use PHPExcel) to consume some reasonable amount of time (several seconds)? Would be ideal to speed it up to 1 second or less. Thank you in advance!

My code:
<?php
/** Error reporting */
error_reporting(E_ALL);
ini_set('display_errors', TRUE); 
ini_set('display_startup_errors', TRUE); 

define('EOL',(PHP_SAPI == 'cli') ? PHP_EOL : '<br />');

/** Include PHPExcel */require_once dirname(__FILE__) . '/../../PHPExcel/Classes/PHPExcel.php';

date_default_timezone_set('UTC');

if(isset($_POST['query']) && trim($_POST['query']) != '')
{
    $dir    = dirname(__FILE__);
    $files = scandir($dir);
    $found = False;
    
    print"<html><head><meta http-equiv=\"Content-Type\" content=\"text/html; charset=utf-8\"><title>Test search</title>";
    print"<table border=1>\n";
    
    print"<tr>\n";
    print"<td>Col1</td><td>Col2</td><td>Col3</td><td>Col4</td><td>Col5</td><td>Col6</td>";
    print"</tr>\n";
    
    
    for($i=0;$i<count($files);$i++){    
        $file=$files[$i];
        $is_xls = stripos($file, ".xls");
        $is_xlsx = stripos($file, ".xlsx");
        if($is_xls === False && $is_xlsx === False)
            continue;
        $pathToFile =  $dir."/".$file;
        $inputFileType = PHPExcel_IOFactory::identify($pathToFile);
        $objReader = PHPExcel_IOFactory::createReader($inputFileType);
        $objReader->setReadDataOnly(true);
        $objPHPExcel = $objReader->load($pathToFile);
        $objWorksheet = $objPHPExcel->getActiveSheet();
        // Get the highest row and column numbers referenced in the worksheet
        $highestRow = $objWorksheet->getHighestRow(); 
        $highestColumn = $objWorksheet->getHighestColumn(); 
        $highestColumnIndex = PHPExcel_Cell::columnIndexFromString($highestColumn);
        
        $query = str_replace('-', '', preg_replace('/\s+/', '', strtolower(trim($_POST['query']))));
        
        for($row=1;$row<$highestRow;$row++){
        
            $firstColumn = $objWorksheet->getCellByColumnAndRow(0, $row)->getValue();
            $name = str_replace('-', '', preg_replace('/\s+/', '', strtolower(trim($firstColumn))));
            $pos = strpos($name, $query);
            if( $pos == True || $pos === 0){
                $found=True;
                print"<tr>\n";
                print"<td>".$objWorksheet->getCellByColumnAndRow(0, $row)->getValue()."</td>
                <td>".$objWorksheet->getCellByColumnAndRow(1, $row)->getValue()."</td>
                <td>".$objWorksheet->getCellByColumnAndRow(2, $row)->getValue()."</td>
                <td>".$objWorksheet->getCellByColumnAndRow(3, $row)->getValue()."</td>
                <td>".$objWorksheet->getCellByColumnAndRow(4, $row)->getValue()."</td>
                <td>".$objWorksheet->getCellByColumnAndRow(5, $row)->getValue()."</td>\n";
                print"</tr>\n";
            }
        }
        
        $objPHPExcel->disconnectWorksheets();
        unset($objReader);
        unset($objPHPExcel);
        unset($objWorksheet);
    }
    if(!$found){
        print"<tr>\n";
        print"<td>Nothing found.</td>\n";
        print"</tr>\n";
    }
    print"</table>\n";
} else# If "search" parameter doesn't exist, show an error and exit 
{
    header("HTTP/1.1 500 Internal Server Error");
print"<h1>500 Internal Server Error</h1>".
      "Request is not set";
exit;
}
My software:
PHPExcel 1.8.0
PHP 5.6.0
Apache 2.4.10
Windows 7

Viewing all articles
Browse latest Browse all 2707

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>