ABIS Infor - 2012-04

PHP, more than a web interface

Peter Vanroose (ABIS) - February 2012

Abstract

PHP: we all know it as a server-side programming language for web servers. But actually PHP can offer us much more! In this short contribution, some concrete examples are worked out, each of them sufficiently explained also for the PHP-ignoramus. Demonstrating what PHP can offer us also outside the context of a web server.

PHP in the context of a web interface

Originally (around 1995) developed by Rasmus Lerdorf as a "Personal Home Page" preprocessing tool to generate HTML for his personal web page, PHP was quickly picked up by others and is nowadays the de facto "server side scripting" language used on web server pages, embedded in the HTML of that page, to dynamically generate HTML or to just perform server-side actions like communicate with a relational database server.

Actually, PHP is practically exclusively used as the programming interface of a web server (very often Apache); it seems to be a well-kept secret that PHP programs can also be written and used outside the context of HTML, and actually even outside a web server. So let's zoom in on that aspect of the language!

PHP as a "stand-alone" programming language

Just like with any other programming language you can pass a PHP program to the PHP compiler for compilation and even execution. Every PHP installation has a program called "php" which is the command-line interface to the compiler. Just pass it your PHP source file (or set it as the default program for the .php file extension on Windows) and off you go!

Formally, PHP programs must always be embedded in HTML, so a .php source file must always have this structure:

  <?php
   ...
  ?>

where the "..." will of course be replaced by the actual PHP program.

The PHP programming language is very similar -syntactically speaking- to Java, or actually even more to Perl. (If you know one of these languages, be aware though of several subtle but important differences...) In my opinion the most important reasons to consider PHP for your application development (instead of the more typical alternatives like Java, C#, C++, C, Perl, Python, Ruby, ...) are probably (1) its "no-nonsense" syntax, and more importantly (2) its load of extension libraries like e.g. ODBC, CURL, LDAP, or XML (but that's also true for most other languages), and maybe most importantly (3) the powerful built-in text manipulation features, including its several functions that support pattern matching using regular expressions. An example of the latter follows.

The PHP programming language is object-oriented, but actually the OO aspect is not enforced upon the programmer, and indeed most PHP programs (and certainly the smaller ones) are written in a non-OO fashion. All my examples are non-OO.

When used outside the web server context, PHP lacks a GUI component, so stand-alone PHP programs are most useful in a command-line (text-based) interface context which is natural for Unix users but may feel uncomfortable to MS-Windows users. Not really a limitation actually: just click on that .php source file and a black window opens and communicates with you - agreed, in a text-based, non-graphical I/O way but who cares if the program is functionally what you need?

If this is your case, it could be a good idea to add the following statement at the end of your PHP program:

	readline("Press ENTER to terminate this program. ");

Stand-alone PHP programs can indeed be very powerful and unbeatable when it comes to text processing, mathematics, data analysis or reformatting data like dates, names, addresses, or whatever you need. Let's look at some examples.

Text manipulation

Suppose you need to search for dates in some text document. Let's say you just need to find out whether a date, i.e., some text of the form "two digits slash two digits slash four digits" is present in that file or not.

First of all you'll need to open that file, read its content into memory, then search that content for the date "pattern" as described above. Reading the content a file into a variable is easy in PHP:

	$content = file_get_contents("myfile.txt");

where "myfile.txt" is the file name (in the current directory) to be loaded, and $content will be the variable into which the file content data will be stored after completion of the function call to "file_get_contents". Note that variables are always to be prefixed with a dollar sigil. There is no need to declare them: they come into existence when first used.

Next let's search through the content of the variable $content for a date, i.e., something of the form "dd/dd/dddd", where "d" stands for a digit, i.e., a character between "0" and "9". The function which is dedicated for exactly this purpose is preg_match: it takes two arguments: both the search pattern ("what" to search for) and the target string ("where" to search in); and it returns either "true" (when found) or "false" (otherwise).

	preg_match("!\d\d/\d\d/\d\d\d\d!", $content);

The first argument of preg_match is a so-called regular expression: it describes the date pattern. A regular expression must be enclosed in delimiters (here the two exclamation marks, but you can chose any character). Further, anything that you search for literally can be written literally (here the two slashes); some characters have a special meaning; e.g. the combination "\d" means "a digit".

A program with just the above two statements does not yet do anything useful, since we don't see anything displayed, let alone that we would see a different output depending on the "true" or "false" return value of preg_match. Well, just add a (conditional) print statement which will write something on your screen; e.g.:

	if (preg_match("!\d\d/\d\d/\d\d\d\d!", $content))
	{  print "There was a date in myfile.txt\n";  }
	else
	{  print "There was no date in myfile.txt\n";  }

Note the "\n" at the end of the text to be printed: this prints an "end-on-line" character to the screen, which allows a possible next print statement to place its output on a new line on the screen.

Now suppose that we don't just want to detect the presence of a date, but we want to "normalise" dates, e.g., convert all occurrences of dates in the day/month/year format into the ISO format: year-month-day. And keep the rest of the file content unchanged. The function preg_replace is ideally suited for this purpose:

	$content = preg_replace("!(\d\d)/(\d\d)/(\d\d\d\d)!", "$3-$2-$1", $content);

preg_replace returns the modified content of $content, which we use here to overwrite that variable. Observe the $1, $2, and $3 expressions in the replacement string (second argument of preg_replace): these variables contain the actual content of the three parenthesized groups in the regular expression.

If we next want to write $content into a file, thereby either overwriting myfile.txt or creating a new myfile2.txt, we could say:

	file_put_contents("myfile2.txt", $content);

Note that all dates will have been reformatted now, since the action of preg_replace is always global.

There are two more useful functions when it comes to textual pattern matching: preg_split and preg_grep. The former will create an array of text fragments from a given text, based on a "separator" pattern, while the latter will "filter" a given array based on a search pattern, returning a shorter array with just the matching entries.

As an example, suppose we want to find all 3-letter words in a file. First split the text into "words":

	$wordlist = preg_split("!\W+!", file_get_contents("myfile.txt"));

The regular expression "\W" stands for "any non-word character", which includes spaces, punctuation marks, ...; the "+" means "one or more of this", so any group of non-letters is seen as a single separator. So $wordlist really contains all words of the text, and nothing else. Now filter this array by throwing out all words which do not have exactly 3 characters:

	$wordlist = preg_grep("!^...$!", $wordlist);

Again, we overwrite the existing variable (an array now) with its new content. The regular expression now contains three ingredients: "^" stands for "beginning of string", "." means "any single character", and "$" represents the end-of-string. Since preg_grep does the array iteration, "string" stands for each of the individual list elements, i.e., words.

Finally print out $wordlist to the screen, one word per line:

	foreach ($wordlist as $w) { print "$w\n"; }

Or if you want to write the words in alphabetic order to a file, first sort the $wordlist array, then write the words into a single variable $words, and finally use the function file_put_contents as we did before.

	sort($wordlist); foreach ($wordlist as $w) { $words .= "$w\n"; }

Some useful libraries

The basic functionality of PHP is built into the compiler. But most functions, including the preg-functions used in the previous section, are available through "plug-in libraries" which could or could not be present on a particular system.

Let me mention just two interesting extension libraries of PHP: the XML library and the MySQL library. They provide functions for parsing and writing XML documents, and for communicating with a MySQL database server, respectively.

Just to give an idea of the possibilities, the following PHP program will read and parse the XML file called "courses.xml" which is supposed to contain course information in the form

  <Course><name>PHP programming: fundamentals</name><date>13.06.2012</date></Course>

The program will then write that information into a two-column MySQL table called "courseinfo".

  $xml = DOMDocument::load("courses.xml");
  mysql_connect("localhost", "username", "password");
  mysql_set_charset("utf8");
  $rows = $xml->getElementsByTagName('Course');
  foreach ($rows as $row) {
    $name = $row->getElementsByTagName('name')->item(0)->nodeValue;
    $date = $row->getElementsByTagName('date')->item(0)->nodeValue;
    mysql_query("INSERT INTO courses.courseinfo VALUES('$name','$date')");
  }

What else is there?

What follows are some (maybe useful) little PHP programs, one-liners actually, in response to concrete questions:

  • What's on television today? print file_get_contents("http://www.een.be/tv-gids","r");
  • What date is Easter in 2013? print date("d/m/Y",easter_date(2013));
  • At what time does the sun rise tomorrow morning in Leuven?
    print date_sunrise(time()+86400,1,50.88,4.71,90,1); (Add an hour in case tomorrow is DST)
  • Give information on the content (i.e., the file type) of all files in the current directory:
    $f=finfo_open();foreach(glob("*") as $n){print $n."\t".finfo_file($f,$n)."\n";}
  • What's the probability to throw at least one six with 10 dice? print 1-pow(5/6,10);
  • How many times does the word "ABIS" occur in the file whose name is given on the command line?
    $f=file_get_contents($argv[1]); $l=preg_split('/ABIS/i',$f); print count($l)-1;

More information?

First of all there is the official website of PHP: www.php.net, with a.o. a very extensive online documentation. And, needless to say, you can of course follow a fundamentals course on PHP at ABIS! For more information please refer to our website: http://www.abis.be/html/en1521.html.