only for RuBoard - do not distribute or recompile Previous Section Next Section

2.6 Strings

A string of characters—a string—is probably the most commonly used data type when developing scripts, and PHP provides a large library of string functions to help transform, manipulate, and otherwise manage strings. We introduced PHP strings earlier, in Section 2.1.1. Here, we examine string literals in more detail and describe some of the useful string functions PHP provides.

2.6.1 String Literals

As already shown in previous examples, enclosing characters in single quotes or double quotes can create a string literal. Single-quoted strings are the simplest form of string literal; double-quoted strings are parsed to substitute variable names with the variable values and allow characters to be encoded using escape sequences. Single-quoted strings don't support all the escape sequences, only \' to include a single quote and \\ to include a backslash.

Tab, newline, and carriage-return characters can be included in a double-quoted string using the escape sequences \t, \n, and \r, respectively. To include a backslash, a dollar sign, or a double quote in a double-quoted string, use the escape sequences \\, \$, or \".

Other control characters and characters with the most significant bit set can be included using escaped octal or hexadecimal sequences. For example, to include the umlauted character ö, the octal sequence \366 or the hexadecimal sequence \xf6 are used:

//Print a string that includes a lowercase
//o with the umlaut mark
echo "See you at the G\xf6teborg Film Festival";

PHP uses eight-bit characters in string values, so the range of characters that can be represented is \000 to \377 in octal notation or \x00 to \xff in hexadecimal notation.

Unlike many other languages, PHP allows newline characters to be included directly in a string literal. The following example show the variable $var assigned with a string that contains a newline character:

// This is Ok. $var contains a newline character
$var = 'The quick brown fox
        jumps over the lazy dog';

This feature is used in later chapters to construct SQL statements that are readable in the source code, for example:

$query = "SELECT max(order_id) 
          FROM orders 
          WHERE cust_id = $custID";

Other control characters, such as tabs and carriage returns, and characters with the most significant bit set—those in the range \x80 to \xff—can also be directly entered into a string literal. We recommend that escape sequences be used in practice to aid readability and portability of source files.

2.6.1.1 Variable substitution

Variable substitution provides a convenient way to output variables embedded in string literals. When PHP parses double-quoted strings, variable names are identified when a $ character is found and the value of the variable is substituted. We have already used examples earlier in this chapter such as:

$cm = 127;
$inch = $cm / 2.54;

// prints "127 centimeters = 50 inches"
echo "$cm centimeters = $inch inches";

When the name of the variable is ambiguous, braces {} can delimit the name as shown in the following example:

$memory = 256;

// Fails: no variable called $memoryMbytes
$message = "My computer has $memoryMbytes of RAM";

// Works: Curly braces are used delimit variable name
$message = "My computer has {$memory}Mbytes of RAM";

// This also works
$message = "My computer has ${memory}Mbytes of RAM";

Braces are also used for more complex variables, such as multidimensional arrays and objects:

echo "Mars is {$planets['Mars']['dia']} times the diameter of the Earth";

echo "There are {$order->count} green bottles ...";

Example 2-4 shows how the multidimensional array $planets is assigned, and objects and the member access operator -> are discussed in Section 2.11.

2.6.1.2 Length of a string

The length property of a string is determined with the strlen( ) function, which returns the number of eight-bit characters in the subject string:

integer strlen(string subject)

Consider an example that prints 16:

print strlen("This is a String");  // prints 16

2.6.2 Printing and Formatting Strings

Earlier we presented the basic method for outputting text—with echo and print—and the functions print_r( ) and var_dump( ), which can determine the contents of variables during debugging.

PHP provides several other functions that allow more complex and controlled formatting of strings.

2.6.2.1 Creating formatted output with sprintf( ) and printf( )

Sometimes more complex output is required than can be produced with echo or print. For example, a floating-point value such as 3.14159 might need to be truncated to 3.14 as it is output. For complex formatting, the sprintf( ) or printf( ) functions are useful:

string sprintf (string format [, mixed args...]) 
integer printf (string format [, mixed args...])

The operation of these functions is modeled on the identical C programming language functions, and both expect a string with optional conversion specifications, followed by variables or values as arguments to match any formatting conversions. The difference between sprintf( ) and printf( ) is that the output of printf( ) goes directly to the output buffer PHP uses to build a HTTP response, whereas the output of sprintf( ) is returned as a string.

Consider an example printf statement:

printf("Result: %.2f\n", $variable);

The format string Result: %.2f\nis the first parameter to the printf statement. Strings like Result: are output the same as with echo or print. The %.2f component is a conversion specification:

In the example, the value that is actually output using the formatting string %.2f is the value of the second parameter to the printf function—the variable $variable.

To illustrate other uses of printf, consider the examples in Example 2-5.

Example 2-5. Using printf to output formatted data
<!DOCTYPE HTML PUBLIC 
   "-//W3C//DTD HTML 4.0 Transitional//EN"
   "http://www.w3.org/TR/html4/loose.dtd" >
<html>
<head>
  <title>Examples of using printf(  )</title>
</head>
<body bgcolor="#ffffff">
<h1>Examples of using printf(  )</h1>
<pre>
<?php
  // Outputs "3.14"
  printf("%.2f\n", 3.14159);

  // Outputs "      3.14"
  printf("%10.2f\n", 3.14159);

  // Outputs "3.1415900000"
  printf("%.10f\n", 3.14159);

  // Outputs "halfofthe"
  printf("%.9s\n", "halfofthestring");

  // Outputs "    3.14 3.141590       3.142"
  printf("%5.2f %f %7.3f\n", 3.14159, 3.14159, 3.14159);

  // Outputs "1111011 123 123.000000 test"
  printf("%b %d %f %s\n", 123, 123, 123, "test");
?>
</pre>
</body>
</html>
2.6.2.2 Padding strings

A simple method to space strings is to use the str_pad( ) function:

string str_pad(string input, int length [, string padding [, int pad_type]])

Characters are added to the input string so that the resulting string is characters in length. The following example shows the simplest form of str_pad( ) that adds spaces to the end of the input string:

// prints "PHP" followed by three spaces
echo str_pad("PHP", 6); 

An optional string argument padding can be supplied that is used instead of the space character. By default, padding is added to the end of the string. By setting the optional argument pad_type to STR_PAD_LEFT or to STR_PAD_BOTH, the padding is added to the beginning of the string or to both ends. The following example shows how str_pad( ) can create a justified index:

$players =  
   array("DUNCAN, king of Scotland"=>"Larry", 
         "MALCOLM, son of the king"=>"Curly",  
         "MACBETH"=>"Moe",
         "MACDUFF"=>"Rafael");
 
echo "<pre>";
 
// Print a heading
echo str_pad("Dramatis Personae", 50, " ", STR_PAD_BOTH) . "\n";
 
// Print an index line for each entry
foreach($players as $role=>$actor)
  echo str_pad($role, 30, ".")
      . str_pad($actor, 20, ".", STR_PAD_LEFT) . "\n";
 
echo "</pre>";

The example prints:

                Dramatis Personae                 
DUNCAN, king of Scotland.....................Larry
MALCOLM, son of the king.....................Curly
MACBETH........................................Moe
MACDUFF.....................................Rafael
2.6.2.3 Changing case

The following PHP functions return a copy of the subject string with changes in the case of the characters:

string strtolower(string subject)
string strtoupper(string subject)
string ucfirst(string subject)
string ucwords(string subject)

The following fragment shows how each operates:

print strtolower("PHP and MySQL"); // php and mysql
print strtoupper("PHP and MySQL"); // PHP AND MYSQL
print ucfirst("now is the time");  // Now is the time
print ucwords("now is the time");  // Now Is The Time
2.6.2.4 Trimming whitespace

PHP provides three functions that trim leading or trailing whitespace characters—null, tab, vertical-tab, newline, carriage-return, and space characters—from strings:

string ltrim(string subject)
string rtrim(string subject)
string trim(string subject)

The three functions return a copy of the subject string: trim( ) removes both leading and trailing whitespace characters, ltrim( ) removes leading whitespace characters, and rtrim( ) removes trailing whitespace characters. The following example shows the effect of each:

$var = trim(" Tiger Land\n");  // "Tiger Land"
$var = ltrim(" Tiger Land\n"); // "Tiger Land\n"
$var = rtrim(" Tiger Land\n"); // " Tiger Land"
2.6.2.5 Rendering newline characters with <br>

Whitespace characters generally don't have any significance in HTML, but it's often useful to preserve newlines when a page is rendered. The nl2br( ) function generates a string by inserting the HTML break element <br />[2] before all occurrences of the newline character in the source argument:

[2] From PHP Version 4.0.5 onwards, nl2br( ) inserts the XHTML-compliant <br /> markup that includes the shorthand way of closing an empty element. Earlier versions inserted <br>, which isn't valid XML.

string nl2br(string source)

The following example shows how nl2br( ) works:

// A short poem
$verse = "Isn't it funny\n";
$verse .= "That a bear likes honey.\n";
$verse .= "I wonder why he does?\n";
$verse .= "Buzz, buzz, buzz.\n";

// The four lines are rendered as one
echo $verse;

// Renders the poem on four lines in HTML as intended
echo nl2br($verse);

2.6.3 Comparing Strings

PHP provides the string comparison functions strcmp( ) and strncmp( ) that safely compare two strings, str1 and str2:

integer strcmp(string str1, string str2)
integer strncmp(string str1, string str2, integer length)

While the equality operator == can compare two strings, the result isn't always as expected when the strings contain characters with the most significant bit set. Both strcmp( ) and strncmp( ) take two strings as arguments, str1 and str2, and return 0 if the strings are identical, 1 if str1 is less than str2, and -1 if str1 is greater that str2. The function strncmp( ) takes a third argument length that restricts the comparison to length characters. These examples show the results of various comparisons:

print strcmp("aardvark", "zebra");        // -1
print strcmp("zebra", "aardvark");        //  1
print strcmp("mouse", "mouse");           //  0
print strncmp("aardvark", "aardwolf", 4); //  0
print strncmp("aardvark", "aardwolf", 5); // -1

The functions strcasecmp( ) and strncasecmp( ) are case-insensitive versions of strcmp( ) and strncmp( ).

The functions strncmp( ), strcasecmp( ), or strncasecmp( ) can be used as the callback function when sorting arrays with usort( ).

2.6.4 Finding and Extracting Substrings

PHP provides several simple and efficient functions that can identify and extract specific substrings of a string.

2.6.4.1 Extracting a substring from a string

The substr( ) function returns a substring from a source string:

string substr(string source, integer start [, integer length])

When called with two arguments, substr( ) returns the characters from the source string starting from position start—counting from zero—to the end of the string. With the optional length argument, a maximum of length characters are returned. The following examples show how substr( ) works:

$var = "abcdefgh";

print substr($var, 2);       //  "cdefgh"
print substr($var, 2, 3);    //  "cde"
print substr($var, 4, 10);   //  "efgh"

If a negative start position is passed, the starting point of the returned string is counted from the end of the source string. If the length is negative, it's treated as the index, and the returned string ends length characters from the end of the source string. The following examples show how negative indexes can be used:

$var = "abcdefgh";

print substr($var, -1);      //  "h"
print substr($var, -3);      //  "fgh"
print substr($var, -5, 2);   //  "de"
print substr($var, -5, -2);  //  "def"
2.6.4.2 Finding the position of a substring

The strpos( ) function returns the index of the first occurring substring needle in the string haystack:

integer strpos(string haystack, string needle [, integer offset])

When called with two arguments, the search for the substring needle is from the start of the string haystack at position zero. When called with three arguments, the search occurs from the index offset into the haystack. The following examples show how strpos( ) works:

$var = "To be or not to be";

print strpos($var, "T");     // 0
print strpos($var, "be");    // 3

// Start searching from the 5th character in $var
print strpos($var, "be", 4); // 16

The strrpos( ) function returns the index of the last occurrence of the single character needle in the string haystack:

integer strrpos(string haystack, string needle)

Unlike strpos( ), strrpos( ) searches for only a single character, and only the first character of the needle string is used. The following examples show how strrpos( ) works:

$var = "To be or not to be";

// Prints 13: the last occurrence of "t"
print strrpos($var, "t");

// Prints 0: Only searches for "T" which 
// is found at position zero
print strrpos($var, "Tap"); 

// False: "Z" does not occur in the subject
onlyprint strrpos($var, "Zoo"); 

If the substring needle isn't found by strpos( ) or strrpos( ), both functions return false. The is-identical operator === should be used when testing the returned value from these functions against false. If the substring needle is found at the start of the string haystack, the index returned is zero and is interpreted as false if used as a Boolean value.

2.6.4.3 Extracting a found portion of a string

The strstr( ) and stristr( ) functions search for the substring needle in the string haystack and return the portion of haystack from the first occurrence of needle to the end of haystack:

string strstr(string haystack, string needle)
string stristr(string haystack, string needle)

The strstr( ) search is case-sensitive; the stristr( ) search isn't. If the needle isn't found in the haystack string, both strstr( ) and stristr( ) return false. The following examples show how the functions work:

$var = "To be or not to be";

print strstr($var, "to");    //  "to be"
print stristr($var, "to");   //  "To be or not to be"
print stristr($var, "oz");   // false

The strrchr( ) function returns the portion of haystack by searching for the single character needle; however, strrchr( ) returns the portion from the last occurrence of needle:

string strrchr(string haystack, string needle)

Unlike strstr( ) and stristr( ), strrchr( ) searches for only a single character, and only the first character of the needle string is used. The following examples show how strrchr( ) works:

$var = "To be or not to be";

// Prints: "not to be"
print strrchr($var, "n"); 

// Prints "o be": Only searches for "o" which
// is found at position 14
print strrchr($var, "oz");
2.6.4.4 Extracting multiple values from a string

PHP provides the explode( ) and implode( ) functions, which convert strings to arrays and back to strings:

array explode(string separator, string subject [, integer limit])
string implode(string glue, array pieces)

The explode( ) function returns an array of strings created by breaking the subject string at each occurrence of the separator string. The optional integer limit determines the maximum number of elements in the resulting array; when the limit is met, the last element in the array is the remaining unbroken subject string. The implode( ) function returns a string created by joining each element in the array pieces, inserting the string glue between each piece. The following example shows both the implode( ) and explode( ) functions:

$guestList = "Sam Meg Sarah Ben Jess May Adam";
$name = "Fred";

// Check if $name is in the $guestList
if (strpos($guestList, $name) === false)
{
  $guestArray = explode(" ", $guestList);
  sort($guestArray);
  echo "Sorry '$name' is not on the guest list.\n";
  echo "Guest list: " . implode(", ", $guestArray)
}

When the string $name isn't found in the string $guestList using strpos( ), the fragment of code prints a message to indicate that $name isn't contained in the list. The message includes a sorted list of comma-separated names: explode( ) creates an array of guest names that is sorted and then, using implode( ), is converted back into a string with each name separated by a comma and a space. The example prints:

Sorry 'Fred' is not on the guest list.
Guest list: Adam, Ben, Jess, May, Meg, Sam, Sarah 

2.6.5 Replacing Characters and Substrings

PHP provides several simple functions that can replace specific substrings or characters in a string with other strings or characters. In the next section we discuss powerful tools for finding and replacing complex patterns of characters. The functions described in this section, however, are more efficient than regular expressions and are often the better choice when searching and replacing strings.

2.6.5.1 Replacing substrings

The substr_replace( ) function replaces a substring identified by an index with a replacement string:

string substr_replace(string source, string replace, int start [, int length])

Returns a copy of the source string with the characters from the position start to the end of the string replaced with the replace string. If the optional length is supplied, only length characters are replaced. The following examples show how substr_replace( ) works:

$var = "abcdefghij";

// prints "abcDEF";
echo substr_replace($var, "DEF", 3);

// prints "abcDEFghij";
echo substr_replace($var, "DEF", 3, 3);

// prints "abcDEFdefghij";
echo substr_replace($var, "DEF", 3, 0);

The str_replace( ) function returns a string created by replacing occurrences of the string search in subject with the string replace:

mixed str_replace(mixed search, mixed replace, mixed subject)

In the following example, the subject string, "old-age for the old", is printed with both occurrences of old replaced with new:

$var = "old-age for the old.";

echo str_replace("old", "new", $var);

The result is:

new-age for the new.

Since PHP Version 4.0.5, str_replace( ) allows an array of search strings and a corresponding array of replacement strings to be passed as parameters. The following example shows how the fields in a very short form letter can be populated:

// A short form-letter for an overdue account
$letter = "Dear #title #name, You owe us $#amount.";

// Set-up an array of three search strings that  
// will be replaced in the form-letter
$fields = array("#title", "#name", "#amount");

// An array of debtors. Each element is an array that
// holds the replacement values for the form-letter
$debtors = array(
    array("Mr", "Cartwright", "146.00"),
    array("Ms", "Yates", "1,662.00"),
    array("Dr", "Smith", "84.75"));

foreach($debtors as $debtor)
  echo "<p>" . str_replace($fields, $debtor, $letter);

The output of this script is as follows:

Dear Mr Cartwright, You owe us $146.00.
Dear Ms Yates, You owe us $1,662.00.
Dear Dr Smith, You owe us $84.75.

If the array of replacement strings is shorter than the array of search strings, the unmatched search strings are replaced with empty strings.

2.6.5.2 Translating characters and substrings

The strtr( ) function translates characters or substrings in a subject string:

string strtr(string subject, string from, string to)
string strtr(string subject, array map)

When called with three arguments, strtr( ) translates the characters in the subject string that match those in the from string with the corresponding characters in the to string. When called with two arguments, a subject string and an array map, occurrences of the map keys in subject are replaced with the corresponding map values.

The following example uses strtr( ) to replace all lowercase vowels with the corresponding umlauted character:

$mischief = strtr("command.com", "aeiou", "äëïöü");
print $mischief;  // prints cömmänd.cöm

When an associative array is passed as a translation map, strtr( ) replaces substrings rather than characters. The following example shows how strtr( ) can expand acronyms:

// Short list of acronyms used in e-mail
$glossary = array("BTW"=>"by the way",
                  "IMHO"=>"in my humble opinion",
                  "IOW"=>"in other words",
                  "OTOH"=>"on the other hand");

// Maybe now I can understand
print strtr($geekMail, $glossary);
only for RuBoard - do not distribute or recompile Previous Section Next Section