Site Navigation

Monday, March 2, 2009

bug 271 - strpos() function in PHP returns unexpected results

Issue: #271
Affects: PHP4, PHP5

PHP is by far one of the most popular server side programming languages available today (55% market share in our poll) however it has its own fair share of quirks.

For reasons unknown to us, PHP decided that the return value from the strpos() function will return an integer index... or maybe a boolean FALSE!

Huh say what?!

Almost every programming language has a method or two to find the position of a substring in a string.

JavaScript Example:

<script type="text/javascript">
var idx = 'Hello World'.indexOf('Wo');
alert('Found: "Wo" at index ' + idx);
</script>


They all return a positive index if the substring is found, or -1 if not.

Unfortunately PHP can sometimes return FALSE, or 0 (zero), or "" when it doesn't find the substring, which makes programming logic to handle this extremely difficult.


Known Workarounds: Several. Many developers have written new extensions to the String Class, standalone functions and more. Rather than claim to have the best solution ourselves, please send in your workaround using the comments below and we'll try to pick and highlight the best solutions from the submissions.

Example Workaround Code:

[Coming soon...]



Related Issues: None.


Bug/Site Feedback |
Submit a bug

12 comments:

McArrow said...

Do you really think that strongly described in manual behaviour of function is a bug?

And where did you find phrase 'Unfortunately PHP can sometimes return FALSE, or 0 (zero), or "" when it doesn't find the substring'?

As I understand, PHP will return 0 if source string starts with the substring you search. And if you use equal operator (==) instead of identical (===) you can made a wrong conclusion on occurrence of substring in a string.

Lionel said...

Are you kidding me ?
strpos returns an integer if the value is found and false otherwise.
That's why in order to test if the needle isn't in the string you have to test it against false with === (explicit type comparison baby).

But it should be a normal thing for programmers to test the type of a return.
Javascript programmers are used to do that too.

Lionel said...

What's that ?
This is not really a bug, just a bad specification but the php.net page is very explicit.

In order to test if no position was found, you have to test the result against false (type included with ===).

I see no real problem overcoming this.

Anonymous said...

Call it what you want folks. Its a bug in the design.

A method should only have 1 (ONE) return type.

If your method returns multiple datatypes... you've Failed.

I love PHP and use it every day but I'll admit the first time I saw this function I just cried.

You know when other developers (.Net, Java, C#, Ruby, etc.) laugh at PHP as a kiddie script language? know you know why.

Jake said...

The PHP bug with strpos() is really annoying.

http://webbugtrack.blogspot.com/2009/03/bug-271-strpos-function-in-php-returns.html

Its right there in the definition of the function. "int"
int strpos ( string $haystack , mixed $needle [, int $offset= 0 ] )

"int" as in: "This function WILL return an INT" - not a string, not a boolean or an object.

Java gets it right:

http://java.sun.com/javase/6/docs/api/java/lang/String.html#indexOf(java.lang.String,%20int)

I bet even ASP gets it right!

anyway I don't have a perfect solution but I look forward to whatever turns up here.

Kit Grose said...

Full workaround for people who can't bear to have mixed response types:

function strpos_int($haystack, $needle, $offset = 0) {
$result = strpos($haystack, $needle, $offset);
return ($result !== false ? $result : -1);
}

Anonymous said...

PHP works differently, why are you calling this a bug ? This is by design working properly.

There is good reason why it's returning FALSE and not -1. In some fonction like substr (http://www.php.net/substr) "-1" means last character, it doesn't mean an index that doesn't exist. To avoid any confusion it returns FALSE. Get inform before posting stuff like that.

Max Graham said...

@Anonymous - re: "Get inform before posting stuff like that."

As many have mentioned, just because the documentation says that it may return (a) or (b) or (c) doesn't mean that it isn't a bug.

Most developers work under the premise that a function with a defined return type (in this case "int" strongly believe that any other datatype returned is an error.

If the datatype trully is "variable", then the function's defined return type should be defined as such. E.g. in VB it would be Variant, in Java it would be Object, etc.

Personally I love using PHP (I'm not hating on the language at all) but similar to Kit Grose's solution I call a function that returns the true index of the match or -1 if not found.

Max.

Anonymous said...

@Max Graham & everyone that thing it's a bug

http://en.wikipedia.org/wiki/Software_bug

It's intended, it's not an error, it's not a mistake, etc.

This thing could have been going on a "Bug or Feature", but not bug.

Also I'd like to make you notice that even if it was returning -1, you would still have to test if it was founded or not. What you have to do is to test if it's FALSE instead of -1.

Also, most of people that thinks it's a bug argumentation are that most language native function return -1, but PHP doesn't. PHP doesn't do it like the other language, but that doesn't mean it's wrong. It's a totally wrong way of thinking to say that because all language do it in a way, that all language should do it the same way.

Max Graham said...

@Anonymous - re: "Also I'd like to make you notice that even if it was returning -1, you would still have to test if it was founded or not."

Not sure if I follow you there? If the value -1 was returned, then the searchword was not found.

Typically one would run it inline like this:

if(strpos_idx($someStr, "http") != -1){
//http was found in the string
} else {
//no http found
}

(where in this example "strpos_idx" is a revised version of the function that returns -1 for "not found")

Max.

Anonymous said...

The only difference is instead of doing :

if(strpos_idx($someStr, "http") != -1){
//http was found in the string
} else {
//no http found
}

You would do :

if(strpos($someStr, "http") !== FALSE) {
//http was found in the string
} else {
//no http found
}

So no mather if it was returning -1 or FALSE you still have to do a test. It changes nothing to the way you would have done your script and it doesn't cause any problem to script in any circumstance. And it's a bug ?

This is being called a bug only because of the opinion of the author and his opinion is highly debatable.

Pete Hamswell said...

I use this class to workaround the buggy implementation in PHP.

class String{
public static function contains(&$haystack, &$needle, &$offset){
$res = strpos($haystack, $needle, $offset);
return $res !== FALSE;
}

public static function strpos(&$haystack, &$needle, &$offset){
$res = strpos($haystack, $needle, $offset);
if($res === FALSE){
return -1;
}
return $res;
}
}

Can't recall if I wrote it or scooped it up from somewhere.