Slow-paths in GNU libc strstr

Author: Wojciech Muła
Added on:2014-03-03

I've observed that some patterns issued to strstr cause significant slowdown.

Sample program kill-strstr.c executes strstr(data, pattern), where data is a large string (16MB) filled with the character '?'. Patterns are read from the command line.

On my machine following times were recorded:

1. searching string 'johndoe'...
                time: 0.032
2. searching string '??????????????????a'...
                time: 0.050
3. searching string '??????????????????????????????a'...
                time: 0.049
4. searching string '???????????????????????????????a'...
                time: 0.274
5. searching string '??????????????????????????????a?'...
                time: 0.356
6. searching string '??????????????????????????????a??????????????????????????????'...
                time: 0.396
  1. Slowdown is visible in case 4 (5 times slower than pattern 3). Pattern has 32 characters, and contains '?', except the last char.
  2. Even bigger slowdown occurs in case 5 (7 times slower). This pattern also contains 32 chars, but the position of the single letter 'a' is last but one.
  3. Similar slowdown occurs in case 6 (nearly 8 times slower). In this pattern single letter 'a' is surrounded by thirty '?'.