Does Python have a string 'contains' substring method?

Question

I'm looking for a string.contains or string.indexof method in Python.

I want to do:

if not somestring.contains("blah"): continue

Praskand · Accepted Answer · 2023-11-01 12:43:44Z

8594

Use the in operator:

if "blah" not in somestring: continue

Note: This is case-sensitive.

edited Nov 1, 2023 at 12:43

Praskand

158 bronze badges

answered Aug 9, 2010 at 2:56

Michael Mrozek

176k29 gold badges171 silver badges178 bronze badges

3
What would be a case-insensitive way?
– Panagiotis D.
CommentedMay 29, 2024 at 6:50
15
@PanagiotisD. if "blah" not in somestring.lower():
– Muiz Sheikh
CommentedJun 11, 2024 at 12:24
6
@MuizSheikh to make your example completely robust it should be if "blah".lower() not in somestring.lower():. Sure "blah" is already lower case, but if you replaced it with something else (like a variable) it might not be.
– Mark Ransom
CommentedJul 3, 2024 at 19:25
Would the .lower solution run into the same issue with some languages, as this stackoverflow.com/questions/444798/… comment describes? (i.e. 2 Greek letters are distinct in their upper case form, but they share the same lower case character)
– Matthias Schippling
CommentedAug 4, 2024 at 19:21
In case you don't know the content of somestring you can be more careful with: if somestring and "blah" in str(somestring): continue
– asmaier
CommentedAug 6, 2024 at 15:45

Add a comment |

Pavel Vlasov · Accepted Answer · 2024-06-08 12:17:41Z

938

You can use str.find:

s = "This be a string" if s.find("is") == -1: print("Not found") else: print("Found")

The find() method should be used only if you need to know the position of sub. To check if sub is a substring or not, use the in operator. (c) Python reference

edited Jun 8, 2024 at 12:17

Pavel Vlasov

4,3716 gold badges44 silver badges55 bronze badges

answered Aug 9, 2010 at 2:55

eldarerathis

36.3k10 gold badges92 silver badges94 bronze badges

99
+1 for highlighting the gotchas involved in substring searches. the obvious solution is if ' is ' in s: which will return False as is (probably) expected.
– aaronasterling
CommentedAug 9, 2010 at 3:22
141
@aaronasterling Obvious it may be, but not entirely correct. What if you have punctuation or it's at the start or end? What about capitalisation? Better would be a case insensitive regex search for \bis\b (word boundaries).
– Bob
CommentedNov 8, 2012 at 0:07
3
Why would this not be what the OP wants
– uh_big_mike_boi
CommentedFeb 18, 2022 at 3:55
4
@uh_big_mike_boi The problem with substring searches is that, in this example, you're looking for the word is inside "This be a string." That will evaluate to True because of the is in This. This is bad for programs that search for words, like swear filters (for example, a dumb word check for "ass" would also catch "grass").
– TheTechRobo
CommentedJun 19, 2022 at 18:44
You can use python index function. 'Hai there'.index('there') will give you such. the only difference is that index throws an exception while find returns -1. Happy python..
– kta
CommentedJul 4, 2024 at 6:29

Add a comment |

Aaron Hall · Accepted Answer · 2021-05-30 22:34:42Z

Does Python have a string contains substring method?

99% of use cases will be covered using the keyword, in, which returns True or False:

'substring' in any_string

For the use case of getting the index, use str.find (which returns -1 on failure, and has optional positional arguments):

start = 0 stop = len(any_string) any_string.find('substring', start, stop)

or str.index (like find but raises ValueError on failure):

start = 100 end = 1000 any_string.index('substring', start, end)

Explanation

Use the in comparison operator because

the language intends its usage, and
other Python programmers will expect you to use it.

>>> 'foo' in '**foo**' True

The opposite (complement), which the original question asked for, is not in:

>>> 'foo' not in '**foo**' # returns False False

This is semantically the same as not 'foo' in '**foo**' but it's much more readable and explicitly provided for in the language as a readability improvement.

Avoid using `contains`

The "contains" method implements the behavior for in. This example,

str.__contains__('**foo**', 'foo')

returns True. You could also call this function from the instance of the superstring:

'**foo**'.__contains__('foo')

But don't. Methods that start with underscores are considered semantically non-public. The only reason to use this is when implementing or extending the in and not in functionality (e.g. if subclassing str):

class NoisyString(str): def __contains__(self, other): print(f'testing if "{other}" in "{self}"') return super(NoisyString, self).__contains__(other) ns = NoisyString('a string with a substring inside')

and now:

>>> 'substring' in ns testing if "substring" in "a string with a substring inside" True

Don't use `find` and `index` to test for "contains"

Don't use the following string methods to test for "contains":

>>> '**foo**'.index('foo') 2 >>> '**foo**'.find('foo') 2 >>> '**oo**'.find('foo') -1 >>> '**oo**'.index('foo') Traceback (most recent call last): File "<pyshell#40>", line 1, in <module> '**oo**'.index('foo') ValueError: substring not found

Other languages may have no methods to directly test for substrings, and so you would have to use these types of methods, but with Python, it is much more efficient to use the in comparison operator.

Also, these are not drop-in replacements for in. You may have to handle the exception or -1 cases, and if they return 0 (because they found the substring at the beginning) the boolean interpretation is False instead of True.

If you really mean not any_string.startswith(substring) then say it.

Performance comparisons

We can compare various ways of accomplishing the same goal.

import timeit def in_(s, other): return other in s def contains(s, other): return s.__contains__(other) def find(s, other): return s.find(other) != -1 def index(s, other): try: s.index(other) except ValueError: return False else: return True perf_dict = { 'in:True': min(timeit.repeat(lambda: in_('superstring', 'str'))), 'in:False': min(timeit.repeat(lambda: in_('superstring', 'not'))), '__contains__:True': min(timeit.repeat(lambda: contains('superstring', 'str'))), '__contains__:False': min(timeit.repeat(lambda: contains('superstring', 'not'))), 'find:True': min(timeit.repeat(lambda: find('superstring', 'str'))), 'find:False': min(timeit.repeat(lambda: find('superstring', 'not'))), 'index:True': min(timeit.repeat(lambda: index('superstring', 'str'))), 'index:False': min(timeit.repeat(lambda: index('superstring', 'not'))), }

And now we see that using in is much faster than the others. Less time to do an equivalent operation is better:

>>> perf_dict {'in:True': 0.16450627865128808, 'in:False': 0.1609668098178645, '__contains__:True': 0.24355481654697542, '__contains__:False': 0.24382793854783813, 'find:True': 0.3067379407923454, 'find:False': 0.29860888058124146, 'index:True': 0.29647137792585454, 'index:False': 0.5502287584545229}

How can `in` be faster than `contains` if `in` uses `contains`?

This is a fine follow-on question.

Let's disassemble functions with the methods of interest:

>>> from dis import dis >>> dis(lambda: 'a' in 'b') 1 0 LOAD_CONST 1 ('a') 2 LOAD_CONST 2 ('b') 4 COMPARE_OP 6 (in) 6 RETURN_VALUE >>> dis(lambda: 'b'.__contains__('a')) 1 0 LOAD_CONST 1 ('b') 2 LOAD_METHOD 0 (__contains__) 4 LOAD_CONST 2 ('a') 6 CALL_METHOD 1 8 RETURN_VALUE

so we see that the .__contains__ method has to be separately looked up and then called from the Python virtual machine - this should adequately explain the difference.

Why should one avoid str.index and str.find? How else would you suggest someone find the index of a substring instead of just whether it exists or not? (or did you mean avoid using them in place of contains - so don't use s.find(ss) != -1 instead of ss in s?) — coderforlife, CommentedJun 10, 2015 at 3:35
Precisely so, although the intent behind the use of those methods may be better addressed by elegant use of the re module. I have not yet found a use for str.index or str.find myself in any code I have written yet. — Aaron Hall, CommentedJun 10, 2015 at 3:39
Please extend your answer to advice against using str.count as well (string.count(something) != 0). shudder — cs95, CommentedJun 5, 2019 at 3:05
@jpmc26 it's the same as in_ above - but with a stackframe around it, so it's slower than that: github.com/python/cpython/blob/3.7/Lib/operator.py#L153 — Aaron Hall, CommentedAug 18, 2019 at 23:34

Cristian Ciupitu · Accepted Answer · 2015-06-20 04:07:17Z

if needle in haystack: is the normal use, as @Michael says -- it relies on the in operator, more readable and faster than a method call.

If you truly need a method instead of an operator (e.g. to do some weird key= for a very peculiar sort...?), that would be 'haystack'.__contains__. But since your example is for use in an if, I guess you don't really mean what you say;-). It's not good form (nor readable, nor efficient) to use special methods directly -- they're meant to be used, instead, through the operators and builtins that delegate to them.

How much faster than a method call?
– SO_fix_the_vote_sorting_bug
CommentedNov 11, 2021 at 17:21 — SO_fix_the_vote_sorting_bug, CommentedNov 11, 2021 at 17:21

firelynx · Accepted Answer · 2021-10-12 19:03:34Z

`in` Python strings and lists

Here are a few useful examples that speak for themselves concerning the in method:

>>> "foo" in "foobar" True >>> "foo" in "Foobar" False >>> "foo" in "Foobar".lower() True >>> "foo".capitalize() in "Foobar" True >>> "foo" in ["bar", "foo", "foobar"] True >>> "foo" in ["fo", "o", "foobar"] False >>> ["foo" in a for a in ["fo", "o", "foobar"]] [False, False, True]

Caveat. Lists are iterables, and the in method acts on iterables, not just strings.

If you want to compare strings in a more fuzzy way to measure how "alike" they are, consider using the Levenshtein package

Here's an answer that shows how it works.

Jeffrey04 · Accepted Answer · 2019-11-09 12:36:08Z

If you are happy with "blah" in somestring but want it to be a function/method call, you can probably do this

import operator if not operator.contains(somestring, "blah"): continue

All operators in Python can be more or less found in the operator module including in.

Ufos · Accepted Answer · 2016-04-06 08:49:22Z

57

So apparently there is nothing similar for vector-wise comparison. An obvious Python way to do so would be:

names = ['bob', 'john', 'mike'] any(st in 'bob and john' for st in names) >> True any(st in 'mary and jane' for st in names) >> False

edited Apr 6, 2016 at 8:49

answered Jul 17, 2015 at 13:19

Ufos

3,3052 gold badges35 silver badges38 bronze badges

1
That's because there is a bajillion ways of creating a Product from atomic variables. You can stuff them in a tuple, a list (which are forms of Cartesian Products and come with an implied order), or they can be named properties of a class (no a priori order) or dictionary values, or they can be files in a directory, or whatever. Whenever you can uniquely identify (iter or getitem) something in a 'container' or 'context', you can see that 'container' as a sort of vector and define binary ops on it. en.wikipedia.org/wiki/…
– Niriel
CommentedAug 10, 2015 at 9:50
2
Worth nothing that in should not be used with lists because it does a linear scan of the elements and is slow compared. Use a set instead, especially if membership tests are to be done repeatedly.
– cs95
CommentedJun 5, 2019 at 3:06

Add a comment |

smci · Accepted Answer · 2023-09-15 20:53:46Z

36

You can use y.count().

It will return the integer value of the number of times a substring appears in a string.

For example:

string.count("bah") # gives 0 string.count("Hello") # gives 1

edited Sep 15, 2023 at 20:53

smci

34k21 gold badges117 silver badges152 bronze badges

answered Feb 6, 2019 at 11:06

Brandon Bailey

8217 silver badges12 bronze badges

14
counting a string is costly when you just want to check if it's there...
– Jean-François Fabre♦
CommentedMay 16, 2019 at 5:53
1
I agree, I had an in depth answer that proposed 3 possible solutions. but this was changed by Jean-Francois Fabre to be what it currently is. Not sure why he would change it so.
– Brandon Bailey
CommentedJun 5, 2019 at 9:34
8
Shifting right is almost certainly not what you want to do here.
– rsandwick3
CommentedMar 28, 2020 at 3:53
1
This is just not the answer to the question. It is by no means an idiomatic way to find out if a string is in another string
– Chr1s
CommentedDec 14, 2022 at 19:50
@rsandwick3: I think OP only means ">>" to denote "gives the result", like "->" or "==>". I edited to clarify.
– smci
CommentedSep 15, 2023 at 20:52

Add a comment |

WhyAreYouReadingThis · Accepted Answer · 2018-05-03 14:18:35Z

Here is your answer:

if "insert_char_or_string_here" in "insert_string_to_search_here": #DOSTUFF

For checking if it is false:

if not "insert_char_or_string_here" in "insert_string_to_search_here": #DOSTUFF

OR:

if "insert_char_or_string_here" not in "insert_string_to_search_here": #DOSTUFF

PEP 8 prefers "if x not in y" to "if not x in y".
– gerrit
CommentedOct 18, 2021 at 13:11 — gerrit, CommentedOct 18, 2021 at 13:11

Jean-François Fabre · Accepted Answer · 2019-05-17 05:07:46Z

13

You can use regular expressions to get the occurrences:

>>> import re >>> print(re.findall(r'( |t)', to_search_in)) # searches for t or space ['t', ' ', 't', ' ', ' ']

edited May 17, 2019 at 5:07

Jean-François Fabre♦

140k24 gold badges177 silver badges244 bronze badges

answered Nov 23, 2018 at 7:15

Muskovets

4908 silver badges18 bronze badges

It is actually less efficient in terms of time complexity. You are better off using the in operator. But it's a fun solution. In case you insist of using re, re.match is better to use as boolean.
– Sadegh Moayedizadeh
CommentedAug 8, 2023 at 14:00

Add a comment |

Collectives™ on Stack Overflow

Does Python have a string 'contains' substring method?

10 Answers 10