在 Python 中拆分具有多个分隔符的字符串

在 Python 中使用多个定界符拆分字符串

Split a string with multiple delimiters in Python

要使用多个定界符拆分字符串：

使用re.split()方法，例如re.split(r',|-', my_str_2)。
该re.split()方法将在所有出现的分隔符之一处拆分字符串。

主程序


import re

# 👇️ split string with 2 delimiters

my_str = 'one,two-three,four'

my_list = re.split(r',|-', my_str)  # 👈️ split on comma or hyphen

print(my_list)  # 👉️ ['one', 'two', 'three', 'four']

# ------------------------------------------------------

# 👇️ split string with 3 delimiters
my_str_2 = 'one,two-three:four'

my_list_2 = re.split(r',|-|:', my_str_2)  # 👈️ comma, hyphen or colon

print(my_list_2)  # 👉️ ['one', 'two', 'three', 'four']

re.split方法接受一个模式和一个字符串，并在每次出现该模式时拆分字符串。

管道|字符是一个OR. 要么匹配A or B。

第一个示例使用 2 个分隔符（逗号和连字符）拆分字符串。

第二个示例使用 3 个分隔符拆分字符串 – 一个逗号、一个连字符和一个冒号。

您可以|根据需要在正则表达式中使用尽可能多的字符。

或者，您可以使用方括号[]来指示一组字符。

主程序


import re

my_str = 'one,two-three,four'

my_list = re.split(r'[,-]', my_str)

print(my_list)  # 👉️ ['one', 'two', 'three', 'four']

确保在方括号之间添加所有分隔符。

主程序


import re

# 👇️ split string with 3 delimiters
my_str = 'one,two-three:four'

my_list = re.split(r'[,-:]', my_str)

print(my_list)  # 👉️ ['one', 'two', 'three', 'four']

如果字符串以其中一个定界符开头或结尾，您可能会在输出列表中得到空字符串值。

您可以使用列表理解从列表中删除任何空字符串。

主程序


import re

# 👇️ split string with 3 delimiters
my_str = ',one,two-three:four:'

my_list = [
    item for item in re.split(r'[,-:]', my_str)
    if item
]

print(my_list)  # 👉️ ['one', 'two', 'three', 'four']

列表理解负责从列表中删除空字符串。

列表推导用于对每个元素执行某些操作或选择满足条件的元素子集。

另一种方法是使用str.replace()方法。

使用#拆分具有多个分隔符的字符串`str.replace()`

要使用多个定界符拆分字符串：

使用该str.replace()方法将第一个分隔符替换为第二个分隔符。
使用该str.split()方法按第二个定界符拆分字符串。

主程序


my_str = 'one_two!three_four'

my_list = my_str.replace('_', '!').split('!')

print(my_list)  # 👉️ ['one', 'two', 'three', 'four']

这种方法仅在您想要拆分的分隔符很少时才方便，例如 2。

首先，我们用第二个分隔符替换每个出现的第一个分隔符，然后我们拆分第二个分隔符。

str.replace方法返回字符串
的副本，其中所有出现的子字符串都被提供的替换项替换。

该方法采用以下参数：

姓名	描述
老的	字符串中我们要替换的子串
新的	每次出现的替换`old`
数数	只`count`替换第一次出现的（可选）

请注意，该方法不会更改原始字符串。字符串在 Python 中是不可变的。

这是另一个例子。

主程序


my_str = 'apple banana, kiwi # melon. mango'


my_list = my_str.replace(',', '').replace('#', '').replace('.', '').split()
print(my_list)  # 👉️ ['apple', 'banana', 'kiwi', 'melon', 'mango']

我们使用该str.replace()方法在根据空白字符拆分字符串之前删除标点符号。

我们使用空字符串进行替换，因为我们要删除指定的字符。

您可以根据需要链接尽可能多的str.replace()方法调用。

最后一步是使用该str.split()方法将字符串拆分为单词列表。

str.split ()
方法使用定界符将字符串拆分为子字符串列表。

该方法采用以下 2 个参数：

姓名	描述
分隔器	在每次出现分隔符时将字符串拆分为子字符串
最大分裂	最多`maxsplit`完成拆分（可选）

当没有分隔符传递给该str.split()方法时，它会将输入字符串拆分为一个或多个空白字符。

主程序


my_str = 'apple banana kiwi'

print(my_str.split())  # 👉️ ['apple', 'banana', 'kiwi']

如果在字符串中找不到分隔符，则返回仅包含 1 个元素的列表。

如果需要将一个字符串拆分为多个分隔符的单词列表，也可以使用该re.findall()方法。

使用#将字符串拆分为单词列表`re.findall()`

使用该re.findall()方法将字符串拆分为具有多个分隔符的单词列表，例如my_list = re.findall(r'[\w]+', my_str).

该re.findall()方法将在每次出现单词时拆分字符串，并返回包含单词的列表。

主程序


import re

# ✅ split string into list of words with multiple delimiters (re.findall())
my_str = 'apple banana, kiwi # melon. mango'

my_list = re.findall(r'[\w]+', my_str)
print(my_list)  # 👉️ ['apple', 'banana', 'kiwi', 'melon', 'mango']

The re.findall method
takes a pattern and a string as arguments and returns a list of strings
containing all non-overlapping matches of the pattern in the string.

The first argument we passed to the re.findall() method is a regular
expression.

main.py


import re

my_str = 'apple banana, kiwi # melon. mango'

my_list = re.findall(r'[\w]+', my_str)
print(my_list)  # 👉️ ['apple', 'banana', 'kiwi', 'melon', 'mango']

The square [] brackets are used to indicate a set of characters.

The \w character matches Unicode word characters and includes most characters that can be part of a word in any language.

The plus + causes the regular expression to match 1 or more repetitions of the
preceding character (the Unicode characters).

The re.findall() method returns a list containing the words in the string.

如果您在阅读或编写正则表达式时需要帮助，请参阅官方文档中的
正则表达式语法
副标题。

该页面包含所有特殊字符的列表以及许多有用的示例。

–

在 Python 中使用多个定界符拆分字符串

Split a string with multiple delimiters in Python

使用#拆分具有多个分隔符的字符串str.replace()

使用#将字符串拆分为单词列表re.findall()

使用#拆分具有多个分隔符的字符串`str.replace()`

使用#将字符串拆分为单词列表`re.findall()`