Question

在我的程序中，每当我有一个字符时，它的计数都会为 1。例如，如果我有 abbbbbc，我会返回 a1b5c1。我不希望单个字符有计数。我喜欢将程序读作 ab5c。程序如下：

def rle(character_string):
  compressed_string=""
  count = 1 
  
  for i in range(len(character_string)-1):
    if character_string[i]== character_string[i+1]:  
      count+=1
    else:
     compressed_string += character_string[i] + str(count) 
     count=1
  
  compressed_string += character_string[-1] + str(count) 

  if len(compressed_string) >= len(character_string):
    return character_string
                                
  return compressed_string 


user_string= input("hello user spam character: ")
x=rle(user_string)

print(x)

仅供参考，希望您永远不需要编码"a3bbbbbc"！:) 或者任何超过 9 次重复的内容。 — 
@Amadan 重复 9 次以上是可以的。取决于你如何编写解码器 — 
这个问题类似于：。如果您认为它不同，请问题，说明它的不同之处和/或该问题的答案对您的问题没有帮助的原因。 — 
@SIGHUPa213解码为aa111，或 213 个字母a？（显然，这是一个 RLE 的玩具示例；通常您不会在值和计数之间产生歧义。） — 
@Amadan 这里假设要编码的字符串不包含数字。如果包含数字，则无法解码 —

Answer 1

仅当计数大于 1 时才需要附加计数。

def rle(character_string):
    compressed_string = ""
    count = 1

    for i in range(len(character_string) - 1):
        if character_string[i] == character_string[i + 1]:
            count += 1
        else:
            compressed_string += character_string[i]
            # Append the count only if it's greater than 1
            if count > 1:
                compressed_string += str(count)
            count = 1

    compressed_string += character_string[-1]
    # Same thing for the last character
    if count > 1:
        compressed_string += str(count)

    if len(compressed_string) >= len(character_string):
        return character_string

    return compressed_string

user_string = input("hello user spam character: ")
x = rle(user_string)

print(x)

Answer 2

使用正则表达式以更简洁的方式实现 RLE 算法，并附带一个解压器：

import re

def rle(x):
    return re.sub(r"([a-z])\1+", lambda m: f"{m[1]}{len(m[0])}", x)

def unrle(x):
    return re.sub(r"([a-z])(\d+)", lambda m: m[1] * int(m[2]), x)

x = rle('aabbbzbbccpppppxxypppppkk')
print(x)
print(unrle(x))

其工作原理是查找单个字母，后面跟着该字母的一个或多个重复。然后，替换函数用 RLE 表示替换该序列。反向操作很简单：我们查找字母，后面跟着任意数量的数字，然后用正确的字母序列替换它。这甚至可以处理任意长的数字（例如a123解压缩为 123a秒）。

上面的代码假设您只压缩小写字母的字符串；如果不是，请[a-z]根据需要调整字符类。

Answer 3

为了展示标准库，可以这样做：

import itertools


def rle_parts(text):
    for ch, grouper in itertools.groupby(text):
        count = sum(1 for _ in grouper)
        yield f"{ch}{count}" if count > 1 else ch


def rle(text):
    return "".join(rle_parts(text))


print(rle("abbbbbcc"))

Answer 4

此代码适用于您的游程编码压缩问题。您描述的算法需要跳过字符的出现次数（如果出现次数为 1）。您可以通过添加 if 和 else 语句来实现这一点，如果出现次数大于 1，则需要添加计数。否则不应该。


def rel(string):
    count, compressed_str = 1, ''

    for index in range(len(string) - 1):
        if string[index] == string[index + 1]:
            count += 1
        else:
            #Checking for the continous appreance of character is not 1
            if count > 1:
                compressed_str += string[index] + str(count)
            else:
                compressed_str += string[index]
            count = 1
            
    # Handling the last character
    if count > 1:
        compressed_str += string[-1] + str(count)
    else:
        compressed_str += string[-1]

    return compressed_str

print(rel('aabbbbbcccdccc'))

Answer 5

如果计数不为 1，则仅需要附加计数。因此：

def rle(s: str) -> str:
    p, *r = s
    count = 1
    result = p
    for c in r:
        if c == p:
            count += 1
        else:
            if count > 1:
                result = f"{result}{count}{c}"
                count = 1
            else:
                result += c
            p = c
    return result if count < 2 else f"{result}{count}"

print(rle("abbbbbc"))
print(rle("abbbbbcc"))

输出：

ab5c
ab5c2

笔记：

假设要编码的字符串不包含任何数字。如果包含数字，此编码技术将生成一个由于明显歧义而无法解码的字符串。

附录：

这是一个解码器：

def rld(s: str) -> str:
    result, *r = s
    count = 0
    for c in r:
        if c.isdecimal():
            count = count * 10 + int(c)
        else:
            if count > 0:
                result += (result[-1] * (count-1))
                count = 0
            result += c
    return result if count == 0 else result + (result[-1] * (count-1))

python – 如果只有一个字符，如何摆脱计数？ – 代码日志

5 个回答
5

5 个回答 5

5 个回答
5