删除得罪从字符串的字符在名单

样本数据来解析（unicode字符串列表）：删除得罪从字符串的字符在名单

[u'\n', u'1\xa0', u'Some text here.', u'\n', u'1\xa0', u'Some more text here.', 
u'\n', u'1\xa0', u'Some more text here.']

我想从这些字符串中删除\xa0。

编辑： 当前方法不工作：

def remove_from_list(l, x): 
    return [li.replace(x, '') for li in l] 

remove_from_list(list, u'\xa0')

我仍然得到完全相同的输出。

来源

2013-05-17 Dan

你有没有尝试过任何东西 –

是的，我会告诉我的尝试 – Dan

检查这些，http://stackoverflow.com/questions/3939361/remove-specific-字符从一个字符串在Python中，http://www.tutorialspoint.com/python/string_replace.htm – Rupak

的问题是在你的代码的每个版本不同。让我们先从这一点：

newli = re.sub(x, '', li) 
l[li].replace(newli)

首先，newli是已经你想，这就是re.sub行做，所以你不需要replace这里都没有。只需分配newli。

其次，l[li]是行不通的，因为li是该行的值，而不是指数。

在这个版本中，这是一个更微妙：

li = re.sub(x, '', li)

re.sub返回一个新的字符串，你要指定该字符串li。但是这并不影响列表中的任何内容，它只是说“li不再引用列表中的当前行，它现在引用这个新的字符串”。

要只替换列表元素的方法是让索引，所以你可以使用[]操作。为了得到那个，你想用enumerate。

所以：

def remove_from_list(l, x): 
    for index, li in enumerate(l): 
    l[index] = re.sub(x, '', li) 
    return l

但实际上，你可能做想用str.replace只是想用它代替的re.sub - 它是：

def remove_from_list(l, x): 
    for index, li in enumerate(l): 
    l[index] = li.replace(x, '') 
    return l

然后你不必担心如果x是正则表达式中的特殊字符会发生什么情况。

另外，在Python中，你几乎从不想修改对象的原地位置，并且也返回它。修改它并返回None，或者返回该对象的新的副本。所以，无论是：

def remove_from_list(l, x): 
    for index, li in enumerate(l): 
    newli = li.replace(x, '') 
    l[index] = newli

...或：

def remove_from_list(l, x): 
    new_list = [] 
    for li in l: 
    newli = li.replace(x, '') 
    new_list.append(newli) 
    return new_list

你可以简单的后者列表理解，在unutbu的回答是：

def remove_from_list(l, x): 
    new_list = [li.replace(x, '') for li in l] 
    return new_list

事实上，第二个更容易编写（不需要enumerate，有一个方便的快捷方式等）并非巧合 - 它通常是你想要的，所以Python使它变得简单。

我不知道怎么回事，使之更清楚，但最后一个尝试：

如果您选择返回列表中的固定了新的拷贝版本，而不是修改列表你的原始列表不会以任何方式修改。如果你想使用固定的新副本，你必须使用函数的返回值。例如：

>>> def remove_from_list(l, x): 
...  new_list = [li.replace(x, '') for li in l] 
...  return new_list 
>>> a = [u'\n', u'1\xa0'] 
>>> b = remove_from_list(a, u'\xa0') 
>>> a 
[u'\n', u'1\xa0'] 
>>> b 
[u'\n', u'1']

你与你的实际代码转向到一切的1字符和0字符字符串列表中遇到的问题是，你实际上没有字符串列表中首先，你有一个字符串，它是一个字符串列表的repr。所以，for li in l手段“的字符串l中的每个字符li，而不是for each string李in the list l`。

来源

2013-05-17 19:20:59 abarnert

由于某种原因，它仍然无法正常工作。根据最后一行，我使用'return [li.replace（x，''）for li']，但它仍然有这些字符。 – Dan

我刚刚更新了答案，以根据此答案显示我所做的事情。 – Dan

这不会在原地修改'l'，它会返回一个新的列表，并将这些字符从每个字符串中剥离出来。你必须打印新的列表，或将其分配给某个东西或其他东西。 – abarnert

你可以使用一个list comprehension和str.replace：

>>> items 
[u'\n', 
u'1\xa0', 
u'Some text here.', 
u'\n', 
u'1\xa0', 
u'Some more text here.', 
u'\n', 
u'1\xa0', 
u'Some more text here.'] 
>>> [item.replace(u'\xa0', u'') for item in items] 
[u'\n', 
u'1', 
u'Some text here.', 
u'\n', 
u'1', 
u'Some more text here.', 
u'\n', 
u'1', 
u'Some more text here.']

来源

2013-05-17 19:10:36 unutbu

@ DanO'Day ：_你想保持什么有效的字符，这个版本不？这保留了除'\ xa0'之外的所有内容，这正是您所要求的。 – abarnert

@ DanO'Day：代码没有改变。 – Matthias

@Matthias我的坏，仍然没有工作，虽然 – Dan

，如果你只在ASCII字符感兴趣（如你提到characters，但这也恰好也是工作的另一种选择对于贴例子的情况下）：

[text.encode('ascii', 'ignore') for text in your_list]

来源

2013-05-17 19:22:55

删除得罪从字符串的字符在名单

回答

相关问题