2015-06-18 36 views
-3

工作,我想抓住它,然后"genome_"串的每一次出现,但",("前结束,并与特定的字符串替换,比方说"XXX"正则表达式并不适用于多个模式occurence

在下面的文字:

(ID_Bxylanisolvens_NLAE-ZL-C182_genome_orf00003 ____ Bxylanisolvens_NLAE -.._ 843_unknown ___ 1278-2120_1 _ ^^ neighbours_ID_Bxylanisolvens_NLAE-ZL-C182_genome_orf00002_1__ID_Bxylanisolvens_NLAE-ZL-C182_genome_orf00004_1__neighbour_genes_Bxylanisolvens_NLAE -.._ Bxylanisolvens_NLAE- ..:0.00000230914009336068,((ID_Bxylanisolvens_NLAE-ZL-G421_genome_orf00003 ____ Bxylanisolvens_NLAE -.._ 843_unknown ___ 1315-2157_1 _ ^^ neighbours_ID_Bxylanisolvens_NLAE-ZL-G421_genome_orf00002_1__ID_Bxylanisolvens_NLAE-ZL-G421_genome_orf00004_1__neighbour_genes_Bxylanisolvens_NLAE -.._ Bxylanisolvens_NLAE- ..:0.00000230914009336068,ID_Bxylanisolvens_NLAE-ZL-C339_genome_orf00003 ____ Bxylanisolvens_NLAE -.._ 843_unknown ___ 1084- 1926_1 _ ^^ neighbours_ID_Bxylanisolvens_NLAE-ZL-C339_genome_orf00002_1__ID_Bxylanisolvens_NLAE-ZL-C339_genome_orf00004_1__neighbour_genes_Bxylanisolvens_NLAE -.._ Bxylanisolvens_NLAE- ..:0.00000230914009336068)28:0.00000230914009336068,(

期望的结果:

(ID_Bxylanisolvens_NLAE-ZL-C182_XXX,((ID_Bxylanisolvens_NLAE-ZL-G421_XXX,(

+0

期望的结果: (ID_Bxylanisolvens_NLAE-ZL-C182_XXX,((你使用(PCRE,蟒蛇),JavaScript的)什么味道正则表达式的ID_Bxylanisolvens_NLAE-ZL-G421_XXX,( – ap88

+0

你尝试过什么 – Jota

+0

我使用? Python的re模块已经尝试了一些模式:'_genome _。* \,\('and'_genome _。*?\,\(' – ap88

回答

1

根据您的样本数据和期望输出,正环视应该有所帮助:

(?<=ID_Bxylanisolvens_NLAE-zl-[A-Z]\d{3,3}_)(genome.*?)(?=,\() 
  • (?<=ID_Bxylanisolvens_NLAE-zl-[A-Z]\d{3,3}_)回顾并检查特定的字符序列。可能需要根据实际数据的可变性进行调整。
  • (genome.*?)捕获位来替换 - 问号使其不贪婪。
  • (?=,\()期待字符组合来限定被删除的部分。

看到它的行动:RegEx101
如果需要进一步的细节/调整,请发表评论。