如何使用Oracle文本索引逗号分隔的文本列

我有一列用逗号分隔的数字，如'2323,23323,23323'。该表有2000万条记录，大约需要37秒才能根据类似下面的关键字返回结果。如何使用Oracle文本索引逗号分隔的文本列

SELECT count(*) from testtable WHERE node_sequence like '%324%';

我试图通过使用Oracle文本通过创建以下索引

CREATE INDEX node_sequence_index ON testtable(node_sequence) INDEXTYPE IS ctxsys.context; 
exec ctx_ddl.sync_index('node_sequence_index');

但下面的查询来提高查询的时间只用言语工作：

SELECT count(*) from testtable WHERE CONTAINS(node_sequence, '324') > 0;

通过查看文档，索引将通过文字标记（分隔空间）。有没有办法用逗号来标记？我一直没能找到一个样本，可以做到这一点。请帮助我了解我在这里错过了什么？

来源

2016-03-27 Chandan

你可以在node_sequence上添加一个字符串替换函数来摆脱逗号。 –

是的，但我想查看这是否可能，而不需要替换逗号。替换会引起参考位置的很多变化 – Chandan

您需要使用所需的参数创建和调整您自己的词法分析器（documentation）。

像这样的东西（抱歉，未测试）：

begin 
    ctx_ddl.create_preference('comma_lexer', 'BASIC_LEXER'); 
    ctx_ddl.set_attribute('comma_lexer', 'PRINTJOINS', '''()/^&"'); 
    ctx_ddl.set_attribute('comma_lexer', 'PUNCTUATIONS', ',.-?!'); 
end; 
/

create index node_sequence_index 
    on testtable(node_sequence) 
    indextype is ctxsys.context 
    parameters ('lexer comma_lexer') 
;

更新

代码从评论由@Chandan该协会致力于在问题中提到的条件：

begin 
    ctx_ddl.create_preference('comma_lexer', 'BASIC_LEXER'); 
    ctx_ddl.set_attribute('comma_lexer', 'WHITESPACE', ','); 
    ctx_ddl.set_attribute('comma_lexer', 'NUMGROUP', '#'); 
end; 
/

create index node_sequence_index 
    on testtable(node_sequence) 
    indextype is ctxsys.context 
    parameters ('lexer comma_lexer') 
;

来源

2016-03-28 15:07:37 ThinkJet

begin ctx_ddl.create_preference（'comma_lexer'，'BASIC_LEXER'）; ctx_ddl.set_attribute（'comma_lexer'，'WHITESPACE'，'，'）; ctx_ddl.set_attribute（'comma_lexer'，'NUMGROUP'，'＃'）; 结束; 创建索引node_sequence_index on testtable（node_sequence） indextype is ctxsys.context parameters（'lexer comma_lexer'）; – Chandan

评论中的上述代码适用于我。由于逗号被用作默认numgroup，因此字符串'4677,45555,45555,55555,5555'被视为单个数字，所以我必须用'＃'这样的任意值替换NUMGROUP。 – Chandan

如何使用Oracle文本索引逗号分隔的文本列

回答

相关问题