2017-09-14 20 views
0

我对Oracle函数相当陌生,因此对我的天真抱歉。Oracle PL/SQL - 识别和提取两个字符串之间的匹配字

我在寻找一个函数,它将COL_A中的字符串与COL_B中的字符串进行比较,然后将字符串中的任何匹配字输出到COL_C

例如

  • COL_A = ‘Microsoft Office’COL_B = ‘Windows Microsoft’,因此 在COL_C预期的结果将是‘Microsoft’
  • COL_A = ‘Microsoft Office’COL_B = ‘Microsoft Office’,因此在COL_C的 预期结果将是‘Microsoft Office’
  • COL_A = ‘Microsoft Office’COL_B = ‘Microsoft Windows',因此 预期的结果在COL_C将是‘Microsoft’
  • COL_A = ‘Microsoft Office’COL_B = ‘Outlook’,因此预计 结果COL_CNULL

我发现,几乎满足条件(Count sequential matching words in two strings oracle),然而,这个函数输出匹配单词计数的功能,只有分类一匹配词顺序也匹配的地方(对于我的目的而言,顺序无关紧要,并且我理想地希望匹配的词将被显示)。

CREATE OR REPLACE FUNCTION STR_WORD_MATCH(
    P_STR1 IN VARCHAR2, 
    P_STR2 IN VARCHAR2) 
RETURN NUMBER 
IS 
L_STR1 VARCHAR2(4000) := P_STR1; 
L_STR2 VARCHAR2(4000) := P_STR2; 
L_RES NUMBER DEFAULT 0; 
L_DEL_POS1 NUMBER; 
L_DEL_POS2 NUMBER; 
L_WORD1 VARCHAR2(1000); 
L_WORD2 VARCHAR2(1000); 
BEGIN 
LOOP 
    L_DEL_POS1 := INSTR(L_STR1,' '); 
    L_DEL_POS2 := INSTR(L_STR2,' '); 
    CASE L_DEL_POS1 
    WHEN 0 THEN 
    L_WORD1 := L_STR1; 
    L_STR1 := ''; 
    ELSE 
    L_WORD1 := SUBSTR(L_STR1,1,L_DEL_POS1 - 1); 
    END CASE; 
    CASE L_DEL_POS2 
    WHEN 0 THEN 
    L_WORD2 := L_STR2; 
    L_STR2 := ''; 
    ELSE 
    L_WORD2 := SUBSTR(L_STR2,1,L_DEL_POS2 - 1); 
    END CASE; 
    EXIT 
WHEN (L_WORD1 <> L_WORD2) OR ((L_WORD1 IS NULL) OR (L_WORD2 IS NULL)); 
    L_RES := L_RES + 1; 
    L_STR1 := SUBSTR(L_STR1,L_DEL_POS1 + 1); 
    L_STR2 := SUBSTR(L_STR2,L_DEL_POS2 + 1); 
END LOOP; 
RETURN L_RES; 
END; 

一如既往,任何帮助将不胜感激。

回答

0

你可以在一个查询中做到这一点,但为了简化语法,我创建了短分词的功能。接下来,我使用此功能并listagg()

select rn, max(c1) c1, max(c2) c2, 
     listagg(t2.column_value, ' ') within group (order by rn, c1, c2) common 
    from (select rownum rn, c1, c2 from t) t 
    cross join table(split(c1)) t1 
    left join table(split(c2)) t2 on t2.column_value = t1.column_value 
    group by rn 

功能:

create or replace function split(i_str in varchar2) 
    return sys.odcivarchar2list pipelined is 
begin 
    for i in 1..length(' '||regexp_replace(i_str, '[^ ]+')) loop 
    pipe row (regexp_substr(i_str, '[^ ]+', 1, i)); 
    end loop; 
end; 

实施例:

with t (c1, c2) as (
    select 'Microsoft Office', 'Windows Microsoft' from dual union all 
    select 'Microsoft Office', 'Microsoft Office' from dual union all 
    select 'Microsoft Office', 'Microsoft Windows' from dual union all 
    select 'Microsoft Office', 'Outlook' from dual) 
select rn, max(c1) c1, max(c2) c2, 
     listagg(t2.column_value, ' ') within group (order by rn, c1, c2) common 
    from (select rownum rn, c1, c2 from t) t 
    cross join table(split(c1)) t1 
    left join table(split(c2)) t2 on t2.column_value = t1.column_value 
    group by rn 

结果:

 RN C1    C2    COMMON 
---------- ---------------- ----------------- ----------------- 
     1 Microsoft Office Windows Microsoft Microsoft 
     2 Microsoft Office Microsoft Office Microsoft Office 
     3 Microsoft Office Microsoft Windows Microsoft 
     4 Microsoft Office Outlook 
相关问题