2014-06-13 18 views
2

我有一个关于initcap的问题。 是否可以创建一个initcap语句来跳过小于4个字符的单词更改。初始跳过小于4个字符的单词

因为我必须将少于4个字符的单词改回正常,在完成initcap之后。
所以我认为mabye有可能创建一个函数/过程/触发器,只会跳过单词?这些单词被用在像“Son En Breugel”这样的地名中,中间的“En”必须变得更低。
字符串的第一个字母并不需要改变,只有一个空格后的第一个小字(就像在字符串中间)


我开始创建一个过程,但它需要一点微调

的*不需要所有的字符串与INITCAP被改回改变
* INITCAP与xDutch格式

--Still需要找到方法来“s转换”改变• ,我想我已经用这个脚本删除了记录?

有人可以帮助我吗?

create or replace PROCEDURE Location_Name_Routine IS 
    BEGIN 
     DELETE 
      FROM Location 
      WHERE Name LIKE '%[^0-9a-zA-Z]%'; 
     UPDATE Location 
      set Name = nls_initcap(Name, 'NLS_SORT=xDutch'); 
     UPDATE Location 
      SET Name = REGEXP_REPLACE(Name,' En',' en'); 
     UPDATE Location 
      SET Name = REGEXP_REPLACE(Name,' Van',' van'); 
     UPDATE Location 
      SET Name = REGEXP_REPLACE(Name,' De',' de'); 
     UPDATE Location 
      SET Name = REGEXP_REPLACE(Name,' Den',' den'); 
     UPDATE Location 
      SET Name = REGEXP_REPLACE(Name,' Over','over'); 
     UPDATE Location 
      SET Name = REGEXP_REPLACE(Name,' Aan',' aan'); 
     UPDATE Location 
      SET Name = REGEXP_REPLACE(Name,' Bij',' bij');  
    END;  
+0

不应该'儿子'被转换为'儿子',因为它也低于4个字符,或者你只想要像'en','van'这样的具体词汇保持不变? – Emmanuel

+0

没有忘记注意,因为字符串的第一个字母没有。刚编辑我的问题。我只想要特定的词不变 – Isene112

回答

2

对潜在问题可能没有简单的答案。我假设你正在尝试正确地使用荷兰语的地址,这个问题与昨天的this other question有关。

结合的问题,至少有三个特殊情况至今:

'S GRAVENHAGE => 's Gravenhage 
IJSLAND   => IJsland 
SON EN BREUGEL => Son en Breugel 

INITCAP甚至NLS_INITCAP('...', 'NLS_SORT=xDutch')未能妥善处理。在开始编码之前,您应该收集所有要求。这些是荷兰大写字母的唯一规则,还是还有更多?

到目前为止发布的答案可能有助于解决一个特定的例外。但是,你不可能简单地结合正则表达式并解决所有问题。您可能想在这里采取更自上而下的方法。


UPDATE

基于wolφi的idead,可以使用所有现有的名称来暴力破解的问题。 95%的时间单独工作。使用this link电子表格中的431名称,可以构建一个包含所有25个特例的列表。

运行该语句一次,以建立一个DECODE表达式来处理所有非平凡的情况:

--Build decode for UPDATE. 
select 
    --Start the decode 
    'decode(upper(name),'|| 
    --List all the exceptions. Single quotes are a mess, no way around it. 
    listagg(
    --Upper case version to match 
    ''''||upper(replace(column_value, '''', ''''''))|| 
    --Pre-defined init-capped version 
    ''','''||replace(column_value, '''', '''''')||'''' 
    , ','||chr(10) 
) 
    within group (order by column_value) 
    || 
    --Default to NLS_INITCAP 
    ',nls_initcap(name, ''NLS_SORT=xDutch''))' 
from table(sys.odcivarchar2list('Bellingwedde','Menterwolde','Oldambt','Pekela','Stadskanaal','Veendam','Vlagtwedde','Appingedam','Delfzijl','Loppersum','Bedum','Ten Boer','Eemsmond','Groningen','Grootegast','Haren','Hoogezand-Sappemeer','Leek','De Marne','Marum','Slochteren','Winsum','Zuidhorn','Achtkarspelen','Ameland','het Bildt','Boarnsterhim','Dantumadiel','Dongeradeel','Ferwerderadiel','Franekeradeel','Harlingen','Kollumerland en Nieuwkruisland','Leeuwarden','Leeuwarderadeel','Littenseradiel','Menaldumadeel','Schiermonnikoog','Terschelling','Tytsjerksteradiel','Vlieland','Bolsward','Gaasterlân-Sleat','Lemsterland','Nijefurd','Sneek','Wûnseradiel','Wymbritseradiel','Heerenveen','Ooststellingwerf','Opsterland','Skarsterlân','Smallingerland','Weststellingwerf','Aa en Hunze','Assen','Midden-Drenthe','Noordenveld','Tynaarlo','Borger-Odoorn','Coevorden','Emmen','Hoogeveen','Meppel','Westerveld','De Wolden','Dalfsen','Hardenberg','Kampen','Ommen','Staphorst','Steenwijkerland','Zwartewaterland','Zwolle','Deventer','Olst-Wijhe','Raalte','Almelo','Borne','Dinkelland','Enschede','Haaksbergen','Hellendoorn','Hengelo','Hof van Twente','Losser','Oldenzaal','Rijssen-Holten','Tubbergen','Twenterand','Wierden','Apeldoorn','Barneveld','Ede','Elburg','Epe','Ermelo','Harderwijk','Hattem','Heerde','Nijkerk','Nunspeet','Oldebroek','Putten','Scherpenzeel','Voorst','Wageningen','Buren','Culemborg','Geldermalsen','Lingewaal','Maasdriel','Neder-Betuwe','Neerijnen','Tiel','West Maas en Waal','Zaltbommel','Aalten','Berkelland','Bronckhorst','Brummen','Doetinchem','Lochem','Montferland','Oost Gelre','Oude IJsselstreek','Winterswijk','Zutphen','Arnhem','Beuningen','Doesburg','Druten','Duiven','Groesbeek','Heumen','Lingewaard','Millingen aan de Rijn','Nijmegen','Overbetuwe','Renkum','Rheden','Rijnwaarden','Rozendaal','Ubbergen','Westervoort','Wijchen','Zevenaar','Almere','Dronten','Lelystad','Noordoostpolder','Urk','Zeewolde','Abcoude','Amersfoort','Baarn','De Bilt','Breukelen','Bunnik','Bunschoten','Eemnes','Houten','IJsselstein','Leusden','Loenen','Lopik','Maarssen','Montfoort','Nieuwegein','Oudewater','Renswoude','Rhenen','De Ronde Venen','Soest','Utrecht','Utrechtse Heuvelrug','Veenendaal','Vianen','Wijk bij Duurstede','Woerden','Woudenberg','Zeist','Andijk','Anna Paulowna','Drechterland','Enkhuizen','Harenkarspel','Den Helder','Hoorn','Koggenland','Medemblik','Niedorp','Opmeer','Schagen','Stede Broec','Texel','Wervershoof','Wieringen','Wieringermeer','Zijpe','Alkmaar','Bergen (NH.)','Heerhugowaard','Heiloo','Langedijk','Schermer','Beverwijk','Castricum','Heemskerk','Uitgeest','Velsen','Bloemendaal','Haarlem','Haarlemmerliede en Spaarnwoude','Heemstede','Zandvoort','Wormerland','Zaanstad','Aalsmeer','Amstelveen','Amsterdam','Beemster','Diemen','Edam-Volendam','Graft-De Rijp','Haarlemmermeer','Landsmeer','Oostzaan','Ouder-Amstel','Purmerend','Uithoorn','Waterland','Zeevang','Blaricum','Bussum','Hilversum','Huizen','Laren','Muiden','Naarden','Weesp','Wijdemeren','Hillegom','Kaag en Braassem','Katwijk','Leiden','Leiderdorp','Lisse','Noordwijk','Noordwijkerhout','Oegstgeest','Teylingen','Voorschoten','Zoeterwoude','''s-Gravenhage','Leidschendam-Voorburg','Pijnacker-Nootdorp','Rijswijk','Wassenaar','Zoetermeer','Delft','Midden-Delfland','Westland','Alphen aan den Rijn','Bergambacht','Bodegraven','Boskoop','Gouda','Nieuwkoop','Reeuwijk','Rijnwoude','Schoonhoven','Vlist','Waddinxveen','Albrandswaard','Barendrecht','Bernisse','Binnenmaas','Brielle','Capelle aan den IJssel','Cromstrijen','Dirksland','Goedereede','Hellevoetsluis','Korendijk','Krimpen aan den IJssel','Lansingerland','Maassluis','Middelharnis','Nederlek','Oostflakkee','Oud-Beijerland','Ouderkerk','Ridderkerk','Rotterdam','Rozenburg','Schiedam','Spijkenisse','Strijen','Vlaardingen','Westvoorne','Zuidplas','Alblasserdam','Dordrecht','Giessenlanden','Gorinchem','Graafstroom','Hardinxveld-Giessendam','Hendrik-Ido-Ambacht','Leerdam','Liesveld','Nieuw-Lekkerland','Papendrecht','Sliedrecht','Zederik','Zwijndrecht','Hulst','Sluis','Terneuzen','Borsele','Goes','Kapelle','Middelburg','Noord-Beveland','Reimerswaal','Schouwen-Duiveland','Tholen','Veere','Vlissingen','Bergen op Zoom','Breda','Drimmelen','Etten-Leur','Geertruidenberg','Halderberge','Moerdijk','Oosterhout','Roosendaal','Rucphen','Steenbergen','Woensdrecht','Zundert','Aalburg','Alphen-Chaam','Baarle-Nassau','Dongen','Gilze en Rijen','Goirle','Hilvarenbeek','Loon op Zand','Oisterwijk','Tilburg','Waalwijk','Werkendam','Woudrichem','Bernheze','Boekel','Boxmeer','Boxtel','Cuijk','Grave','Haaren','''s-Hertogenbosch','Heusden','Landerd','Lith','Maasdonk','Mill en Sint Hubert','Oss','Schijndel','Sint Anthonis','Sint-Michielsgestel','Sint-Oedenrode','Uden','Veghel','Vught','Asten','Bergeijk','Best','Bladel','Cranendonck','Deurne','Eersel','Eindhoven','Geldrop-Mierlo','Gemert-Bakel','Heeze-Leende','Helmond','Laarbeek','Nuenen, Gerwen en Nederwetten','Oirschot','Reusel-De Mierden','Someren','Son en Breugel','Valkenswaard','Veldhoven','Waalre','Beesel','Bergen (L.)','Gennep','Horst aan de Maas','Mook en Middelaar','Peel en Maas','Venlo','Venray','Echt-Susteren','Leudal','Maasgouw','Nederweert','Roerdalen','Roermond','Weert','Beek','Brunssum','Eijsden','Gulpen-Wittem','Heerlen','Kerkrade','Landgraaf','Maastricht','Margraten','Meerssen','Nuth','Onderbanken','Schinnen','Simpelveld','Sittard-Geleen','Stein','Vaals','Valkenburg aan de Geul','Voerendaal')) 
where column_value <> nls_initcap(column_value, 'NLS_SORT=xDutch'); 

使用从该声明的结果建立一个UPDATE这样的:

--Update names to properly init-capped name, as defined by: 
--http://epp.eurostat.ec.europa.eu/portal/page/portal/nuts_nomenclature/local_administrative_units 
update location 
set name = 
    decode(upper(name),'''S-GRAVENHAGE','''s-Gravenhage', 
    '''S-HERTOGENBOSCH','''s-Hertogenbosch', 
    'AA EN HUNZE','Aa en Hunze', 
    'ALPHEN AAN DEN RIJN','Alphen aan den Rijn', 
    'BERGEN (NH.)','Bergen (NH.)', 
    'BERGEN OP ZOOM','Bergen op Zoom', 
    'CAPELLE AAN DEN IJSSEL','Capelle aan den IJssel', 
    'GILZE EN RIJEN','Gilze en Rijen', 
    'HAARLEMMERLIEDE EN SPAARNWOUDE','Haarlemmerliede en Spaarnwoude', 
    'HOF VAN TWENTE','Hof van Twente', 
    'HORST AAN DE MAAS','Horst aan de Maas', 
    'KAAG EN BRAASSEM','Kaag en Braassem', 
    'KOLLUMERLAND EN NIEUWKRUISLAND','Kollumerland en Nieuwkruisland', 
    'KRIMPEN AAN DEN IJSSEL','Krimpen aan den IJssel', 
    'LOON OP ZAND','Loon op Zand', 
    'MILL EN SINT HUBERT','Mill en Sint Hubert', 
    'MILLINGEN AAN DE RIJN','Millingen aan de Rijn', 
    'MOOK EN MIDDELAAR','Mook en Middelaar', 
    'NUENEN, GERWEN EN NEDERWETTEN','Nuenen, Gerwen en Nederwetten', 
    'PEEL EN MAAS','Peel en Maas', 
    'SON EN BREUGEL','Son en Breugel', 
    'VALKENBURG AAN DE GEUL','Valkenburg aan de Geul', 
    'WEST MAAS EN WAAL','West Maas en Waal', 
    'WIJK BIJ DUURSTEDE','Wijk bij Duurstede', 
    'HET BILDT','het Bildt', 
    nls_initcap(name, 'NLS_SORT=xDutch')); 
+0

谢谢你的回答, 是的,你是正确的,它适当地大写地址在荷兰。 这有点让我头痛,首先我会试着把注意力集中在Son En Breugel的'En'这样的中间单词上。 但也有一个名字,如Alpen Aan De Rijn,必须成为Alpen aan de Rijn而没有中间首都。 这就是优先1 – Isene112

+0

请参阅我的编辑,你有一些sugestions? – Isene112