你所看到的是IN()bieng受绑定变量窥视的结果。当你有直方图的时候,你会写一个查询,比如“where a ='a'”,oracle将使用直方图来猜测返回多少行(与inlist运算符相同的想法,它对每个项目进行迭代并聚合行)。如果没有直方图,它将以行/不同值的形式进行猜测。 在子查询中,oracle并不这样做(在大多数情况下,它有一个独特的例子)。
例如:
SQL> create table test
2 (val1 number, val2 varchar2(20), val3 number);
Table created.
Elapsed: 00:00:00.02
SQL>
SQL> insert into test select 1, 'aaaaaaaaaa', mod(rownum, 5) from dual connect by level <= 100;
100 rows created.
Elapsed: 00:00:00.01
SQL> insert into test select 2, 'aaaaaaaaaa', mod(rownum, 5) from dual connect by level <= 1000;
1000 rows created.
Elapsed: 00:00:00.02
SQL> insert into test select 3, 'aaaaaaaaaa', mod(rownum, 5) from dual connect by level <= 100;
100 rows created.
Elapsed: 00:00:00.00
SQL> insert into test select 4, 'aaaaaaaaaa', mod(rownum, 5) from dual connect by level <= 100000;
100000 rows created.
所以我有101200行的表。对于VAL1,100是“1”1000是“2”100是“3”而100k是“4”。
现在如果直方图聚集(并且我们希望他们在这种情况下)
SQL> exec dbms_stats.gather_table_stats(user , 'test', degree=>4, method_opt=>'for all indexed columns size 4', estimate_percent=>100);
SQL> exec dbms_stats.gather_table_stats(user , 'lookup', degree=>4, method_opt =>'for all indexed columns size 3', estimate_percent=>100);
我们看到以下内容:
SQL> explain plan for select * from test where val1 in (1, 2, 3) ;
Explained.
SQL> @explain ""
Plan hash value: 3165434153
--------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1200 | 19200 | 23 (0)| 00:00:01 |
| 1 | INLIST ITERATOR | | | | | |
| 2 | TABLE ACCESS BY INDEX ROWID| TEST | 1200 | 19200 | 23 (0)| 00:00:01 |
|* 3 | INDEX RANGE SCAN | TEST1 | 1200 | | 4 (0)| 00:00:01 |
--------------------------------------------------------------------------------------
VS
SQL> explain plan for select * from test where val1 in (select id from lookup where str = 'A') ;
Explained.
SQL> @explain ""
Plan hash value: 441162525
----------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 25300 | 518K| 106 (3)| 00:00:02 |
| 1 | NESTED LOOPS | | 25300 | 518K| 106 (3)| 00:00:02 |
| 2 | TABLE ACCESS BY INDEX ROWID| LOOKUP | 1 | 5 | 1 (0)| 00:00:01 |
|* 3 | INDEX UNIQUE SCAN | LOOKUP1 | 1 | | 0 (0)| 00:00:01 |
|* 4 | TABLE ACCESS FULL | TEST | 25300 | 395K| 105 (3)| 00:00:02 |
----------------------------------------------------------------------------------------
哪里查找表
SQL> select * From lookup;
ID STR
---------- ----------
1 A
2 B
3 C
4 D
(str是唯一索引并具有直方图)。
注意到基数为1200的基数是一个很好的计划,但在子查询中是非常不准确的? Oracle没有计算连接条件的直方图,而是说“看,我不知道会是什么id,所以猜猜总行数(100k + 1000 + 100 + 100)/不同值(4)= 25300并使用如果你知道这个子查询将匹配少量的行(我们这样做),那么你必须提示外部查询,试图把它使用索引,如:。
SQL> explain plan for select /*+ index(t) */ * from test t where val1 in (select id from lookup where str = 'A') ;
Explained.
SQL> @explain
Plan hash value: 702117913
----------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 25300 | 518K| 456 (1)| 00:00:06 |
| 1 | NESTED LOOPS | | 25300 | 518K| 456 (1)| 00:00:06 |
| 2 | TABLE ACCESS BY INDEX ROWID| LOOKUP | 1 | 5 | 1 (0)| 00:00:01 |
|* 3 | INDEX UNIQUE SCAN | LOOKUP1 | 1 | | 0 (0)| 00:00:01 |
| 4 | TABLE ACCESS BY INDEX ROWID| TEST | 25300 | 395K| 455 (1)| 00:00:06 |
|* 5 | INDEX RANGE SCAN | TEST1 | 25300 | | 61 (2)| 00:00:01 |
----------------------------------------------------------------------------------------
另一件事是在我的具体情况为VAL1 = 4是最表的,可以说,我有我的标准查询: select * from test t where val1 in (select id from lookup where str = :B1);
为可能的:B1
输入。如果我知道传入的有效值是A,B和C(即不是映射到id = 4的D)。我可以添加这一招:
SQL> explain plan for select * from test t where val1 in (select id from lookup where str = :b1 and id in (1, 2, 3)) ;
Explained.
SQL> @explain ""
Plan hash value: 771376936
--------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 250 | 5250 | 24 (5)| 00:00:01 |
|* 1 | HASH JOIN | | 250 | 5250 | 24 (5)| 00:00:01 |
|* 2 | VIEW | index$_join$_002 | 1 | 5 | 1 (100)| 00:00:01 |
|* 3 | HASH JOIN | | | | | |
|* 4 | INDEX RANGE SCAN | LOOKUP1 | 1 | 5 | 0 (0)| 00:00:01 |
| 5 | INLIST ITERATOR | | | | | |
|* 6 | INDEX UNIQUE SCAN | SYS_C002917051 | 1 | 5 | 0 (0)| 00:00:01 |
| 7 | INLIST ITERATOR | | | | | |
| 8 | TABLE ACCESS BY INDEX ROWID| TEST | 1200 | 19200 | 23 (0)| 00:00:01 |
|* 9 | INDEX RANGE SCAN | TEST1 | 1200 | | 4 (0)| 00:00:01 |
--------------------------------------------------------------------------------------------------
现在发现甲骨文已经得到了合理的卡(其1,2,3推到测试表和有1200..not 100%准确,因为我是唯一的过滤功能noe,但是我告诉oralce一定不是4!
你的查询在语法上不正确,你在'select'语句中有val2和val3,但这些不是我的'group by'语句 –
使用INNER JOIN而不是子查询,并比较结果,IT应该表现得更好。 –
subselect返回多少个值?这些值包含的列val1中的数据的百分比多少? – sufleR