2011-12-22 27 views
2

我目前正在试图优化我的数据库。问题是以下几点: 我有一张表,当前存储超过83Mio。时间依赖值。它们由高分辨率(ms)时间戳索引。我需要做的是计算特定值出现在给定时间间隔内的次数 - 例如说,我想知道值1.56787在时间戳x到时间戳y的时间间隔中出现了多少次。现在这需要几乎永远。 即时通讯使用InnoDB,我已经花了很多时间来优化配置文件,这增加了速度immensly。优化MySQL数据库的快速计数

我很感激任何输入,因为我几乎不知道如何将它关闭。我能想到的唯一解决方法是创建包含固定时间间隔的预计数值的表,由于整个事物也应该是完全可更新的(我们正在讨论每隔几毫秒到达的新值),所以这并不令人满意。另一个数据库系统会更适合我的问题吗?

下面是解释输出:

Field Type Null Key Default Extra 

timestamp bigint(20) NO PRI NULL  
ask decimal(6,5) NO  NULL  
bid decimal(6,5) NO  NULL  
askvolume decimal(6,5) NO  NULL  
bidvolume decimal(6,5) NO  NULL  

# The MySQL server 
[mysqld] 
port= 3306 
socket= "C:/xampp/mysql/mysql.sock" 
basedir="C:/xampp/mysql" 
tmpdir="C:/xampp/tmp" 
datadir="C:/xampp/mysql/data" 
pid_file="mysql.pid" 
skip-external-locking 
key_buffer = 16M 
max_allowed_packet = 61M 
table_cache = 64 
sort_buffer_size = 512K 
net_buffer_length = 8K 
read_buffer_size = 256K 
read_rnd_buffer_size = 512K 
myisam_sort_buffer_size = 8M 
log_error="mysql_error.log" 
bind-address="192.168.1.2" 


# Don't listen on a TCP/IP port at all. This can be a security enhancement, 
# if all processes that need to connect to mysqld run on the same host. 
# All interaction with mysqld must be made via Unix sockets or named pipes. 
# Note that using this option without enabling named pipes on Windows 
# (via the "enable-named-pipe" option) will render mysqld useless! 
# 
# commented in by lampp security 
#skip-networking 
skip-federated 

# Replication Master Server (default) 
# binary logging is required for replication 
# log-bin deactivated by default since XAMPP 1.4.11 
#log-bin=mysql-bin 

# required unique id between 1 and 2^32 - 1 
# defaults to 1 if master-host is not set 
# but will not function as a master if omitted 
server-id = 1 

# Replication Slave (comment out master section to use this) 
# 
# To configure this host as a replication slave, you can choose between 
# two methods : 
# 
# 1) Use the CHANGE MASTER TO command (fully described in our manual) - 
# the syntax is: 
# 
# CHANGE MASTER TO MASTER_HOST=<host>, MASTER_PORT=<port>, 
# MASTER_USER=<user>, MASTER_PASSWORD=<password> ; 
# 
# where you replace <host>, <user>, <password> by quoted strings and 
# <port> by the master's port number (3306 by default). 
# 
# Example: 
# 
# CHANGE MASTER TO MASTER_HOST='125.564.12.1', MASTER_PORT=3306, 
# MASTER_USER='joe', MASTER_PASSWORD='secret'; 
# 
# OR 
# 
# 2) Set the variables below. However, in case you choose this method, then 
# start replication for the first time (even unsuccessfully, for example 
# if you mistyped the password in master-password and the slave fails to 
# connect), the slave will create a master.info file, and any later 
# change in this file to the variables' values below will be ignored and 
# overridden by the content of the master.info file, unless you shutdown 
# the slave server, delete master.info and restart the slaver server. 
# For that reason, you may want to leave the lines below untouched 
# (commented) and instead use CHANGE MASTER TO (see above) 
# 
# required unique id between 2 and 2^32 - 1 
# (and different from the master) 
# defaults to 2 if master-host is set 
# but will not function as a slave if omitted 
#server-id  = 2 
# 
# The replication master for this slave - required 
#master-host  = <hostname> 
# 
# The username the slave will use for authentication when connecting 
# to the master - required 
#master-user  = <username> 
# 
# The password the slave will authenticate with when connecting to 
# the master - required 
#master-password = <password> 
# 
# The port the master is listening on. 
# optional - defaults to 3306 
#master-port  = <port> 
# 
# binary logging - not required for slaves, but recommended 
#log-bin=mysql-bin 


# Point the following paths to different dedicated disks 
#tmpdir = "C:/xampp/tmp" 
#log-update = /path-to-dedicated-directory/hostname 

# Uncomment the following if you are using BDB tables 
#bdb_cache_size = 4M 
#bdb_max_lock = 10000 

# Comment the following if you are using InnoDB tables 
#skip-innodb 
innodb_data_home_dir = "C:/xampp/mysql/data" 
innodb_data_file_path = ibdata1:10M:autoextend 
innodb_log_group_home_dir = "C:/xampp/mysql/data" 
#innodb_log_arch_dir = "C:/xampp/mysql/data" 
## You can set .._buffer_pool_size up to 50 - 80 % 
## of RAM but beware of setting memory usage too high 
innodb_buffer_pool_size = 1024M 
innodb_additional_mem_pool_size = 20M 
## Set .._log_file_size to 25 % of buffer pool size 
innodb_log_file_size = 5M 
innodb_log_buffer_size = 16M 
innodb_flush_log_at_trx_commit = 0 
innodb_lock_wait_timeout = 50 

[mysqldump] 
quick 
max_allowed_packet = 16M 

[mysql] 
no-auto-rehash 
# Remove the next comment character if you are not familiar with SQL 
#safe-updates 

[isamchk] 
key_buffer = 20M 
sort_buffer_size = 20M 
read_buffer = 2M 
write_buffer = 2M 

[myisamchk] 
key_buffer = 20M 
sort_buffer_size = 20M 
read_buffer = 2M 
write_buffer = 2M 

[mysqlhotcopy] 
interactive-timeout 

喔机器是i7-950的RAM 6GB和系统+数据库上的SSD。所以我认为这不应该是问题?

感谢您的帮助,我们将非常感谢!

+0

在这个问题不是一种选择砸钱?你可以做很多事情来提高性能,而更多的钱绝对不是我的第一选择......但是有些观点你必须得到更好的硬件。根据你的描述,你似乎越来越接近这一点。这就是说...我强烈建议给(配置良好的)Postgres一枪。 – shesek 2011-12-22 09:05:22

+0

没有看到用于运行MySQl实例的表格和硬件 - 很难提出任何建议。我怀疑Postgres会在I/O绑定系统上做出什么改变,尤其是因为知道InnoDB的B-tree实现是业界最好的实现之一。以远距离交换整个RDBMS并不是一个可行的选择。我敢打赌,MySQL配置不正确(SHOW VARIABLES LIKE'%innodb%'和'EXPLAIN ...'的输出不存在)。如果您可以发布这些信息,那么分析出现问题会更容易。 – 2011-12-22 09:17:46

+0

感谢球员们,我把所有的信息都加入到了描述中......如果有必要,它不应该成为一个问题,但是当我贴出来的时候,它是一个带有ssd和6GB内存的i7 ... – user871784 2011-12-22 11:45:31

回答

-1

第一步:如果您尚未完成此操作,请使用explain plan查看查询的瓶颈究竟是什么,以及引擎是否正确使用索引。

第二步:按时间戳范围对表进行分区。我不确定MySQL/InnoDB是否具备这种能力,但如果没有,您最好更换DBMS。

在任何情况下,MySQL都不是高性能的好选择:根据您的需求,使用Oracle或Postgre或者甚至内存存储可能会更好(尤其是如果您不太在意的话)为了安全而不是表现)。

+0

InnoDB在内存中存储工作数据集,其工作速度比内存中特定的解决方案快得多。 Postgres或Oracle不会更快。这是一个优化系统I/O并查看查询花费多长时间的问题。 – 2011-12-22 09:19:08

+0

我真诚地怀疑InnoDB比H2更快(他们说它不是,它的价值:http://www.h2database.com/html/performance.html)。尽管如此,清晰地转换RDBMS是需要考虑的另一件事,而不是最简单的解决方案,但值得一提的是。 – Viruzzo 2011-12-22 11:07:15

+0

InnoDB @ 5.6速度很快,并且您链接的基准测试表明,MySQL在他们的情况下每秒执行140条语句,这比我的小型测试xeon和7200rpm驱动器的速度要慢。我不相信其他系统的开箱即用基准。再次,有专门针对MySQL的基于列的分析引擎,其性能比Oracle快。问题是为什么数据库的数据少于1亿个。记录执行速度如此之慢,我知道一个事实,即如果不更改数据库系统,可以在InnoDB上获得非常快的结果。 – 2011-12-22 11:29:58

0

我没有感觉你在你的索引时间戳值的值范围,但在我看来,partitioning你的表可以帮助你在这里。具体为RANGE partitioningHASH partitioning

这应该会显着提升性能。

+0

哈希分区是不可取的:它确保数据在分区中的均匀分布(作为散列),但对于间隔查询(如他所做的那样),它不提供特别的好处,而范围分区最明显。 – Viruzzo 2011-12-22 11:01:22

+0

谢谢大家的建议,我想我现在有很多东西要挖掘! – user871784 2011-12-22 11:30:03