使用bdutil在现有的GCE hadoop/spark集群中添加或删除节点

我正在开始在谷歌云计算引擎上运行Spark计算引擎，并使用bdutil部署（在GoogleCloudPlatform github上），我是这样做如下：使用bdutil在现有的GCE hadoop/spark集群中添加或删除节点

./bdutil -e bigquery_env.sh,datastore_env.sh,extensions/spark/spark_env.sh -b myhdfsbucket deploy

我期待我可能要开始用2个节点的集群（如默认值），后来想补充另一个工作节点，以应付需要是一项艰巨的任务跑。如果可能的话，我希望在不完全销毁和重新部署集群的情况下这样做。

我试过用不同数量的节点重新部署，或者运行“create”和“run_command_group install_connectors”，但是对于其中的每一个，我都会收到关于已经存在的节点的错误，例如

./bdutil -n 3 -e bigquery_env.sh,datastore_env.sh,extensions/spark/spark_env.sh -b myhdfsbucket deploy

或

./bdutil -n 3 -b myhdfsbucket create 
./bdutil -n 3 -t workers -b myhdfsbucket run_command_group install_connectors

我也试过快照和克隆一个工人已经运行，但不是所有的服务似乎正常启动，我留下了一点我的深度存在。

有关如何可以/应该从现有群集添加和/或删除节点的任何指导？

来源

2015-02-11 Gavin

更新：我们添加了resize_env.sh到基础bdutil repo所以你不需要去我对它的叉了

原来的答复：

没有官方支持调整bdutil部署集群的大小，但这肯定是我们之前讨论过的，实际上它可以为调整大小提供一些基本支持。一旦合并到主分支中，这可能会采取不同的形式，但我已将第一个调整大小支持草稿推送到my fork of bdutil。这是通过两个提交来实现的;一个允许skipping all "master" operations（包括创建，run_command，删除等）和另一个到add the resize_env.sh file。

我还没有对其他bdutil扩展的所有组合进行测试，但我至少已成功运行它与基地bdutil_env.sh和extensions/spark/spark_env.sh。理论上它应该适用于你的bigquery和数据存储扩展。要在您的情况下使用它：

# Assuming you initially deployed with this command (default n == 2) 
./bdutil -e bigquery_env.sh,datastore_env.sh,extensions/spark/spark_env.sh -b myhdfsbucket -n 2 deploy 

# Before this step, edit resize_env.sh and set NEW_NUM_WORKERS to what you want. 
# Currently it defaults to 5. 
# Deploy only the new workers, e.g. {hadoop-w-2, hadoop-w-3, hadoop-w-4}: 
./bdutil -e bigquery_env.sh,datastore_env.sh,extensions/spark/spark_env.sh -b myhdfsbucket -n 2 -e resize_env.sh deploy 

# Explicitly start the Hadoop daemons on just the new workers: 
./bdutil -e bigquery_env.sh,datastore_env.sh,extensions/spark/spark_env.sh -b myhdfsbucket -n 2 -e resize_env.sh run_command -t workers -- "service hadoop-hdfs-datanode start && service hadoop-mapreduce-tasktracker start" 

# If using Spark as well, explicitly start the Spark daemons on the new workers: 
./bdutil -e bigquery_env.sh,datastore_env.sh,extensions/spark/spark_env.sh -b myhdfsbucket -n 2 -e resize_env.sh run_command -t workers -u extensions/spark/start_single_spark_worker.sh -- "./start_single_spark_worker.sh" 

# From now on, it's as if you originally turned up your cluster with "-n 5". 
# When deleting, remember to include those extra workers: 
./bdutil -b myhdfsbucket -n 5 delete

一般来说，最佳实践的建议是凝结配置成一个文件，而不是总是传递标志。例如，你的情况，你可能要一个名为my_base_env.sh文件：

import_env bigquery_env.sh 
import_env datastore_env.sh 
import_env extensions/spark/spark_env.sh 

NUM_WORKERS=2 
CONFIGBUCKET=myhdfsbucket

然后调整大小命令要短得多：

# Assuming you initially deployed with this command (default n == 2) 
./bdutil -e my_base_env.sh deploy 

# Before this step, edit resize_env.sh and set NEW_NUM_WORKERS to what you want. 
# Currently it defaults to 5. 
# Deploy only the new workers, e.g. {hadoop-w-2, hadoop-w-3, hadoop-w-4}: 
./bdutil -e my_base_env.sh -e resize_env.sh deploy 

# Explicitly start the Hadoop daemons on just the new workers: 
./bdutil -e my_base_env.sh -e resize_env.sh run_command -t workers -- "service hadoop-hdfs-datanode start && service hadoop-mapreduce-tasktracker start" 

# If using Spark as well, explicitly start the Spark daemons on the new workers: 
./bdutil -e my_base_env.sh -e resize_env.sh run_command -t workers -u extensions/spark/start_single_spark_worker.sh -- "./start_single_spark_worker.sh" 

# From now on, it's as if you originally turned up your cluster with "-n 5". 
# When deleting, remember to include those extra workers: 
./bdutil -b myhdfsbucket -n 5 delete

最后，这是不太100％一样，如果你'd最初部署了集群-n 5;在这种情况下，主节点/home/hadoop/hadoop-install/conf/slaves和/home/hadoop/spark-install/conf/slaves上的文件将丢失新节点。如果您打算使用/home/hadoop/hadoop-install/bin/[stop|start]-all.sh或/home/hadoop/spark-install/sbin/[stop|start]-all.sh，则可以手动将SSH连接到主节点并编辑这些文件以将新节点添加到列表;如果没有，那么就不需要更改这些从属文件。

来源

2015-02-12 05:24:46

太棒了！你的叉子仍然可用吗？只是想知道在将现有的bdutil群集添加到现有的bdutil群集中并添加新的磁盘之前，最简单的选择是什么。 – 2016-02-01 12:13:50

实际上，我们将'resize_env.sh'添加到[base bdutil repo]（https://github.com/GoogleCloudPlatform/bdutil/blob/master/extensions/google/experimental/resize_env.sh），所以你不用'我不再需要去看它了。 – 2016-02-01 20:16:19

使用bdutil在现有的GCE hadoop/spark集群中添加或删除节点

回答

相关问题