2017-02-18 52 views
0

我正尝试使用apache的火花与java程序的帮助连接的Apache配置单元。下面是程序:为什么我无法使用apache spark连接到配置单元metastore?

import org.apache.spark.sql.SparkSession; 

public class queryhive { 

    public static void main(String[] args) 
{ 
    String warehouseLocation = "spark-warehouse"; 

    SparkSession spark = SparkSession 
      .builder() 
      .appName("Java Spark Hive Example") 
      .master("local[*]") 
      .config("spark.sql.warehouse.dir", warehouseLocation) 
      .enableHiveSupport() 
      .getOrCreate(); 
try 
{ 
     spark.sql("select count(*) from heath1").show(); 
} 
catch (Exception AnalysisException) 
{ 
    System.out.print("\nTable is not found\n"); 
} 
} 
} 

我已经加入到maven的pom.xml中:HDFS的地址和蜂巢在<properties>标签的地址。
我想查询使用spark的配置单元表。但我看不到表,因为我得到表外未发现:

log4j:WARN No appenders could be found for logger (org.apache.hadoop.util.Shell). 
log4j:WARN Please initialize the log4j system properly. 
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info. 
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties 
17/02/18 11:30:56 INFO SparkContext: Running Spark version 2.1.0 
17/02/18 11:30:56 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 
17/02/18 11:30:56 WARN Utils: Your hostname, aims resolves to a loopback address: 127.0.1.1; using 10.0.0.3 instead (on interface wlp2s0) 
17/02/18 11:30:56 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to another address 
17/02/18 11:30:56 INFO SecurityManager: Changing view acls to: aims 
17/02/18 11:30:56 INFO SecurityManager: Changing modify acls to: aims 
17/02/18 11:30:56 INFO SecurityManager: Changing view acls groups to: 
17/02/18 11:30:56 INFO SecurityManager: Changing modify acls groups to: 
17/02/18 11:30:56 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(aims); groups with view permissions: Set(); users with modify permissions: Set(aims); groups with modify permissions: Set() 
17/02/18 11:30:57 INFO Utils: Successfully started service 'sparkDriver' on port 32975. 
17/02/18 11:30:57 INFO SparkEnv: Registering MapOutputTracker 
17/02/18 11:30:57 INFO SparkEnv: Registering BlockManagerMaster 
17/02/18 11:30:57 INFO BlockManagerMasterEndpoint: Using org.apache.spark.storage.DefaultTopologyMapper for getting topology information 
17/02/18 11:30:57 INFO BlockManagerMasterEndpoint: BlockManagerMasterEndpoint up 
17/02/18 11:30:57 INFO DiskBlockManager: Created local directory at /tmp/blockmgr-6263f04a-5c65-4dda-9e9a-faafb32a066a 
17/02/18 11:30:57 INFO MemoryStore: MemoryStore started with capacity 335.4 MB 
17/02/18 11:30:57 INFO SparkEnv: Registering OutputCommitCoordinator 
17/02/18 11:30:58 INFO Utils: Successfully started service 'SparkUI' on port 4040. 
17/02/18 11:30:58 INFO SparkUI: Bound SparkUI to 0.0.0.0, and started at http://10.0.0.3:4040 
17/02/18 11:30:58 INFO Executor: Starting executor ID driver on host localhost 
17/02/18 11:30:58 INFO Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 43772. 
17/02/18 11:30:58 INFO NettyBlockTransferService: Server created on 10.0.0.3:43772 
17/02/18 11:30:58 INFO BlockManager: Using org.apache.spark.storage.RandomBlockReplicationPolicy for block replication policy 
17/02/18 11:30:58 INFO BlockManagerMaster: Registering BlockManager BlockManagerId(driver, 10.0.0.3, 43772, None) 
17/02/18 11:30:58 INFO BlockManagerMasterEndpoint: Registering block manager 10.0.0.3:43772 with 335.4 MB RAM, BlockManagerId(driver, 10.0.0.3, 43772, None) 
17/02/18 11:30:58 INFO BlockManagerMaster: Registered BlockManager BlockManagerId(driver, 10.0.0.3, 43772, None) 
17/02/18 11:30:58 INFO BlockManager: Initialized BlockManager: BlockManagerId(driver, 10.0.0.3, 43772, None) 
17/02/18 11:30:58 INFO SharedState: Warehouse path is 'hdfs://localhost:8020/user/hive/warehouse/default.db/spark-warehouse'. 
17/02/18 11:30:58 INFO HiveUtils: Initializing HiveMetastoreConnection version 1.2.1 using Spark classes. 
17/02/18 11:30:59 INFO HiveMetaStore: 0: Opening raw store with implemenation class:org.apache.hadoop.hive.metastore.ObjectStore 
17/02/18 11:30:59 INFO ObjectStore: ObjectStore, initialize called 
17/02/18 11:31:00 INFO Persistence: Property hive.metastore.integral.jdo.pushdown unknown - will be ignored 
17/02/18 11:31:00 INFO Persistence: Property datanucleus.cache.level2 unknown - will be ignored 
17/02/18 11:31:02 INFO ObjectStore: Setting MetaStore object pin classes with hive.metastore.cache.pinobjtypes="Table,StorageDescriptor,SerDeInfo,Partition,Database,Type,FieldSchema,Order" 
17/02/18 11:31:03 INFO Datastore: The class "org.apache.hadoop.hive.metastore.model.MFieldSchema" is tagged as "embedded-only" so does not have its own datastore table. 
17/02/18 11:31:03 INFO Datastore: The class "org.apache.hadoop.hive.metastore.model.MOrder" is tagged as "embedded-only" so does not have its own datastore table. 
17/02/18 11:31:03 INFO Datastore: The class "org.apache.hadoop.hive.metastore.model.MFieldSchema" is tagged as "embedded-only" so does not have its own datastore table. 
17/02/18 11:31:03 INFO Datastore: The class "org.apache.hadoop.hive.metastore.model.MOrder" is tagged as "embedded-only" so does not have its own datastore table. 
17/02/18 11:31:03 INFO Query: Reading in results for query "[email protected]" since the connection used is closing 
17/02/18 11:31:03 INFO MetaStoreDirectSql: Using direct SQL, underlying DB is DERBY 
17/02/18 11:31:03 INFO ObjectStore: Initialized ObjectStore 
17/02/18 11:31:05 INFO HiveMetaStore: Added admin role in metastore 
17/02/18 11:31:05 INFO HiveMetaStore: Added public role in metastore 
17/02/18 11:31:05 INFO HiveMetaStore: No user is added in admin role, since config is empty 
17/02/18 11:31:05 INFO HiveMetaStore: 0: get_all_databases 
17/02/18 11:31:05 INFO audit: ugi=aims ip=unknown-ip-addr cmd=get_all_databases 
17/02/18 11:31:05 INFO HiveMetaStore: 0: get_functions: db=default pat=* 
17/02/18 11:31:05 INFO audit: ugi=aims ip=unknown-ip-addr cmd=get_functions: db=default pat=* 
17/02/18 11:31:05 INFO Datastore: The class "org.apache.hadoop.hive.metastore.model.MResourceUri" is tagged as "embedded-only" so does not have its own datastore table. 
17/02/18 11:31:06 INFO SessionState: Created local directory: /tmp/cac4110a-ebb3-47a6-b21e-682a12724ba2_resources 
17/02/18 11:31:06 INFO SessionState: Created HDFS directory: /tmp/hive/aims/cac4110a-ebb3-47a6-b21e-682a12724ba2 
17/02/18 11:31:06 INFO SessionState: Created local directory: /tmp/aims/cac4110a-ebb3-47a6-b21e-682a12724ba2 
17/02/18 11:31:06 INFO SessionState: Created HDFS directory: /tmp/hive/aims/cac4110a-ebb3-47a6-b21e-682a12724ba2/_tmp_space.db 
17/02/18 11:31:06 INFO HiveClientImpl: Warehouse location for Hive client (version 1.2.1) is hdfs://localhost:8020/user/hive/warehouse/default.db/spark-warehouse 
17/02/18 11:31:06 INFO HiveMetaStore: 0: get_database: default 
17/02/18 11:31:06 INFO audit: ugi=aims ip=unknown-ip-addr cmd=get_database: default 
17/02/18 11:31:06 INFO HiveMetaStore: 0: get_database: global_temp 
17/02/18 11:31:06 INFO audit: ugi=aims ip=unknown-ip-addr cmd=get_database: global_temp 
17/02/18 11:31:06 WARN ObjectStore: Failed to get database global_temp, returning NoSuchObjectException 
17/02/18 11:31:06 INFO SparkSqlParser: Parsing command: select count(*) from health1 
17/02/18 11:31:08 INFO HiveMetaStore: 0: get_table : db=default tbl=health1 
17/02/18 11:31:08 INFO audit: ugi=aims ip=unknown-ip-addr cmd=get_table : db=default tbl=health1 
17/02/18 11:31:08 INFO HiveMetaStore: 0: get_table : db=default tbl=health1 
17/02/18 11:31:08 INFO audit: ugi=aims ip=unknown-ip-addr cmd=get_table : db=default tbl=health1 

Table is not found 
17/02/18 11:31:08 INFO SparkContext: Invoking stop() from shutdown hook 
17/02/18 11:31:08 INFO SparkUI: Stopped Spark web UI at http://10.0.0.3:4040 
17/02/18 11:31:08 INFO MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped! 
17/02/18 11:31:08 INFO MemoryStore: MemoryStore cleared 
17/02/18 11:31:08 INFO BlockManager: BlockManager stopped 
17/02/18 11:31:08 INFO BlockManagerMaster: BlockManagerMaster stopped 
17/02/18 11:31:08 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped! 
17/02/18 11:31:08 INFO SparkContext: Successfully stopped SparkContext 
17/02/18 11:31:08 INFO ShutdownHookManager: Shutdown hook called 
17/02/18 11:31:08 INFO ShutdownHookManager: Deleting directory /tmp/spark-ea93f7ec-6151-43e9-b5d9-bedbba537d62 

我使用Apache蜂巢1.2.0和2.1.0星火
我相信这个问题是因为版本的不。使用Eclipse Neon作为IDE.Kindly,让我知道为什么我面临这个问题以及我如何解决它。

回答

0

您需要指定模式名称。无论是作为SELECT * FROM schmaName.tableName或如下所示

try 
{ 
     spark.sql("use schemaName")   // name of the schema 
     spark.sql("select count(*) from heath1").show(); 
} 
catch (Exception AnalysisException) 
{ 
    System.out.print("\nTable is not found\n"); 
} 
+0

试图..不工作.. –

+0

什么是HDFS的位置的内容 - /用户/蜂巢/仓库???你能找到你的模式,然后在这个位置查找表格吗? –

+0

其文件 - comment.csv其中包含所有的数据。 –

相关问题