0
我有一个数据帧,其数据类型可以看出以下如何在pySpark中执行createOrReplaceTempView后维护列的数据类型?
orders.printSchema()
root
|-- order_id: long (nullable = true)
|-- user_id: long (nullable = true)
|-- eval_set: string (nullable = true)
|-- order_number: short (nullable = true)
|-- order_dow: short (nullable = true)
|-- order_hour_of_day: short (nullable = true)
|-- days_since_prior_order: short (nullable = true)
但是当我把它注册到一个表,数据类型都更改为字符串。
orders.createOrReplaceTempView("orders")
spark.sql("describe orders").show()
+--------------------+---------+-------+
| col_name|data_type|comment|
+--------------------+---------+-------+
| order_id| string| |
| user_id| string| |
| eval_set| string| |
| order_number| string| |
| order_dow| string| |
| order_hour_of_day| string| |
|days_since_prior_...| string| |
+--------------------+---------+-------+
那么如何在pyspark中将数据框的原始类型保留到表中。