我想迭代相同的代码，用于像SAS一样的不同宏集，然后附加所有填充在一起的表。由于我来自萨斯背景，我很困惑如何在Pyspark环境中做到这一点。任何帮助深表感谢！如何在SAS中像pyspark一样循环宏？

实施例代码如下：

STEP1：定义宏变量

lastyear_st=201615 
lastyear_end=201622 

thisyear_st=201715 
thisyear_end=201722

STEP2：循环通过各种宏变量

代码

customer_spend=sqlContext.sql(""" 
select a.customer_code, 
sum(case when a.week_id between %d and %d then a.spend else 0 end) as spend 
from tableA 
group by a.card_code 
""" 
%(lastyear_st,lastyear_end) 
(thisyear_st,thisyear_end))

STEP3：附加上述各填充数据集的到基础表

来源

2017-06-17 Cagdas Kanar

# macroVars are your start and end values arranged as list of list. 
# where each innner list contains start and end value 

macroVars = [[201615,201622],[201715, 201722]] 

# loop thru list of list ==> 
for start,end in macroVars: 

    # prepare query using the values of start and end 
    query = "SELECT a.customer_code,Sum(CASE\ 
    WHEN a.week_id BETWEEN {} AND {} \ 
    THEN a.spend \ 
    ELSE 0 END) \ 
    AS spend FROM tablea GROUP BY a.card_code".format(start,end) 

    # execute query 
    customer_spend = sqlContext.sql(query) 

    # depending on your base table setup use appropriate write command for example 

    customer_spend\ 
    .write.mode('append')\ 
    .parquet(os.path.join(tempfile.mkdtemp(), 'data'))

来源

2017-06-17 17:36:14 Pushkr

嗨普希卡，谢谢你。我也可以在列表中使用字符串值吗？所以我的意思是，它可以是[['a'，'b'，'c']，[1,2，'x]]等等。 –

是的，你也可以使用字符串 – Pushkr

我也可以单独定义一个宏变量出数组，并在数组中引用它，例如：a =“”“花> 0然后1 else 0结束”“”[[a ，1,2]，[a，2,4]] –

如何在SAS中像pyspark一样循环宏？

STEP1：定义宏变量

STEP2：循环通过各种宏变量

STEP3：附加上述各填充数据集的到基础表

回答

相关问题