2015-04-06 45 views
0

我的数据是学校及其在某些科目评估中的表现以及在课程中注册性别的百分比列表。我创建了一个样本数据设置如下:正如你所看到的,没有性别比例共收集学校Y.研究有条件填补SAS

data have; 
    input school $ subject $ perc_male perc_female score similar_school $; 
datalines; 
X math 51 49 93 Y 
X english 48 52 95 Y 
X tech 60 40 90 Y 
X science 57 43 92 Y 
Y math . . 87 X 
Y english . . 83 X 
Y science . . 81 X 
Y language . . 91 X 
Z math 40 60 78 Z 
Z english 50 50 76 Z 
Z science 45 55 80 Z 
; 
run; 

表明,学校X有一个非常类似的性别分布,所以我要归咎于主题从X到Y的特定百分比。另一个问题是Y有语言得分,而X没有进行这种评估。在这种情况下,我希望采用估算值的平均值(51,48,57)来得到男性语言接受者百分比的52。

执行,这将证明我所需的输出:

data want; 
    input school $ subject $ perc_male perc_female score; 
datalines; 
X math 51 49 93 Y 
X english 48 52 95 Y 
X tech 60 40 90 Y 
X science 57 43 92 Y 
Y math 51 49 87 X 
Y english 48 52 83 X 
Y science 57 43 81 X 
Y language 52 48 91 X 
Z math 40 60 78 Z 
Z english 50 50 76 Z 
Z science 45 55 80 Z 
; 
run; 

有一个downvote,因此添加什么我试着让几乎我在哪里我需要。对于谁低估,我想知道你是否有任何建设性的反馈意见。谢谢!我想知道是否有方法将平均插补部分构建到我当前的片段中。另外,我认为可能有更有效的方法来做到这一点。任何帮助将不胜感激。

proc sql; 
    select distinct cats("'",similar_school,"'") into :school_list separated by ',' 
    from have 
    where perc_male=.; 
quit; 

proc sql; 
    create table stuff as 
    select similar_school as school, subject, perc_male, perc_female 
    from have 
    where school in (&school_list.); 
quit; 

proc sql; 
    create table want2 as 
    select a.school, a.subject, coalesce(a.perc_male,b.perc_male), coalesce(a.perc_female,b.perc_female), a.score, a.similar_school 
    from have as a 
    left join stuff as b 
     on a.school=b.school and a.subject=b.subject 
    ; 
quit; 

回答

1

根据您的预期数据,palin简单的SQL可以解决您的问题。您可以先根据学校和类似的学校信息进行自我加入,并合并perc_male & perc_female信息。这将照顾你的第一个问题。对于问题的第二部分,你可以计算每个学校的平均值,并将perc_male & perc_female信息与相应的学校平均值结合起来。看看下面的SQL,让我知道它是否有帮助。

proc sql; 
create table want as 
select aa.school 
    , aa.subject 
    , coalesce(aa.perc_male, mean(aa.perc_male)) as perc_male 
    , coalesce(aa.perc_female,mean(aa.perc_female)) as perc_female 
    , score 
    , similar_school 
from (
     select a.school 
      , a.subject 
      , coalesce(a.perc_male ,b.perc_male) as perc_male 
      , coalesce(a.perc_female,b.perc_female) as perc_female 
      , a.score 
      , a.similar_school 
     from have as a 
     left join have as b 
       on b.school=a.similar_school 
      and a.subject=b.subject 
    ) as aa 
group by aa.school 
; 
quit; 
+0

谢谢你sushil。这执行我需要的插补。 – pyll 2015-04-06 20:37:57