0
我有两个数据集。让我们假设它们看起来像这样简单:将建模数据集的分布与观测数据集的分布匹配?
observed <- data.frame(name = c("Jenny", "Mark", "James", "Amber", "Jamie"),
height = c(68, 69, 72, 63, 77),
mood = c("content", "content", "melancholy", "happy", "melancholy"))
modeled <- data.frame(name = c("Alex", "Jimmy", "Sal", "Evelyn", "Maria", "George", "Hilary", "Donny", "Jose", "Luke", "Leia"),
height = c(74, 71, 68, 66, 80, 59, 67, 67, 69, 65, 72),
mood = c("content", "content", "melancholy", "happy", "melancholy","content", "content", "melancholy", "happy", "melancholy", "happy"))
我想从选择行建模,使得建模$高度的分布尽可能接近观察到$高度的分布。我需要保持行不变,而不是简单地匹配高度整数的分布。任何有识之士将不胜感激。
你的意思是*尽可能接近*?如果你基于'%observed $ height'中的'建模$ height%'来过滤'模型',那么你将得到完全匹配。这是你想要的吗? – coffeinjunky
这些数据集很差,无法解决这个问题,因为它们太小了。我希望高度栏的密度分布匹配。 –