案例:为什么非线性规划(NLP)解算器Rsolnp中的目标函数不符合?
我有3个地区(巴西,新西兰,美国)的宇宙(我的实际问题是更大 - 31个区)。这三个地区通过迁移相连。例如,如果有10人从巴西转移到美国(BRZ-USA),我们就有移民到美国(人流入)和从巴西移民(人流出)。我有一个关于给定宇宙中所有可能的迁移流的迁移率数据集(3 * 2 = 6)。另外,我还有每个地区的人口数据集。当我将迁移率乘以人口时,我获得了移民数量。然后我可以计算每个地区的移民人数和移民人数。从移民中扣除移民导致净移民数量(可以是正数或负数)。然而,由于我们有一个平衡的制度(每个地区的流入流量相等),所有地区的净流动人口总和应为零。除了净移民率和人口外,我还会从每个地区假设的未来情景中获得净移民数量。但情景净移民数量与我可以从我的数据计算的数量不同。因此,我想通过增加或减少一个固定数字来扩大和缩小6个迁移率,以使得到的净迁移数量符合方案值。我使用非线性规划(NLP)解算器Rsolnp来完成此任务(请参阅下面的示例脚本)。
问题:
我已指定在最小二乘方程的形式的目标函数,因为它是迫使6个定标器以尽可能接近零越好目标。另外,我正在使用等式约束函数来满足场景值。这一切都正常工作,解决方案提供了可以添加到迁移率的标量,从而导致迁移计数与场景值完美匹配(请参阅脚本部分“测试目标是否已达到”)。但是,我还想将权重(变量:w)应用于目标函数,以便某些标量的较高值受到较大的惩罚。但是,无论我如何指定权重,我总是会获得相同的解决方案(请参阅“不同权重的示例结果”)。所以看起来求解器不遵守目标函数。有没有人有一个想法,为什么是这种情况,以及如何改变目标函数,以便使用权重是可能的?非常感谢您的帮助!
library(Rsolnp)
# Regions
regUAll=c("BRZ","NZL","USA") # "BRZ"=Brazil; "NZL"=New Zealand; "USA"=United States
#* Generate unique combinations of regions
uCombi=expand.grid(regUAll,regUAll,stringsAsFactors=F)
uCombi=uCombi[uCombi$Var1!=uCombi$Var2,] # remove same region combination (e.g., BRZ-BRZ)
uCombi=paste(uCombi$Var2,uCombi$Var1,sep="-")
#* Generate data frames
# Migration rates - rows represent major age groups (row1=0-25 years, row2=26-50 years, row3=51-75 years)
dfnm=data.frame(matrix(rep(c(0.01,0.04,0.02),length(uCombi)),ncol=length(uCombi),nrow=3),stringsAsFactors=F) # generate empty df
names(dfnm)=uCombi # assign variable names
# Population (number of people) in region of origin
pop=c(rep(c(20,40,10),2),rep(c(4,7,2),2),rep(c(30,70,50),2))
dfpop=data.frame(matrix(pop,ncol=length(uCombi),nrow=3),stringsAsFactors=F) # generate empty df
names(dfpop)=uCombi # assign variable names
#* Objective function for optimization
# Note: Least squares method to keep the additive scalers as close to 0 as possible
# The sum expression allows for flexible numbers of scalars to be included but is identical to: w[1](scal[1]-0)^2+w[2](scal[2]-0)^2+w[3](scal[3]-0)^2+w[4](scal[4]-0)^2+w[5](scal[5]-0)^2+w[6](scal[6]-0)^2
f.main=function(scal,nScal,w,dfnm,dfpop,regUAll){
sum(w*(scal[1:nScal]-0)^2)
}
#* Equality contraint function
f.equal=function(scal,nScal,w,dfnm,dfpop,regUAll){
#* Adjust net migration rates by scalar
for(s in 1:nScal){
dfnm[,s]=dfnm[,s]+scal[s]
}
#* Compute migration population from data
nmp=sapply(dfpop*dfnm,sum) # sums migration population across age groups
nmd=numeric(length(regUAll)); names(nmd)=regUAll # generate named vector to be filled with values
for(i in 1:length(regUAll)){
colnEm=names(nmp)[grep(paste0("^",regUAll[i],"-.*"),names(nmp))] # emigration columns
colnIm=names(nmp)[grep(paste0("^.*","-",regUAll[i],"$"),names(nmp))] # immigration columns
nmd[regUAll[i]]=sum(nmp[colnIm])-sum(nmp[colnEm]) # compute net migration population = immigration - emigration
}
nmd=nmd[1:(length(nmd)-1)] # remove the last equality constraint value - not needed because we have a closed system in which global net migration=0
return(nmd)
}
#* Set optimization parameters
cpar2=list(delta=1,tol=1,outer.iter=10,trace=1) # optimizer settings
nScal=ncol(dfnm) # number of scalars to be used
initScal=rep(0,nScal) # initial values of additive scalars
lowScal=rep(-1,nScal) # lower bounds on scalars
highScal=rep(1,nScal) # upper bounds on scalars
nms=c(-50,10) # target values: BRZ=-50, NZL=10, USA=40; last target value does not need to be included since we deal with a closed system in which global net migration sums to 0
w=c(1,1,1,1,1,1) # unity weights
#w=c(1,1,2,2,1,1) # double weight on NZL
#w=c(5,1,2,7,1,0.5) # mixed weights
#* Perform optimization using solnp
solRes=solnp(initScal,fun=f.main,eqfun=f.equal,eqB=nms,LB=lowScal,UB=highScal,control=cpar2,
nScal=nScal,w=w,dfnm=dfnm,dfpop=dfpop,regUAll=regUAll)
scalSol=solRes$pars # return optimized values of scalars
# Example results for different weights
#[1] 0.101645349 0.110108019 -0.018876993 0.001571639 -0.235945755 -0.018134294 # w=c(1,1,1,1,1,1)
#[1] 0.101645349 0.110108019 -0.018876993 0.001571639 -0.235945755 -0.018134294 # w=c(1,1,2,2,1,1)
#[1] 0.101645349 0.110108019 -0.018876993 0.001571639 -0.235945755 -0.018134294 # w=c(5,1,2,7,1,0.5)
#*** Test if target was reached
# Adjust net migration rates using the optimized scalars
for(s in 1:nScal){
dfnm[,s]=dfnm[,s]+scalSol[s]
}
# Compute new migration population
nmp=sapply(dfpop*dfnm,sum) # sums migration population across age groups
nmd=numeric(length(regUAll)); names(nmd)=regUAll # generate named vector to be filled with values
for(i in 1:length(regUAll)){
colnEm=names(nmp)[grep(paste0("^",regUAll[i],"-.*"),names(nmp))] # emigration columns
colnIm=names(nmp)[grep(paste0("^.*","-",regUAll[i],"$"),names(nmp))] # immigration columns
nmd[regUAll[i]]=sum(nmp[colnIm])-sum(nmp[colnEm]) # compute net migration population = immigration - emigration
}
nmd # should be -50,10,40 if scalars work correctly
尝试不同于零的起始值('initScal');对于所有w,和(w * 0^2)= 0。 – 2014-12-01 19:10:10
伟大的建议@NealFultz!当将初始值设置为不同于零的值时,权重实际上就起作用。如果您将答案作为普通帖子发布,我可以接受。唯一令我担心的是初始值的值对所得到的标量值有很大影响。 – Raphael 2014-12-02 16:18:16