我有大量的数据,我希望使用GORM加载到数据库中。Grails DuplicateKeyException/NonUniqueObjectException异步承诺内批量加载时
class DbLoadingService {
static transactional = false
// these are used to expedite the batch loading process
def sessionFactory
def propertyInstanceMap = org.codehaus.groovy.grails.plugins.DomainClassGrailsPlugin.PROPERTY_INSTANCE_MAP
// these are example services that will assist in the parsing of the input data
def auxLoadingServiceA
def auxLoadingServiceB
def handleInputFile(String filename) {
def inputFile = new File(filename)
// parse each line and process according to record type
inputFile.eachLine { line, lineNumber ->
this.handleLine(line, lineNumber)
}
}
@Transactional
def handleLine(String line, int lineNumber) {
// do some further parsing of the line, based on its content
// example here is based on 1st 2 chars of line
switch (line[0..1]) {
case 'AA':
auxLoadingServiceA.doSomethingWithLine(line)
break;
case 'BB':
auxLoadingServiceB.doSomethingElseWithLine(line)
break;
default:
break;
}
if (lineNumber % 100 == 0) cleanUpGorm()
}
def cleanUpGorm() {
def session = sessionFactory.getCurrentSession()
session.flush()
session.clear()
propertyInstanceMap.get().clear()
}
}
class AuxLoadingServiceA {
static transactional = false
doSomethingWithLine(String line) {
// do something here
}
}
class AuxLoadingServiceB {
static transactional = false
doSomethingElseWithLine(String line) {
// do something else here
}
}
我故意只对每一行的负载做了顶级服务transactional。实际上在顶层下有很多级别的服务,而不仅仅是所示的单个Aux A服务层。因此,我不希望产生多层事务的开销:我认为我应该只需要1.
加载到数据库中的数据模型包含一对具有hasMany/belongsTo关系的域对象。与域对象的这种交互是在子层内完成的,并且不会显示在我的代码中以保持示例的可管理性。
,这似乎是引起该问题的域对象类似于此:
class Parent {
static hasMany = [children: Child]
static mapping = {
children lazy: false
cache true
}
}
class Child {
String someValue
// also contains some other sub-objects
static belongsTo = [parent : Parent]
static mapping = {
parent index: 'parent_idx'
cache true
}
}
需要所示的cleanupGorm()方法,否则服务研磨到大量的线路后完全停止。当我移动加载到一个异步过程,这样一旦
// Called from with a service/controller
dbLoadingService.handleInputFile("someFile.txt")
然而,:
def promise = task {
dbLoadingService.handleInputFile("someFile.txt")
}
我得到
当我启动数据库负载,一切工作完全按预期一个DuplicateKeyException/NonUniqueObjectException:
error details: org.springframework.dao.DuplicateKeyException: A different object with the same identifier value was already associated with the session : [com.example.SampleDomainObject#1]; nested exception is org.hibernate.NonUniqueObjectException: A different object with the same identifier value was already associated with the session : [com.example.SampleDomainObject#1]
所以,我的问题是,什么是关于将大量数据异步加载到Grails DB中的最佳实践?为了确保内存中的对象在会话中保持一致,是否需要执行刷新/清除会话的操作?缓存对象时是否需要完成某些操作?
首先,你不应该做这样的重批处理。使用像Spring Batch这样的真正的批处理框架。但是,您是否尝试过为每个任务使用新的休眠会话?这可能有帮助。 – 2015-02-10 19:37:38
新的hibernate会话,意味着这里有一个新的hibernate会话吗? inputFile.eachLine {line,lineNumber - > this.handleLine(line,lineNumber) } – John 2015-02-10 19:42:30
是的,使用这个:http://grails.github.io/grails-doc/latest/ref/Domain%20Classes/withNewSession。 html – 2015-02-10 19:45:06