是否有这个从同步方法调用同步方法的同步成本是多少?
synchronized void x() {
y();
}
synchronized void y() {
}
这
synchronized void x() {
y();
}
void y() {
}
是否有这个从同步方法调用同步方法的同步成本是多少?
synchronized void x() {
y();
}
synchronized void y() {
}
这
synchronized void x() {
y();
}
void y() {
}
是的,有一个额外的性能开销,除非及直至JVM内联调用y()
,其现代化的JIT之间的任何性能上的差异编译器会以相当短的时间完成。首先,考虑你在课堂外可以看到y()
的情况。在这种情况下,JVM必须检查输入y()
以确保它可以进入对象的显示器;当呼叫来自x()
时,此检查将始终成功,但它不能被跳过,因为呼叫可能来自班级以外的客户。这个额外支票的成本很低。
另外,考虑其中y()
是private
的情况。在这种情况下,编译器仍未优化掉同步;看到一个空y()
以下拆卸:
private synchronized void y();
flags: ACC_PRIVATE, ACC_SYNCHRONIZED
Code:
stack=0, locals=1, args_size=1
0: return
According to the spec's definition of synchronized
,每个进入一个块或方法进行物体上锁定动作,并留下执行解锁动作。除非锁定计数器下降到零,否则没有其他线程可以获取该对象的监视器。据推测,某种静态分析可以证明private synchronized
方法只能从其他方法中调用,但是Java的多源文件支持最多只会使这种脆弱,甚至忽略反射。关于其回报方法和显示器出口的调用This means that the JVM must still increment the counter on entering y()
:
监控项,由Java虚拟机的方法调用隐式地处理和返回指令,仿佛monitorenter和monitorexit使用。
@AmolSonawane correctly notes该JVM可通过执行锁粗,基本上内联y()
方法优化运行时该代码。在这种情况下,在JVM决定执行JIT优化之后,从x()
调用到y()
将不会引起任何额外的性能开销,但当然从其他位置直接调用y()
仍然需要单独获取监视器。
在两种方法同步的情况下,您将锁定显示器两次。所以第一种方法会有额外的额外开销锁定。但是您的JVM可以通过锁定粗化来降低锁定的成本,并且可以在线调用y()。
如果您已经持有它,则不需要获取该锁... – assylias
这不是真的,如果两个mehtods都是同步的并且是非静态的,则不需要额外的锁定是必须的。 –
“线程t可能会多次锁定某个特定的监视器;每次解锁都会反转一次锁定操作的效果。” - Java语言规范17.1 –
为什么不测试它!?我跑了一个快速的基准。在回路中调用benchmark()
方法进行预热。这可能不是很准确,但确实显示出一些一致的有趣模式。
public class Test {
public static void main(String[] args) {
for (int i = 0; i < 100; i++) {
System.out.println("+++++++++");
benchMark();
}
}
static void benchMark() {
Test t = new Test();
long start = System.nanoTime();
for (int i = 0; i < 100; i++) {
t.x();
}
System.out.println("Double sync:" + (System.nanoTime() - start)/1e6);
start = System.nanoTime();
for (int i = 0; i < 100; i++) {
t.x1();
}
System.out.println("Single sync:" + (System.nanoTime() - start)/1e6);
}
synchronized void x() {
y();
}
synchronized void y() {
}
synchronized void x1() {
y1();
}
void y1() {
}
}
结果(最后10)
+++++++++
Double sync:0.021686
Single sync:0.017861
+++++++++
Double sync:0.021447
Single sync:0.017929
+++++++++
Double sync:0.021608
Single sync:0.016563
+++++++++
Double sync:0.022007
Single sync:0.017681
+++++++++
Double sync:0.021454
Single sync:0.017684
+++++++++
Double sync:0.020821
Single sync:0.017776
+++++++++
Double sync:0.021107
Single sync:0.017662
+++++++++
Double sync:0.020832
Single sync:0.017982
+++++++++
Double sync:0.021001
Single sync:0.017615
+++++++++
Double sync:0.042347
Single sync:0.023859
貌似第二变化确实略微更快。
一个micro benchmark run with jmh
Benchmark Mean Mean error Units
c.a.p.SO18996783.syncOnce 21.003 0.091 nsec/op
c.a.p.SO18996783.syncTwice 20.937 0.108 nsec/op
=>无统计学差异的结果。
查看生成的程序集显示已经执行了锁定粗化,并且x_sync
中已经内嵌y_sync
,但它是同步的。
全部结果:
Benchmarks:
# Running: com.assylias.performance.SO18996783.syncOnce
Iteration 1 (5000ms in 1 thread): 21.049 nsec/op
Iteration 2 (5000ms in 1 thread): 21.052 nsec/op
Iteration 3 (5000ms in 1 thread): 20.959 nsec/op
Iteration 4 (5000ms in 1 thread): 20.977 nsec/op
Iteration 5 (5000ms in 1 thread): 20.977 nsec/op
Run result "syncOnce": 21.003 ±(95%) 0.055 ±(99%) 0.091 nsec/op
Run statistics "syncOnce": min = 20.959, avg = 21.003, max = 21.052, stdev = 0.044
Run confidence intervals "syncOnce": 95% [20.948, 21.058], 99% [20.912, 21.094]
Benchmarks:
com.assylias.performance.SO18996783.syncTwice
Iteration 1 (5000ms in 1 thread): 21.006 nsec/op
Iteration 2 (5000ms in 1 thread): 20.954 nsec/op
Iteration 3 (5000ms in 1 thread): 20.953 nsec/op
Iteration 4 (5000ms in 1 thread): 20.869 nsec/op
Iteration 5 (5000ms in 1 thread): 20.903 nsec/op
Run result "syncTwice": 20.937 ±(95%) 0.065 ±(99%) 0.108 nsec/op
Run statistics "syncTwice": min = 20.869, avg = 20.937, max = 21.006, stdev = 0.052
Run confidence intervals "syncTwice": 95% [20.872, 21.002], 99% [20.829, 21.045]
无差异会在那里。由于线程内容只能在x()获取锁定。在x()处获得锁的线程可以在y()处获取锁而不会发生任何争用(因为那只是在特定时间可以到达该点的线程)。因此,在那里放置同步没有任何影响。
测试可以发现如下(你必须猜测一些方法做,但没有什么复杂的):
它测试他们每100个线程并开始计数的平均值,其中70%已经完成后(如预热) 。
它在最后打印一次。
public static final class Test {
final int iterations = 100;
final int jiterations = 1000000;
final int count = (int) (0.7 * iterations);
final AtomicInteger finishedSingle = new AtomicInteger(iterations);
final AtomicInteger finishedZynced = new AtomicInteger(iterations);
final MovingAverage.Cumulative singleCum = new MovingAverage.Cumulative();
final MovingAverage.Cumulative zyncedCum = new MovingAverage.Cumulative();
final MovingAverage singleConv = new MovingAverage.Converging(0.5);
final MovingAverage zyncedConv = new MovingAverage.Converging(0.5);
// -----------------------------------------------------------
// -----------------------------------------------------------
public static void main(String[] args) {
final Test test = new Test();
for (int i = 0; i < test.iterations; i++) {
test.benchmark(i);
}
Threads.sleep(1000000);
}
// -----------------------------------------------------------
// -----------------------------------------------------------
void benchmark(int i) {
Threads.async(()->{
long start = System.nanoTime();
for (int j = 0; j < jiterations; j++) {
a();
}
long elapsed = System.nanoTime() - start;
int v = this.finishedSingle.decrementAndGet();
if (v <= count) {
singleCum.add (elapsed);
singleConv.add(elapsed);
}
if (v == 0) {
System.out.println(elapsed);
System.out.println("Single Cum:\t\t" + singleCum.val());
System.out.println("Single Conv:\t" + singleConv.val());
System.out.println();
}
});
Threads.async(()->{
long start = System.nanoTime();
for (int j = 0; j < jiterations; j++) {
az();
}
long elapsed = System.nanoTime() - start;
int v = this.finishedZynced.decrementAndGet();
if (v <= count) {
zyncedCum.add(elapsed);
zyncedConv.add(elapsed);
}
if (v == 0) {
// Just to avoid the output not overlapping with the one above
Threads.sleep(500);
System.out.println();
System.out.println("Zynced Cum: \t" + zyncedCum.val());
System.out.println("Zynced Conv:\t" + zyncedConv.val());
System.out.println();
}
});
}
synchronized void a() { b(); }
void b() { c(); }
void c() { d(); }
void d() { e(); }
void e() { f(); }
void f() { g(); }
void g() { h(); }
void h() { i(); }
void i() { }
synchronized void az() { bz(); }
synchronized void bz() { cz(); }
synchronized void cz() { dz(); }
synchronized void dz() { ez(); }
synchronized void ez() { fz(); }
synchronized void fz() { gz(); }
synchronized void gz() { hz(); }
synchronized void hz() { iz(); }
synchronized void iz() {}
}
基本上(原子方式执行)MovingAverage.Cumulative添加: 平均=(平均*(N)+数)/(++ N);
MovingAverage.Converging可以查找但使用另一个公式。
50秒预热后的结果:
随着:jiterations - >百万
Zynced Cum: 3.2017985649516254E11
Zynced Conv: 8.11945143126507E10
Single Cum: 4.747368153507841E11
Single Conv: 8.277793176290959E10
这就是纳米秒平均值。这真的没什么,甚至表明,zynced人需要更少的时间。
附:jiterations - >原* 10(需要更长的时间)
Zynced Cum: 7.462005651190714E11
Zynced Conv: 9.03751742946726E11
Single Cum: 9.088230941676143E11
Single Conv: 9.09877020004914E11
正如你所看到的结果表明,它真的不是一个很大的区别。实际上已经有较低的最后30%完成的平均时间。
每一个线程(迭代= 1)和jiterations =原始* 100;
Zynced Cum: 6.9167088486E10
Zynced Conv: 6.9167088486E10
Single Cum: 6.9814404337E10
Single Conv: 6.9814404337E10
在同一线程环境(除去Threads.async调用)
附:jiterations - >原* 10
Single Cum: 2.940499529542545E8
Single Conv: 5.0342450600964054E7
Zynced Cum: 1.1930525617915475E9
Zynced Conv: 6.672312498662484E8
的zynced一个在这里似乎要慢。大约为10。其原因可能是由于每次之后都会运行一次zynced,谁知道。没有能量去尝试相反的。
最后一次测试有:
public static final class Test {
final int iterations = 100;
final int jiterations = 10000000;
final int count = (int) (0.7 * iterations);
final AtomicInteger finishedSingle = new AtomicInteger(iterations);
final AtomicInteger finishedZynced = new AtomicInteger(iterations);
final MovingAverage.Cumulative singleCum = new MovingAverage.Cumulative();
final MovingAverage.Cumulative zyncedCum = new MovingAverage.Cumulative();
final MovingAverage singleConv = new MovingAverage.Converging(0.5);
final MovingAverage zyncedConv = new MovingAverage.Converging(0.5);
// -----------------------------------------------------------
// -----------------------------------------------------------
public static void main(String[] args) {
final Test test = new Test();
for (int i = 0; i < test.iterations; i++) {
test.benchmark(i);
}
Threads.sleep(1000000);
}
// -----------------------------------------------------------
// -----------------------------------------------------------
void benchmark(int i) {
long start = System.nanoTime();
for (int j = 0; j < jiterations; j++) {
a();
}
long elapsed = System.nanoTime() - start;
int s = this.finishedSingle.decrementAndGet();
if (s <= count) {
singleCum.add (elapsed);
singleConv.add(elapsed);
}
if (s == 0) {
System.out.println(elapsed);
System.out.println("Single Cum:\t\t" + singleCum.val());
System.out.println("Single Conv:\t" + singleConv.val());
System.out.println();
}
long zstart = System.nanoTime();
for (int j = 0; j < jiterations; j++) {
az();
}
long elapzed = System.nanoTime() - zstart;
int z = this.finishedZynced.decrementAndGet();
if (z <= count) {
zyncedCum.add(elapzed);
zyncedConv.add(elapzed);
}
if (z == 0) {
// Just to avoid the output not overlapping with the one above
Threads.sleep(500);
System.out.println();
System.out.println("Zynced Cum: \t" + zyncedCum.val());
System.out.println("Zynced Conv:\t" + zyncedConv.val());
System.out.println();
}
}
synchronized void a() { b(); }
void b() { c(); }
void c() { d(); }
void d() { e(); }
void e() { f(); }
void f() { g(); }
void g() { h(); }
void h() { i(); }
void i() { }
synchronized void az() { bz(); }
synchronized void bz() { cz(); }
synchronized void cz() { dz(); }
synchronized void dz() { ez(); }
synchronized void ez() { fz(); }
synchronized void fz() { gz(); }
synchronized void gz() { hz(); }
synchronized void hz() { iz(); }
synchronized void iz() {}
}
结论,实在是没有什么区别。
如果有区别,我会感到惊讶。另请参阅http://www.oracle.com/technetwork/java/6-performance-137236.html(2.1.1和2.1.2) – assylias