spf4j 8.2.1 is available in maven central

04:54PM Dec 16, 2016 in category General by Zoltan Farkas

This is a bug fix release, for details see: https://github.com/zolyfarkas/spf4j/releases/tag/spf4j-8.2.1  


The rise of demagogues in the age of technological disruption > /brain/dump

06:54AM Nov 24, 2016 in category General by Zoltan Farkas

I thought I will never write on my blog about politics, but recent events triggered this brain dump…

We have been living in the age of technological disruption for a while now. As with any disruptions there will be winners and losers and quite often historically the repercussions have been severe, weavers smashed stocking frames, rebellions/revolutions with deadly consequences, … (see https://en.wikipedia.org/wiki/Luddite).

In the past few decades, technological advancements (like the internet, faster and cheaper computers, sensors, better programming languages, etc..) accelerated the speed of disruption and caught a lot of people unprepared for the new world. There are plenty of jobs, the problem is that they don’t match the skill-set of the unemployed population.

Right now in the UK and the US this dislocation resulted in Brexit + Trump. In both cases demagogues with generic promises like “bring jobs back” and nationalistic slogans like “[your nationality here] first” won the election. Although the motivation of the vote was similar (blame globalization), people voted for very different paths. (California voting to secede from the union to better fund Obama-care would be a closer comparison to Brexit…).

Although Brexit might not actually happen (I don’t think the goals of the separatist side are achievable and 51.9% is not really a “overwhelming support” as some say, another referendum with a different turnout and the “overwhelming support” can be for the opposite), Trump will happen… and my only hope is that the limits of presidential power will stop the nonsensical ideas that were floated by the Trump campaign, and maybe the few good ones will get through. (unlikely congress will vote to limit their own terms... and there is no reason to believe anything that was said...)

Thing is, more dislocation is on the horizon, are you a truck driver? bus driver? taxi driver? Your job might not be there for long. And globalization will have nothing to do with it. There will be a lot of new jobs that these dislocations will create… you just need to have the skill and the willingness to relocate…

I have recently watched a excellent presentation “Types vs Tests” (https://www.infoq.com/presentations/Types-Tests), and found out that one of the presenters (Amanda Laucher) is teaching computer programming to coal miners (http://money.cnn.com/2016/04/22/news/economy/coal-workers-computer-coders/). This is what we need, not more coal mining jobs, coal mining was always dangerous and I would prefer sending robots into the deep of the earth and avoid the dramatic results of mining accidents. Kudos for Amanda and her husband, they are a good example on what can be done.

Do we want to prop up the dying industries of yesterday instead of investing in the industries of tomorrow?

I also don’t understand the anti-environmental rhetoric, I don’t want to live in a country that does not care about the environment, I don’t want to breathe dirty air, I don’t want to drink dirty water (Do people in Flint Michigan like their water?), I don’t want to eat produce grown in a polluted environment, I don’t want to swim in water polluted by an oil spill. Not too long ago, after arriving in Germany, the controversial artist Ai Wei Wei said “The light is so beautiful and the air is so clear. It seems surreal to me. Sometimes I feel like I could cry.” (http://www.spiegel.de/international/world/artist-ai-weiwei-has-left-china-for-germany-a-1047793.html). Do we really want to travel (if not immigrate) to other parts of the world to see clear skies and breathe clean air?

We are all either immigrants or descendants of immigrants, we should “Do to others as you would have them do to you.”, we have to keep in mind that any one of us can become immigrants in the future…

And now let’s celebrate by eating that turkey!


The age of privacy started in 2010!

06:19AM Nov 05, 2016 in category General by Zoltan Farkas

The age of privacy started right when Facebook founder said "Privacy no longer a social norm". Recently I overheard 2 teenagers having a conversation about privacy... I was shocked! I thought this is the generation that updates/tweets/shares their every step on the internet... so the future might not be as bad as I believed... there is hope!... now back to earth and reality...

Privacy is necessary, and in my opinion it is a human right.

If you don't think so, see, or think about your telco being hacked and your internet access logs becoming available to everyone to analyze... The best way to understand the dangers is study computer science (computer networks), which everyone should do...


Spf4j + Junit a new way to look at your unit tests

04:58PM Oct 08, 2016 in category General by Zoltan Farkas

JUnit reports execution time for unit tests, I recently experienced some very slow unit tests.. but had no idea why... Since guesswork is not my favorite pastime, sp4j-junit was born.  

Easy to use (see), and now you can reduce the guesswork. Each unit test will have its own ssdump file for you to look at...

If you run a code coverage tool when running your unit tests (which you should always do), you should see its impact...




Benchmarking/Profiling (In)sanity

05:34AM Jun 19, 2016 in category Java by Zoltan Farkas

Benchmarking/Profiling (In)sanity

(with more complete data at

It all started when I stumbled upon AppendableWriter in guava which is nothing more than an “adapter” class that adapts an Appendable to a Writer:

    // It turns out that creating a new String is usually as fast, or faster

    // than wrapping cbuf in a light-weight CharSequence.

    target.append(new String(cbuf, off, len));

This statement smells… it looks like a decision made to quickly based on either a bad benchmark or a bad interpretation of the benchmark results. (which is unfortunately the norm)

So I created an implementation with a light weight CharSequence, with a comment claiming the opposite (with slightly more detail though):


    I wrote a JMH benchmark, and the results are as expected the opposite:


    Benchmark                                          Mode  Cnt      Score     Error  Units

    AppendableWriterBenchmark.guavaAppendable         thrpt   10  10731.940 ± 427.258  ops/s

    AppendableWriterBenchmark.spf4jAppendable         thrpt   10  17613.093 ± 344.769  ops/s


     Using a light weight wrapper is more than 50% faster!

     See AppendableWriterBenchmark for more detail...


    appendable.append(CharBuffer.wrap(cbuf), off, off + len); 

I have written a  benchmark to validate my results based on the JMH benchmarking framework… (complete code at). I run these benchmarks s part of the release process of spf4j.

The world was good for a while... after several JVM updates, OS updates.... I noticed something weird about my benchmark:

# JMH 1.12 (released 75 days ago)

# VM version: JDK 1.8.0_92, VM 25.92-b14

# VM invoker: /Library/Java/JavaVirtualMachines/jdk1.8.0_92.jdk/Contents/Home/jre/bin/java

# VM options: -Xmx256m -Xms256m -XX:+UnlockCommercialFeatures -Djmh.stack.profiles=/Users/zoly/NetBeansProjects/spf4j/spf4j-benchmarks/target -Dspf4j.executors.defaultExecutor.daemon=true -Djmh.executor=FJP -Djmh.fr.options=defaultrecording=true,settings=/Users/zoly/NetBeansProjects/spf4j/spf4j-benchmarks/src/main/jfc/profile.jfc

# Warmup: 10 iterations, 1 s each

# Measurement: 10 iterations, 1 s each

# Timeout: 10 min per iteration

# Threads: 8 threads, will synchronize iterations

# Benchmark mode: Throughput, ops/time

# Benchmark: com.google.common.io.AppendableWriterBenchmark.spf4jAppendable

# Run progress: 50.00% complete, ETA 00:00:21

# Fork: 1 of 1

# Preparing profilers: JmhFlightRecorderProfiler 

# Profilers consume stderr from target VM, use -v EXTRA to copy to console

# Warmup Iteration   1: 70485.272 ops/s

# Warmup Iteration   2: 77614.082 ops/s

# Warmup Iteration   3: 4704.416 ops/s

# Warmup Iteration   4: 4730.839 ops/s

# Warmup Iteration   5: 4722.175 ops/s

# Warmup Iteration   6: 4718.885 ops/s

# Warmup Iteration   7: 4575.629 ops/s

# Warmup Iteration   8: 4710.284 ops/s

# Warmup Iteration   9: 4805.996 ops/s

# Warmup Iteration  10: 4747.728 ops/s

Iteration   1: 4903.879 ops/s

Iteration   2: 4912.047 ops/s

Iteration   3: 4972.782 ops/s

Iteration   4: 5015.564 ops/s

Iteration   5: 5021.001 ops/s

Iteration   6: 4958.149 ops/s

Iteration   7: 4883.546 ops/s

Iteration   8: 4733.823 ops/s

Iteration   9: 4800.601 ops/s

Iteration  10: 4896.969 ops/s

10x degradation during warmup ?!!! What the heck? This smells like a memory issue... but what the **** is going on? There are no leaks as the memory chart captured during the benchmark looks benign. I can see that GC is working smoothly, why would some piece of code just slow down? (warmup is supposed to make things faster after all…), is the JIT doing a “deoptimization”?, is it memory allocations? Lets use the “drunk man anti-method” (see for detail) and do an experiment with G1GC:

# JMH 1.12 (released 75 days ago)

# VM version: JDK 1.8.0_92, VM 25.92-b14

# VM invoker: /Library/Java/JavaVirtualMachines/jdk1.8.0_92.jdk/Contents/Home/jre/bin/java

# VM options: -XX:+UseG1GC -Xmx256m -Xms256m -XX:+UnlockCommercialFeatures -Djmh.stack.profiles=/Users/zoly/NetBeansProjects/spf4j/spf4j-benchmarks/target -Dspf4j.executors.defaultExecutor.daemon=true -Djmh.executor=FJP -Djmh.fr.options=defaultrecording=true,settings=/Users/zoly/NetBeansProjects/spf4j/spf4j-benchmarks/src/main/jfc/profile.jfc

# Warmup: 10 iterations, 1 s each

# Measurement: 10 iterations, 1 s each

# Timeout: 10 min per iteration

# Threads: 8 threads, will synchronize iterations

# Benchmark mode: Throughput, ops/time

# Benchmark: com.google.common.io.AppendableWriterBenchmark.spf4jAppendable

# Run progress: 50.00% complete, ETA 00:00:21

# Fork: 1 of 1

# Preparing profilers: JmhFlightRecorderProfiler 

# Profilers consume stderr from target VM, use -v EXTRA to copy to console

# Warmup Iteration   1: 62995.070 ops/s

# Warmup Iteration   2: 70660.918 ops/s

# Warmup Iteration   3: 73127.240 ops/s

# Warmup Iteration   4: 69744.209 ops/s

# Warmup Iteration   5: 71235.279 ops/s

# Warmup Iteration   6: 70993.784 ops/s

# Warmup Iteration   7: 73375.331 ops/s

# Warmup Iteration   8: 74783.664 ops/s

# Warmup Iteration   9: 74197.163 ops/s

# Warmup Iteration  10: 74138.756 ops/s

Iteration   1: 73277.416 ops/s

Iteration   2: 72664.450 ops/s

Iteration   3: 72385.558 ops/s

Iteration   4: 71675.424 ops/s

Iteration   5: 74182.841 ops/s

Iteration   6: 75584.596 ops/s

Iteration   7: 74036.789 ops/s

Iteration   8: 72369.089 ops/s

Iteration   9: 72409.741 ops/s

Iteration  10: 72425.595 ops/s

Surprise surprise! The performance degradation magically disappears... heap chart looks good as well.

So for whatever reason this degradation does not happen with the G1 Garbage collector, but why? I can’t image G1 GC being so magic (there is no magic in software)…

I have improved the SP4j profiler (www.spf4j.org) to get the ability to collect iteration specific Stack samples with JMH (Spf4jJmhProfiler) and I made a comparison between Iteration 1 (fast) and Iteration 6 (slow) with the default GC (Parallel) and I also did the same  with G1GC: 

Hmm in the G1 world, like in the observed fast iteration with the Parallel GC, we don’t see the HeapCharBuffer.get invocation as much, which means this invocation is probably inlined…. Which makes me think that a de-optimization happens and this de-optimization is related somehow to Garbage collection?…

Lets dig in further, and turn on compilation output (-XX:+PrintCompilation):

   5697  654       3       java.nio.Buffer::position (43 bytes)   made zombie

   5697  662       3       java.nio.Buffer::limit (62 bytes)   made zombie

   5697  969       3       org.apache.avro.generic.GenericData::getField (11 bytes)

   5697  741       3       java.nio.HeapCharBuffer::get (15 bytes)   made zombie

   5697  739       3       java.nio.CharBuffer::charAt (16 bytes)   made zombie

   5697  745       3       java.nio.Buffer::<init> (121 bytes)   made zombie

   5697  747       3       java.nio.CharBuffer::<init> (22 bytes)   made zombie

   5697  754       3       java.lang.StringBuilder::append (8 bytes)   made zombie

   5697  750       3      71901.755 ops/s — LAST time it was fast!

 java.nio.CharBuffer::length (5 bytes)   made zombie

   5697  753       3       org.spf4j.io.AppendableWriter::write (15 bytes)   made zombie

   5697  757       3       java.lang.AbstractStringBuilder::append (54 bytes)   made zombie

   5697  755       3       java.lang.StringBuilder::append (6 bytes)   made zombie

   5697  749       3       java.lang.AbstractStringBuilder::append (144 bytes)   made zombie

   5699  971       3       java.lang.String::toString (2 bytes)

   5700  972       3       java.io.ObjectOutputStream$BlockDataOutputStream::setBlockDataMode (35 bytes)

# Warmup Iteration   3:    5728  973       3       java.util.Arrays::fill (21 bytes)

   6712  975       3       java.util.ArrayList::<init> (12 bytes)

   8121  984     n 0       java.lang.Class::isInstance (native)   

   8122  985       3       java.io.ObjectOutputStream::writeHandle (21 bytes)

5461.899 ops/s]

They key methods of the Benchmark seem to get transformed into zombies :-) (which is never good, everybody knows that zombies will come at you in a clumsy way and will do bad things to you).  We are getting closer now… I seems like a de-optimization happens… let’s dig in and try to understand why (good explanation worth reading at https://gist.github.com/chrisvest/2932907).

Since the difference in behavior makes this smelling like a memory related issue, lets look at the usage of the code cache (-XX:+PrintCodeCache):

 CodeCache: size=245760Kb used=3366Kb max_used=3366Kb free=242393Kb

 bounds [0x000000010e997000, 0x000000010ed07000, 0x000000011d997000]

 total_blobs=1280 nmethods=897 adapters=297

 compilation: enabled

There seems to be no problem there… memory is not the issue here…

Lets look next into inlining(-XX:+PrintInlining):

                                @ 47   org.spf4j.io.AppendableWriter::close (39 bytes)   inline (hot)

                                  @ 8   org.spf4j.io.AppendableWriter::flush (24 bytes)   inline (hot)

                                    @ 1   org.spf4j.io.AppendableWriter::checkNotClosed (35 bytes)   inline (hot)

                              @ 23   org.openjdk.jmh.infra.Blackhole::consume (42 bytes)   disallowed by CompilerOracle

                              @ 20   com.google.common.io.AppendableWriterBenchmark::spf4jAppendable (52 bytes)   force inline by CompilerOracle

                                @ 7   java.lang.StringBuilder::setLength (6 bytes)   inline (hot)

                                  @ 2   java.lang.AbstractStringBuilder::setLength (45 bytes)   inline (hot)

                                    @ 15   java.lang.AbstractStringBuilder::ensureCapacityInternal (16 bytes)   inline (hot)

                                @ 15   org.spf4j.io.AppendableWriter::<init> (23 bytes)   inline (hot)

                                  @ 1   java.io.Writer::<init> (10 bytes)   inline (hot)

                                    @ 1   java.lang.Object::<init> (1 bytes)   inline (hot)

                                @ 36   org.spf4j.io.AppendableWriter::write (15 bytes)   inline (hot)

                                  @ 5   java.nio.CharBuffer::wrap (8 bytes)   inline (hot)

               !                    @ 4   java.nio.CharBuffer::wrap (20 bytes)   inline (hot)

                                      @ 7   java.nio.HeapCharBuffer::<init> (14 bytes)   inline (hot)

                                        @ 10   java.nio.CharBuffer::<init> (22 bytes)   inline (hot)

                                          @ 6   java.nio.Buffer::<init> (121 bytes)   inline (hot)

                                            @ 1   java.lang.Object::<init> (1 bytes)   inline (hot)

                                            @ 55   java.nio.Buffer::limit (62 bytes)   inline (hot)

                                            @ 61   java.nio.Buffer::position (43 bytes)   inline (hot)

                                  @ 8   java.lang.StringBuilder::append (6 bytes)   inline (hot)

                                   \-> TypeProfile (307072/307072 counts) = java/lang/StringBuilder

                                    @ 2   java.lang.StringBuilder::append (8 bytes)   inline (hot)

                                      @ 2   java.lang.AbstractStringBuilder::append (54 bytes)   inline (hot)

                                        @ 45   java.nio.CharBuffer::length (5 bytes)   inline (hot)

                                          @ 1   java.nio.Buffer::remaining (10 bytes)   inline (hot)

                                        @ 50   java.lang.StringBuilder::append (8 bytes)   inline (hot)

                                          @ 4   java.lang.StringBuilder::append (10 bytes)   inline (hot)

                                            @ 4   java.lang.AbstractStringBuilder::append (144 bytes)   inline (hot)

                                              @ 18   java.nio.CharBuffer::length (5 bytes)   inline (hot)

                                                @ 1   java.nio.Buffer::remaining (10 bytes)   inline (hot)

                                              @ 89   java.lang.AbstractStringBuilder::ensureCapacityInternal (16 bytes)   inline (hot)

                                                @ 12   java.lang.AbstractStringBuilder::expandCapacity (50 bytes)   inline (hot)

                                                  @ 43   java.util.Arrays::copyOf (19 bytes)   inlining too deep

                                              @ 116   java.nio.CharBuffer::charAt (16 bytes)   inline (hot)

                                                @ 2   java.nio.Buffer::position (5 bytes)   accessor

                                                @ 8   java.nio.Buffer::checkIndex (24 bytes)   inline (hot)

                                                @ 12   java.nio.HeapCharBuffer::get (15 bytes)   inline (hot)

                                                  @ 7   java.nio.Buffer::checkIndex (22 bytes)   inlining too deep

                                                  @ 10   java.nio.HeapCharBuffer::ix (7 bytes)   inlining too deep

Now that is something interesting! Inlining to deep?! and exactly for the method I am seeing in my profile when things are being slow!

Let’s increase the inlining level a bit and see what happens (default is 9) (-XX:MaxInlineLevel=12) :

# Run progress: 0.00% complete, ETA 00:00:20

# Fork: 1 of 1

# Warmup Iteration   1: 83142.216 ops/s

# Warmup Iteration   2: 103245.276 ops/s

# Warmup Iteration   3: 96166.461 ops/s

# Warmup Iteration   4: 96081.599 ops/s

# Warmup Iteration   5: 95880.021 ops/s

# Warmup Iteration   6: 95450.397 ops/s

# Warmup Iteration   7: 94175.079 ops/s

# Warmup Iteration   8: 80373.298 ops/s

# Warmup Iteration   9: 76777.619 ops/s

# Warmup Iteration  10: 89284.943 ops/s

Iteration   1: 83158.890 ops/s

Iteration   2: 81364.498 ops/s

Iteration   3: 86231.231 ops/s

Iteration   4: 84652.819 ops/s

Iteration   5: 84800.907 ops/s

Iteration   6: 82218.109 ops/s

Iteration   7: 86128.886 ops/s

Iteration   8: 84813.081 ops/s

Iteration   9: 79925.358 ops/s

Iteration  10: 85968.824 ops/s

Benchmark                                   Mode  Cnt      Score      Error  Units

AppendableWriterBenchmark.spf4jAppendable  thrpt   10  83926.260 ± 3285.072  ops/s

Hurray! No more degradation with the Parallel GC!!!! There is still something interesting happening in warmup iteration 2 though… (I have a feeling more can be squeezed out here…and it has to do with zombies :-))

Let’s redo the benchmarks with our newly learned tweaks…

With G1GC: # VM options: -XX:MaxInlineLevel=12 -XX:+UseG1GC -Xmx256m -Xms256m

Benchmark                                   Mode  Cnt      Score      Error  Units

AppendableWriterBenchmark.guavaAppendable  thrpt   10  44132.545 ±  615.835  ops/s

AppendableWriterBenchmark.spf4jAppendable  thrpt   10  76099.587 ± 1877.786  ops/s

With Parallel GC(default in my case): # VM options:-XX:MaxInlineLevel=12 -Xmx256m -Xms256m

Benchmark                                   Mode  Cnt      Score      Error  Units

AppendableWriterBenchmark.guavaAppendable  thrpt   10  45104.009 ±  395.868  ops/s

AppendableWriterBenchmark.spf4jAppendable  thrpt   10  81934.901 ± 1121.964  ops/s

We observe significantly better performance with the spf4j implementation with both GC implementations. And this time G1 GC looses its magic…

We could have made a naive conclusion that G1GC is superior to the Parallel GC based on one of my first observations (drunk man anti-method sometimes leads to drunk man conclusions)… Or the conclusion the guava developers made… A few more data points to visualize the resource usage difference…

In the guava implementation Memory/GC overview we can see lots of allocations and garbage collection being done really fast, making sure the CPU is not bored..

In the spf4j  implementation Memory/GC overview we see a lot less garbage created and collected which is where the extra performance (2x) is coming from. The quietness we see is probably also related to some extra escape analysis which happens most likely to the char buffer involved here. It is always good to have the GC bored… Think of all the other garbage your app can now create :-)


  1. Benchmarking is hard. Really hard. Beware of benchmarks and benchmark results, most benchmarks and conclusions I am seeing on the web are BAD, be skeptical.
  2. Disregard any benchmark not made with a good framework (like JMH, see for more detail)
  3. Single threaded benchmarks are not valuable most of the time, resource hogs might perform well single threaded, but when multithreaded they are less likely to do so.
  4. Disregard benchmark results/conclusions that lack detail.
  5. Understanding benchmarking measurements is hard. Really hard. Do NOT jump to conclusions, try to understand what is going on.
  6. Drunk man anti-method can lead to drunk man conclusions, but it can still provide interesting knowledge.
  7. Beware of tools, measuring a system changes a system, they will impact your measurements it is good to have a good understanding of how they work, and what the potential impact might be. Validate measurements made with one tool with another tool. (One of the reason I am using 2 profilers)

The tools used in this analysis are:

  1. Java Flight Recorder + Java Mission Control to record and visualized profiles of the benchmarks. (http://www.oracle.com/technetwork/java/javaseproducts/mission-control/java-mission-control-1998576.html)
  2. Spf4j stack sampling profiler  + UI to record and visualize profiles. (http://www.spf4j.org)
  3. JMH benchmarking framework. (http://openjdk.java.net/projects/code-tools/jmh/)

I made all the efforts to make the measurements and conclusions in this article as “correct” as possible, but they are not perfect, more measurements on more platforms would have been nice, with more iterations, on server hardware (not a macbook pro… although with 4 cores and 16Gb of ram + SSD it makes a better server than most of the cloud servers out there :-))…

The benchmark should also be improved to make it more realistic to actual real life workloads….

One thing we still have unanswered is: why is the inlining not an issue with G1 GC?

I will leave this for another day…

All code is on github: https://github.com/zolyfarkas/spf4j/tree/master/spf4j-benchmarks 

For any comments/questions, email me at: zolyfarkas@yahoo.com 



APIs MUST be Copyrightable

11:16AM Jun 10, 2016 in category General by Zoltan Farkas

With the Oracle vs Google lawsuit I have seen a lot of opinion articles where people claim that APIs should not be something that copyright should apply for... which I strongly disagree with...

Some opinions trivialize APIs, which is plain STUPID and the opinion of such a individual can be ignored. An API takes a lot of time to develop, get adopted... it is not Simple! And in the Oracle vs Google case the Java API is very complex.

The EFF opinion claims that it kills interoperability, which is nonsense! A famous contra argument is: Intel created the x86 instruction set which AMD licenses, while AMD created AMD64 which Intel licenses... News flash EFF, APIs can be licensed! This way creation gets rewarded which will incentivize more creation... Without incentives nothing happens... (see the Communist experience, and how well that worked...) Creator can create APIs without any restriction if they want, but they should be the ones making the decision, no others who want to take advantage....

Creation needs to be protected, and APIs are as important as any other part of a software component. If the author shares its code for free and through the GPL license asks you to keep the code free, please RESPECT that! 

Extracting the java API and simply relicensing it is wrong! It took a lot of investment to develop the java API, to get it used by millions of organizations and developers... you cannot simply copy something and disregard the license... it is stealing! The java API is licensed with GPL for a reason and everybody needs to respect that.


Implementing a distributed semaphore.

08:49AM May 14, 2016 in category Java by Zoltan Farkas

I recently needed to limit the number of parallel accesses to a resource across a cluster. The classic use case for a Semaphore. 

So I went ahead and looked around for existing distributed implementations. The most solid one I could find is a zookeeper based one part of apache curator. This is probably a good implementation. (I say probably since the quality of popular OS libs is questionable more often than I like it)

The one drawback in my case is that this requires zookeeper, a extra piece of infra I don’t have(yet). What I do have is a MVCC Sql database with ACID… 

So a few hours of coding later the JdbcSemaphore  was born!

What is special (I thinnk) about this implementation is: 
1) You increase/decrease the number of total reservations, see the actual total/available reservation with JMX, which is pretty useful!
2) See the state in the DB with plain SQL!
3) Relatively low overhead, it is part of spf4j!

Currently it is in beta, and is being peer reviewed… concurrency is hard to get right!

Enjoy, for information, comments, suggestions go to www.spf4j.org and don't be afraid to use the issue tracker! :-)


The awesomeness of ZEL part 2.

10:00AM Apr 23, 2016 in category General by Zoltan Farkas

:Since I started ZEL to calculate fibonacci for the PART 1 article, I have added some extra info about the result of a operation

ZEL Shell
zel>func det fib(x) { fib(x-1) + fib(x-2) }
type>class org.spf4j.zel.vm.Program
executed in>6889844 ns
zel>fib(0) = 0
type>class java.lang.Integer
executed in>4319632 ns
zel>fib(1) = 1
type>class java.lang.Integer
executed in>317251 ns
type>class java.lang.Integer
executed in>50519054 ns
type>class java.lang.Long
executed in>5201067 ns
type>class java.math.BigInteger
executed in>7518551 ns
type>class java.math.BigInteger
executed in>572010 ns

As you might notice there are 2 "strange" things to observe: 

  1. The result types are different. Thanks to ZEL's type auto upgrade feature. (no overflows in ZEL)
  2. Execution time for fib(101) < execution time for fib(100), and this is thanks to the "det" (deterministic) keyword used to declare the fibonacci function.

For more detail please see 


The awesomeness of ZEL

09:33AM Apr 23, 2016 in category General by Zoltan Farkas

I recently showed someone how bug fib (10000) is, and the best tool for this was ZEL:

ZEL Shell
zel>func det fib(x) { fib(x-1) + fib(x-2) }
zel>fib(0) = 0
zel>fib(1) = 1
zel>fib (10000)

More information about ZEL you will find at.



Anticompetitive behavior? or just stupid behavior?

10:45AM Apr 17, 2016 in category General by Zoltan Farkas

I was reading an article about the UC Davis attempted to manipulate search results related to a pepper spraying incident that took place there (which I already forgot about :-), I am sure they wish they would be a European university and be protected by the "right to be forgotten" laws :-).

The proposed solution was pretty interesting:

"flood of content with positive sentiment and off-topic subject matter", and proposed hosting content on Google's own services, which would appear higher in the firm's search results."

This raises eye brows, since Google Search has a monopoly position in quite a few markets, and it is investigated in Europe for exact same behavior... Now there might be practical/technical explanations for this "behavior"... Ultimately crawling sites hosted in your own data centers is cheaper (less overhead)... On the other side when I use Google Search my expectation as customer is to get the best relevant search results and I don't believe the fact that content is hosted by Google has anything to do with relevance!

My suggestion to Google is: if this "behavior", which Nevins & Associates seem to think they can take advantage of exists, to correct it! not for the antitrust issues! but for improving the quality of their product!


SPF4j 7.2.25 is out!

10:43AM Apr 12, 2016 in category General by Zoltan Farkas

SPF4j 7.2.25 is out! UPDATE 7.2.26 is out to fix a bug with the slf4j formatting utility.

This is probably the last release that will be usable with JDK 1.7. Moving on forward the library will be compiled with JDK 1.8, and will not be usable with JDKs lower than 1.8.

7.2.25 contains a few notable additions:

1) Java nio TCP proxy. This is a useful utility for testing HA for tcp based services.

Here is how simple it is to use:

        ForkJoinPool pool = new ForkJoinPool(8);
        try (TcpServer server = new TcpServer(pool,
                new ProxyClientHandler(HostAndPort.fromParts(“www.zoltran.com”, 80), null, null, 10000, 5000),
                1976, 10)) {
            byte[] originalContent = readfromSite(“www.zoltran.com”);
            byte[] proxiedContent = readfromSite("http://localhost:1976");
            Assert.assertArrayEquals(originalContent, proxiedContent);
2) Slf4jMessageFormatter - a useful utility that is more flexible and after than the one shipped with slf4j. (see Slf4jMessageFormatterTest for more detail)

3) AppendableLimiterWithOverflow and a fast implementation of AppendableWriter

4) FastStackCollector allows usage with a more flexible filter: FastStackCollector(Predicate<Thread>) allowing you to reduce the profiling overhead and improve the relevance of the profiled data.

For more detail see: http://www.spf4j.org


To Spin or not to Spin!

11:02AM Sep 12, 2015 in category General by Zoltan Farkas

Both spf4j LifoThreadPool and java ForkJoin pool use Spinning to try to improve performance. It seems like FJP decided to stop spinning...

Latest JDK update disabled spinning for the fork-join pool:


the latest update also fixes serious bugs with FJP like:


The spf4j thread pool spinning is configurable (on by default) and the decision to employ it is in the hands of the user... By default spf4j will have max nr_processors/2 spinning threads at any  (configurable via system property: lifoTp.maxSpinning). 

There are significant differences between spf4j lifo thread pool and FJP, and these differences make them useful for different use cases...  

FJP lacks the configurability of the spf4j thread pool. It is not possible to configure a FJP to reject tasks before and unrecoverable error like a OOM happens. FJP timeout to retire a thread is hardcoded to 2S, why isn't this at least configurable via a System property is a bit of a mystery to me... I am a big fan of minimalistic configurability, but in case of FJP it is a bit extreme... The cost of creating a thread can be significant if one factors all the thread local caches that are typically created by the average Server software and which are lost when the thread is brought down... so it would be useful if one could specify a minimal number of threads to be kept alive and a custom timeout... I understand that you can fork the FJP code, but if you need only a thread pool you can use spf4j thread pool instead... and you can even choose to spin. 

SPF4J thread pool will schedule the tasks in FIFO order to worker threads who are picked in a LIFO order... 

On the performance side, in some of the tests FJP is faster that spf4j in others it is significantly slower... which means that my tests need improvement...  



JDK ThreadPoolExecutor NOT.

08:16AM Jun 20, 2015 in category General by Zoltan Farkas

I have finally reached my patience with the JDK thread pools....

1) JDK thread pools are biased towards queuing. So instead of spawning a new thread, they will queue the task. Only if the queue reaches its limit will the thread pool spawn a new thread.

2) Thread retirement does not happen when load lightens. For example if we have a burst of jobs hitting the pool that causes the pool to go to max, followed by light load of max 2 tasks at a time, the pool will use all threads to service the light load preventing thread retirement. (only 2 threads would be needed…)

Unhappy with the behavior above, I went ahead and implemented a pool to overcome the deficiencies above.

To resolve 1) there are several “hacks” using the existing JDK pools:



To resolve 2) Using Lifo scheduling resolves the issue. This idea was presented by Ben Maurer at ACM applicative 2015 conference:

So a new implementation was born:


So far this implementation improves async execution perfomance for ZEL (http://zolyfarkas.github.io/spf4j/spf4j-zel/index.html).

The implementation is spin capable to reduce context switch overhead, yielding superior performance for certain use cases.



Spf4j 7.1.10 is Out

07:48PM Apr 19, 2015 in category General by Zoltan Farkas

Bugs fixes, performance improvements and some new funtionalitty to record CPU and Thread usage for your process.

Open files monitor has been improved to automatically detect OF limit, will call the GC if warn threshold exceeded and even shut down the process when limit is reached if configured so.

JMX exporter utility has been improved to fully support "mixin"  MX beans. Spf4j monitors can now be controlled with JMX.

release is available in the central repo as always at 




Spf4j 7.1.8 is out

07:29PM Apr 01, 2015 in category General by Zoltan Farkas

Nee MemorizingBufferedInputStream is now available for troubleshooting purposes. (see where you fail)

Plus lots of bug fixes...

code available at http://www.spf4j.org


Spf4j 7.1.4 is out

08:26AM Mar 08, 2015 in category General by Zoltan Farkas

This release contains a lot of enhancements:

Object recycler had a few bugs fixed, and should be  production ready.

Added SizedObjectRecyclers which are very useful for recycling buffers. (ByteArrayBuilder can now use a recycled array.)

PipedOutputStream and PipedInputStream, a significantly better implementation that the stock jdk one, not only it is slightly faster, but the producer controls the byte handover with flush, having buffering semantics. This implementation supports timeouts as well by integrating with spf4j Runtime.get|setDeadline()

New UpdateablePriorityQueue implementation.

New Strings utilities, for fast to/from utf8 coding/decoding.

release is available in the central maven repo.



spf4j + flight recorder

06:49PM Nov 02, 2014 in category General by Zoltan Farkas

Spf4j has not a Flight recorder profiler integration for JMH, all you need to do to us it is:

        Options opt = new OptionsBuilder()
                .jvmArgs("-XX:+UnlockCommercialFeatures", "-Djmh.stack.profiles="
                        + System.getProperty("jmh.stack.profiles",
         new Runner(opt).run();

As you can see in the example above you can actually use spf4j profiler and flight recorder at the same time.

(not that it makes sense to do that :-) )

 enjoy, cheers!


Jmh + Spf4j

11:27AM Nov 02, 2014 in category General by Zoltan Farkas

I had some time this weekend to code due to bad weather :-), and I have integrated spf4j and jmh so that spf4j can be used to profile benchmarks. This way as you see a performance degradation you can immediately take a look at what potentially is the cause. All you need to do is to look at the ssdump files generated. (ex spf4j benchmark profiles).

Spf4j profiler is a better and lower overhead implementation compared with the JMH StackProfiler, however both suffer from safe point bias, which makes their results less accurate. (a lot of commercial profilers suffer from the same issue, I believe java flight recorder does not)

Spf4j will contain JMH profiler integration with java Flight recorder in the near future. 


Sleep sort in Zel

11:45AM Nov 01, 2014 in category General by Zoltan Farkas

One of the candidates we have been recently interviewing, as a anecdote implemented a sleep sort during the interview.

So I thought to myself, this can easily be implemented in ZEL as well, so here it is:

func sleepSort(x) {
  l = x.length;
  if l <= 0 {
    return x
  resChan = channel();
  max = x[0];
  sl = func (x, ch) {sleep x * 10; ch.write(x)};
  sl(max, resChan)&;
  for i = 1; i < l; i++ {
    val = x[i]; 
    sl(val, resChan)&;
    if (val > max) {
      max = val
  sleep (max + 1) * 10;
  for c = resChan.read(), i = 0; c != EOF; c = resChan.read(), i++ {
     x[i] = c
  return x

 and it works like a charm, enjoy!


Generating a Unique ID

12:13PM Oct 26, 2014 in category General by Zoltan Farkas

Most applications I encounter use UUID.randomUUID().toString() to generate unique IDs for various things like requests, transactions.... which is quite a slow implementaion.

Since I implemented a UID generator in SPF4J, I decided to do a little bit of benchmarking with JMH: 


and here are the results on my 4 core macbook pro: 

Benchmark                              Mode  Samples         Score        Error  Units

o.s.c.UIDGeneratorBenchmark.jdkUid    thrpt       60    261797.856 ±  11388.450  ops/s 

o.s.c.UIDGeneratorBenchmark.atoUid    thrpt       60   8102280.696 ± 159030.080  ops/s

o.s.c.UIDGeneratorBenchmark.scaUid    thrpt       60  25371629.029 ± 354517.591  ops/s

As you ca see the spf4j UID generator is 100x faster.

And as you can see it is significantly faster than the implementation using atomic instructions. In a lot of the code I stumble upon I see a lot of unjustified use, and the scalability impact is significant. 


SPF4J release 6.5.17 is OUT

07:00AM Sep 21, 2014 in category General by Zoltan Farkas

Release 6.5.17 is out, code and binaries at. Some of the notable changes:

 1) Added 3 measurement stores: tsdbtxt a simple text based format to store measurements. Graphite UDP store, and Graphite TCP store.

 2) ObjectPool is now called RecyclingSupplier, an extension to Guava Supplier. with 2 methods: get() and recycle(object)...

 3) Performance enhancements to further reduce the library overhead (and Heisenberg uncertainty principle)

 4) Retry methods in the Callable class have been further refined. A randomized Fibonacci back-off with immediate retries has been introduced as default.

 5) Added Either utility class.

6) Easy to export JMX operations and attributes. Simply annotate with @JmxExport the method or getters and setters and Register the object with the new Registry class and your done.


Apple Watch makes Yo obsolete(or not)

08:28PM Sep 09, 2014 in category General by Zoltan Farkas

One of the interesting features of apple watch is its instant messaging capability.

It allows you to send a YO in the most efficient way.

Based on popularity of YO, I see this as apple watch's killer feature :-).


spf4j jmx utilities enhaced

01:57PM Aug 22, 2014 in category General by Zoltan Farkas

Added some enhancements to the spf4j library to export attributes and operations.

All you need to do is annotate with @JmxExport your getters and setter and operations and call Registry.export() with your objects.

code @ http://www.spf4j.org



Why does history have to repeat itself?

10:52PM Aug 02, 2014 in category Java by Zoltan Farkas

I wonder if the large loss of life in the Iraq and Afghanistan wars was worth it… and I am pretty sure it was not…

13 years after 9/11, and 10 years after the initial 9/11 commission report

"Al Qaeda–affiliated groups are now active in more countries than before 9/11."

“The struggle against terrorism is far from over—rather, it has entered a new and dangerous phase.”

“A senior national security official told us that the forces of Islamist extremism in the Middle East are stronger than in the last decade.”

“ISIS now controls vast swaths of territory in Iraq and Syria, creating a massive terrorist sanctuary. One knowledgeable former Intelligence Community leader expressed concern that Afghanistan could revert to that condition once most American troops depart at the end of 2014.”

On PBS Frontline on Jul 29 somebody said about the new terrorist threat:

“This is Al Qaeda 6.0, they make Bin Laden’s Al Qaeda look like boy scouts”

I see the same failed strategy being employed by Israel in Gaza…  

The Israeli army is creating the next generation of Extremist that will make the previous one look like boy scouts…

Why does history have to repeat itself?


spf4j alternative java flight recorder

08:56PM Jul 11, 2014 in category General by Zoltan Farkas

With JDK update 40 Oracle released  Java Mission Control + Java flight recorder:

(for more detail see: http://docs.oracle.com/javase/8/docs/technotes/guides/jfr/)

As with spf4j you can implement continuous profiling, and there are some pros and cons of using Java Flight Recorder:

Java flight recorder has some implementation advantages that in theory will provide better data quality. Oracle calls the impact:"Zero performance overhead" which is sales BS, every engineer with the IQ greater than the room temperature knows that there is no such thing. However the overhead can be minimal and potentially lower that the spf4j, although not significantly lower.

But don't get ready to throw spf4j out of the window, java flight recorder is available only on the Oracle JVM, and is free to use in your test environments only, for production environments you will need to buy a license. Meanwhile spf4j you runs on any JVM, and is free to use in any environment.

Also some of the visualization is spf4j are in my view better...

In any case Java flight recorder is a great tool for implementing continuous profiling.



Easilly expose attributes and operations via JMX

10:14AM Jul 05, 2014 in category General by Zoltan Farkas

I implemented a small utility to export attributes and operations via jmx.

All you need to do is:

1) Annotate your attribute getter/setter or operation with @JmxExport

2) invoke: Registry.export("test", "Test", testObj1, testObj2...);

 and your attributes and methods will be available via JMX.

This is available in the latest version of spf4j


Parallel qsort in zel

08:43PM Apr 04, 2014 in category Java by Zoltan Farkas

I had a bit of time to implement some extra features in zel, just enough so that I can write quick sort in zel:

func qSortP(x, start, end) {
  l = end - start;
  if l < 2 {
  pidx = start + l / 2;
  pivot = x[pidx];
  lm1  = end - 1;
  x[pidx] <-> x[lm1];
  npv = start;
  for i = start; i < lm1; i++ {
    if x[i] < pivot {
      x[npv] <-> x[i];
      npv ++
  x[npv] <-> x[lm1];
  qSortP(x, start, npv)&;
  qSortP(x, npv + 1, end)&

qSortP(x, 0, x.length)

As you can see it is pretty much the standard implementation, and since it is ZEL it is parallel.

Parallel exec time = 510 ms
Parallel exec time = 470 ms
Parallel exec time = 473 ms
Single Threaded exec time = 1640 ms
Single Threaded exec time = 1528 ms
Single Threaded exec time = 1527ms

Tests executed on a quad core MacBook pro and show good scalability of the execution engine.

pretty cool, tests are at https://code.google.com/p/spf4j/source/browse/trunk/spf4j-zel/src/test/java/org/spf4j/zel/vm/QSort.java



ZEL has now channels.

09:22PM Mar 24, 2014 in category General by Zoltan Farkas

Here is a simple program where we have 1 producer and 10 consumers:

        ch = channel();
        func prod(ch) { for i = 0; i < 100 ; i++ { ch.write(i) }; ch.close()};
        func cons(ch, nr) {
            sum = 0;
            for v = ch.read(); v != EOF; v = ch.read() {
                out(v, ","); sum++ 
            out("fin(", nr, ",", sum, ")") 
        prod(ch); // start producer
        for i = 0; i < 10; i++ { cons(ch, i) } //start consumers

as with zel futures, channel operations do not block a thread.

ZEL coroutines are multiplexed over a pool of threads where channel.read and future.get(transparent)

are points where execution can be suspended.

Current channel implementation is a unbounded channel.


spf4j release 6.5.2

11:06PM Feb 23, 2014 in category General by Zoltan Farkas

I finally managed to  clean up and improve ZEL to make it worthy of being part of spf4j.

you can checkout source download binaries(from the maven repo) at www.spf4j.org



zel and replicas

08:44PM Feb 19, 2014 in category General by Zoltan Farkas

Implemented zel system function "first", which will  return the first value returned by a set of async invocations.

This is in general practical for implementing replica invocations, where we care about the first and fastest result.

Here is a dummy example:

replica = func async (x) {
    sleep random() * 1000;
    out(x, " finished\n");
    return x
out(first(replica(1), replica(2), replica(3)), " finished first\n");
sleep 1000

returns something like:

3 finished
3 finished first
2 finished
1 finished

As you can see in this case 3 finishes first. 2 and 1 finish afterwards, but the result are discarded.

Next on my list are exceptions and canceling async tasks where the result are not needed anymore...