How to install Groovy on a Banana Pi or Raspberry Pi

The Raspberry or Banana Pi have already pre-installed Python.
If you like to stay with Groovy on these nice little devices, here are the instructions to install Groovy.
GVM makes the whole process very easy and convenient.

Open a terminal window and use these commands:

curl -s >
chmod u+x
source “$HOME/.gvm/bin/”
gvm install groovy

This will download and install the latest Groovy version.
Check the installed version:

groovy -version
Groovy Version: 2.4.2 JVM: 1.8.0 Vendor: Oracle Corporation OS: Linux

Happy Groovy coding on your Banana Pi,

Posted in Development, Groovy | Tagged , , , , | Leave a comment

Adaptive machine learning in a streaming environment

Here is a very nice explanation about adaptive machine learning in a streaming environment.

Adaptive machine leaning

Posted in Big Data | Tagged , , | Leave a comment

How to Solve: “Too many connections”; nested exception is com.mysql.jdbc.exceptions.jdbc4.MySQLNonTransientConnectionException:


in one of my projects I used GPars and its withPool() method to execute some async tasks.
The prime purpose of a task is to fetch some data from external sources and save the resulting data to a database. Very simple.

While the task did not consume much memory and the performance was mostly determined by the response time of some external rest services, I used quite a high parallelism with a pool size of 500.

// GPars
withPool(500) {
   tasksToDo.eachParallel { task ->
      // .. do some long running tasks asynchronously
      // and save the result to a database 


Everything worked fine in the local Grails test environment.
As production environment I used Amazon’s AWS beanstalk and a RDS mysql instance as database backend, both as t1.small instances.

But with the Amazon installation i got the following error message:

"Too many connections"; nested exception is com.mysql.jdbc.exceptions.jdbc4.MySQLNonTransientConnectionException:

2014-08-20 09:19:54,373 [ForkJoinPool-1-worker-218] ERROR crawler.BasicRssGatherer  - org.springframework.dao.DataAccessResourceFailureException: Hibernate operation: could not prepare statement; .....
Data source rejected establishment of connection,  message from server: "Too many connections"

It turned out, that Amazon RDS instances have a fixed limit of available connections.
You can check this out, by

  • opening the AWS console,
  • go to your RDS instance,
  • open “Parameter Groups” and
  • seek for “max_connections”.

In my case the formula found is: {DBInstanceClassMemory/12582880}, which calculates to:

	 * MODEL      max_connections innodb_buffer_pool_size
	 * ---------  --------------- -----------------------
	 * t1.micro   34                326107136 (  311M)
	 * m1-small   125              1179648000 ( 1125M,  1.097G)
	 * m1-large   623              5882511360 ( 5610M,  5.479G)
	 * m1-xlarge  1263            11922309120 (11370M, 11.103G)
	 * m2-xlarge  1441            13605273600 (12975M, 12.671G)
	 * m2-2xlarge 2900            27367833600 (26100M, 25.488G)
	 * m2-4xlarge 5816            54892953600 (52350M, 51.123G)

It is not possible to adjust these settings. So you can either switch to a bigger and more costly RDS instance, or reduce the pool size, which might reduce the overall performance of your app. This might be Amazon’s approach of monetarization. A third option is to roll out your own mysql installation, where you can set the connection limit by yourself, but you will loose all opportunities of a managed RDS database.

So, be aware that Amazon’s RDS mysql instances have a fixed connection limit, if you are working with high numbers of parallel database accesses. The limit can be calculated by the formula found in the max_connection section as DBInstanceClassMemory/12582880.


Posted in Development | Tagged , , | 1 Comment

Create a WebCrawler with less than 100 Lines of Code


while you find excellent open source Web crawlers (e.g. crawler4j), I wanted to write a crawler with very little coding.
Very often I use the famous Jsoup library. Jsoup has some nice features to find and extract data from an url:

// extract URL from HTML using Jsoup
def doc = Jsoup.connect("").get() as Document
def links ="a[href]") as Elements

To crawl the web, you can use recursive programming and code a closure. This might be efficient and produces very compact code, but can end up in an out of memory scenario. If you prefer this way, remember groovy’s the @TailRecursive annotation. This annotation can help to avoid an out-of-memory.

Another solution is to work with queues, e.g. ArrayDeque. Array deques have no capacity restrictions so they grow as necessary and they are considerable faster than Stack or LinkedList. But remember an ArrayDeque is not thread safe. If you need thread safety, you have to provide your own synchronization code. ArrayDeque implements java.util.Deque, which defines a container that supports fast element adding and removal from the beginning and end of the container, which is used in the following code.

package crawler

import org.codehaus.groovy.grails.validation.routines.UrlValidator
import org.jsoup.nodes.Document
import org.jsoup.nodes.Element
import org.jsoup.Jsoup

import groovy.util.logging.*
import java.util.regex.Matcher
import java.util.regex.Pattern

class BasicWebCrawler {

   def private final boolean followExternalLinks = false
   def linksToCrawl = [] as ArrayDeque
   def visitedUrls = [] as HashSet
   def urlValidator = new UrlValidator()
   def final static Pattern IGNORE_SUFFIX_PATTERN = Pattern.compile(".*(\\.(css|js|bmp|gif|jpe?g"+ "|png|tiff?|mid|mp2|mp3|mp4" +
	"|wav|avi|mov|mpeg|ram|m4v|pdf" +"|rm|smil|wmv|swf|wma|zip|rar|gz))\$")

   def private final timeout = 3000
   def private final userAgent = "Mozilla"

   def collectUrls(List seedURLs) {
      seedURLs.each {url ->
      try {
         def urlToCrawl = linksToCrawl.poll() as String // "poll" removes and returns the first url in the"queue"
         try {
            // extract URL from HTML using Jsoup
            def doc = Jsoup.connect(urlToCrawl).userAgent(userAgent).timeout(timeout).get() as Document
            def links ="a[href]") as Elements
	        links.each {Element link ->
               // find absolute path
               def absHref = link.attr("abs:href") as String
               if (shouldVisit(absHref)) {
	             // If this set already contains the element, the call leaves the set unchanged and returns false.
		          if (!linksToCrawl.contains(absHref)) {
		             log.debug "new link ${absHref} added to queue"
	    } catch (org.jsoup.HttpStatusException e) {
	      // ignore 404
	      // handle exception
	    } catch ( e) {
	      // handle exception
	    } catch (IOException e) {
	      // handle exception
   } catch (Exception e){
      // handle exception

 def private boolean shouldVisit(String url) {
   // filter out invalid links
   def visitUrl = false
   try {
      def boolean followUrl = false
      def match = IGNORE_SUFFIX_PATTERN.matcher(url) as Matcher
      def isUrlValid = urlValidator.isValid(url)

     if (!followExternalLinks) {
       // follow only urls which starts with any of the seed urls
       followUrl = seedURLs.any { seedUrl ->
          if (url.startsWith(seedUrl)) {
	     return true // break
     } else {
	   // follow any url
	   followUrl = true
     visitUrl = (!match.matches() && isUrlValid && followUrl)
   } catch (Exception e) {
     // handle exception
   return visitUrl

As shown, it is possible to code a web crawler with less than 100 lines of code.
Just provide a list of seed urls as argument of the collectUrls method:

def seedUrls = []
def crawler = new BasicWebCrawler()

HTH Johannes

Posted in Development, Groovy | Tagged | Leave a comment

How to solve: ** java.lang.instrument ASSERTION FAILED ***: “!errorOutstanding” with message transform method call failed at ../../../src/share/instrument/JPLISAgent.c line:


as I tested some memory intensive, recursive functions with GGTS 3.6 and Grails 2.4, execution was interrupted with the following error message:

*** java.lang.instrument ASSERTION FAILED ***: 
"!errorOutstanding" with message transform method call failed at 
../../../src/share/instrument/JPLISAgent.c line: 844

While this error message is not really helpful to find the root cause, I guess it has to do with the forked execution of tests.

To make the error disappear you can either comment out the complete grails.project.fork section in BuildConfig.groovy, or use the following setting:

// BuildConfig.groovy

forkConfig = [maxMemory: 1024, minMemory: 64, debug: false, maxPerm: 256]
grails.project.fork = [
    // configure settings for the test-app JVM, uses the daemon by default
    test: false,
    //test: [maxMemory: 768, minMemory: 64, debug: false, maxPerm: 256, daemon:true],

After you have edited your BuildConfig.groovy, do also a grails clean and you will be back in business again.

HTH Johannes

Posted in Development, Grails | Tagged , , , | Leave a comment

Debugging in GGTS fails with: Error – your app path – does not appear to be part of a Grails application.


recently I worked on a Grails 2.4.x project in GGTS 3.6. After some coding, refactoring and reconfiguration, debugging stopped working with the following error message:

Error | /Users/jolo/Documents/workspace-ggts-3.6.0.M1/FeedHarvesterGPars does not appear to be part of a Grails application.

The following commands are supported outside of a project:

|Run 'grails help' for a complete list of available scripts.

The weird thing was, that grails run-app still worked, while only debugging wasn’t possible anymore.

In a first attempt, I did the standard routines, deleted the local caches (e.g. rm -rf .grails) and run the typical grails commands:

grails clean
grails compile
grails refresh-dependencies

With the result, that

grails run-app 

was still successful, but

grails run-app -debug-fork

failed again, with the error message above.

With that kind of error message, it is almost impossible to track down the root cause. Especially when grails run-app works, and grails run-app -debug does not. These are the moments I dislike GGTS/Grails and all of its magic.

After I lost a reasonable amount of my precious time with guessing what the heck is going on, the problem was caused by a pull of the project from the local GGTS repository to a new local git repository!

To solve the issue, do in GGTS:

  1. choose Run -> Debug Configurations
  2. search the Grails section,
  3. mark the corresponding configuration and
  4. delete it with a right mouse click.

So be warned, if you move your project to a local git repository!


Posted in Development, Grails | Tagged , | Leave a comment