Synthetic Data creates free course on Big Data to ensure quality in its professionals

Madrid, April 10 2018 - Synthetic Data, a leading Big Data solution provider, has developed a free course on Big Data. The idea is to offer free training in order to create professionals the company trusts and knows all of their capabilities. The course lasts 2 entire days and covers the most sought of Big Data technologies, such as Hadoop, Spark, HBase, Hive, etc. It is extremely hands-on, going step by step in every aspect of the installation, configuration, and use of those technologies. According to Synthetic Data´s CEO Al Costa: "The market is in desperate need of qualified professionals, and the best way to ensure we can provide the best is to train them ourselves. There are far too many professionals out there with technology claims in their CV´s which, at the end, do not match the expectations, and this is something we want to tackle in a productive way". At the end of the course, the alumni do a written test and only the 3 best are chosen for a 12 month work period with Synthetic Data at its projects with its clients. The course is always accepting new students: they need to send their CV´s for approval to

Heimdall Data and Madrid, July 15 2017 - Synthetic Data team up for Intelligent SQL Optimization

Heimdall Data, the next generation SQL data access platform, and Synthetic Data, a leading Big Data solution provider, today announced a strategic alliance for the European Union market with Synthetic Data’s market reach in the financial services sector and Heimdall Data’s unique transparent caching solution. Learn more here or read the press-release

We will be speaking at IT Arena!

Madrid, August 10 2017 - Synthetic Data will be speaking in It Arena. This is by far the Biggest IT Event in Eastern Europe, and it happens every year from September 29 to October 1 in beautiful Lviv, Ukraine
There, our CEO will be speaking alongside companies such as Amazon Web Services, Bayer, 500 Startups, etc, on how Big Data is just part of
a larger puzzle and what is required to properly fit those pieces. If you attend don´t forget to say hi!

Check out a project we published at GitHub!

Modern cities replaced the stop lights in its traffic system for roundabouts. These, besides requiring no maintenance, prevent collisions, and improve traffic flow
in general. In a similar way, Big Data systems today rely on semaphore-based systems in which one script can only run after the other has started, of after
it has received some data, etc. This eventually becomes a nightmare to manage and are thus prone to failure. In a new vision, we propose a system we called
"Roundabout", where several scripts (developed in Python, Java, Scala, etc) feed with data a central script running on Spark Streaming.
This central script,
the "roundabout", runs continuously, gathering data from those scripts (the "cars"), processing that data, and forwarding to other scripts once they enter
the roundabout. This way, the automation is seamless, as data is kept by Spark until the script which needs it shows up, without the need for external job
managers. Download the code (beta) at and let us know your thoughts!