Recently we had to move a full Cassandra backup to another cluster of machines (another Datacenter on Cassandra’s jargon). Although it can be achieved using DC replication we opted for a more conservative approach and not change production configurations neither increase its load due data streaming. This post is quick comparison to find out which tool would perform better for copying a large directory tree locally. The Data One of our Cassandra’s clusters contains 12 nodes, each node has 532Gb of data distributed among 1,753,200 files (the /var/lib/cassandra folder). Continue Reading »
La última semana tuve la oportunidad de contar la experiencia de Socialmetrix instalando y configurando clusters de Datastax Analytics en Azure. Datastax brinda una solución comercial en un bundle, conteniendo Cassandra, Spark y Solr integrados. Las charlas se dieron en Argentina Big Data Meetup. Hosted by Jampp y el Nardoz Meetup. Hosted by MedalliaContinue Reading »
We run several processes that may take hours to complete and it is nice to be notified on a Slack channel when those processes finishes correctly. Using the Slack’s Incoming Webhooks API, a small bash script and a couple of tricks it is really simple!Continue Reading »
This is the second post about Socialmetrix Quantum API, at this time we’ll use the API to show summary statistics about campaigns. Please refer to the first post in order to get your API token and basic API usage instructions.
We walk you through the process of creating a campaign and assinging posts to it through the web ui, once the information is loaded, we’ll extract this metrics using Quantum API.Continue Reading »
Sometimes you just need data to learn how a algorithm works, to run a stress test or just to have a excuse to spin up several machines in a cluster and see how it crush the data. More often than not, it is incredibly hard to obtain data, and a few colleagues I’ve talked about had similar problem, so this post is a collection of links and references for datasets I know have been open source. Please contribute =)Continue Reading »
Entrevista que nos hicieron desde La Nacion sobre redes sociales y la política.Continue Reading »
Although tagcloud seems a little bit outdated and criticized visualization format, I have no doubt it can be useful sometimes. And if you can create one with only a few key strokes it is pretty sweet. Below I’ll show the technic of extracting Twitter #hashtags but you can use this technic to virtually any text source.Continue Reading »
Most of our data are stored on MySQL and Cassandra, MySQL was the primary data-store when we started up the company. Currently our MySQL workload is located at AWS RDS and we would like to give a try to Microsoft Azure. This writing is to document a few tricks we learned to reduce the total time of dump, transfer and restore. Hope it can help you too.Continue Reading »
El 11 de noviembre fue invitado a participar del programa Aldea Global de la rádio FM Tribunales 90.5 donde conversamos sobre el uso de redes sociales como herramienta para entender la opinión pública.
En esta oportunidad pude contar el trabajo que hacemos desde Socialmetrix para medir a los candidatos, entender el sentimiento del público y tópicos de conversación para ayudar los partidos a entender su público y sus deseos o quejas.Continue Reading »
This is the first post of a series on how to use Socialmetrix Quantum as datasource, which enables you to create custom dashboards or ingest social data into your internal systems empowering your big data initiatives.
To get started, we will login into your Quantum account and get the API authentication token. Don’t have an account yet? Get a free trial!Continue Reading »