Imitation game?

The quote from Charles Caleb Colton is well known: “Imitation is the sincerest form of flattery”.

On the Internet, everything is ultimately up for grabs. Contents is freely and easily accessible – the power of the web. However, copy is not flattery.

Sharing information doesn’t mean allowing someone else to take ownership of that information. The principle is straightforward: if you like, you link. And do not forget to return the favor by sharing genuine information as well, thereby expanding the web of knowledge.

So beware. There’s a copycat around copying contents from our web site. We are currently assessing our actions (and I will probably write a piece on this once it’s been dealt with) – but in the meantime, please make sure that your bookmarks to are up-to-date.

Thanks for reading!


[Reading Time: 5mn]

Time flies. And good things mature.

Looking at a couple of posts from a few years back, I see that the ones on outlier detection with Clojure and Incanter are still very valuable and a frequent read if I believe the blog stats.

But they are way out of date regarding the libraries I have used at the time. They too have matured quite a bit, which means that the Clojure ecosystem is as strong (of not stronger) as ever. I have updated the post slightly, and the code heavily, so you might want to check them out.

Thanks for reading

Big Data Taxonomy Visualisation

[Reading Time: 15mn]

What is it?

In the last couple of blog posts, I have explained my thoughts on software architecture documentation in general, with a focus on data-driven architecture documentation.

Data management has become a complex topic over the last decade, mostly caused by the explosion of data sources, itself a consequence of the explosion of internet-connected services and devices. A term has been coined to depict this situation: Big Data.

Oftentimes, our clients want to understand 2 things about Big Data:

  • What it means, in order to understand whether they need to feel concerned.
  • What the landscape is, or the cartography, or even better: the taxonomy.

Continue reading “Big Data Taxonomy Visualisation”

Architecture & Software Documentation, Part II – The Database

[Reading Time: 15mn]


I have introduced the topic of documenting software artefacts in my previous blog post. As I said, the purpose of a good documentation is to communicate – to your technical peers, to your boss, to your sponsors.

But a good documentation set is hard to achieve and then, to maintain. It should live as close to the actual code as possible. A good deal can be obtained through reverse engineering to / from UML when your project is object-oriented. But let’s face it, we live in a polyglot programming world. At least some layers in our software design won’t be amenable to UML because they don’t use object-orientation design. The canonical example being: how would you go about documenting your SQL-based data model?

Continue reading “Architecture & Software Documentation, Part II – The Database”

Architecture and Software Documentation

[Reading Time: 15mn]


In software development projects, documentation is often considered as an after-thought – if not completely ignored. As professionals, we all know that documenting is part of the design and development activity, much like pushing artifacts to production is part of the development process. Of course, I am not talking about the project’s own documentation like specifications / requirements or sponsors meeting minutes or project planning. They are obviously necessary, but not what I want to review here. No, here I am talking about documenting the design and the code. And remember, we have all been told that it is not only good practice, but also contributes to the overall quality of the project. Not convinced? Okay, let me tell you a story we all know already.

Continue reading “Architecture and Software Documentation”

The Art of Queueing – Simulated

[Reading Time: 30mn]

I don’t know a lot of people who enjoy queueing. But queueing is a fact of life: each time we are competing for access to some limited resource, the natural tendency is to form queues: taxi spots, social security counters, Eiffel Tower entrances, Disneyland latest attraction, saturdays’ over-crowded supermarket tills, annual iPhone launch day at any Apple Store. The list goes on and on.

Last week I was standing in a queue which had formed in front of 2 tube tickets distribution machines at the St Lazare station in Paris, France. The line was short but accumulating. Not a good sign but the Metro hall at St Lazare is huge: you can fill many more people before promiscuity becomes compromising. And in any case, my turn was close.

Continue reading “The Art of Queueing – Simulated”

Prevent Technical Debt with SOLID

[Reading Time: 15mn]

Following my previous post on Technical Debt, some readers argued that yes, it is definitely easy to put the blame on developers. But, put under pressure to deliver business-critical features with time-to-market as the main requirement, it is very difficult to produce “right” code – the kind of code you will not resent a few days later.

I am a big proponent of equipping developers with patterns. This is totally different from using frameworks – I will dedicate another blog post to that topic in the future. As a developer, patterns are maps: given a specific requirement aspect and environment, they reveal the best possible paths of implementation. They are invaluable. And because they are cataloged, shared and well known, they help produce “right” maintainable code more easily.

Continue reading “Prevent Technical Debt with SOLID”

Technical Debt – A Primer

[Reading Time: 5mn – Presentation: 30mn]

A few weeks ago, I had the opportunity to do a presentation on Technical Debt to a panel of corporate IT managers of a large European bank.

Ward Cunningham introduced the Technical Debt financial metaphor around 20 years back, in an effort to allow non technical people to understand why money needs to be spent continually on software products. Well, I think his message is still very valuable today.
Following this presentation and the workshops and discussions which ensued, some patterns seemed to emerge.

Continue reading “Technical Debt – A Primer”

Data Quality – Testing Outliers With Live Market Data And The Stockings Library

[Reading time: 10mn]

While working on enhancing our outliers detection library, we saw an announcement for a new Clojure library called “stockings“. This library offers an API which you case use on order to get at market date from Yahoo Finance service (stockings.core) or Google Finance service (stockings.alt).

We’ll leave it to you to have a look at the complete capabilities of the library. As you guess from our previous posts, we were specifically interested in its market data history features.Our existing example starts with data available on disk in a CSV file. We’ve adapted it so it dynamically gets historical data using the stockings library.

Continue reading “Data Quality – Testing Outliers With Live Market Data And The Stockings Library”

Data Quality – Outliers Display with Incanter

[Reading time: 5mn]

In our previous post, we briefly explained how we used Clojure to do data outliers detection with descriptive statistics. Since then, we have enriched our prototype library with further detection methods: MAD (Median Absolute Deviation) and IQR (Interquartile Range). The source code is available on github if you want to play around with it.

Now, how good are these outliers methods? Obviously, as the functions return a collection of offending points with calculation details, it is rather difficult to notice whether the results are pertinent or not. For this, you want to see the time series on a chart with outliers highlighted – well, let’s say that we want to see this.

Continue reading “Data Quality – Outliers Display with Incanter”