Is data mining really that boring?

Many of you may think that analyzing data and running statistics is dreadfully boring, and you may be right – but for a self-proclaimed geek like myself, I actually enjoy playing around with data – especially if it poses a challenge and on top of that shows some nice or surprising patterns.


My latest “assignment” was quite interesting, as I was investigating mission wins during each played out war. Not only was I looking at faction wins for each mission, I also looked into wins for the different battle types and wins depending on which faction was the defending party.

Some of our players have been concerned about faction balance within the game and we, of course, wanted to investigate if this is actually a problem. It required me to look into a vast amount of old log files going back to last year and because of this I decided to write a program that would do it all for me – God only knows, I am too lazy to do it all by hand :-) Since most of Heroes & Generals is written in C#, I wanted to write the program in the same code, even though my previous knowledge of programming languages has been restricted to Java, C++ and VBA. So, through nicely available online tutorials and a heavy use of Google (thank God for Google!), I spent some time teaching myself C#. I was told it would be quite similar to Java, but seriously folks, for a newbie like me – the hell it is! However, I did manage to write a functioning program in Visual Studio (I am sure all of you hard core programmers could write a much prettier one, but hey, it works!). The program looked through many thousands of lines of codes in multiple log-files, cross referencing other log files as well and then extracted the data important for querying mission wins. Boy, how I love writing programs – especially ones that can save me a lot of time :-)

Anyways, enough about my venturing into the secret world of C#, below are some of the results – now keep in mind, that this is a work in progress and something we will continue to investigate and monitor to provide the best experience for our players.

From November of 2012 to January 15th (before the Guderian build), the overall missions wins (% missions won within a war) looked like this:


and after the Guderian build, the results were:

It seems that the Guderian build may have taken care of some of the inconsistencies regarding faction wins, but this is something we will monitor closely.

When dividing wins into different battle types and looking at the defending faction, the below graphs show the percentage wins when defending. Again the data is divided into before the Guderian build (January 15th 2013, top graph) and after (bottom graph).


These results are quite interesting as it seems that for the Airfield battles, defenders lose almost 75% of the times a battle is fought. We will definitely investigate this further.

So, these are some of the things I work with here at Reto-Moto and in the future I will continue looking into the game statistics to aid in making the game fun and interesting for our users.
Now, isn’t this fun – do you still think data mining is boring!?

  1. UnbeugsamUnbeugsam03-12-2013


    we need ACTION !!!!! xD

    • SkyDaggersSkyDaggers03-12-2013

      A release date for Hansen would be better, until Hansen Arma 3 time.

  2. zerov25zerov2503-12-2013

    Nice to see some real data numbers so we can tell with precision who won more wars and why or how :) good job and learning alone any type of codes is always hard to do!

  3. cRo4Ti4cRo4Ti403-12-2013

    woooowww nice. But one question is , due Axis seems more grouped / clan activ , thi smeans this data are ….. ahh wait … now i know why there is no Group/clan feauture … this would distort data outcome. If Clans play aganst Public.

    …hmmm nice ..can we see more ? or dayly info.

  4. DondergodDondergod03-12-2013

    What I would mostly find interesting is weapon stats.
    Kills with every unmodified weapon. Caps with every unmodified weapon and so on.
    What is the average k/d with a garand, how many kills an hour, and how many caps an hour?
    These kinds of stats would really seem interesting to me.

  5. bsnake(Latvia)bsnake(Latvia)03-12-2013

    Thanks for the data it was interesting to see and read them. And +1 to Dondergod. Id like to see this type of data about weapons. :)

  6. katthedemonkatthedemon03-12-2013

    so let me get this right we have one faction that is more noob friendly and easyer to learn and has some really strong equipment that the other side cant effectively counter and you try to equalize the ratio of won battles?

    statistics are a nice tool if you factor in every detail that infuences the statistic its dangerous to look at general statistics like winrates and ballance somethign around this world of tanks comes to mind as a horrible ballancing and a horrible way of treating statistics in general

    as it is now both factions have different ammounts of skilled players and teamwork if a good axis clan would join th eallies for a couple of wars the numbers could change dramatically so ballancing around them might be a mistake instead we need to get rid of the player factor or at least minimize it by normalizing the statistics according to the player skill

    would be nice to see statistics to kills with k98 and m1 and the range as well as missed shots etc im sure this would show some very revealing results

    same goes for the stuart and the panzer 1 a simple comparison in the number of killed infantry and tanks compared against player skill in general would reveal a lot of issues imo

  7. GerhachtGerhacht03-12-2013

    Nice 2c some1 is looking into statistics. Interesting data that need some analysis for sure!

    Would be more informative if number of wins and not percentage presented because it would show short and long wars better than percentages.

    Any interpretations done for fraction balance already crazycat?

  8. CraneTYSCraneTYS03-12-2013

    Defenders loosing because it’s a small map and it’ has got short bistance between points.Also spawnplace for tower defenders is awfull.

  9. cRo4Ti4cRo4Ti403-13-2013
    • fagadabafagadaba03-13-2013


  10. Reto.CrazyCatReto.CrazyCat03-13-2013

    Thanks for the nice comments. I will of course in time look into more of this and also in more details – even running real stats instead of just averages, where I can factor in several variables and get P-values and such.
    cRo4Ti4 – nice graphs :-)

  11. MarrvMarrv03-13-2013

    I have to say it;
    Lies, Damned Lies & Statistics! :P
    Seriously though I love numbers and using them to find patterns and anomalies. So I see what your doing here and that trying to find empirical evidence of proof to confirm or contradict in game player reports is a useful thing and will allow a better feedback that is moderated by those results, to limit the effect of “those who shout loudest are often placated”

    I hope you enjoy metaphorical cake & red bull, Am going to send some to help with the crunching!

  12. Reto.CrazyCatReto.CrazyCat03-13-2013

    Thanks Marrv – I appreciate it :-) And I do like cake and red bull, even if it is just metaphorical!

  13. MakoMako03-13-2013

    Thanks for posting this Crazycat! Your not the only one that enjoys pouring over a bunch of data; its always interesting to compare the ‘eye test’ to the real numbers.

    In regards to defenders and the airfield map, my eye test says the reason for the difficulty in defending is the spawns. Once the frontline is condensed beyond the first and especially 2nd cap points on either side the defenders are essentially spawning right out in the line of fire, especially at the tower. Few terrain and spawn location alterations would address it.

    As for compaign balance overall, eshhh. Numbers don’t lie and they parallel the eye test. More often then not the axis are stomping the allies even if the win % on each map has equalized somewhat. This is a case of sheer numbers being the ace card in the end. That volume is brought on by equipment ease of use, plain and simple, ie not ‘yours is better than mine’. Overall the allied equipment requires a different style of play to use effectively and on average most players don’t desire or aren’t able to adapt.

  14. cRo4Ti4cRo4Ti403-13-2013

    That would be awesoem , can we see the graph, of actually ” veteran” % per side . lets say igher then rank 5 ones.

    just outr of interrest if that really is true about “the veterans play more on axis side”. maybe its only , the veterans who play on axis side are more dancers( who are visual to se) as the allies ones.

  15. ZapfcreationsZapfcreations03-13-2013

    i know why defenders are loosing the battles on the Airfield map and also why the Allies tend to win more on the Airfield map. And it isnt the placement of the CP’s :-)

    hint: just do a data query of the names of allied players who are at the winning battles on the Airfield map.

  16. PegoPego03-13-2013

    haha ! I knew that Shovel was going to be used for something !

  17. _DaltonBlue__DaltonBlue_03-14-2013

    how long is the game down

  18. SummariusSummarius03-15-2013

    Ok, as one of the players complaining the loudest about this, as well as someone whose adult jobs have centered around data analysis:


    You will need to break out a lot of details. You get scads and scads of data, the more you dig through and sift through, the more you will learn about your own game.

    This is an excellent first step. When I get some time, I will offer a few suggested breakdowns and things to tease out of the data.

    As an accountant and former intel analyst… data analysis is always fascinating.

  19. Reto.CrazyCatReto.CrazyCat03-15-2013

    Summarius – it is nice to hear that someone else finds data analysis cool :-)

  20. DormantDormant03-20-2013

    “These results are quite interesting as it seems that for the Airfield battles, defenders lose almost 75% of the times a battle is fought. We will definitely investigate this further.”

    No need to investigate: it’s a bug.
    Defending paratroopers with aeroplanes (a common thing on airfields) cannot spawn.

