Forum Rules

  • No flaming or derogatory remarks, directly or through insinuation.
  • No discussion, sharing or referencing illegal software such as hacks, keygen, cracks and pirated software.
  • No offensive contents, including but not limited to, racism, gore or pornography.
  • No excessive spam/meme, i.e. copious one liners in a short period of time, typing with all caps or posting meme responses (text/image).
  • No trolling, including but not limited to, flame incitation, user provocation or false information distribution.
  • No link spamming or signature advertisements for content not specific to Dota 2.
  • No Dota 2 key requests, sell, trade etc.
  • You may not create multiple accounts for any purpose, including ban evasion, unless expressly permitted by a moderator.

  • Please search before posting. One thread per issue. Do not create another thread if there is an existing one already.
  • Before posting anything, make sure you check out all sticky threads (e.g., this). Do not create new threads about closed ones.
  • It is extremely important that you post in correct forum section.

  • Balance discussion only in Misc.
  • All art related (such as hero model) feedbacks go to Art Feedback Forum.
  • All matchmaking feedback should go here: Matchmaking Feedback
  • All report/low priority issues should go here: Commend/Report/Ban Feedback
  • No specific workshop item feedback. These should go to workshop page of that item.
  • When posting in non-bugs section (such as this), use [Bugs], [Discussion] or [Suggestion] prefix in your thread name.

In case you object some action by a moderator, please contact him directly through PM and explain your concerns politely. If you are still unable to resolve the issue, contact an administrator. Do not drag these issues in public.

All rules are meant to augment common sense, please use them when not conflicted with aforementioned policies.
See more
See less

Parsing the parse: Demo Specifics

  • Filter
  • Time
  • Show
Clear All
new posts

  • Parsing the parse: Demo Specifics

    So using the tool that Valve has provided, we get a 2 million line or so file of everything that happened in the game. I figured it'd be good to start a thread to compile different ways of calculating metrics (kills, deaths, gpm, xp, tower kills, ck, cd, etc.) using the information provided in the file.

    So there's a couple of fairly obvious things we can pick up immediately from the file. Player names and corresponding heroes are all listed at the very bottom of the file, along with the match id, the game mode, and who won. This list is even more-so important, because it establishes the player indexes that are used throughout the rest of the file to reference players. This player list is 0-indexed, meaning the player IDs are 0-9. There's an is_fake_client flag, which I'm assuming means whether or not the player is a bot, but I can't verify this since I haven't played against many bots at all and the recent 3 three matches I did play with bots, a replay is unavailable to parse.

    Of the 168 different events, I've only seen 3 appear in the parse - dota_combatlog, hltv_status, dota_chase_hero. (edit: I listed the event IDs, but apparently they can possibly change from replay to replay. The most interesting is the combat log data:

    How the combat log works:

    Each time an effect/damage is seen, it's added to a list called CombatLogNames which matches a # with a corresponding in-game entity. That entity can be just about anything - a hero, a tower, a "goodguy"(lol) creep, seige, etc. This CombatLogNames is constantly refreshed throughtout the game as new events happen, so only the last list given should be used to find the corresponding entity. The list is appended, so the # - entity relationship never changes, it just continues to get larger.

    Type: the type of action that occurred in the log (Damage - 0, Healing - 1, Modifier - 2 (gaining a buff/debuff), Modifier - 3 (losing a buff/debuff), Death - 4) -- I'm confident about all 5 now.

    Sourcename: This is the source of the action, corresponding to the CombatLogNames map, NOT a player ID.

    Targetname: pretty much the same as sourcename. This is the target of the spell/damage/healing corresponding to the CombatLogNames map.

    Attackername: this appears to almost always be the same as the Sourcename. It appears to always be the same for type: 0 (damage), but for type: 2 (modifiers), the attackername is often different than the sourcename, where the sourcename might not exist at all.

    Inflictorname: This is non-zero when type == 2 or type == 3. Type 2 and 3 both deal with modifiers, so the inflictorname corresponds to the name of the modifier being applied or removed.

    Attackerillusion/Targetillusion: I'm not sure what exactly these are, but I imagine it has something to do with illusions if they are present.

    Value: This always seems to be 1 for type 2, but for type 0, this represents the damage done. For type 1 (healing), this is the amount healed.

    Health: The health of the target AFTER the value is applied. When there is a log of health: 0, the immediate following log is an exact duplicate, but type 4 (death).

    Messages / Gathering useful statistics:

    There isn't any apparent way to reliably track things like gold or xp other than through messages that are sent to the screen. The useful messages can be found in dota_usermessages.proto and it seems the most useful messages are the messages in DOTA_OVERHEAD_ALERT and DOTA_CHAT_MESSAGE. Here are some of the ones I've looked at so far:

    The are a couple of different scenarios for this. The first one is when one hero kills another, there will be two player IDs listed. playerid_1 refers to the hero that was killed, while playerid_2 refers to the killer. If this kill was first blood, there will immediately be another chat event CHAT_MESSAGE_FIRSTBLOOD where playerid_1 is the hero obtained the first blood, while playerid_2 is always -1. The "value" for both CHAT_MESSAGE_HERO_KILL and CHAT_MESSAGE_FIRSTBLOOD is the gold that was received. In the case of a firstblood, the values for firstblood and the kill are identical. If there are assists, it appears the playerid_1 is the death, playerid_2 is the last hit, and all subsequent playerid_# are the assists.

    This message is displayed when a tower is killed. It doesn't appear possible to tell what tier the tower was, but it might be able to determine this from the gold given using a different message. I'm not sure what the 'value' field is, but playerid_1 is the person who killed the tower and if no hero killed the tower, playerid_1 is -1. This same stuff goes for CHAT_MESSAGE_TOWER_DENY.

    This message is displayed when a rune is picked up. The 'value' (0-4) corresponds to which rune was picked up (I'm unclear on which runes correspond to which value), and playerid_1 is the player that picked itup. This also goes for CHAT_MESSAGE_RUNE_BOTTLE.

    This message is only displayed for most recipe items, so it isn't a reliable way to determine the player's inventory.

    Overhead Alerts

    These demos as I'm sure you're aware are the SourceTV demos and aren't what you see on your screen, but what the spectator sees on theirs. Because of this, there are quite a few useful overhead alerts we can use to determine gold gains, creep denies, and xp gained.

    As I stated above, it isn't possible to determine which tier of tower was killed via CHAT_MESSAGE_TOWER_KILL, but it should definitely be possible by checking the OVERHEAD_ALERT_GOLD immediately before the tower kill message. If a hero killed the tower, there will be 4 instances and if a neutral killed the tower, there will be 5 instances of this:
    ---- CDOTAUserMsg_OverheadEvent (9 bytes) -----------------
    message_type: OVERHEAD_ALERT_GOLD
    value: 280
    target_player_entindex: 3
    target_entindex: 121
    The 'value' is the amount of gold received in all cases and target_player_entindex APPEARS to be playerID + 2. Player IDs range from 0-9, but player_entindexes appear to range from 2-11.

    I believe this is used for both tower denies and creep denies. The 'value' appears to always be 0 and the source_player_entindex is the same as the OVERHEAD_ALERT_GOLD entindex. The target_entindex seems like it should be used for determining if a creep or a tower was denied, but it's unclear to me how to determine the entity indexes of towers/creeps to be able to differentiate the two.

    I could have sworn this was shown when you spectated games, but I can't find any instances of this appearing in the replay file.

    This is all I've gathered from reading through the replay files thus far. I haven't found a way to determine when or what players buy (unless it's specially noted in the chat), how much gold is lost on death, and how much XP is gained by hero/creep/tower kills.

    Does anyone have any more information on this, or could Zoid elaborate on some more specifics that might lead us in the right direction? Leave your thoughts or anything you've found below so we can discuss!

    Edit: Something I also forgot to mention was that I can't find a way to determine the players' corresponding Steam IDs unless they reconnect (in which case it's in the entity data). This is a pretty huge deal for leagues/tournaments that uniquely identify players by their Steam ID, so linking the in-game player name to the steam ID is fairly important. Does anyone have any insight on this?
    Last edited by snevets; 04-28-2012, 03:13 PM.

  • #2
    I've updated the combat log information. The only thing I'm not sure about is Type 3. It seems to be used when the abilities are detached from the hero. In Sand King's case, when he sand storms, the sand storm becomes a separate 'entity' (not sure on the actual term to use here) and a Type 3 is generated. This seems to also be the case with Omniknight's degen aura, so perhaps someone with more knowledge about the technical mechanics of the game can fill in more about Type 3.


    • #3
      @Zoid: Are there any future plans to expose additional events (to accompany the combat log) or something to give us information about item purchases, inventory changes, player deaths, etc.? It seems the information is definitely there somewhere since it's shown in the replay, but it doesn't appear to be visible in any readable/parseable form?


      • #4
        Ok, here I bring lots of more goodies. This is really scattered information, but it will help to build a better compendium of what we have.

        Disclaimer: I might be wrong in what I say, ESPECIALLY when I say "there is no <whatever>", as it means I wasn't able to find it in the more than reasonable amount of time I've been toying around with the tool, so it should be read as (there's no APPARENT way of easily doing/obtaining <whatever>).

        a) The game works in ~0.0333 tick intervals (i.e: 30 ticks per second).

        b) Information is saved on a replay every 2 ticks, so you have 15 DEM_Packet structures per second.

        c) Every tick where nothing happens consists of 2 messages:
        1)CNETMsg_Tick: which contains 3 values: Tick (which is not the same value displayed in the DEM_Packet for some reason), host_frametime and host_frametime_std_deviation. I'm not sure what these 2 values measure, but host_frametime is a number that's always around 3300 and stddev is a number that oscillates, in my case in the low hundreds. Maybe this has something to do with latency?
        2)CSVCMsg_PacketEntities: Now, this is, I think, a key message from which we have no info. entity_data is a binary field and there are lots of things going on there.

        d) Every 1800 ticks (1 minute), you get a FULL_DEM_Packet, which is a much longer packet that should allow you to get all the info up to that point in the game, as to be able to recreate the whole game status without simulating until that point. This is the main change they did to the replay format a while back, otherwise you had to "run" the whole replay until that point, in this case, you'll have to "run" up to 1 minute from the last FULL_DEM_Packet.

        e) The replay format is not meant to be parsing-friendly, it's meant to be Efficient. Which means, there are certain events that don't have a nice message, and a few things that won't appear as such. Namely:
        1) There's no message for buying/selling items, it's something that happens within that binary entity_data on the PacketEntities message. You want to see who bought what when? Tough luck.
        2) There are NO scoreboards. So you can't quickly check KDA/GPM/CS/etc, there's not a "number" holding that anywhere. There are events as snevets suggests, they should be, in theory, sufficient to calculate all those stats, but not without a considerate amount of work. KILLS can be obtained by checking CHAT_MESSAGE_HERO_KILL messages, same with DEATHS. ASSISTS, on the other hand, prove tricky, as there's no message saying who assisted, meaning you'll have to analyze the last 17(?) seconds (i.e: 510 ticks) of dota_combatlog messages to see who else damaged the killed target. GPM is even more complicated, you have to add up every instance where your hero gets gold, the standard gpm you get just for being there, and subtracting gold from deaths/buybacks (which means calculating time and current hero level by yourself). I don't even know how to measure XPM at the moment.
        3) There's no clear movement info. There are a few packets that have to do with clicking (either for moving or casting spells) such as CDOTAUserMsg_SpectatorPlayerClick, and some coordinates here and there, but from looking at it, unless you find a way to simulate the game map and pathfinding, it's nearly impossible to determine where a hero is, based on move click input.
        4) Buying/Selling items is part of entity_data updates and not a packet on its own (unless it's a big item, in which case you get a message saying "XXX has bought Butterfly")
        5) Not surprisingly, Team chat is not stored on the replays (if that was the case, you'd be able to see the other team's chat log, and if they made it so that somehow, you could only read your team's messages, you'd need 2 different replays per match, one with each team's team chat).
        6) I can't find for the life of me any info regarding picks/bans. It has to be on the entity_data info too, I guess. (You can find who picked whom, but not info when the pick happens in itself, or in case of -CM: pick/ban order.

        f) There's a key structure CDemoFileInfo at the end of every file holding Match ID, Game mode, Player name -> Player hero relationships. Unfortunately there's no SteamID here, which would be great because it would allow for being able to gather info for the same person from different matches even if a player who's been in all of them changes nicknames (The best example is tournament games).

        g) CUserMsg_SayText2 is the packet holding Allchat info. It has who says it (player name) and the message (string).

        h) Also unsurprisingly, the parser doesn't work with console recorded games (command record <filename>), just with the replays you download from the Watch tab.

        i) Does anyone know the actual map dimensions in game coordinates? I guess one would be able to stochastically measure where a player is by checking where he clicks.

        How can we actually move along with this? The best way I found was to host solo/2-player games with a single objective in mind, for example:
        - Go mid and get killed by a tower
        - Deny a creep
        - Cast a spell
        - Grab a rune
        - Issue orders to a courier
        - etc.

        Every time you are going to do something, say it on ALL Chat first. That way it's easy to find it later on the replay parse.

        Try to aim for 2-3 minute replays, to keep parser output size to a minimum and not get flooded by information. Parsing a full game replay is not recommended as the output file is in the hundreds of Mb, and you'll get lost in a sea of data.

        I believe it'd be ideal to pool our resources and try to collaborate in this together, so send me a PM if you're working on this and I'll try to coordinate a research group.

        Best of wishes, hope this is useful for someone.



        • #5

          are you positive that tracking CHAT_MESSAGE_HERO_KILL will at the end of parsing return proper K/D for players? I made a small parser which parses the parse, I track CHAT_MESSAGE_HERO_KILL and add death/kill to specified player in that message block, and this is what I get:

          As you can see, numbers are a bit different in parser and actual game. Is maybe CHAT_MESSAGE_STREAK_KILL connected with this error somehow?

          Also, you can notice that alchemist's nick is different ingame and in replay parser, so I guess that steam ID is stored inside the replay file and that ingame scoreboard gets current nick from steam ID.

          Oh, and there is CHAT_MESSAGE_HERO_DENY which appears in this game, when Pudge suicides himself. Both playerid_1 and playerid_2 values are equal to 1, aka pudge ingame id. I beleive that in other cases playerid_1 is id of player that got denied, and playerid_2 is id of player that denied him.
          Last edited by reiser; 04-27-2012, 06:32 PM.


          • #6
            I'm not sure I would track hero kills with CHAT_MESSAGE_HERO_KILL, I'd probably track k/d/a with the combat log. You can track deaths by looking for Type: 4 in the combat log and then figuring out if they killed a hero or not. The tricky thing with that is determining assists, which I don't really have a good theory on how to do with the combat log. You could obviously check past damage against a target within X time (not sure what the cut-off is for assists), but your hashtable or whatever you're using would have to be relative to the tick so you can track the time.

            Edit: It's also very possible that playerid_1 isn't always the person dying. I tried to be semi-thorough and compare a couple of replays to confirm that theory, but you may want to check against your demo to make sure that it is correct.
            Last edited by snevets; 04-27-2012, 06:35 PM.


            • #7
              I just tried and observing the outputted dump of demoinfo2.exe and I think "type: 3" is losing buff/debuff since "type: 2" is gaining buff/debuff


              [01:53.36] Spectre receives Spectral Dagger Path buff from Spectre.
              dota_combatlog eventid:170 
               type: 2 
               sourcename: 0 
               targetname: 1 
               attackername: 1 
               inflictorname: 2 
               attackerillusion: 0 
               targetillusion: 0 
               value: 0 
               health: 511 
               timestamp: 113.365280 
               targetsourcename: 0
              [02:09.36] Spectre loses Spectral Dagger Path buff.
              dota_combatlog eventid:170 
               type: 3 
               sourcename: 0 
               targetname: 1 
               attackername: 0 
               inflictorname: 2 
               attackerillusion: 0 
               targetillusion: 0 
               value: 0 
               health: 511 
               timestamp: 129.365341 
               targetsourcename: 0
              [02:00.03] Pudge receives Rot debuff from Pudge.
              dota_combatlog eventid:170 
               type: 2 
               sourcename: 0 
               targetname: 7 
               attackername: 7 
               inflictorname: 9 
               attackerillusion: 0 
               targetillusion: 0 
               value: 1 
               health: 620 
               timestamp: 120.031845 
               targetsourcename: 0
              [02:00.19] Pudge loses Rot debuff.
               type: 3 
               sourcename: 0 
               targetname: 7 
               attackername: 0 
               inflictorname: 9 
               attackerillusion: 0 
               targetillusion: 0 
               value: 1 
               health: 620 
               timestamp: 120.198509 
               targetsourcename: 0

              I think combat log can be parsed directly from the 100mb+ file by filtering out these keywords:
              • name: "dota_combatlog"
              • CombatLogNames
              • dota_combatlog eventid:## (## - depends on what eventid is seen after searching dota_combatlog)

              I tried tracing a 60sec combat log in-game and searching "dota_combatlog eventid:##" in the dump file. I successfully traced all entries in the combat log without skipping 1 search entry.

              Its easily recreated just looking the dem_packet of the combatlog eventid

              [01:54.59] Clockwerk hits Earthshaker with Rocket Flare for 60 damage (568->508).
              dota_combatlog eventid:170 
               type: 0 
               sourcename: 4 
               targetname: 3 
               attackername: 4 
               inflictorname: 5 
               attackerillusion: 0 
               targetillusion: 0 
               value: 60 
               health: 508 
               timestamp: 114.598595 
               targetsourcename: 3
              this is the CombatLogNames
              #7 CombatLogNames flags:0x1 (15 Items) 429 bytes
                  #0 'dota_unknown' (0 bytes)
                  #1 'npc_dota_hero_spectre' (0 bytes)
                  #2 'modifier_spectre_spectral_dagger_in_path' (0 bytes)
                  #3 'npc_dota_hero_earthshaker' (0 bytes)
                  #4 'npc_dota_hero_rattletrap' (0 bytes)
                  #5 'rattletrap_rocket_flare' (0 bytes)
                  #6 'npc_dota_hero_vengefulspirit' (0 bytes)
                  #7 'npc_dota_hero_pudge' (0 bytes)
                  #8 'pudge_rot' (0 bytes)
                  #9 'modifier_pudge_rot' (0 bytes)
                  #10 'npc_dota_hero_venomancer' (0 bytes)
                  #11 'modifier_item_ring_of_basilius_aura_bonus' (0 bytes)
                  #12 'npc_dota_hero_queenofpain' (0 bytes)
                  #13 'npc_dota_hero_antimage' (0 bytes)
                  #14 'npc_dota_hero_mirana' (0 bytes)
              So the format will be
              ["timestamp"] "attackername" hits "targetname" with "inflictorname" for "value" damage (???->"health")

              There is still 1 problem that I can't determine though, I don't know where to find the "???" value

              Hope this helps on documenting entries in the parsed file
              I hate people that put their Rig specs in the signature. This is not!
              CPU: AMD Trainwrecker AM5+ 12150 hexacontatetra-core 1.65 Thz | RAM: Corsair Revenge 4TB DDR6 (0.24-0.24-0.36) | GPU: Sapphire 9990 128GB GDDR9 Hexa-Fired (0.65/0.95/1.1Thz) | Resolution: @ 20480x12800
              Connection: - Telus: (70MB/10MB)
              All Specs shown in the signature are fictitious, any resemblance to real hardware, current or obsolete, is purely coincidental


              • #8
                I made working K/D parser, except that it counts kill and death when hero who has aegis is killed. Is there any info in replay from which I can identify if hero has aegis at the time of his death, or the only way is to make item parser too, and check item slots of that hero for aegis? I checked the combatlog event when hero is killed and nothing is different from other similar events, so aegis data is not contained there.


                • #9
                  anyone figured out yet what that number in the url to the match replay comes from?

                  Lets say I want to create a website where you enter a match id, the tool then grabs the replay file (or the header), parses the necessary infos and displays them.

                  for example an url:
                  the 12838212 is the match-id. 750113279 seems random tho. Doesn't look like a timestamp.

                  Any idea?


                  • #10

                    You are correct about the Type 3, Zoid clarified this last night, and it is indeed losing the modifier. Type 2 is gaining and Type 3 is losing. I'll probably update the OP later today with some more info.


                    • #11
                      Everything above seems to correlate with what I've been seeing. I've been trying to find information to an entity's location on the map, but not having any luck. My guess is that it is in the CSVCMsg_Entity class which we don't have the definition for yet.
                      Author of Stats for Dota 2 in the Android Play Store.


                      • #12
                        I did a bit more work around pulling information out of the replays. Managed to get kills/deaths but like reiser have an issue with aegis. The combat log also gives information about buildings dying so you can get tower kills and denies (if you look up the targetname from the combat log it resolves to names like npc_dota_badguys_tower1_mid). You can also get roshan kills (and which neutrals they killed). With a bit of work that lets you get the creep kills and denies as well.

                        Another thing I didn't see mentioned is that the CUserMsg_TextMsg messages contain the combat log summary you can see in the console after a game (total damage, total stuns, total slows) and looks reasonably easy to pull out. Haven't really had any luck fishing out items (I suspect Bruno is right in that it is only contained in the mystery PacketEntities entity_data).

                        Progress is available at:
                        Most of the new stuff will be in which at the moment prints out player names, heroes, kills/denies, creep kills/denies and tower stats


                        • #13
                          You should be able to use CHAT_MESSAGE_AEGIS to keep track of who picks up the Aegis of the immortal, and keep count of the ticks to know if it expires or not. What I don't know is how the message is calculated. I killed Roshan as Sniper, and this is what is in the file:
                          ---- CDOTAUserMsg_ChatEvent (8 bytes) -----------------
                          type: CHAT_MESSAGE_AEGIS
                          value: 0
                          playerid_1: 5
                          playerid_2: -1
                          But I don't know why it gave Sniper an ID of 5 (I am assuming that the playerid corresponds to who the message is directed at), since his CombatLogNames ID is #1, and 5 is an Illusion (Which I didn't even get till after Roshan had fell)
                          Last edited by Drkirby; 04-28-2012, 02:53 PM.


                          • #14
                            CombatLogNames are not related to Player IDs, they are completely separate. Player IDs are 0-indexed at the bottom of the replay file, so in a normal game you have player IDs 0-9. In your game as sniper, you'll be 6th from the top of the list.

                            CombatLogNames are only used in the combat log.


                            • #15
                              Ok, so it looks like Player IDs are hard coded to slots. I was in a solo private game, but on Dire. In normal games, it shouldn't matter, but it is an issue that could crop up in private games.