OpenAI Five

  • OpenAI Five

  • #2
    I should confess that I was wrong: they are actually still trying. What they have done is certainly very cool and promising, but it is still very far from what I would call "actually playing Dota". As long as they don't give up, they will get there (hopefully in the next two years). Their PR was much better and more honest this time (still somewhat exaggerated, but much less than last year), but I am still struggling to understand the 'Open' part of 'OpenAI'!

    The downside is that the custom bots are now barely alive (if not completely dead). Looks like even Chris is not active anymore!

    If I was at Valve, I would definitely make OpenAI run a bot tournament before letting them have a segment at TI. Even though it is unlikely for other bots to win against OpenAI's bots, this might bring back the people who left and would certainly attract new people which is really good for the game. This would be also a good move for OpenAI, since people won't be able to argue that their bots win only because they have unachievable reflexes and are better and faster at calculating things than humans. I think whether they will do this or not, solely depends on how 'unlikely' do they think it is for other bots to win against them, since in the event that someone's bots beat them that will not look good for them at all!

    PS. Since I was away for a long time: how are things going with you guys?


    • #3
      I hope "Open"AI open-sources their code post TI, or at least provides an avenue to get other enthusiasts involved. Currently, without dedicated resources to throw against the problem it is hard for anyone else to find motivation.

      For example - I am very interested in the topic and technology and have lots of ideas/thoughts about how to better generalize their approach to eliminate their restrictions. I would test it myself if I could leverage some of the work they did getting it setup.

      From my understanding they didn't write a simulator but rather leverage AWS to deploy numerous clients connecting to the game to perform self-play learning. Since AWS instances don't have a "monitor" hooked up Dota2 was crashing on them (since apparently it requires the presence of a register display device) until they faked out a monitor through a virtual USB device that was registering itself as a monitor and then stubbing out many OpenGL calls. As a result they basically have a headless client since all their learning and decision making is (I assume here, no facts to base it on) based on the protobuf dumps Valve provides. Doing some back-of-the-napkin math what they achieved is basically doable with $50,000 worth of investment using cloud services and their code over the length of a 2-3 day weekend. I mean this from the cost of leveraging Google's AWS services and infrastructure at scale.

      All of the above is a big hurdle to cross for your one-off developers like myself. Do I understand how to do it? yes (although it would take a bit of research and google-fu), but I don't have the time to do this myself currently.

      Now, Chris's presence has been missing since Oct 2017 on these forums but someone is making bot API fixes and just not communicating them because as I noted in the protobuf thread, testing yesterday I found that many bugs and flaws I identified back in Oct of 2017 have been now fixed. This is probably driven by OpenAI's involvement and they probably have direct lines of communication with Valve (after all their 5v5 bots competed against Valve employees per the article) so in a way at least they are making progress which benefits us, just that we don't know it till we try it ourselves.

      I reached out to OpenAI via private messages to see if they are interested in further dialogue with the community and interested parties, but have not heard back yet.


      • #4
        Some thoughts on the topic I posted elsewhere:

        This is really awesome and exciting. I have sooooo many questions regarding design decision and the restrictions though, so this is likely to be a very long post. Please note, none of my questions or comments are to diminish in any way shape or fashion the achievements presented by OpenAI, but rather things that immediately pop into my head.

        1) Why use a separate LSTM for each hero instead of a master controller LSTM instance that can control 5 heroes similarly to how in robotics a robot dog can control 4 separate legs, a tail and a head?

        To be fair I am not sure that a robot dog would actually not use independent LSTMs for each limb, but I assume not. I would hazard to guess that it is just easier and faster to train the independent heroes. Additionally, it allows for better future integration in co-op AI + Human matches since the human heroes would not be controllable and thus an implementation that does the "best" action given it's localized environment would fair quite well even in the presence of human laning partners.

        However, I would think in the long run a 6th LSTM to control the "team" action pool will be necessary. Currently some of the restrictions placed eliminate it as being necessary (for example the fact that 5 invulnerable couriers exist; also their restrictions seem very reminiscent of the Turbo game rules), but in traditional Dota courier control is important. This 6th agent could also control glyph usage (eliminating them from consideration at each individual hero's level thus reducing the action space) and determine item builds (which currently are hard-scripted but ultimately you don't want 5 Meks on a team so you would want to monitor who picks up what team-impacting items) and assignment of limited team itemization (such as gems, wards, tomes of knowledge, smokes, etc.). Furthermore, this 6th agent could also influence the decision making of the 5 independent LSTMs leveraging the "team spirit" hyperparameter (which in the default scripted bots is referred to as "desire").

        Finally, it is my gut-feeling that the 5 heroes currently supported were chosen specifically b/c of their lack of global abilities and thus the decision space for each hero can be localized to their immediate surrounding and thus greatly reduced. For example, if Zeus/Invoker/AA/Silencer/Gyro/NP/SB/IO/Underlord/etc. are included you know have to consider all the other visible units when determining if your global (or globally-influencing) ability should be used. Even harder to calculate is the impact for those heroes like AA/SB/NP that have long travel/projectile times to arrive at global coordinates.

        2) Couldn't heroes really just be represented by their stats & abilities?

        IMHO, a hero can really be represented by stats such as: base Int/Agi/Str, turn speed, attack speed, attack cast point, movement speed, bounding box, starting armor, magic resistance, attack range; plus per level gain of Int/Agi/Str; plus the Talents inherent to the hero (which technically are abilities, but not really as no talent grants an active ability but rather influences an existing ability). (hopefully I'm not forgetting any others here). From these all the other possibly critical data pieces can be inferred (like health regen rate, mana regen rate, total mana pool, total health pool, base damage, etc.).

        Abilities likewise can be represented by parameters such as: passive or active (if passive by the bonus it provides), targeting restrictions (friendly, enemy, tree, point - meaning ground targetable), type of damage (physical, magic, pure), ignore spell immunity (yes, no), ability cast range, ability cast point, channeled (yes, no), ability AOE size (if approriate) and length of time that AoE persists (the OpenAI article seems to indicate AoE is not accounted for based on the Shrapnel comments made).

        Items are treated as abilities in Dota so same applies to them.

        Based on all of the above a model could be trained to do the "right action" based on those parameters and AI could handle Ability Draft in the future just as easily as any hero selection. I imaging this is the plan long term.

        3) Do trees matter?

        It is possible to destroy trees using tangos or Force Staff usage (there are other ways in reality but not with the restricted hero pool and itemization with the exception of perhaps Meteor Hammer which I don't recall of hand if it destroys trees). Also, destroyed trees naturally grow back after certain amount of time. Does the AI consider this state world information and plan for it? I would hazard to guess "not yet". Once again, this adds complexity, but tree interaction is not listed as a restriction currently.

        To add to this, does the AI understand terrain in any fashion other than possibly how it affects Line-of-Sight?

        Tree destruction events are included in the protobuf information sent by the server, however to model tree destruction you would also have to track all the trees in the game which greatly impacts the size of the state space.

        4) How do you handle dropped items (if at all) or items that affect environment?

        This is not a listed restriction although eliminating Divine Rapiers, Roshan handles the typical situation where an item ends up on the ground. Similarly, with no warding allowed and no stealth heroes the need for a gem doesn't exist (itemization is hard-scripted anyways). But... in the match against human players would the bots react at all or know how to handle the presence of placed items on the ground (like TP scroll or ironwood branch)?

        Furthermore, as a human player trying to break the bots I would use ironwood branches to plant trees in lane since that is not forbidden and thus force bots into unknown situations giving me an advantage possibly.

        Has this been considered? It has been (and in some cases continues to be) the Achilles' heel for the default bots (specifically with Roshan dropping Aegis as the owner of the item becomes "nil" once Roshan dies).

        5) Is there any logic for shop location and travel to secret/side shops?

        I would guess not given that item progression is hard-coded for now and 5 individual couriers exist and that the rules used seem to be essentially Turbo rules which allow the purchase of any item from the Fountain thus eliminating the need of knowledge regarding some items being explicit to the secret shop. Just a guess though.

        6) Any logic for moving items between stash, backpack, main inventory?

        I would guess not for now, but just curious.


        • #5
          They made some headway and posted an update:

          * currently support 18 heroes (no longer a mirror match must be played)
          * more items allowed (bottles and divine rapiers still not allowed)
          * stealth/invis is allowed (one of the 18 heroes is Riki after all)
          * warding is allowed now
          * roshan is allowed now


          • #6
            Impressive! But now I'm worried that they might be shooting themselves in the foot because they want to move too fast!

            Assuming their model is strong enough to generalize to more heroes (and different hero combinations), I think the only remaining restriction that will be hard to deal with is summons/illusions since Chris removed the trick I had for detecting them. Also, people on Reddit were saying the games are random drafts, which is surprising to me since some of the heroes in their pool hard-counter other heroes in the pool so no matter how good their algorithm works, there is a chance they lose! This is surprising to me since it seemed to me that they are perfectionists!

            BTW. It is surprising to me that Divine is not allowed but Gem is allowed!


            • #7
              The Divine issue might be more with the fact that the item doesn't provide the benefit if picked up by a team-mate after dropping on your death (attuned to the hero that bought it, unless picked up by opposing team which clears attuning and re-attunes to that person) and the fact that you cannot "give" it to a person unless it is attuned to them. Gem has no such constraints.


              • #8
                Pretty disappointing to see open ai have no plans to release their bots for public play after this weekend I guess that's it for bot scripting now since valve effectively abandoned it.


                • #9
                  I'm still working it on and off when i have time.