Page 1 of 4 1 2 3 ... LastLast
Results 1 to 10 of 39

Thread: pydota2

  1. #1
    Basic Member
    Join Date
    Dec 2016
    Posts
    731

    Lightbulb pydota2

    As I brush up on my understanding of Reinforcement Learning and Deep Learning I was combing through the approaches used by AlphaGo, AlphaGo Zero, OpenAI and their gym, and baselines packages, as well as many articles written in the MachineLearning reddit group and really want to see what I (or we as a group) can achieve.

    I looked at the approach being taken by DeepMind's StarCraft II team and have decided to try and replicate their framework within the Dota 2 universe.

    I just recently started this... so it is a very early Work-In-Progress project. I am 50% hesitant to even make this post at this time, but leaned to do so simply because I'm hopeful others will be willing to jump on-board and help propel this forward with me.
    You can find it here: https://github.com/pydota2/pydota2

    I have tried to put together a lot of thoughts and forward-looking ideas into the README.md so please read through it (it shows up as the text below the github project from the link above). One piece is currently completed (that is the ability to pull down protobuf dumps from a bot game). A single replay (for dire & radiant from the same game) is uploaded.

    The goal is really initially to create the appropriate framework for ALL OF US to use in developing RL/ML approaches to Dota 2. Writing the framework is a critical step, a big one, but not unfathomable. Yes, we will find faults in the API made available by Valve and hopefully Valve will be responsive in terms of providing us the functionality we need in order to interact with the game or with game-elements / replays as we implement these. Also, many members already have started writing many of the pieces I hope will be mainstream components of the framework in isolation in their own private/public repos. For actual RL/ML approaches to writing agents, people can do what they want (and collaborate as they want), just hopefully within the framework so code sharing (if that's what they want) can easily be done.

    I haven't started a wiki or any task list yet for things that need to be worked on yet and have no time until much later today (Halloween after-all) to do so.

    My personal plan is to devote at minimum 4-5 hrs per week (yeah, I know it's not much, but I am very busy the last few months and will continue to be for many more months) to this project. Some weeks I might hit 10-20 hrs, but I cannot commit to that.

    Cheers (I'm out of time to write this post).
    Last edited by nostrademous; 10-31-2017 at 10:53 AM.

  2. #2
    Basic Member aveyo's Avatar
    Join Date
    Aug 2012
    Location
    EU West
    Posts
    2,927
    Hint: create an org instead of it being a repository under your personal profile. Then you can easily add contributors and what not, dedicated repositories for various parts of the project etc.

  3. #3
    Basic Member
    Join Date
    Dec 2016
    Posts
    731
    Okay, created the Organization known as 'pydota2' and update OP with link.

    Added the first few issues with a milestone version 0.2 deadline.

  4. #4
    Basic Member
    Join Date
    Dec 2016
    Posts
    76
    well, it took me about maybe 100hr to write a basic one with C++ and Python Binding.
    efficiency is not that important, but it must be multithread friendly.

    i spent many many time on the algorithms, thats the real hard part QAQ.
    https://github.com/lenLRX/Dota2_DPPO_bots ----My ML bot work in progress

  5. #5
    Basic Member
    Join Date
    Dec 2016
    Posts
    731
    Okay, I modularized the protocol connections to the CMsgBotWorldState to their own class for easy re-use in self_play and human_play models later on (proto_connector). I tested this morning that the (proto_ingest.py) script still works and will record the Radiant & Dire replay binaries in the appropriate location. This was tested using Python 3.6, but it should work on 2.7 as well.

    I also started writing the (client_connector.py) to act as the server that Dota2 will connect to when sending CreateHTTPRequest POST messages from live bots. The design here is for them to not actually POST anything (although they could) but rather for the POST to act like a polling method and allow the server to send data to the live bots through responses to the polls (after all, we will be getting our world state information through the CMsgBotWorldState dumps in a separate thread). I can test this code later tonight, although no promises.

  6. #6
    Basic Member
    Join Date
    Jun 2017
    Posts
    7
    I have read your README.md and I think the approach you think is very comprehensive and feasible, but I still have something to say, To be honest, I am also a novice in ML. I started to learn DL 2 month ago and RL just few days ago, so forgive my noob if my understanding is wrong.

    I notice that you want to represent abilities as certain features(isStun, blink, mana, cd, duration..), while I don't think it is a good idea. You need to collect and input a lot of ability data and those may be changed according to the DOTA2 update so you need to train it again. Moreover, some ability can't be represent in such simple features such as hook of pudge, link of IO and so on. You can't extract all the features you want to represent the abilities. In my opinion, we can just put the ability into an embedding layer with about maybe 100 dimension I think enough to represent a ability. It is just like Word2Vec in NLP domain so we may call it Ability2Vec. In this network, the input maybe (observation, abilityVector) and calculate through embedding layer and strategy layers, then the ouput maybe the action of the ability(use it or not, use on where, who, when). If my understanding is right, there are 3 advantages of such layer.
    1). After the good training, we can get lots of good embedding vectors of abilities, and the similar abilities would have little distance in vector (AM's blink and QOP's blink may be very similar in embedding vector just like synonym in Word2Vec). More importantly, we can get a good strategy network that can be commonly used in any ability(it may be overfitting for recent used ability but some trick may fix it) so we don't need train it for each ability.
    2). Even if the Dota2 version update the ability data, the embedding vector may just change a little in theory, which means that we can input recent games into the old version network to train the embedding vector to fix the update problem so we don't need to train the network from zero again.
    3). The time of extract features would be saved.

    Actually, I write so much about ability vector just in order to come up with an example. We can do some Item2Vec so on. Never the less, Hero2Vec, Unit2Vec may be possible, too. Any entity can be changed into embedding layer. To be honest, I'd like to work with you to build the RL model together if quickly gaming and multi-process are possible. It's really exciting to let Dota2 AI learn how to play games and beat human.

    There is one thing I can ensure is that neural network would be really useful in such video game. I use something like NN in my team_desire.lua to adjust the push desire(ofc its parameters are adjust by myself without ML). The inputs are information of the lane and heroes position we can see and their item worth(has tp or not), items that useful in pushing tower my team have, towers' and barracks' life. Through just one neuron of hidden layer, the output is 3 push lane desire. Finally, each bot would have a last layer to input itself and the team desire to get the final desire of itself. It is a simple pushing system, but it can beat most of the bots script right now. My bot script name is "Army Bots", you can have a look at it.

  7. #7
    Basic Member aveyo's Avatar
    Join Date
    Aug 2012
    Location
    EU West
    Posts
    2,927
    That's the OpenAI what does this "button" do? self-learn approach. It has greater potential as it mimics human learning more closely, but requires a shitload of processing resources to the point of being unviable for this small independent project. Dota is not LoL with it's "this is my..cone move, this is my..circle move, this is my..throw stuff on the ground to blow it up move". After 7.07 it's even harder to come up with "synonyms" in a timely fashion. Nothing wrong with "hard-coding" abilities / items, and focusing on other aspects for starters (OpenAI team also done that, and they were working on just one hero).

  8. #8
    Basic Member
    Join Date
    Dec 2016
    Posts
    731
    @pilaoda - sorry, "reply with quote" seems to be currently broken on the forums.

    I have no issues with what you suggest and would love to have your help on the project. It's "one" approach, and the whole point is to allow multiple approaches so people can test ideas and things. In Dota, as far as coding is concerned, Items and Abilities are synonymous and you can treat them as the same thing.

    My push right now is to get the various modes of interacting with Dota2 for the purposes of being able to do ML working. As soon as I have live system messaging between the python back-end and the main client working I will start considering how to properly extract various world state inputs that we want to use. In other words, getting the framework up and running so people can do whatever ML approach testing they want is priority #1. Then doing actual ML is #2.

    Ultimately, my goal is to make the ML approach as hero-agnostic as possible and rather focus on distinguishing features of a heroes and abilities and items and how they relate to local and global objectives. This is to address the huge hero pool and not having to train for each hero. Also, because patches are somewhat frequent, an approach to quickly re-learn a specific piece of the overall system seems somewhat critical.

    Shoot me a PM with your github user name or request access to the pydota2 org and I will add you as collaborator.
    Last edited by nostrademous; 11-03-2017 at 05:40 AM.

  9. #9
    Basic Member
    Join Date
    Dec 2016
    Posts
    731
    Started some documentation on the pydota2 topic.

    https://github.com/pydota2/pydota2/b...er/pydota2.pdf

  10. #10
    Basic Member
    Join Date
    Dec 2016
    Posts
    731
    Okay got the basic framework for pydota2/bin/human_play.py working.

    Run Using:
    Code:
    python pydota2\bin\human_play.py --team Radiant
    What It Does:
    1) Sets up a protobuf listener for the specified team and accepts the dumps as they happen (depending on the dota 2 specified launch options: example `-botworldstatetosocket_frames 10`). It then calls a function callback to process the serialized protobuf dump into a JSON-represented proto object as specified by the CMsgBotWorldState.proto file maintained by Valve.

    2) Sets up a HTTP POST listener for the bots playing in the actual game to call using a polling model (after authenticating).

    Example Cmd Shell Output:
    Code:
    Radiant protoSize: 68
    Received Post:  {'Type': 'X', 'Time': 1976.0185546875}                      <====== THIS IS AUTHENTICATION PACKET
    SENDING RESPONSE:
     {'status': 200, 'Type': 'X', 'Time': 1976.0185546875}
    
    Radiant protoSize: 22424
    Received Post:  {'Type': 'P', 'Time': 1976.2802734375, 'PlayerID': 3}   <====== THIS IS A POLLING PACKET
    TODO - fill out with actions for agents
    SENDING RESPONSE:
     {'status': 200, 'Type': 'P', 'Data': {}, 'Time': 1976.2802734375}
    Example Console Output in Dota2 Client:
    Code:
    [VScript] 50.16624 [npc_dota_hero_bane]: SENDING SERVER UPDATE
    [VScript] Connected Successfully to Backend Server
    
    [VScript] 50.56624 [npc_dota_hero_bane]: SENDING SERVER UPDATE
    [VScript] Received Update from Server
    [VScript] 50.56624 [npc_dota_hero_bane]: Need to Process new Server Reply
    [VScript] 50.56624 [npc_dota_hero_bane]: Packet RTT: 0.12145996095001
    [VScript] 50.56624 [npc_dota_hero_bane]: Server Data: table: 0x0024efd0
    Still need to work out the exact frequency because currently I'm polling every 0.1 seconds, but it seems that the fastest I can get the round trip time of a poll to a reply is about 0.12145996095001 so there might need to be some deconfliction between which update got which response.

    Next Work:
    * model self_play.py after human_play.py by creating 2 sub-processes (one for each team) so they are isolated from each other
    * write a thread-locked queue for putting the JSON-encoded proto data and handing it over to the RL system for processing
    * write a thread-locked queue for retrieving the actions from the RL system for sending as a response to the HTTP POST message
    * write some basic movement actions to test the system
    * a lot more stuff...

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •