Forum Rules

  • No flaming or derogatory remarks, directly or through insinuation.
  • No discussion, sharing or referencing illegal software such as hacks, keygen, cracks and pirated software.
  • No offensive contents, including but not limited to, racism, gore or pornography.
  • No excessive spam/meme, i.e. copious one liners in a short period of time, typing with all caps or posting meme responses (text/image).
  • No trolling, including but not limited to, flame incitation, user provocation or false information distribution.
  • No link spamming or signature advertisements for content not specific to Dota 2.
  • No Dota 2 key requests, sell, trade etc.
  • You may not create multiple accounts for any purpose, including ban evasion, unless expressly permitted by a moderator.

  • Please search before posting. One thread per issue. Do not create another thread if there is an existing one already.
  • Before posting anything, make sure you check out all sticky threads (e.g., this). Do not create new threads about closed ones.
  • It is extremely important that you post in correct forum section.

  • Balance discussion only in Misc.
  • All art related (such as hero model) feedbacks go to Art Feedback Forum.
  • All matchmaking feedback should go here: Matchmaking Feedback
  • All report/low priority issues should go here: Commend/Report/Ban Feedback
  • No specific workshop item feedback. These should go to workshop page of that item.
  • When posting in non-bugs section (such as this), use [Bugs], [Discussion] or [Suggestion] prefix in your thread name.

In case you object some action by a moderator, please contact him directly through PM and explain your concerns politely. If you are still unable to resolve the issue, contact an administrator. Do not drag these issues in public.

All rules are meant to augment common sense, please use them when not conflicted with aforementioned policies.
See more
See less

Creep block Bot using Reinforcement Learning

  • Filter
  • Time
  • Show
Clear All
new posts

  • Creep block Bot using Reinforcement Learning

    Hi guys
    I tried training a creep block agent using the Deep Deterministic Policy Gradient (DDPG) algorithm. Here is one run of the runs (not the best):
    I was inspired by BeyondGodlikeBot's post last August, and derived a lot of the training environment setup from him. There are a couple differences in our approaches:
    • He uses a stochastic policy, whereas I'm using a deterministic policy.
    • He uses a hard-coded policy to speed up learning. This hard-coded policy is quite simple (move 100 units in front of the farthest ahead creep) but still works quite good. What ends up happening is that the agent learns to mimic this hard-coded policy instead of exploring for itself.

    I intentionally did not use any hard-coded policy and instead allowed the agent to randomly explore by itself. This results in slower learning, but atleast the agent isn't biased by what us humans might feel is the 'right' way to block creeps.
    This is still my first iteration of the agent and was quite slow to train. I plan to speed up learning a lot and use other exploration algorithms and maybe even other learning algorithms. I also plan to make a clean openai-gym-like environment setup (gym-dota?) which I can put on github so other people can also train their agents on the same environment, and even add more environments

  • #2

    The result looks pretty good! How did you manage to reset the waves after they get past the tower?


    • #3
      Thanks! Right now I'm spawning each creep manually around the mid lane spawn point and then force killing it. You could also issue console commands for spawning and killing creep waves ('dota_spawn_creeps_mid' and 'dota_kill_creeps all' AFAIK)