Linked is an interesting article about AlphaGo Zero (a completely self-trained variant) and how it faired against supervised training variants (those that beat human professionals).

https://www.nature.com/nature/journa...ature24270.pdf

I found it to be a cool read with some new insights into things I haven't considered (mind you, I'm a noob still at this point on the domain topic).

Obviously there are differences between Go and Dota2, the main one being one is a turn-based game and the other continuous, but perhaps there are approaches or approximations that can be made to handle that. Anyways, if you think ideas from the article are not applicable or apply to Dota2 then I apologize for wasting your time; otherwise, enjoy the read.