Page 1 of 4 1 2 3 ... LastLast
Results 1 to 10 of 39

Thread: My Project :D

  1. #1

    My Project :D

    Site: datadrivendota.com

    I've been working on something I'd like to share in case you're interested or have feedback. I want to do the best data analytics possible with Dota's data, so I've been taking per-second snapshots of replay data and building a charting frontend to present it.

    Examples:

    Comparing Slark and Drow from the last game of the Shanghai Major
    Game snapshots from a parsed match
    Meta snapshot from shanghai major

    There are some things (like fight analysis) that just don't work rolled up to the 10s or 1 minute interval, so I am keeping things as granular as possible. (Even though this presents some speed/optimization problems sometimes.) If you'd like to see your data presented this way, let me know your steam ID and I can hook you up . There are still bugs being ironed out and some things are clunky, but we have some cool stuff in the pipe.

  2. #2
    Basic Member MuppetMaster42's Avatar
    Join Date
    Nov 2011
    Location
    Australia
    Posts
    585
    looks pretty slick man.
    you've got a lot of data there, it's great for comparing.
    I'd love to see what sort of analytical tools you could build off of this.


    a few things:
    - pls sir. get rid of the times new roman on the graphs. my eyes.
    - your graphing library seems slow as with that amount of data.
    you're using D3 i think?
    One of the problems is that your library uses SVG to do the graphs, but with that much data you end up having a really deep AND wide DOM tree.
    Which means the browser just sucks at re-rendering and querying it.
    Additionally with an SVG model the library ends up having the graph representation in the DOM, as well as the same data duplicated in internal data structures.
    might be worth seeing if you can find another library that uses canvas drawing instead?


    some things to help your bandwidth:
    - rejig your library bundler so that it either uses only minified library files, it uglifies each library file before bundling, or it uglifies after bundling.
    (I'd go with option 1 or 2).
    - rejig your application bundler so that it uglifies before bundling.
    - use a minifier on your HTML.
    - use a minifier on your CSS after you've compiled it from SASS.

  3. #3
    Thank you for the feedback Muppet! Your opinion means a lot; you have been a big help in this. If you think of anything you would like to see, or any other feedback, please send it my way.

    I am pushing the font change to production today, and will reexamine minifying all the things. When you say things seem slow, were you looking at the blog or elsewhere? (The blog still uses some older holdover code, which is known to be worse. I am upgrading the core version before going back and redoing graphics.) I intend to upgrade all the zooming charts to automatically change their step intervals to tradeoff granularity and performance as you zoom in, but if things feel bad I'll move that issue up the queue.

    As for things upcoming things: I am currently spiking something that looks like Mario Kart ghostracing. Hopefully I will be able to demo it here this week!

  4. #4
    Basic Member MuppetMaster42's Avatar
    Join Date
    Nov 2011
    Location
    Australia
    Posts
    585
    The only reason I mention minifying is to save you some of that sweet, sweet bandwidth.
    Save you money so you can keep doing bigger and better things



    for slowness I was referencing to this page in particular: https://www.datadrivendota.com/match...parsed-detail/
    It has a large set of time data across a smaller number of graphs, and it is choppy as hell on my chrome (scrolling is jumpy, mouseover lags behind the cursor, zoom in is slow).

    it's one of those weird things, because this page: https://www.datadrivendota.com/match...54&pmses=130,4
    has a smaller set of time data, but probably has more total points across all of its graphs, yet it performs near perfectly (not 100% smoothness, but near as makes no appreciable difference!)

    another weird thing that I just noticed is that this page: https://www.datadrivendota.com/match...99&pmses=130,4 (which is the hybrid of the first and second)
    performs similarly to the second, even though it has the same time scale of the first, and more graphs.
    Maybe there's somethign different you've done on the first page? dunno.



    ohhh that sounds interesting. I can't wait to see it!

  5. #5
    Basic Member MuppetMaster42's Avatar
    Join Date
    Nov 2011
    Location
    Australia
    Posts
    585
    I wrote this next post before actually considering hwo much gzip does... so it may not be worth the effort.
    the TL;DR of it is that you can save ~1KB on each json request by changing your data structures.

    Spoiler: 

    depending on how much your bandwidth costs, it also might be useful to look into some form of pseudo-minification algorithm for your json data.

    because you have keys repeated many, many times - a lot of wasted bandwidth!

    a few options:

    1)
    replace all string keys with number keys, and send along a lookup table for the string keys,
    i.e.
    Code:
    [
        {"time": 131, "offset_time": -89, "hero_kill_income": 0},
        {"time": 132, "offset_time": -88, "hero_kill_income": 0},
        ..............
    ]
    becomes
    Code:
    {
        "dict": [
            "time",
            "offset_time",
            "hero_kill_income"
        ],
        data: [
            {0: 131, 1: -89, 2: 0},
            {0: 132, 1: -89, 3: 0},
            ..............
        ]
    }
    which can be reduced to arrays for even further space savings (as long as every object in data follows the same pattern!
    Code:
    {
        "dict": [
            "time",
            "offset_time",
            "hero_kill_income"
        ],
        data: [
            [131, -89, 0],
            [132, -89, 0],
            ..............
        ]
    }
    I did a test on this json: https://s3.amazonaws.com/datadrivend...lls_v1.json.gz
    before minification: 106483B - zipped: 7.02KB
    after minification: 26703B - zipped: 5.94KB

  6. #6
    Hmmm. You are right about those examples, and that is very concerning. I will try splicing in a different charting implementation locally. You are also right that D3 can be very burdensome on the DOM, and now that I have more comprehension of my required feature set I might be able to cut some corners.

    As for the data structures, writing them this way makes the code clean and extensible at the cost of some bandwidth, but I try to economize by only pulling the slices I need. (The full json file with all the parts in it is 10s of MB.) The hope is that I can add a cheap subscription service (~$2/mo?) to cover all the costs, which should be more than sufficient for this kinds of waste (and help make rent). In theory I can offer replay caching, deep parsing + chart analytics, visualizations of all kinds, and maybe a redundant data api for the community if this gets off the ground. We'll see!

    Also, I pushed the font change just for you (and everyone else with eyes) . If you see anything else let me know, or if you want all of _your_ matches to show up parsed send me a steam ID .

  7. #7
    Yeah, it looks like the high-intensity zoomable charts at least will need to be webgl based. Plotly will probably work, and I can still d3ize some smaller things. Annoying, but doable!

  8. #8
    I added some light technical documentation for working with replay files to my blog, maybe they will be helpful to someone.

  9. #9
    Basic Member MuppetMaster42's Avatar
    Join Date
    Nov 2011
    Location
    Australia
    Posts
    585
    I started writing another replay parser a long while ago (https://github.com/bradzacher/Eaglesong)
    but after a while of trying to reverse engineer dotabuff's sange/yasha, skadistat's smoke, etc I threw in the towel lol - so many magic numbers and weird things.

    also your link in the conclusion is pointing to localhost: dynamic charting environments = http://127.0.0.1:8000/matches/time-lapse/

  10. #10
    You are right! I have corrected the link. The times regrets the error.

    When you were building your own parser, what did you want to get out of the data?

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •