Wednesday, April 22, 2015

tldr: Bug in #MySQL or my query, but I can't tell which The TEDxIU Twitter user tweeted only once on...

tldr: Bug in #MySQL or my query, but I can't tell which

The TEDxIU Twitter user tweeted only once on a Sunday in the last year.

I know this because I wrote a program to put all their tweets into a database. And all the tweets from TEDxEast, TEDxPurdueU, and a bunch of others. It's still running, but I want to start getting the analysis going.

Here's a query:

    SELECT  '' 
    ,   DAYOFWEEK(b.datestamp) dw 
    ,   HOUR(b.datestamp) h
    -- ,       count(master.id) 'master'
    ,       count(tedxiu.text) 'tedxiu'

    FROM base_week b

    LEFT OUTER JOIN tedx_master tedxiu
        ON ( tedxiu.screen_name = 'tedxiu' 
        AND DAYOFWEEK(tedxiu.created) = DAYOFWEEK(b.datestamp)
        AND      HOUR(tedxiu.created) = HOUR(b.datestamp) )

    -- LEFT OUTER JOIN tedx_master master 
    --     ON ( DAYOFWEEK(master.created) = DAYOFWEEK(b.datestamp)
    --     AND      HOUR(master.created) = HOUR(b.datestamp) )
    WHERE DAYOFWEEK( b.datestamp ) = 7
    GROUP BY dw , h 
    ORDER BY dw , h
    ;

And when I run it, it tells me they tweeted once at 7pm.

dw h tedxiu
7 0 0
7 1 0
7 2 0
7 3 0
7 4 0
7 5 0
7 6 0
7 7 0
7 8 0
7 9 0
7 10 0
7 11 0
7 12 1
7 13 0
7 14 0
7 15 0
7 16 0
7 17 0
7 18 0
7 19 0
7 20 0
7 21 0
7 22 0
7 23 0

When I un-comment the "master" join, I get this output.

dw h tedxiu
7 0 0
7 1 0
7 2 0
7 3 0
7 4 0
7 5 0
7 6 0
7 7 0
7 8 0
7 9 0
7 10 0
7 11 0
7 12 18
7 13 0
7 14 0
7 15 0
7 16 0
7 17 0
7 18 0
7 19 0
7 20 0
7 21 0
7 22 0
7 23 0

The results from "master" are bleeding out into "tedxiu". Which is stupid. 

No comments:

Post a Comment