It is a classical question and I know there are many work around like here: Select a Column in SQL not in Group By but they do not work for my issue on Bigquery.
I have a table with tweets from Twitter and I want a ranking for the urls including any tweet text.
ID tweet url
1 my github tweet http://www.github.com/xyz
2 RT github tweet http://www.github.com/xyz
3 another tweet http://www.twitter.com
4 more tweeting http://www.github.com/abc
I tried the following query, but then id 1 and 2 are counted separately.
SELECT tweet, count(url) as popularity, url FROM table group by tweet, url order by popularity desc
How can I count/rank the urls correctly and still preserve any associated tweet text in the result? I do not care if it is from ID 1 or 2.
Here is one approach:
SELECT url, COUNT(*) AS popularity, GROUP_CONCAT(tweet)
FROM Table GROUP BY url ORDER BY popularity
GROUP_CONCAT aggregation function will concatenate all the tweets associated with same URL using comma as separator (you can pick another separator as second parameter to GROUP_CONCAT).