Pandas rely on groups

I have a pandas dataframe that looks as follows: ID round player1 player2 1 1 A B 1 2 A C 1 3 B D 2 1 B C 2 2 C D 2 3 C E 3 1 B C 3 2 C D 3 3 C A The dataframe contains sport match results, where the ID column denotes one tournament, the round column

SQL case and count each line

How can I get the required results below? I could get all unique categories by adding DISTINCT, but in retrieving the total of each category the query below doesn't work. Table structure: BEER ID | NAME | TYPE | ALCOHOL | Required Result category | t

Count all past occurrences of an element in a large dataset

I have a quite large dataframe (3 million rows) that looks like this: df = pd.DataFrame({'user_id' : ['100','101','102','103','104'], 'service_id' : ['73', '73', '46', '12', '12'], 'date_of_service' : ['2015-06-10 17:00:00', '2014-09-27 17:00:00', '2

how to insert ungrouped data

Inspired by this great answer I wrote the following query that returns the AVG calculated according 5-minutes intervals for the last year. What I would like to have is all the 5-minutes intervals and, in case, set to null if no rows fit into a partic

python groupby itertools list of methods

I have a list like this: #[YEAR, DAY, VALUE1, VALUE2, VALUE3] [[2014, 1, 10, 20, 30], [2014, 1, 3, 7, 4], [2014, 2, 14, 43,5], [2014, 2, 33, 1, 6] ... [2013, 1, 34, 54, 3], [2013, 2, 23, 33, 2], ...] and I need to group by years and days, to obtain s

Group to get a transaction min

Currently I have a list of transactions as below: Front Office ID Transaction ID TradeDate SettlementDate 10000 1234 2015-03-03 2015-03-04 10000 1235 2015-03-03 2015-06-17 10001 1232 2015-03-13 2015-03-18 10001 1231 2015-03-13 2015-06-17 What I need

Select 1 column in a Group By LINQ query

I think what I need is relatively simple but every example I Google just returns results using First(), which I'm already doing. Here is my expression: var options = configData.AsEnumerable().GroupBy(row => row["myColumn"]).Select(grp => g

Percentage by group - oracle

I have this sample. What I need is getting an average per key not key and value. However, the syntax I used appear to give me the average per key and value. select avg(value2),KEY,VALUE from testavg GROUP BY key,value order by key, value Doing otherw

Group R by aggregate

In R (which I am relatively new to) I have a data frame consists of many column and a numeric column I need to aggregate according to groups determined by another column. SessionID Price '1', '624.99' '1', '697.99' '1', '649.00' '7', '779.00' '7', '7

Sql query for grouping with a deduplicated column

I have the following table create table events ( event_id, event_name, datetime, email) And I want to display the events per week, and the events per week deduplicated by emails, in a single query. While doing: select date_trunc('week', datetime) wdt

Group by Month Showing Duplicate Months in SQL

I have a table that logs transactions for a Warehouse DB. Amongst the information in this table is the location the transaction occurred to, the date the transaction ended, the time it ended, the qty transferred, and the Division. I am trying to get

The GROUP BY and ORDER BY clauses cause the query to be slow

I have a following query which executes for a very long time. Both article_category and article tables have approximately 250k rows. I tried some multiple-column indexes but nothing what would speed up the query. Current EXPLAIN is like this (that st

Group of SQL queries by problem

I am trying to display the number of quotes made during a certain period, sum the forecast for each and group them by the person who created the quote. Below is my query...but I think I am doing something wrong with the group by, but I don't know wha

Select column ID and maximum row ID

Is there a way to tell MySQL that while making something like this SELECT id, MAX(seq) FROM t1 GROUP BY ident; I can also get the id value? I know I shouldn't be using id if it's not in a group by but I feel like its strange to make a multi pass to g

SQL query to retrieve SUM in different ranges DATE

I have a table with information about sold products, the customer, the date of the purchase and summary of sold units. The result I am trying to get should be 4 rows where the 1st three are for January, February and March. The last row is for the pro

GROUP BY table1.column_name ORDER BY table2.column_name

I have a table of posts, and a table of users, and I have a table in-between that has the relationships between the two. One user can have many posts etc. The posts can be 'starred' by adding a 0 or 1 into the starred column on posts table. Only one

Linq to split / analyze substrings

I have got a List of strings like: String1 String1.String2 String1.String2.String3 Other1 Other1.Other2 Test1 Stuff1.Stuff1 Text1.Text2.Text3 Folder1.Folder2.FolderA Folder1.Folder2.FolderB Folder1.Folder2.FolderB.FolderC Now I would like to group th

How to group the results by month?

Currently, my codes here produces such results: SELECT YEAR(date_added) AS YEAR, MONTHNAME(date_added) AS MONTH, COUNT(*) AS TOTAL FROM news GROUP BY MONTH UNION ALL SELECT YEAR(date_added) AS YEAR, MONTHNAME(date_added) AS MONTH , COUNT(*) AS TOTAL

Translation of the SQL query into Doctrine2 DQL

I'm trying to translate this (My)SQL to DQL SELECT content, created, AVG(rating) FROM point GROUP BY DAY(created) ORDER BY created ASC And I'm stuck at GROUP BY part, apparently DAY/WEEK/MONTH isn't recognized as valid "function". [Semantical Er

MySQL filesort on GROUP BY YEAR & amp; Month

I have a large table that stores debug information for my web app. The issue is that the table is now 500,000 rows and one of the queries is slow because the index isn't being used. SQL: EXPLAIN SELECT count(*) AS `count`, month(event_date) AS `month

Dynamic column value with group by (sql and linq server)

i have a table Plan with following sample data i want to aggregate the result by PlanMonth and for PlanStatus i want that if any of its (in a group) values is Drafted i get drafted in the result and Under Approval otherwise. i have done it using foll

Group by & amp; Eliminate variable values

Scenario 1: Table: id column1 column2 1 "bla" "foo" 2 "bla" "bar" I want to group by column1 and get null for column2, cause there's not the same value in all rows. Scenario 2: Table: id column1 column2 1 "bla&

Django Models Group by

I have this simple SQL query - SELECT pid, COUNT(*) AS docs FROM xml_table WHERE suid='2' GROUP BY pid; How do I get this using Django ORM (i.e. django models). Basically I am not getting how to do GROUP BY?XML_table.objects.filter(suid='2').values('

Top N per group with multiple table joins

Based on my research, this is a very common problem which generally has a fairly simple solution. My task is to alter several queries from get all results into get top 3 per group. At first this was going well and I used several recommendations and a

Group by a range of X days

I have a set or records and I want to count and group them by a certain range e.g. I want to count the records that were created by groups of X days e.g. SELECT COUNT(*) FROM `table` GROUP BY /*`created` 3 days/* You can do something like SELECT COUN

Number and group by date must return 0 to no value

I'm trying to report the number of interviews we did per day. So I have a table of interviews such as interviewid,staffid,date,comments... and a date reference table containing all date from 2005 to 2020. having a single date field named ref. My quer