Use the ternary operator in the application function in the data frame of the pandas, without grouping columns

How can I use ternary operator in the lambda function within apply function of pandas dataframe?

First of all, this code is from R/plyr, which is exactly what I want to get:

ddply(mtcars, .(cyl), summarise, sum(ifelse(carb==4,1,0))/sum(ifelse(carb %in% c(4,1),1,0)))

in the above function, I can use ifelse function, R's ternary operator, to compute the resultant dataframe.

However, when I want to do the same in Python/pandas with the following code

mtcars.groupby(["cyl"]).apply(lambda x: sum(1 if x["carb"] == 4 else 0) / sum(1 if x["carb"] in (4, 1) else 0))

, the following error occurs:

ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

So how can I compute and get the same dataframe as in R/plyr?

For your information, if I use the ternary operator without grouping the columns, such as

mtcars.apply(lambda x: sum(1 if x["carb"] == 4 else 0) / sum(1 if x["carb"] in (4, 1) else 0), axis=1)

, I can get the resultant dataframe for some reasons (but it's not what I wanted to do).

Thanks.

[Update]

Sorry, the original example is not a good one when it comes to the use of ternary operator, since it uses 1 and 0, which can be used as a binary. So the updated R/plyr code is the following:

ddply(mtcars, .(cyl), summarise, sum(ifelse(carb==4,6,3))/sum(ifelse(carb %in% c(4,1),8,4)))

Is it feasible to use the ternary operator in this situation?


I think your code could be translated to this:

mtcars.groupby(["cyl"])['carb'].apply(lambda x: sum((x == 4).astype(float)) / sum(x.isin((4, 1))))

Toy example:

>>> mtcars = pd.DataFrame({'cyl':[8,8,6,6,6,4], 'carb':[4,3,1,5,4,1]})
>>> mtcars
   carb  cyl
0     4    8
1     3    8
2     1    6
3     5    6
4     4    6
5     1    4
>>> mtcars.groupby(["cyl"])['carb'].apply(lambda x: sum((x == 4).astype(float)) / sum(x.isin((4, 1))))
cyl
4      0.0
6      0.5
8      1.0
dtype: float64

update

In more complex case, you can use numpy.where() function:

>>> import numpy as np
>>> mtcars.groupby(["cyl"])['carb'].apply(lambda x: sum(np.where(x == 4,6,3).astype(float)) / sum(np.where(x.isin((4,1)),8,4)))
cyl
4      0.375
6      0.600
8      0.750
dtype: float64