Attach datetimes of DataFrames and the fill data before

advertisements

I have two pandas DataFrames:

DataFrame a

2013-03-25 13:15:00     1
2013-03-26 13:15:00     2
2013-03-28 13:15:00     4
2013-03-29 13:15:00     5

and DataFrame b

2013-03-25 13:15:00    25
2013-03-27 13:15:00    15
2013-03-28 13:15:00     5
2013-03-29 13:15:00    10

I am trying to join the dates and forward fill the values. Right now I am doing it like this:

ab = pd.concat([a, b], axis=1)
ab.fillna(method='ffill', inplace=True)

a = ab.ix[:,0]
b = ab.ix[:,1]

So, ab is

2013-03-25 13:15:00     1    25
2013-03-26 13:15:00     2   NaN
2013-03-27 13:15:00   NaN    15
2013-03-28 13:15:00     4     5
2013-03-29 13:15:00     5    10

and then

2013-03-25 13:15:00     1    25
2013-03-26 13:15:00     2    25
2013-03-27 13:15:00     2    15
2013-03-28 13:15:00     4     5
2013-03-29 13:15:00     5    10

This has two disadvantages. Firstly, a and b are Series now. Secondly, this solution wouldn't work for multi-column DataFrames. Is it possible to just do this for a and b in place without going the way over ab. This seems like a fairly standard process. What am I missing?

Edit:

a.info()
<class 'pandas.core.frame.DataFrame'>
DatetimeIndex: 4 entries, 2013-03-25 13:15:00 to 2013-03-29 13:15:00
Data columns (total 1 columns):
icap    4 non-null float64
dtypes: float64(1)
memory usage: 64.0 bytes

b is equivalent.


I think in your case using combine_first combined with ffill will give you what you want:

In [46]:
a.combine_first(b).ffill()

Out[46]:
                     a   b
index
2013-03-25 13:15:00  1  25
2013-03-26 13:15:00  2  25
2013-03-27 13:15:00  2  15
2013-03-28 13:15:00  4   5
2013-03-29 13:15:00  5  10

This will join and align on indices taking the union of both dfs, this will introduce NaN values which you can fill using ffill

From the result of the above you can just assign back the cols of interest, also it looks like what you really want is to reindex using the union of the indices:

In [48]:
a.reindex(a.index.union(b.index)).ffill()

Out[48]:
                     a
index
2013-03-25 13:15:00  1
2013-03-26 13:15:00  2
2013-03-27 13:15:00  2
2013-03-28 13:15:00  4
2013-03-29 13:15:00  5

So you can do this for both dfs without having to perform any merging/combining