How to select column's unique values by date: pandas, time series

First thing, you may want to convert the date column from dtype O to datetime[ns]64 and then set that as the index/DateTimeIndex since this is time series data, and that is a powerful construct/overall structure for time series data in pandas.

From there, you should then be able to use something like .loc and slice out the "2020-02-01" date and then aggregate with min.

For the second question, you could then use groupby on the dataframe and pass df.index for the by parameter and then aggregate with count and that should give essentially a value_counts using the index. Also, in theory you could do a value_counts on the date column prior to conversion. And in all honest, you could do both of these tasks 1000 other different ways and as long as it is readable and maintainable, isn't causing any bottlenecks or performance degradations, is in line with what your group or team expects for style, etc, if that is applicable even, then it's just as viable as other solutions fitting these criteria in the same context, for the most part.

/r/learnpython Thread