## Pandas .min() method doesn’t seem to be the fastest

I’m trying to get min, `max`

, `mean`

, etc`. (some kind of value for all numbers) for a Pandas df column, and the Pandas method doesn’t seem to be the fastest. It seems that if I hit it with .values first, the run time of these operations will be greatly improved`

`.`

Is this expected behavior (meaning is Pandas doing something stupid or intentionally?). Maybe I’m using `.values`

to run out of extra memory, or I’m making assumptions and/or making it easier in some way which isn’t a given…).

“Evidence” of unexpected behavior:

```
df = pd. DataFrame(np.random.randint(0,1000,size=(100000000, 4)), columns=list('ABCD'))
start = time.time()
print(df['A'].min())
print(time.time()-start)`
# 0
# 1.35876178741
start = time.time()
df['A'].values.min()
print(time.time()-start)
# 0
# 0.225932121277
start = time.time()
print(np.mean(df['A']))
print(time.time()-start)
# 499.49969672
# 1.58990907669
start = time.time()
print(df['A'].values.mean())
print(time.time()-start)
# 499.49969672
# 0.244406938553
```

### Solution

When you call only one column, you reduce it to a pandas family that is based on a numpy array but contains more. Pandas objects are optimized for spreadsheet or database-type operations such as joins, lookups, and so on.

When you call .values on a column, it makes it a numpy array, which is a dtype optimized for math and vector operations in `C`

`.`

Even when “expanding” the ndarray type, the efficiency of mathematical operations easily beats the family data type. Here is a quick discussion on some of the differences.

As a side note, there is a specific module – `timeit`

for these types of time comparisons

```
type(df['a'])
pandas.core.series.Series
%timeit df['a'].min()
6.68 ms ± 121 µs per loop
type(df['a'].values)
numpy.ndarray
%timeit df['a'].values.min()
696 µs ± 18 µs per loop
```