Python and Pandas on Jupyter |

Python and Pandas on Jupyter

Maybe it should be in Jupyter??? In any case, I’ve been studying using python in jupyter notebooks and it’s some pretty radical stuff. Using numpy and %matplotlib inline can yield some incredible results. This is a list of the commonly used features and samples thereof.

Loading Dataset

import pandas as pd

df18 = pd.read_csv('all_alpha_18.csv')
df18.head()

Consise summary of columns and rows – pandas.DataFrame.info

df18.info()

Print duplicate lines – pandas.DataFrame.duplicated

duplicate = df08[df08.duplicated()]
print("Duplicate Rows :")
duplicate

Count duplicate lines – pandas.DataFrame.duplicated

print(df08.duplicated().sum())

Print/Count lines missing data – pandas.DataFrame.isnull

null_data = df08[df08.isnull().any(axis=1)]
print(null_data)

Column data types – pandas.DataFrame.dtypes

df08.dtypes

Distinct values in columns – pandas.DataFrame.unique

SmartWayColCnt08 = df08['SmartWay'].unique()<br>SmartWayColCnt08.size

Dropping columns – pandas.DataFrame.drop

df_08.drop(['Stnd', 'Underhood ID', 'FE Calc Appr', 'Unadj Cmb MPG'], axis=1, inplace=True)
df_18.drop(['Stnd', 'Stnd Description', 'Underhood ID', 'Comb CO2'], axis=1, inplace=True)

Rename columns – pandas.DataFrame.rename

df_08.rename(columns={'Sales Area': 'Cert Region'})

Replace spaces with underscores, lowercase labels

df_08.rename(columns=lambda x: x.strip().lower().replace(" ", "_"), inplace=True)

Confirm column lables are identical

df_08.columns == df_18.columns

Posted on March 1, 2022 by Aron. This entry was posted in Python. Bookmark the permalink.

Comments are closed.

Proudly powered by WordPress