Here's a simple cheat sheet for working with Pandas functions in Data Science:
Basic Data Structures
Series: A one-dimensional labeled array capable of holding any data type.
import pandas as pd
s = pd.Series([3, -5, 7, 4], index=['a', 'b', 'c', 'd'])
DataFrame: A two-dimensional labeled data structure with columns of potentially different types.
data = {'Country': ['Belgium', 'India', 'Brazil'], 'Capital': ['Brussels', 'New Delhi', 'Brasilia'], 'Population': [11190846, 1303171035, 207847528]}
df = pd.DataFrame(data, columns=['Country', 'Capital', 'Population'])
Data Manipulation
Selecting Data:
df['Country'] # Selects the 'Country' column
df.loc[:, 'Country':'Population'] # Selects a range of columns
Filtering Data:
python
Copy
df[df['Population'] > 10000000] # Filters rows where Population > 10000000
Adding Data:
df['Area'] = [30528, 3287263, 8515767] # Adds a new column 'Area'
Data Cleaning
Handling Missing Values:
df.fillna(0) # Replaces missing values with 0
df.dropna() # Drops rows with missing values
Dropping Columns:
df.drop(columns=['Area']) # Drops the 'Area' column
Data Analysis
Descriptive Statistics:
df.describe() # Generates descriptive statistics
Grouping Data:
df.groupby('Country').sum() # Groups data by 'Country' and sums other columns
Data Visualization
Plotting Data:
import matplotlib.pyplot as plt
df.plot(kind='bar', x='Country', y='Population')
plt.show()
Would you like more details on any specific function or feature? 🌟 If this is helpful to you, you can save it to favorites so you don't lose it!