Pandas DataFrames & Data Analysis
Master data manipulation, analysis, and cleaning with the Pandas library.
Series and DataFrames
Pandas introduces two primary objects: Series (1D) and DataFrames (2D table).
Creating a DataFrame
import pandas as pd
data = {
"Product": ["Laptop", "Mouse", "Monitor"],
"Price": [1200, 25, 300],
"InStock": [True, True, False]
}
df = pd.DataFrame(data)
print(df.head())
data = {
"Product": ["Laptop", "Mouse", "Monitor"],
"Price": [1200, 25, 300],
"InStock": [True, True, False]
}
df = pd.DataFrame(data)
print(df.head())
Reading External Data
Pandas can load data from CSV, Excel, SQL, and more.
df = pd.read_csv("sales.csv")
stats = df.describe() # Summary statistics
stats = df.describe() # Summary statistics
Selection and Filtering
# Select columns
prices = df["Price"]
# Filter rows
expensive = df[df["Price"] > 500]
prices = df["Price"]
# Filter rows
expensive = df[df["Price"] > 500]
Grouping and Aggregating
# Group by category and find average price
avg_prices = df.groupby("Category")["Price"].mean()
avg_prices = df.groupby("Category")["Price"].mean()
โ Practice (20 minutes)
- Create a DataFrame with 10 rows of "Student" names and their "Scores".
- Find the average score using
df["Scores"].mean(). - Filter the DataFrame to only show students who scored above 90.
- Add a new column "Passed" based on whether the score is above 50.