Python Data Science
Fee : ₹25,000 + GST ₹4500, Duration : 5 months, Contact : patkar@rajeshpatkar.com
Upcoming Class
Duration
5 months
Start Date
9th Feb 2025
End Date
28th June 2025
Timings
9.00 am to 11.00 am
Days
Sat/Sun
Course Content
1. Introduction to Data Science
Overview of Data Science
Importance and Applications
Data Science vs. Data Analytics vs. Machine Learning
Data Science Lifecycle
Data Collection, Cleaning, Exploration, Modeling, Deployment, and Maintenance
Tools and Environment Setup
Python Ecosystem for Data Science
Introduction to Jupyter Notebooks
Version Control with Git and GitHub
2. Data Maniuplation with Numpy and Panda
NumPy Fundamentals
Arrays and Array Operations
Mathematical Functions and Vectorization
Indexing, Slicing, and Reshaping Arrays
Advanced NumPy Techniques
Broadcasting and Vectorization
Performance Optimization
pandas for Data Manipulation
Series and DataFrames
Importing and Exporting Data (CSV, Excel, JSON, SQL)
Data Cleaning and Preparation
Handling Missing Data
Dropping Duplicates
Filtering and Sorting
3. Exploratory Data Analysis (EDA ) and Visualization
Descriptive Statistics
Measures of Central Tendency and Dispersion
Probability Distributions
Data Visualization with matplotlib
Basic Plotting: Line, Scatter, Bar, Histogram
Customizing Plots: Colors, Labels, Legends, Titles
Advanced Visualization with seaborn
Statistical Plots: Boxplots, Violin Plots, Pair Plots
Heatmaps and Correlation Matrices
Interactive Visualizations
Introduction to Plotly and Bokeh
Creating Interactive Dashboards
4. Basis Statistics or Data Analysis
Probability Theory
Basic Probability Concepts
Conditional Probability and Bayes’ Theorem
Inferential Statistics
Sampling Methods
Confidence Intervals and Margin of Error
Hypothesis Testing
Null and Alternative Hypotheses
t-Tests, Chi-Square Tests, p-Values
Regression Analysis
Simple and Multiple Linear Regression
Logistic Regression for Classification
5. Introduction to Machine Learning
Machine Learning Fundamentals
Supervised vs. Unsupervised Learning
Overview of scikit-learn
Supervised Learning - Regression
Linear Regression Concepts and Implementation
Evaluating Regression Models: MSE, RMSE, R²
Supervised Learning - Classification
Logistic Regression: Binary Classification
Performance Metrics: Accuracy, Precision, Recall, F1-Score
6. Advanced Machine Learning
Decision Trees and Ensemble Methods
Decision Trees: Structure, Splitting Criteria, Pruning
Random Forests: Ensemble Learning, Feature Importance
Support Vector Machines (SVM)
Margin Maximization
Kernel Functions: Linear, Polynomial, RBF
SVM Implementation for Classification
Unsupervised Learning - Clustering
k-Means Clustering: Algorithm Steps, Applications
Hierarchical Clustering: Agglomerative and Divisive Methods
Dimensionality Reduction Techniques
Principal Component Analysis (PCA)
t-Distributed Stochastic Neighbor Embedding (t-SNE)
7. Model Evaluation and Optimization
Cross-Validation Techniques
k-Fold Cross-Validation
Stratified Sampling
Hyperparameter Tuning
Grid Search and Randomized Search
Using scikit-learn’s GridSearchCV and RandomizedSearchCV
Handling Imbalanced Data
Challenges with Imbalanced Datasets
Techniques: Resampling, SMOTE, Adjusting Class Weights
Model Deployment and Monitoring
Saving and Loading Models with joblib and pickle
Introduction to ML Model Monitoring and Maintenance
8. Data Engineering and Tooling
Introduction to Data Engineering
Role of Data Engineering in Data Science
Overview of Data Pipelines and ETL Processes
Data Pipeline Tools
Introduction to Apache Airflow
Designing and Orchestrating Workflows
Big Data Technologies
Introduction to Apache Hadoop and HDFS
Basics of Apache Spark for Large-Scale Data Processing
Cloud Data Engineering
Overview of Cloud Platforms (AWS, GCP, Azure)
Introduction to AWS S3, Redshift, and Glue
NoSQL Databases
Introduction to MongoDB and Cassandra
Use-Cases for NoSQL vs. SQL
Data Warehousing Concepts
Designing Data Warehouses
Star and Snowflake Schemas
9. Database and DataStorage
Introduction to Relational Databases and SQL
Understanding Databases and SQL Syntax
Data Definition and Manipulation (CREATE, SELECT, INSERT, UPDATE, DELETE)
Advanced SQL Concepts
Joins, Subqueries, and Set Operations
Aggregate Functions and Grouping
Interacting with Databases in Python
Using SQLite and PostgreSQL with Python
Database Connections and Transactions
Introduction to ORMs
Using SQLAlchemy for Database Operations
Mapping Classes to Database Tables
10. Web development for DataScience
Introduction to Web Development and Flask
Understanding HTTP and Client-Server Architecture
Setting Up a Flask Application
Routing and View Functions
Defining Routes and Handling Requests
Templates and Rendering HTML Pages
Forms and User Input
Handling Form Submissions
Validating User Input
Building RESTful APIs
Creating APIs with Flask
Consuming and Serving JSON Data
11. Project
Exam System