AdmitOK - Study Abroad Consultancy

Data Science

Admitok-IT is leading Data Science training institute in Hyderabad. We offer the best training and 100% placement assistance.

Data science is a field that involves using scientific methods, processes, algorithms and systems to extract knowledge and insights from structured and unstructured data. It is a multidisciplinary field that combines domain expertise, statistical and mathematical skills, and computer science and information technology knowledge to analyze and interpret complex data sets.

Data scientists use a wide range of tools and techniques to clean, process, and analyze data, including machine learning algorithms, statistical analysis, and visualization techniques. They apply these tools and techniques to solve real-world problems in a variety of industries, such as healthcare, finance, and marketing.

Some common tasks that data scientists perform include:

Collecting and cleaning data from various sources

Analyzing data to identify patterns and trends

Building and evaluating machine learning models

Visualizing and communicating results to stakeholders

To be successful in data science, it is important to have strong skills in programming, statistics, and math, as well as domain knowledge in the industry you are working in. Some of the most commonly used programming languages in data science include Python, R, and SQL.

Course Duration:

Online

It will having 50% theory, 50% Hands-on.

It is a 45 days program and extends up to 1 hour each.

Corporate

It will having 50% theory, 50% Hands-on.

It is a 5 days program and extends up to 8 hour each.

Classroom

Classroom arranged on request and minimum attendees for batch is 5.

Data Science Course Syllabus

Introduction to Data Science

What is data science?

What is difference between AI, Data Science, Machine Learning, Deep Learning

Job Land scape and Preparation Time

Who are data scientists?

What is day to day job of Data Scientist

End to End Data Science Project Life Cycle

Data Science roles – functions, pay across domains, experience

Business Statistics

Data types

Continuous variables
Ordinal Variables
Categorical variables
Time Series
Miscellaneous
Common Data Science Terminology

Descriptive statistics

Basics concepts of probability
Frequentist versus Bayesian Probability
Axioms of probability theory,
Permutations and combination
Conditional and marginal probability
Joint Probability
Bayes Theorem
Probability Mass Function and Probability Density Function
Cumulative Mass Function and Cumulative Density Function

Central Tendencies

Mean
Median
Mode
Spread
Variance
Standard Deviation
Effects on central tendencies after transformations
Quartile Analysis
Implementation of central tendencies using python
Box Plots for outlier identification
Drawing Box plots using python

Sampling

Need for Sampling?
Different types of Sampling
Simple random sampling
Systematic sampling
Stratified Sampling
Implementation of sampling techniques using python

Data distributions

Normal Distribution
Binomial Distribution
Binomial Approximated to Normal
Implementation of distributions using python

Inferential statistics

Why inferential statistics?
Z score calculation
Defining p value and implementations using python
Inferring from sample to population
Sampling distribution of sample means

Hypothesis testing

Confidence Interval
Testing the hypothesis
Type I error
Type II error
Null and alternate hypothesis
Reject or acceptance criterion

Introduction to R

A Primer to R programming

What is R? Similarities to OOP and SQL

Types of objects in R – lists, matrices, arrays, data.frames etc.

Creating new variables or updating existing variables

If statements and conditional loops – For, while etc.

String manipulations

Sub setting data from matrices and data.frames

Casting and melting data to long and wide format

Merging datasets

Python for Data Science

Understanding the reason of Python’s popularity

Basics of Python: Operations, loops, functions, dictionaries

Numpy – creating arrays, reading, writing, manipulation techniques

Ground-up for Deep-Learning

Exploratory Data Analysis with Python

Getting to understand structure of Matplotlib

Configuring grid, ticks.

text, color map, markers, widths with Matplotlib

configuring axes, grid

hist, scatterplots

bar charts

multiple plots

3D plots

Correlation matrix plotting

Data Munging with Python

Introduction to pandas

Data loading with Pandas

Data types with python

Descriptive Statistics with Pandas

Quartile analysis with Pandas

Sort, Merge, join with Pandas

Indexing and Slicing with pandas

Pivot table, Aggregate and cross tab with pandas

Apply function for parallel processing with Python

Cleaning Data with python

Determining correlation

Handling missing values

Plotting with Pandas

Time series with Pandas

Introduction to Artificial Intelligence

Dealing Prediction problem

Forecasting for industry

Optimization in logistics

Segmentation in customer analytics

Supervised learning

Unsupervised Learning

Optimization

Types of AI : Statistical Modelling, Machine Learning, Deep Learning, Optimization, Natural Language Processing, Computer vision, Speech Processing, Robotics

Artificial Intelligence I – Statistical Modelling

Linear Regression

Assumptions
Model development and interpretation
Sum of least squares

Logistic Regression

Need for logistic regression
Logit link function
Maximum likelihood estimation
Model development and interpretation
Confusion Matrix – error measurement
ROC curve
Measuring sensitivity and specificity
Advantages and disadvantages of logistic regression models

Time series analysis – Forecasting

Simple moving averages
Exponential smoothing
Time series decomposition
ARIMA

Model validation and deployment

RMSE – Root Mean squared error
MAPE – Mean Average Percentage Error
Confusion matrix and Misclassification rate
Area under the curve (AUC) , ROC curve

Artificial Intelligence II – Machine Learning

Supervised Learning

Decision trees and Random Forest

C5.0
Classification and Regression trees(CART)
Process of tree building
Entropy and Gini Index
Problem of over fitting
Pruning a tree back
Trees for Prediction (Linear) – example
Tress for classification models – example
Advantages of tree-based models?

Association Rule Mining

Rules generation from decision trees,
Apriori algorithm
Support, confidence and lift measures

Support Vector Machines

Linear learning machines
SVM case for linearly separable data
Kernel space

Neural Networks

Motivation for Neural Networks
Perceptron and Single Layer Neural Network
Back Propagation algorithm
Feed Forward Neural Net
Sigmoid parameters
Weights initialization,
Decay of weights
Learning rate
Momentum

Ensemble Techniques

Bagging
Boosting
Stacking
Gradient Boosting Machines
Unsupervised Learning

Clustering Techniques

Hierarchical clustering
K-Means clustering
Distance measures
Applications of cluster analysis – Customer Segmentation

Collaborative Filtering, PCA

Artificial Intelligence III – Natural Language Processing

NLP I – Text Preprocessing

Tokenization
Stemming
Lemmatization

NLP II – Text Modelling

POS tagging
TFIDF and classification

Artificial Intelligence IV – Deep Learning

ReLU

Sigmoid, Depth vs Width tradeoffs

Convolutional networks

Concepts of filters

Sliding

Pooling and Padding

Comparison between DL and ML performances over the MNIST dataset

Practical use cases of AI and best practices in AI

Business problem to an analytical problem

Guidelines in model development

Big Data, Azure for AI, Data Science applications

Big data and analytics?

Leverage Big data platforms for Data Science

Introduction to evolving tools

Machine learning with Spark

Creation of R-Server clusters

Computation of Big-Data ML algorithms over the Azure cloud

Analytical Visualisation with Tableau

Why is it important for Data-Analyst

Tableau workbook walkthrough

Instruction of creation of your own workbooks

Demo of few more workbooks

What we are offering as a part of this course?

2 REAL TIME projects End to End explanations with Pseudo code

All classes explained with REAL TIME projects experience

Data sets with code

End to End Data Science Project work flow explanation

Free online mock test for Data Science Interview preparation

Free Mock Interviews for best performers in exam

Hand written notes copy and slides copy from institute

Detailed assistance in Resume preparation. Special attention for experienced people on previous experience

Real time interview questions and answers e-book

Trainer available for doubts answering on Slack Channel

Latest resources, blogs and articles sharing on slack channel

Special focus on building profile for experienced people