DataScience[Updated on:Aug-7-2022]
*.Public* | Reading Time: About 5 minutes

Data science lifecycle focus on machine learning and various analytical techniques to glean insights and forecasts from data in order to achieve business objectives.

Machine learning algorithms and statistical techniques are used in a comprehensive science lifecycle process to produce improved prediction models. The process involves numerous common data science steps, including data extraction, preparation, cleaning, modeling, and evaluation, among others.

The life cycle of data science involves multiple steps.

1. Data Acquisition

2..Data Processing

3. Model Building

4. Pattern Evaluation

5. Knowledge Representation

Any type of model involves two types of variables, independent and dependent which would be defined as X and Y respectively. X can hold any number of independent variables, whereas Y holds only one value which is actual or predictable (assumption).

Y = Dependent Variable

X = Independent Variable

A machine learning model identifies the relation between Y and X.

Based on data, we need to predict the customer churn. What could be the primary drivers in terms of customer churn?

If the dependent variable is churn, then x would be the value that helps in predicting the dependent variable.

For example, if Y = Churn of data which affects the business

then X1 = Price, X2= Customer's demographic, X3 = network quality, X4 = tenure of the customer, X4 = service and X5= complaints

We may also outline the data which is not required for our data modeling or prediction. The process of eliminating the poorly measured data is called data outliers.

You should discard the outlier if it is clear that the error resulted from poorly measured or inputted data. You can exclude the outlier if it has no effect on the results but does influence the assumptions. But make a note of that in a footnote in your essay.

R Language Data types help you to define, calculate and eliminate unwanted data. Especially the definition stage involves different inputs and those inputs would be defined using various data types.

List - Heterogeneous - information should be in the same column of an excel file. The information can be heterogeneous

Also Read:

Deep Learning of Data Science To Get Started

Logistic Regression Model using Python

Supervised and Unsupervised Learning

Share this course on your social media to help your peers. Follow our Data Science page on WEBYPOST for more courses on data science

Like & Share

## Leave a Comment

Login to post a public Comment

## Comments 2.

Posted this article like `12 month's ago. Data science is evolving day by day and it's ruling the world for now. Understand all types of terminology and learn more to achieve the data science certificate.

This is really required to start with learning the data science life cycle and machine learning models. Need more practical examples from your page.

I need a problem and a practical solution solved on Python. Thanks!