Data Set – What is it?

What is DataSet?
Dataset is the collection of data to train the machine to predict. Let’s understand this in the real world example.

Assume that you have a room with 5 kinds of fruits in 5 buckets. (Apple, Banana, Grapes, Blackberry, Blueberry)

Your each of bucket contains 20 quantity of food.

Now we have here:
1. A Room
2. 5 Buckets ( 5 Kind of fruits )
3. Each Bucket have 20 fruits

Let’s convert this data in the technical form.
1. A Room – We call it DataSet
2. 5 Buckets – We call it Labels
3.  Variants of fruits: We call it Examples

Above example is for Image prediction.

What is dataset in Text?
You have Pizza order lines:
Order a pizza
Order the pizza
Order pizza for me
Please book a pizza for me

These above lines belong to the same thing, which is Ordering a Pizza. (We can say Order Pizza)

Add some extra cheez
I want more cheez
Please add my cheez

These above lines belong to the same thing, which is Adding Extra Cheez. (we can say Extra Cheez)

So here we have:
1. Dataset – Pizza Order System
2. LabelsOrdering a Pizza and Adding Extra Cheez
3. These different lines are Examples


