What is DataSet?
Dataset is the collection of data to train the machine to predict. Let’s understand this in the real world example.
Assume that you have a room with 5 kinds of fruits in 5 buckets. (Apple, Banana, Grapes, Blackberry, Blueberry)
Your each of bucket contains 20 quantity of food.
Now we have here:
1. A Room
2. 5 Buckets ( 5 Kind of fruits )
3. Each Bucket have 20 fruits
Let’s convert this data in the technical form.
1. A Room – We call it DataSet
2. 5 Buckets – We call it Labels
3. Variants of fruits: We call it Examples
Above example is for Image prediction.
What is dataset in Text?
You have Pizza order lines:
Order a pizza
Order the pizza
Order pizza for me
Please book a pizza for me
These above lines belong to the same thing, which is Ordering a Pizza. (We can say Order Pizza)
Add some extra cheez
I want more cheez
Please add my cheez
These above lines belong to the same thing, which is Adding Extra Cheez. (we can say Extra Cheez)
So here we have:
1. Dataset – Pizza Order System
2. Labels – Ordering a Pizza and Adding Extra Cheez
3. These different lines are Examples