18IT045_Practical_Work_DS
Kindly perform following tasks for the given dataset.
Dataset: https://www.kaggle.com/gpreda/coronavirus-2019ncov
Task-1:
Dataset Description using Orange tool.
What is need to be done to improve the accuracy of classification result of the given dataset? Get the maximum classification accuracy possible by performing following methods.
→Pre-processing
o Encoding
o Normalization
o Missing value handling
o Feature Selection
Compare your accuracy with and without applying pre-processing steps. Perform the Classification and visualize accuracy before and after preprocessing in Orange/Python.
Task-2:
Generate the Dashboard of preprocessed dataset from task-1.
Find the Maximum data insights by plotting Bar chart, Boxplot, Pie Plot, Stack Plot using PowerBI dashboard visualization.
Following answers need to be submitted in a single PDF file:
1. Provide a screen shot of data description and explain in brief.
2. Provide screen shot(s) of data pre-processing steps showing its significance.
3. Provide a screen shot showing accuracy before and after pre-processing.
4. Provide a screen shot of PowerBI dashboard with description.
load data in orange tool and set the confirmed case as a target variable
work flow of the task is :
Preprocess the data with impute value and discreatazation
Accuracy of before pre processing and after preprocessing is here
Save the data in excel file.
Load the saved data in power bi and create the visualization of data using bar chart ,pie chart, donate chart