Some month ago i engaged in a challenge that brought data anlyst from various part of the world and i was previledged to be paired with an amazing group and also be able to connect with higher minds so this is one of the project that we did
The DASL challenge 4 involves cleaning of the movie time dataset. The objective of this process is to eliminate errors, eliminate redundancy, increase data reliability, deliver accuracy, ensure consistency, and help to get reliable information for decision-making.
- Remove duplicate values from the movies column
- Clean the year column to get the appropriate year
- Get the first genre from the genre column
- Clean the rating column
- Get the director name and put it in a different column and also put the stars names in a separate column
- Clean the votes column
- Clean the runtime column
- Clean the gross column