Data Engineering

Answer these questions in best of your abilities within 24 hours. If you are stuck on a technical aspect, write down in words how you would go about answering this questions and what other information would you need.

  1. Attached is a sample dataset with 100,000 rows of random placements, and media exposure by impressions. What queries would you use to analyze the data? (Hint: you need to think about cleaning the data first. Common data problems include duplicates, missing, errors in the data)

 

  1. Now that you identified issues with the data, do you notice anything particular about this dataset? What queries would you use to investigate? (Hint: think about what you know about digital marketing and do you think these would be good exposures?)

 

  1. Segment the media exposures into 5 groups. What queries would you use to help you with this? Create a histogram in your answer.

 

  1. How many green T-shirts do you think are sold in a year? (This is an open ended question, let us know how you would go about this figure)

Leave a Reply

Your email address will not be published. Required fields are marked *