My First Sql Project - An Sql Analysis Of Movies Produced Between 1980 – 2022.

My First Sql Project - An Sql Analysis Of Movies Produced Between 1980 – 2022.

This is my first data analysis (SQL) project and I wanted to infuse elements of data analysis and my personal touch (drama and beauty)

According to beloved Wikipedia (this is me sweetly referencing Wikipedia (also this is not a science project, I can reference Wikipedia!)) Double brackets, grammatically wrong but SQL says otherwise and she’s the queen here, so….

Okay! Back to data analysis, this is a process of inspecting, cleansing, transforming, and modelling data with the goal of discovering useful information and supporting decision-making.

Structured Query Language on the other hand, is a standardized programming language that is used to manage relational databases and perform varying operations on the data in them. Reading this and understanding it isn’t much, it uses pretty clear words and simple sentences.

Before we move on, for the language/pronunciation cognoscente(s) Is it pronounced
"ess-quee-ell" or "Sequel" ?

Learning SQL was fun, educative and pretty easy (this is thanks to my tutors and Hacker rank) At some point, concepts became confusing and I could hear the word “join” ringing constantly in my ears, the tables and the connections puzzled me, so much that I was tempted to skip it and hope that someday it just “magically” clicks in my head - DON’T DO THAT!!

I stood by Richard Feynman’s quote when he was asked to explain why spin one-half particles obey Fermi-Dirac statistics >“I couldn’t do it, I couldn’t reduce it to freshman level, that means we do not really understand this” seeing as I am a freshman myself in the world of data, I always ask “can I explain this to my 6year old cousin who still believes an orange tree would sprout from her stomach for swallowing a seed” whenever I am satisfied I can, I move on to the next topic.

How was this applied?

Dataset was gotten from Kaggle.

File was in CSV (Comma delimited) format; not a lot of cleaning was done, I only added an “ID” column, divided it into two tables(cause I’m bougie like that) called my first born tables “dvd” and “cd” for no great reason. And oh! after twenty attempts, a 120g 680calories bar of Diary milk whole nuts, 2 snickers bars, 4 buckets of tears, bouts of anger and listening to viral tiktok sound “I’m gonna kill my mum, I’m gonna kill my dad” this portrays exactly how I felt trying to figure out the “released” column. She flummoxed me and I bamboozled her, she now resides in kaggle; in the original dataset, not in my twins “dvd” and “cd” (she was also not necessary for my analysis)

I imported the csv file into PG admin and wrote my first query
Select * from dvd;
Felt great. Encouraging. Re-assuring.

This dataset came with questions and I answered them below with screenshots of the query I used and the result I got, please ignore the structure of my queries (no semicolon) and focus on results.

A. Total Gross
This one was straightforward and easy
image.png 587,105,546,272.
That is a gross of 500billion for movies produced in just 42years. Approximately 146billion every decade, 14billion every year, 1billion every month and 38million daily on film consumption! For context, the human genome project cost $3billion, occurring simultaneously at 20 institutions around the world over a period of 13years. The treaty of Versailles signed to provide reparations after World War 1 required Germany to pay $12.5Billion unconditionally, less than the annual gross of movies.

B. Total Revenue
Also pretty straightforward. Revenue would be what is left from a movie’s gross after removing cost of production (budget)

image.png $364,569,800,248.
This was the profit gotten from putting out movies. Once again, for context, because we never understand the magnitude of numbers when it is simply written. Every year, the movie industry made a whopping $8Billion. These numbers show the importance of data analysis, we never know the magnitude of things, especially those we cannot count or quantify.

C. Top 10 movies by total revenue
These are the movies that made the most money.

image.png image.png Were these movies rated highly? Or is this totally dependent on hype? Can we say people would rather see a movie due to intense marketing or because of the beautiful storyline and magnificent directing?

image.png image.png

These were the best rated movies, none of them made the top 10 movies by revenue. This directly shows that great marketing can sell a movie greatly. It can also be said that people do not appreciate some elements of cinema – screenplay, dialogue, directing e.t.c

image.png image.png

D. All genre by total revenue
Everybody’s making some sort of profit. Action, animation and comedy are making themselves the big three, the revered holy trinity; and then there’s western, if she was a spice she’d be flour.

image.png image.png

E. All genre by total gross
From this, we can deduce the size (profitability/demand) of a genre, there’s action, comedy and animation again, being the ridiculous triplet. Then, then again, there’s western.

image.png image.png

F. Top 10 score by revenue
You can see that making an exceptional movie; one with excellent screenplay, directing, storyline and dialogue are really not as important as hype and marketing.

image.png image.png

G. Top movies over the decades (1980-2020)
1980 – 1990

image.png image.png

“They need people like me, so they can make us the bad guy” what movie is this quote from? Its top15 from her time, stars a legendary actor and the movie is just fabulous! Also, look at my baby sitting graciously at the top. He’s top 2 and he’s not number 2!
1990-2000

image.png image.png 1990 to 2000 might just be the goat year in movie production cause look at this gems. I recommend seeing every movie in this list.

2000 – 2020

image.png image.png

The age of light sabers and dark cloaks. I’d quote the greatest sports coach to ever do it “if I speak, I am in big trouble”

H. Highest gross (top 10 countries)
These would give directors, actors and people who generally work in film making an idea of the size of a country’s movie market. A large market can easily propel you to stardom. Does this account for why US movie stars are known globally?

image.png image.png

I. Highest grossing movies

image.png image.png j. Bottom 5 less grossing movies (1980-2020)

image.png image.png

SERMON.

  1. People are certainly into action, animation and comedy as they are the biggest genres.
  2. Such a shame romance did not do well, certainly a shocker. Also, Why is horror making more money than romance? We need to have a conversation!?
  3. Numbers never lie, and this time she has shown us that “Titanic” is the greatest romance movie ever produced, followed closely by “The notebook” (Based purely on my sentiments)
  4. It is shocking to see Parasite in the list of bottom 10 grossing movies over 42years, what exactly happened?
  5. Something big has to happen in the western movies arena, their performance reminds me of Manchester united against Villareal during the Europa league finals 2021. Shambolic!

I hope this was an Insightful and enjoyable read.

P.S that is Richard Feynman at the top. A great physicist, he said:

“If you think you understand quantum mechanics, you don't understand quantum mechanics"

Ironic as he won a Nobel Prize For contributions to the development of quantum electrodynamics, He is known for his work in the path integral formulation of quantum mechanics and the theory of quantum electrodynamics. That quote is both literal and figurative because he meant it while I believe it fosters poise.

You don't think so? well, a discourse then?