CS50 Video Player
    • 🧁

    • 🥧

    • 🍓

    • 🍿
    • 0:00:00Introduction
    • 0:00:17Tables
    • 0:03:37Spreadsheets
    • 0:04:26Databases
    • 0:12:28SQLite
    • 0:13:56SELECT
    • 0:19:36LIMIT
    • 0:21:34WHERE
    • 0:31:45NULL
    • 0:34:23LIKE
    • 0:45:58Range Conditions
    • 0:53:10ORDER BY
    • 1:02:00Aggregate Functions
    • 1:17:21Conclusion
    • 0:00:00[MUSIC PLAYING]
    • 0:00:17CARTER ZENKE: Well, hello, one and all, and welcome to CS50's Introduction to
    • 0:00:21Databases with SQL.
    • 0:00:23My name is Carter Zenke.
    • 0:00:25And in this course, you'll learn how to represent, how to organize and manage,
    • 0:00:29and how to ask questions of the data that's around you
    • 0:00:32in your everyday life.
    • 0:00:34But why learn those skills?
    • 0:00:36Well, you may have heard we're living in the information age, where we generate
    • 0:00:40so much information, so much data by virtue of interaction
    • 0:00:43with computers and with each other over the internet.
    • 0:00:46You might think of, let's say, Google keeping track of the sites you click on
    • 0:00:50or the sites you search for.
    • 0:00:52You could think of maybe the smartphone in your pocket or the smartwatch
    • 0:00:55on your wrist keeping track of health information, emails,
    • 0:00:58text messages, and so on.
    • 0:01:00You might even think of YouTube, where you
    • 0:01:02might be watching this same video keeping track of all
    • 0:01:05the videos on their platform, the creators of those videos,
    • 0:01:08and even the comment you might leave on this video.
    • 0:01:12So although, we're living in this information age, where
    • 0:01:15there is so much data, so much information,
    • 0:01:17we can use these new tools, like database and SQL,
    • 0:01:20to interact with that information to store it and manage it.
    • 0:01:23And although we're using some of these new tools,
    • 0:01:25some of the other concepts we'll learn aren't actually so new.
    • 0:01:29So here is a diagram from literally a few thousand years ago.
    • 0:01:34And notice how this diagram has rows and columns.
    • 0:01:39And this seems to store the stipends for workers at a temple
    • 0:01:43some few thousand years ago.
    • 0:01:45So given what you know based on your prior knowledge, what kind of name
    • 0:01:50might you give this diagram with rows and with columns?
    • 0:01:55What can we give a name to for this?
    • 0:01:58So I'm seeing some ideas of a table, perhaps a spreadsheet as well.
    • 0:02:02For our purposes, we'll call this a table, where
    • 0:02:05a table stores some set of information.
    • 0:02:08And every row in that table stores one item
    • 0:02:12in that set, where every column has some piece of information
    • 0:02:15about that item, some attribute of that item.
    • 0:02:18So here, for example, we do have a table of workers at a temple.
    • 0:02:22Every row is one worker.
    • 0:02:24And every column is their stipend for a particular month.
    • 0:02:29So I could take this idea of a table, this very ancient idea,
    • 0:02:33and apply it to a more modern context.
    • 0:02:35So let's say I'm a librarian, for instance.
    • 0:02:38I want to organize my library.
    • 0:02:41Well, here I have book titles and authors.
    • 0:02:44And I could certainly use a table to store this information.
    • 0:02:47But how might you propose I store this information?
    • 0:02:51What could I do with my rows and with my columns
    • 0:02:54if I have book titles and book authors?
    • 0:02:58I see one thing I could do is probably organize my titles and my authors
    • 0:03:03next to each other like this.
    • 0:03:05I could take my titles to my authors, put them right next together.
    • 0:03:07So I have Song of Solomon by Toni Morrison,
    • 0:03:10Goodnight Moon by Margaret Wise Brown.
    • 0:03:12And notice hear how each book is one row.
    • 0:03:15But every row has two columns worth two pieces of information for each book.
    • 0:03:21I have a title for one column and an author for the other.
    • 0:03:25And so together, I have a table of books where every column tells me
    • 0:03:29one piece of information.
    • 0:03:31And every row tells me one book in this data set.
    • 0:03:36So thankfully, now that we're living in this information age,
    • 0:03:40we no longer have to use stone tablets or perhaps pencil and paper
    • 0:03:45to store our tables.
    • 0:03:47We have software now like Apple Numbers, Google Sheets, and Microsoft Excel.
    • 0:03:51But this isn't a course on Apple Numbers or Microsoft Excel or so on.
    • 0:03:57It's actually a course on databases and on SQL.
    • 0:04:00So feel free to raise your hand if you'd like, but why
    • 0:04:03would we decide to move along from these spreadsheet
    • 0:04:07softwares towards a database?
    • 0:04:10What might that database give us that a spreadsheet might not give us?
    • 0:04:14So I'm seeing a few ideas here and among them are some simplicity,
    • 0:04:18some ability to organize some data.
    • 0:04:20But there are a few other ideas which I think
    • 0:04:22about too for why we might move beyond spreadsheets and go towards databases.
    • 0:04:26Now one of these is this idea of scale.
    • 0:04:29So let's say you are a Google or an Instagram.
    • 0:04:32You're trying to store not just tens of thousands
    • 0:04:35of users or hundreds of thousands but literally
    • 0:04:37millions of users or billions of users.
    • 0:04:39And with that kind of scale may be better served by a database
    • 0:04:43to store that much information.
    • 0:04:46Another reason to move from spreadsheets to databases
    • 0:04:49is the idea of being able to update data more frequently.
    • 0:04:53Maybe you're a Twitter of the world, and you're
    • 0:04:55trying to have others tweet multiple times per second.
    • 0:04:58Well, a database can handle that kind of capacity
    • 0:05:00much better than a spreadsheet could alone.
    • 0:05:04And a third reason to move beyond this might be speed.
    • 0:05:08Let's say I'm trying to look up some piece of information in my database.
    • 0:05:11Well, I could do that much faster with a database
    • 0:05:14than I could with a spreadsheet.
    • 0:05:15You could think of yourself using Command-F or Control-F to find
    • 0:05:18a piece of information in your spreadsheet,
    • 0:05:20going one by one through the rows.
    • 0:05:22A database gives you access to more kinds
    • 0:05:24of algorithms you could use to search this data much faster, ultimately.
    • 0:05:28So these three reasons, among others, are
    • 0:05:31those might want to move beyond spreadsheets and start using databases.
    • 0:05:36Now, it's worth thinking first, what is a database?
    • 0:05:40We'll be talking about databases throughout this course.
    • 0:05:42So what is a database?
    • 0:05:44Well, a database is simply some way to organize your data such
    • 0:05:48that you can actually create data, update data, read data, and delete
    • 0:05:54data.
    • 0:05:54And often these our for interactions that we'll do with a database--
    • 0:05:57like adding some data, looking at data, deleting data, and even updating it
    • 0:06:02along the way.
    • 0:06:03But the database isn't the only thing in our picture here.
    • 0:06:06We also have a database management system,
    • 0:06:09a way to interact with our database.
    • 0:06:13So you might think of perhaps writing a program on your computer.
    • 0:06:17You have some interface with to write that program,
    • 0:06:19like VS Code for instance.
    • 0:06:21Or you might think of your own desktop on your own computer.
    • 0:06:23You have icons that you interact with the underlying operating system.
    • 0:06:27In the same way, we can use this software called a database management
    • 0:06:31system, or a DBMS for short, to interact with a database perhaps
    • 0:06:35using a graphical interface or using a textual language too.
    • 0:06:40Now, there are a few varieties of database management softwares.
    • 0:06:46And these are a few of them here.
    • 0:06:47MySQL, Oracle, PostgreSQL, and SQLite.
    • 0:06:51But this is a non-exhaustive list.
    • 0:06:53So let me also ask again, what kinds of other database management systems
    • 0:06:57have you perhaps heard of in this case?
    • 0:07:01I'm seeing one for Microsoft Access, perhaps, MongoDB.
    • 0:07:05There are other kinds of softwares, other companies out there that make
    • 0:07:08these ways to interact with a database.
    • 0:07:10And this is, again, a nonexhaustive list.
    • 0:07:13Now, if you are a database administrator,
    • 0:07:16or maybe you're somebody who's making a choice of which software to use,
    • 0:07:19you have a few trade-offs to consider.
    • 0:07:21Let's say you might think of one being proprietary, for instance, costing
    • 0:07:25money to work with.
    • 0:07:26What you get for that money is additional support
    • 0:07:29to actually implement your own database.
    • 0:07:31On the other hand, you might have open source software or free software
    • 0:07:35to use, stuff like PostgreSQL, MySQL, and SQLite.
    • 0:07:39But the downside is you are then responsible for actually implementing
    • 0:07:43that database.
    • 0:07:45Another thing to consider too is that maybe some
    • 0:07:48are going to be heavier weight than others, more fully featured
    • 0:07:52as a consequence but perhaps heavier weight requiring
    • 0:07:54more computation to run.
    • 0:07:56You can think of those like MySQL or PostgreSQL
    • 0:07:58being a little bit heavier weight but being more fully featured,
    • 0:08:02whereas SQLite down below will be a little lighter weight, as the name
    • 0:08:06might imply, but allow you to do most of the same work
    • 0:08:09that these other softwares could allow you to do as well.
    • 0:08:13And in this course, we'll actually be using SQLite for you
    • 0:08:16to work with your own databases.
    • 0:08:18But gradually, we'll move on to MySQL and PostgreSQL too.
    • 0:08:24So let me go ahead and talk about then SQL in this case.
    • 0:08:29You might notice that in each of these, MySQL and PostgreSQL and SQLite,
    • 0:08:34each of them have this idea of SQL in them.
    • 0:08:36And SQL is that language that we'll use to interact with our database.
    • 0:08:40Now, let me ask, what does SQL stand for, perhaps?
    • 0:08:45We talk about S-Q-L or SQL, but what might SQL stand for?
    • 0:08:51So I'm saying it stands for Structured Query Language,
    • 0:08:53which is good if you already know this, but not to worry if you don't.
    • 0:08:56So SQL does stand for Structured Query Language.
    • 0:09:00As we'll see in this course, it is structured.
    • 0:09:02It does have some keywords you can use to interact with the database.
    • 0:09:05And it is a query language-- it can be used to ask
    • 0:09:08questions of data inside a database.
    • 0:09:13We'll see that this is the language we can use to create data, to read data,
    • 0:09:18to update data, and delete data all with SQL in this case.
    • 0:09:24And our next thing will be to talk about this idea of querying.
    • 0:09:28So SQL is a query language, but what can we do with SQL?
    • 0:09:32Well, the first thing we can do, we'll focus on first
    • 0:09:35in this course is writing queries--
    • 0:09:37trying to ask questions of data using SQL.
    • 0:09:41Well, what kinds of questions could we ask?
    • 0:09:43Well, you might imagine perhaps working at an Instagram or a Facebook trying
    • 0:09:48to work as an engineer to figure out what kinds of posts
    • 0:09:51are the most liked on your platform.
    • 0:09:53That's a question you can answer with databases and with SQL.
    • 0:09:57You might also think of whether your numbers of daily users
    • 0:10:00are growing or shrinking if you work at a startup for instance.
    • 0:10:03Even maybe you might be working for some a company like Spotify that could ask,
    • 0:10:07how could we play songs that are like those a user just played?
    • 0:10:11This too is a question you can answer with databases and with SQL.
    • 0:10:17Now today, we'll be focusing on this database of books.
    • 0:10:21And in particular, books that have been longlisted,
    • 0:10:24quote, unquote, "for the International Booker Prize."
    • 0:10:27The International Booker Prize, if you're not familiar,
    • 0:10:29is an award given to books written around the world
    • 0:10:32by authors from many countries.
    • 0:10:34And it's designed to award books of fiction particularly good
    • 0:10:38in some cases.
    • 0:10:39And every year, the committee selects 13 books
    • 0:10:43to include on a longlist for consideration for this prize.
    • 0:10:47And our database then has five years worth of longlists
    • 0:10:50for the International Booker Prize inside of it.
    • 0:10:53We could use this database, perhaps, if we
    • 0:10:55were a librarian trying to find books for our library or even as a book
    • 0:11:00reader, an avid reader myself trying to find books to read that I
    • 0:11:03could put on my own shelf overall.
    • 0:11:06So we'll look at this database.
    • 0:11:08But we'll need a few tools in our toolkit,
    • 0:11:10metaphorically, to actually interact with this database.
    • 0:11:13And one of them is going to be Visual Studio Code.
    • 0:11:17Visual Studio code is an IDE, integrated development environment, to write code
    • 0:11:22and to edit files with.
    • 0:11:24It's also often called VS Code.
    • 0:11:27Now, in VS Code, we'll also be able to use SQLite, this database management
    • 0:11:32system, or a DBMS for short, to actually interact with that database.
    • 0:11:37So we'll be using these two tools combined
    • 0:11:39to work with the database of longlisted books for the International Booker
    • 0:11:44Prize.
    • 0:11:45And although we'll be using it here, SQLite is not just used in this course.
    • 0:11:49It's used in a variety of applications.
    • 0:11:51You could think too of phone applications,
    • 0:11:54where SQLite is often used on those devices have much lower memory.
    • 0:11:58You could think too of it being used on desktop applications
    • 0:12:02to simplify the process of storing data there too.
    • 0:12:05You could even think of it being used on websites
    • 0:12:07to help store information that user submits via a form, for example.
    • 0:12:12So we'll jump into using SQLite, but keep in mind
    • 0:12:15that not just in this course you'll use it,
    • 0:12:18but also, it's used in a variety of applications here too.
    • 0:12:23So why don't we just jump right into things and start using our environment
    • 0:12:27and start using SQLite?
    • 0:12:28So I'll go over here to my computer and will open up, let's say, VS Code.
    • 0:12:34We're here.
    • 0:12:35You can see I have my terminal environment.
    • 0:12:37And if you're familiar, you can type things like ls
    • 0:12:40to see the files that are in your current folder.
    • 0:12:43So I'll type ls right here.
    • 0:12:44And I'll see this database called longlist.db.
    • 0:12:48Again, working with books that have been longlisted or considered
    • 0:12:52for the International Booker Prize.
    • 0:12:54So if I want to open up this file, I can use this command.
    • 0:13:00Then this command is going to be called SQLite 3.
    • 0:13:04Or I can take some file that I have, like longlist.db, and open
    • 0:13:09it using this program called SQLite 3.
    • 0:13:12Well, it's called SQLite 3 because this is the third version of the SQLite
    • 0:13:16software.
    • 0:13:17So let's try this in our terminal.
    • 0:13:20I'll go back over here, and I'll say sqlite3 longlist.db.
    • 0:13:27And now, I'll hit Enter.
    • 0:13:29And notice how my terminal prompt changes.
    • 0:13:32It's no longer dollar sign.
    • 0:13:33It now says sqlite in front.
    • 0:13:35This means I'm inside of my SQLite environment.
    • 0:13:39So to clear things up, let me just clear my terminal.
    • 0:13:42I can use Control-L for this.
    • 0:13:44And now I have just that prompt up top.
    • 0:13:48And now a question I want to answer in this case first is, what
    • 0:13:52data do I actually have in my database?
    • 0:13:55What data is actually here for me to look at and to ask questions about?
    • 0:14:00Now, for this question, I can use my very first SQL
    • 0:14:04keyword, which will be called SELECT.
    • 0:14:07So SELECT is a way for me to select some rows in a table inside of my database.
    • 0:14:13Using SELECT, I can get back certain rows
    • 0:14:15or, in this case, perhaps all of them just to get a taste of what's inside.
    • 0:14:20So let's try using SELECT on this database to understand
    • 0:14:23what rows we have in our table here.
    • 0:14:26Let me go back to my computer and to my SQLite environment.
    • 0:14:30And I will try this very first SQL keyword.
    • 0:14:33I'll say SELECT.
    • 0:14:35And I can use this star operator here to say select everything.
    • 0:14:39I want every row and every column from this table.
    • 0:14:45Now, it's not enough for me to simply say select star and end my query.
    • 0:14:49I had to tell SQL which table do I want to select rows from.
    • 0:14:54In this case, I know I table was called longlist.
    • 0:14:57So I'll say SELECT star from "longlist," quote, unquote.
    • 0:15:02And to end my query, I'll say Semicolon.
    • 0:15:06And then finally, I can hit Enter.
    • 0:15:09And notice how I get a lot of data back.
    • 0:15:11This is a lot of data all at once.
    • 0:15:13But it's because my terminal is a little bit small,
    • 0:15:15there's a lot of rows and columns here.
    • 0:15:17So I could probably simplify this just a little bit.
    • 0:15:20And instead of saying SELECT star, I could also select a particular column
    • 0:15:24from my table.
    • 0:15:26I could say, for instance, SELECT, let's say,
    • 0:15:29just the title column from my database for my table like this.
    • 0:15:35I just know already that there is a column called title.
    • 0:15:39So now, I'll try this instead-- not select star, but select title instead.
    • 0:15:43I'll hit Enter.
    • 0:15:44And now, this looks a little bit better.
    • 0:15:46I can see the titles inside of this table from top to bottom.
    • 0:15:53Now, the neat thing here is I can select more than one column too.
    • 0:15:58Let's say I don't want just the title--
    • 0:15:59I want titles and authors in my search.
    • 0:16:02Well, I could do that as well.
    • 0:16:04Let me try this.
    • 0:16:05SELECT not just "title," quote, unquote, but then I'll
    • 0:16:09say comma and some new column to select.
    • 0:16:12I'll select also the authors from this table.
    • 0:16:16And I'll select them from the longlist table like this.
    • 0:16:20Now I'll hit Semicolon to end my query.
    • 0:16:22Hit Enter.
    • 0:16:23And now I'll see a variety of columns here,
    • 0:16:27in particular the title and the author column.
    • 0:16:31Now, this is going to be like all of my columns so far.
    • 0:16:34And there is a way for me to get back just to some of them
    • 0:16:36later on that we'll see.
    • 0:16:37But for now, let's ask, what questions do
    • 0:16:40you have on this SQL SELECT statement and how we're getting back these rows?
    • 0:16:47SPEAKER: Do I need to use the quotes around the words like you are?
    • 0:16:50CARTER ZENKE: Yeah, great question.
    • 0:16:52Do I have to use quotes around the words like I am?
    • 0:16:55In general, it's a good practice to use these double quotes around your table
    • 0:16:59names and your column names.
    • 0:17:01These are called SQL identifiers.
    • 0:17:03Later on, we'll see we'll also have strings in SQL-- strings
    • 0:17:08being collections of characters we can use.
    • 0:17:10For those, we'll simply single-quote them
    • 0:17:12to note the difference between a string and an actual column name.
    • 0:17:16So good style convention here to use-- double quotes for column names
    • 0:17:19and single quotes for string names.
    • 0:17:23Other questions too?
    • 0:17:25SPEAKER: OK.
    • 0:17:25I wanted to know, where did we take all this information from.
    • 0:17:29Like, this data, I don't know this list of books,
    • 0:17:34where did we take all this information from?
    • 0:17:36CARTER ZENKE: Yeah, a great question.
    • 0:17:38Where do we take this information from?
    • 0:17:39So some of this data is publicly available.
    • 0:17:42In fact, if you look at the Booker Prize website,
    • 0:17:44you can find a set of longlisted books over the years.
    • 0:17:47In this case, we have books from 2018 to 2023.
    • 0:17:51We'll also see later on this table has data on the ratings of those books
    • 0:17:56and the number of votes that were given to those books.
    • 0:17:59That data is from Goodreads, the site that aggregates
    • 0:18:02reviews from people like you, people like me who rate books online.
    • 0:18:06So we've taken data from a variety of sources and combined it into one here.
    • 0:18:11Let's take one more question to from Vinayak.
    • 0:18:14SPEAKER: Yeah.
    • 0:18:15So I wanted to ask regarding the syntax of what you used in the terminal.
    • 0:18:20So is the whole SQLite 3 is case-sensitive
    • 0:18:24because while using the syntax you used capital letters, whereas can we
    • 0:18:30use small-case letters as well?
    • 0:18:32CARTER ZENKE: A great question.
    • 0:18:33So here, I used capital letters for SQL keywords
    • 0:18:36and lowercase, my table names and column names.
    • 0:18:39Do I have to do that?
    • 0:18:40I, in some cases, do, some cases don't.
    • 0:18:43I think I could use lowercase for these SQL keywords,
    • 0:18:46but it's not very good style, for instance.
    • 0:18:48So let me just show you an example of this while I go back to my computer.
    • 0:18:52So the question again was, can I use lowercase for SQL keywords?
    • 0:18:57I think I could, but the question is, should I?
    • 0:19:00And probably not.
    • 0:19:01So let me try this.
    • 0:19:02I'll say select, let's say, title from--
    • 0:19:05I'm in the habit of uppercase it.
    • 0:19:06So I'll say from in lowercase longlist like this Semicolon.
    • 0:19:11Hit Enter.
    • 0:19:12And that still works.
    • 0:19:14But the problem you might run into is someone
    • 0:19:17who's reading your query particularly, a long one, might want to know,
    • 0:19:21what are the SQL keywords?
    • 0:19:22And what are your column names and other identifiers here?
    • 0:19:26By capitalizing your SQL keywords, you can make it clear
    • 0:19:29that this is a SQL keyword and not some other name overall.
    • 0:19:35OK.
    • 0:19:35So let's keep going.
    • 0:19:37And we saw just a little bit ago that we could select "title" from "longlist."
    • 0:19:41And we would get back a whole list of titles, literally all the titles that
    • 0:19:46are in this database.
    • 0:19:48But it's often maybe good practice for me to not look at all the data.
    • 0:19:53Like, imagine if we had millions of rows in this column,
    • 0:19:56but only to give back some.
    • 0:19:57Take a peek of what's inside this database.
    • 0:20:00And for that, I could use this other SQL keyword, this one called LIMIT.
    • 0:20:05So LIMIT, as its name might imply, limits
    • 0:20:09the number of queries or the number of rows I get back from my query.
    • 0:20:13I could say LIMIT 3, for instance, or LIMIT 5
    • 0:20:16to get back only the top 3 or the top 5 rows from my table.
    • 0:20:20And let me ask folks, if I wanted to peak
    • 0:20:22in this data set, how many rows should I try to limit it to?
    • 0:20:27I could say SELECT title from, let's say,
    • 0:20:30longlist, but limit to what number?
    • 0:20:33I'm seeing 10.
    • 0:20:35So let's try 10 first.
    • 0:20:36I'll go back to my computer.
    • 0:20:38I'll come back over here, and I'll say, why don't I
    • 0:20:40select "title" from "longlist?"
    • 0:20:45But now, instead of hitting Semicolon, I will instead say LIMIT 10 Semicolon.
    • 0:20:53And I'll hit Enter.
    • 0:20:54Now I see only the first 10 rows in my data set.
    • 0:21:00Handy for peeking in at the top of my data set here too.
    • 0:21:03Let me try not just 10 but 5.
    • 0:21:05So I'll say SELECT "title" from "longlist" LIMIT 5 Semicolon.
    • 0:21:13This then gives me just the top 5 titles in my database.
    • 0:21:19Just in whatever order they were added to my table,
    • 0:21:21I'll see them in that order here too.
    • 0:21:25So using LIMIT, we can actually try to get back a certain number of rows.
    • 0:21:31But this isn't quite that interesting.
    • 0:21:33It's good for peaking in your data set.
    • 0:21:35I think we've answered that question of, what data do we have?
    • 0:21:38But let's say we want to make more advanced queries.
    • 0:21:40We want to find the books that were nominated in 2023 or perhaps books
    • 0:21:44by a certain author.
    • 0:21:45Well, for that, we could use this next SQL keyword, this one called WHERE.
    • 0:21:50So WHERE allows me to get back not all rows, but only
    • 0:21:56some rows where some condition is true.
    • 0:22:00So WHERE is often combined with other conditions
    • 0:22:04to make sure I only get back some rows, where that condition is true.
    • 0:22:09Let's try looking at that in SQLite to get
    • 0:22:11a feel for what it can do for us here.
    • 0:22:14So I'll go back to my WHERE.
    • 0:22:17And I will then go back to SQLite here.
    • 0:22:20Hit Control-L to clear my terminal.
    • 0:22:23And I'll then try this query.
    • 0:22:25I'll say SELECT "title."
    • 0:22:28SELECT "title."
    • 0:22:29And why don't we also select author along the way?
    • 0:22:32Two columns here.
    • 0:22:34And I'll select them from my longlist table.
    • 0:22:37But I don't want all rows.
    • 0:22:39I only want, let's say, those titles and authors that were longlisted in 2023.
    • 0:22:45So I'll do this.
    • 0:22:46I'll say WHERE the "year" column is equal to 2023.
    • 0:22:53And notice here how 2023 is not in quotes because it
    • 0:22:56is an actual number, an integer.
    • 0:22:59So I don't need to quote it like I would a string some collection of characters
    • 0:23:02or a table or column name.
    • 0:23:05So I'll hit Enter here.
    • 0:23:07And what do I see?
    • 0:23:08Well, I see only those books that were nominated in 2023.
    • 0:23:12Let's try this again.
    • 0:23:14I might try not just 2023--
    • 0:23:17I might try 2022 like this.
    • 0:23:20Hit Semicolon.
    • 0:23:21Now, I'll see those books nominated in 2022.
    • 0:23:25I could keep going.
    • 0:23:26I could say why not 2022?
    • 0:23:28Why not 2021?
    • 0:23:29Now, I have all those books nominated in 2021.
    • 0:23:34So this is handy.
    • 0:23:37We can set things equal to or not equal to make some condition here.
    • 0:23:40And we also have others we could use.
    • 0:23:42We saw equals just now.
    • 0:23:44But we similarly have not equals.
    • 0:23:46And we have this kind of obscure operator down here.
    • 0:23:50This one being also equivalent to not equals as we'll see in just a minute.
    • 0:23:55But let me first ask now, what questions do
    • 0:23:58we have on how to use WHERE or using SELECT so far?
    • 0:24:02SPEAKER: Why are the subsets of SQL?
    • 0:24:04CARTER ZENKE: A good question.
    • 0:24:06Are there subsets of SQL?
    • 0:24:07So there are.
    • 0:24:09In fact, S-Q-L or SQL was defined by the,
    • 0:24:12I believe it's the ANSI, like standard corporation.
    • 0:24:15They have a whole set of the SQL language
    • 0:24:17that is like the official version of it.
    • 0:24:19You might be able to use some subset of that version
    • 0:24:22with the database manager system that you actually use.
    • 0:24:24So for SQLite, we're using a subset of SQL that works with SQLite.
    • 0:24:29Similarly, if you were using another software like PostgreSQL or MySQL,
    • 0:24:33you could use another subset there too.
    • 0:24:36Let's take another question from Tayas, perhaps.
    • 0:24:40SPEAKER: Then I want to know that, can we add 2022 and 2021 in a terminal?
    • 0:24:45CARTER ZENKE: Yeah, good question.
    • 0:24:47Could I, perhaps, try to filter by 2021 and 2022?
    • 0:24:52I could do that.
    • 0:24:53And we'll see that in just a moment here.
    • 0:24:55So let's keep going and exploring some other options with not equals.
    • 0:24:58And then we'll see how we can combine conditions using WHERE too.
    • 0:25:02So let's go back, and let's focus first on trying
    • 0:25:05to use these not equal operators.
    • 0:25:07We saw the exclamation point equals and this greater than, less than sign put
    • 0:25:12together.
    • 0:25:13So let's try a few of those.
    • 0:25:15Let's say I want to find books that are written by a certain author.
    • 0:25:20Well, I could use equals for that.
    • 0:25:21But let's say I also want to find books that are not
    • 0:25:24written in the hardcover format, like they
    • 0:25:26tend to be more expensive and so on.
    • 0:25:28So I don't want hardcover books.
    • 0:25:30Well, I could try a query like this.
    • 0:25:32I could say SELECT "title" and "format," where format is either hardcover
    • 0:25:38or paperback, FROM my "longlist" table.
    • 0:25:42And now I'll say, WHERE the "format" is not equal to hardcover Semicolon.
    • 0:25:49So notice here I'm using single quotes for hardcover.
    • 0:25:53This is a string.
    • 0:25:54It's not a table name or a column name.
    • 0:25:57It is just a string.
    • 0:25:58So I'm using single quotes here.
    • 0:26:00Everything else, though, like format, longlist, title, et cetera,
    • 0:26:05those are all table names or column names.
    • 0:26:08So again, I'll hit Enter.
    • 0:26:09And now I'll see that these are all in paperback according to my table.
    • 0:26:15I've omitted those that are hardcover.
    • 0:26:17Well, I could also use in this case, the greater than or less
    • 0:26:21than sign put together to say not equals as well.
    • 0:26:25Let me just hit the Up Arrow on my computer
    • 0:26:27to reveal what I just previously typed.
    • 0:26:29I'll then tab back over and say not exclamation point equals, but less than
    • 0:26:36and then greater than.
    • 0:26:37Hit Enter now.
    • 0:26:38And I should see the very same results.
    • 0:26:41But all I did was change this operator from exclamation point
    • 0:26:44equals to less than and greater than.
    • 0:26:47It tends to be that the exclamation point equals
    • 0:26:49is the more common operator in this case, but they do the very same thing.
    • 0:26:55Now, one more keyword I could use here that's worth mentioning
    • 0:26:59is this keyword called NOT.
    • 0:27:01So here, I was able to use exclamation point equals or the less
    • 0:27:05than or greater than sign to say not equals.
    • 0:27:08But I could also negate a condition using NOT.
    • 0:27:12So let's try using this one in our SQLite terminal too.
    • 0:27:16I'll go back over here.
    • 0:27:17And I'll bring back my SQLite terminal.
    • 0:27:21I'll say let's do not this operator here, but instead use NOT.
    • 0:27:27So I might, in front of WHERE, say WHERE NOT "format" equals "hardcover."
    • 0:27:35So I have a condition, "format" equals "hardcover,"
    • 0:27:38but now I'm going to negate it, take the opposite of it
    • 0:27:41and get back the very same results here too.
    • 0:27:46OK.
    • 0:27:48So to the question earlier, how we can combine these conditionals.
    • 0:27:52So let's try that here in just a minute.
    • 0:27:55So let's say I wanted to find the books that
    • 0:27:58were not just written in 2022 or 2023 alone, but all the books together.
    • 0:28:05Well for this, I could use a few other SQL keywords that
    • 0:28:08might be a little familiar to you too.
    • 0:28:11Let's try looking at some of these over here.
    • 0:28:14So here, we have these called AND, OR, and these parentheses here too.
    • 0:28:20So using AND and OR, I can change conditionals.
    • 0:28:25I can put them together to make a more complex conditional, a compound
    • 0:28:28condition.
    • 0:28:29And I could also use these parentheses to symbolize
    • 0:28:32that this condition should come first and then
    • 0:28:35some condition should come afterwards as well.
    • 0:28:37So let's try these in SQLite as well.
    • 0:28:40I'll go back to my computer.
    • 0:28:42And again, our goal is to find not just books in 2022 or 2023,
    • 0:28:47but books that work across those years as well.
    • 0:28:51So I'll, in this case say, SELECT--
    • 0:28:53let's go for a "title" and "author" from my "longlist" table.
    • 0:28:59Now, I'll say WHERE the "year" is 2022, as we did
    • 0:29:05before, OR perhaps the "year" is 2023.
    • 0:29:10And notice how my query is kind of wrapping around my terminal.
    • 0:29:14I could leave it like this, or if I backspace just a little bit,
    • 0:29:19I could hit Enter.
    • 0:29:21And now I'm on a new line to continue my query.
    • 0:29:25So here, I'll say OR "year" is 2023.
    • 0:29:30And now my query is done.
    • 0:29:32So I'll hit Semicolon.
    • 0:29:33And I should see those books published in or nominated in 2022 or 2023.
    • 0:29:42Let's try a few more here.
    • 0:29:44Let's try using our parentheses as well.
    • 0:29:46Maybe I want not just these books but also
    • 0:29:49those that are formatted in a hardcover format.
    • 0:29:52So I'll say--
    • 0:29:53I'll clear my terminal, Control-L again, and SELECT "title" as well as,
    • 0:30:00let's say, "format" from my "longlist" table.
    • 0:30:05And now, I'll hit my new line to extend my query
    • 0:30:09without wrapping it on my terminal.
    • 0:30:10I'll hit Enter.
    • 0:30:11And I'll say WHERE the "year" is 2022 OR the "year" is 2023.
    • 0:30:20That's one condition.
    • 0:30:22And I can denote that with a single set of parentheses here.
    • 0:30:26I also want it to be true that the "format" is not "hardcover."
    • 0:30:33So now, I'm adding another condition in here.
    • 0:30:36Now, I'll say Semicolon, hit Enter.
    • 0:30:39And I'll get back only those books that are published in the paperback
    • 0:30:43format in 2022 and 2023.
    • 0:30:49OK.
    • 0:30:50So let me pause again and ask if there are any questions so far
    • 0:30:53in how we've been using WHERE and SELECT and other conditions as well.
    • 0:31:00SPEAKER: I would like to know about the--
    • 0:31:05we can have a list of the top titles available in the database.
    • 0:31:10Like you have mentioned, the title, author, where we can know about that.
    • 0:31:15What are the titles available in the database using command below?
    • 0:31:19CARTER ZENKE: Yeah.
    • 0:31:20So here I've been using these column names called "title" and "author."
    • 0:31:23And I think your question is, how would I know that I have these columns?
    • 0:31:26Well, as we'll see in future times together,
    • 0:31:29I'll be able to actually look at the schema of my database, what
    • 0:31:32columns are inside of it.
    • 0:31:33For now, just take it on my own word that I
    • 0:31:37knew what was inside the database before I actually started querying it.
    • 0:31:40We'll see later on how you can take a database
    • 0:31:42and understand the columns you have there too.
    • 0:31:44A good question.
    • 0:31:45Let's jump into some more queries then.
    • 0:31:48I'll go back to my computer.
    • 0:31:49And let's see what else we could do with these conditions.
    • 0:31:53Well, not only could I try to make compound conditions,
    • 0:31:57I could also try to find, let's say, which data is missing.
    • 0:32:02So I know in this table, I have not just authors of books,
    • 0:32:06but translators of those books.
    • 0:32:09Often, books for the International Booker Prize
    • 0:32:11were translated from some other language into English--
    • 0:32:15but some weren't or at least they didn't have a translator
    • 0:32:18that was separate from the author.
    • 0:32:20So to think about what data is missing from our table,
    • 0:32:23we should introduce this new idea--
    • 0:32:25this one called NULL.
    • 0:32:28So I'll walk over here.
    • 0:32:29And we'll see that we have this type called NULL, where this
    • 0:32:33means that this value doesn't exist.
    • 0:32:35It's not in our database.
    • 0:32:37We can actually put together a condition around this idea of null, something
    • 0:32:42not being there.
    • 0:32:44We could use IS NULL to figure out if a value is null.
    • 0:32:47It's not there.
    • 0:32:48It's missing from our database.
    • 0:32:49Or IS IT NULL, meaning that it actually is there.
    • 0:32:54So I'll go back to SQLite and show you what we could
    • 0:32:56do with some of these concepts here.
    • 0:32:59Let me go back to my terminal.
    • 0:33:01And let's say I do want to find those translators that
    • 0:33:04don't exist in my database.
    • 0:33:06Well, I could use SELECT, let's say, "title" and "translator" from my
    • 0:33:13"longlist."
    • 0:33:14And I want to make sure that these translators are null--
    • 0:33:18they don't exist.
    • 0:33:19So I'll say WHERE "translator" IS NULL Semicolon.
    • 0:33:25Now, I'll hit Enter.
    • 0:33:26And I should see two books.
    • 0:33:28Titles are The Perfect Nine and the indictment of The Enlightenment
    • 0:33:31of The Greengate Tree.
    • 0:33:31But notice how over here, this value is null--
    • 0:33:35it doesn't exist in my table.
    • 0:33:38I could conversely find those books that do have translators using IS NOT NULL.
    • 0:33:43And I will try this one again.
    • 0:33:46But in this case, I'll say, WHERE "translator" IS NOT NULL Semicolon.
    • 0:33:53And I'll hit Enter.
    • 0:33:55And now, let me just zoom out a little bit.
    • 0:33:58I can see that I have both titles on the left-hand side
    • 0:34:03and translators on the right-hand side.
    • 0:34:05All of these actually exists.
    • 0:34:08These are books that did have translators in this case.
    • 0:34:11So a good way to find data that's missing in your table using
    • 0:34:14NULL or IS NOT NULL.
    • 0:34:17So let's come back over here and figure out what more
    • 0:34:19we can do with some of these queries.
    • 0:34:22We've kind of exhausted our work with some of our conditions,
    • 0:34:27like chaining them together and using NULL and so on.
    • 0:34:29But one more thing we could do is trying to use this idea of matching
    • 0:34:34some kind of pattern in my database.
    • 0:34:36So maybe I'm a book reader.
    • 0:34:40And I want to find a book with the word "love" somewhere in the title.
    • 0:34:44Well, for this, I could use another keyword--
    • 0:34:47this one called LIKE.
    • 0:34:49So LIKE is a good keyword to use when I want to roughly match
    • 0:34:54some string in my database.
    • 0:34:56Let's say I want to look at book titles and find
    • 0:34:59if some word exists in that title.
    • 0:35:01I could use LIKE for that.
    • 0:35:04And LIKE becomes powerful when you combine it
    • 0:35:06with these other operators, namely this percent sign and this underscore.
    • 0:35:10The percent sign can match any character around a string I give it.
    • 0:35:15And the underscore can match any single character
    • 0:35:18that I pass in with my string.
    • 0:35:20It's probably best shown with an example.
    • 0:35:23So let me show you some in my terminal here.
    • 0:35:26I'll walk back.
    • 0:35:27And, again, we'll try to find these books that
    • 0:35:29have "love" somewhere in the title.
    • 0:35:32So I'll say in this case, SELECT, let's say, "title" from "longlist."
    • 0:35:38But I don't want all titles.
    • 0:35:39I only want those that have "love" somewhere in this title.
    • 0:35:43So I'll say WHERE "title" LIKE, let's say, percent love percent Semicolon.
    • 0:35:53Now before I run this, let me explain what this is doing.
    • 0:35:57I have here a SELECT query asking for the "title"
    • 0:36:01column from my "longlist" table.
    • 0:36:04But I'll only get back those rows where "title" is LIKE percent love percent.
    • 0:36:11But what does that mean?
    • 0:36:12Well, the percent, remember, matches any string of characters.
    • 0:36:16It could match a, b, c, 1, 2, 3.
    • 0:36:18As long as any string of characters comes after and has "love,"
    • 0:36:22I could match that value here.
    • 0:36:25Similarly, the percent sign after says anything
    • 0:36:28that comes after "love," as long as "love" is somewhere in the middle.
    • 0:36:32So anything before, anything after, but so long as "love"
    • 0:36:36is just somewhere in there, I'll get it back.
    • 0:36:38So let me try that running this query then and come back over here.
    • 0:36:42I will hit Enter on my query.
    • 0:36:44And I'll see I get back four books--
    • 0:36:47Love in the Big City, More Than I Love My Life and so on.
    • 0:36:51So notice how if I come back over here that each of these titles
    • 0:36:54has "love" somewhere in it.
    • 0:36:56For this one, I match love up front and then
    • 0:37:00had any string of characters coming after it like this.
    • 0:37:02For this one, I had More Than I Love My Life.
    • 0:37:05I got some string before it.
    • 0:37:07And then afterwards, any string after it, "love" is somewhere in the middle.
    • 0:37:12Let me show you another example too, where
    • 0:37:14we use percent in a different way.
    • 0:37:17Let's say I want to find only those books that
    • 0:37:19have "The" at the very beginning of the title.
    • 0:37:23Let me try this.
    • 0:37:24I'll say a SELECT "title" from "longlist" then WHERE,
    • 0:37:30let's say, the "title" is LIKE the percent Semicolon.
    • 0:37:39Now, I've changed something up.
    • 0:37:41I have not percent in front and behind, but only after the "The."
    • 0:37:47So in this case, I'll get back not anything
    • 0:37:50that has "The" the title wherever, but now
    • 0:37:53at the very beginning of this string.
    • 0:37:56And I see perhaps a style mistake.
    • 0:37:59Let me ask the audience, what style mistake did
    • 0:38:01I just make when I typed in this query?
    • 0:38:05So I'm seeing maybe I used double quotes when I should have used single quotes.
    • 0:38:09So let me come back and fix that first.
    • 0:38:11I'll come back over here.
    • 0:38:12And, again, by convention, we tend to use single quotes for our strings.
    • 0:38:17So let me fix that right here.
    • 0:38:19And now let me run this query to see what we get back.
    • 0:38:22I'll hit Enter.
    • 0:38:23And I'll see only those books that begin with "The."
    • 0:38:28Now, let me show this query again, though.
    • 0:38:32This was our query here.
    • 0:38:34Knowing what we know about the percent sign,
    • 0:38:39what other titles might I accidentally get back by running this query?
    • 0:38:45I have "The" percent.
    • 0:38:47But what other words actually begin these book titles
    • 0:38:51if I were to run this query here?
    • 0:38:53We saw only those with "The," but if I had other book titles, what might
    • 0:38:57I get back?
    • 0:38:58So I might get back those book titles that
    • 0:39:00have not just "The" the beginning but also, let's say,
    • 0:39:04"There" or "They" or so on.
    • 0:39:06There are many words beginning with T-H-E.
    • 0:39:08And if I had the percent sign right after it,
    • 0:39:10I might match one of those words like Y or R-E or so on.
    • 0:39:15So I didn't have any of those titles in this database.
    • 0:39:17But you can imagine a different database where I have that kind of data.
    • 0:39:22Let's fix this then.
    • 0:39:24I say, I want to match not just "The" but "The" and a space,
    • 0:39:28and match any characters after that to make this
    • 0:39:31query better designed in this instance.
    • 0:39:35OK.
    • 0:39:36So let me pause here then and ask what questions we
    • 0:39:38have on using percent along with LIKE.
    • 0:39:44SPEAKER: Can we use two percent signs between two words?
    • 0:39:48CARTER ZENKE: Yeah, I think so.
    • 0:39:49So let me go back to my terminal here and let me try to answer this question.
    • 0:39:52So can we use two percent signs to say something's between two words?
    • 0:39:56So let me say this.
    • 0:39:57I'll go with SELECT, let's say, "title" from "longlist" WHERE maybe
    • 0:40:06the "title" has something like--
    • 0:40:10maybe it begins with "The."
    • 0:40:11So I'll say WHERE "title" LIKE "The."
    • 0:40:17And then I'll take any set of characters afterwards.
    • 0:40:20But I want to have "love" also in here too.
    • 0:40:23So I'll say "love."
    • 0:40:25And then I'll take any other characters after it, percent again, single quote,
    • 0:40:31Semicolon.
    • 0:40:32Now, I don't know if I have any books that actually fit this search.
    • 0:40:37But I could say the percent love percent to mean give me
    • 0:40:41back any title that has "The" at the beginning
    • 0:40:44then any words or characters then "love" then any words or characters
    • 0:40:48after that.
    • 0:40:49I'll hit Enter.
    • 0:40:50And I see I don't have any books that fit that kind of description.
    • 0:40:54But if I did, I would see them here with "The" at the beginning and "love"
    • 0:40:58somewhere in the title.
    • 0:41:01All right.
    • 0:41:02So let's keep going.
    • 0:41:03Let's focus not just on this percent sign, but also on this underscore.
    • 0:41:08So if I want to find--
    • 0:41:09let's say, I don't know what a particular character is in my title.
    • 0:41:13I could use this underscore to match any particular character.
    • 0:41:17Not any string of characters, but any single character too.
    • 0:41:20So let's try this in our terminal.
    • 0:41:23And there is a book, this one called a Pyre.
    • 0:41:26And I actually keep forgetting how it's spelled.
    • 0:41:29I don't know whether it's P-I-R-E or P-Y-R-E. It could be either one,
    • 0:41:34but I want to find it in my database.
    • 0:41:36So let me try this.
    • 0:41:37I'll say SELECT, let's say, "title" FROM "longlist," WHERE, in this case,
    • 0:41:44"title" is LIKE--
    • 0:41:46well, I know that it starts with a P. And I
    • 0:41:49don't know if this is an I or a Y. But I'll at least leave it as an underscore
    • 0:41:54now to say it could be any character here.
    • 0:41:56Then I'll say R-E single quote, Semicolon.
    • 0:42:02And now I could try hitting Enter.
    • 0:42:04And I'll see I get back this title called Pyre.
    • 0:42:08So notice in this case that the underscore is matching
    • 0:42:12literally any single character.
    • 0:42:14This could be a Y. It could be an I. But in this case, I have this Y here.
    • 0:42:20OK.
    • 0:42:20Let me go back.
    • 0:42:21And let's actually ask in this case what questions we
    • 0:42:23have on using LIKE with these single underscores if any.
    • 0:42:28SPEAKER: Yeah.
    • 0:42:28So Carter, I wanted to ask you, that as you use the underscore sign here,
    • 0:42:33so for multiple characters, can we use multiple underscores in order
    • 0:42:37to find something in the database?
    • 0:42:39CARTER ZENKE: A great question.
    • 0:42:40Could we use more than one underscore to try to find
    • 0:42:43some characters in our database?
    • 0:42:45You absolutely could.
    • 0:42:45So let me try that myself.
    • 0:42:47I'll go back to my terminal.
    • 0:42:49And let me try to find a book title this could work with.
    • 0:42:52I'll say maybe SELECT "title" from "longlist"
    • 0:42:56to get back all the books in this table.
    • 0:42:58I'll hit Semicolon.
    • 0:42:59And maybe I will go with--
    • 0:43:05let's try this one called Tyll.
    • 0:43:08Well, maybe I want to just find the titles that have a Y or an I in here,
    • 0:43:13but I also don't know if it's one L or maybe two L's, for instance.
    • 0:43:17Let me go back and try this.
    • 0:43:18I'll say SELECT, let's say, "title" FROM "longlist" WHERE my "title" is LIKE--
    • 0:43:26I know it begins with a T. I don't know if this is a Y or an I.
    • 0:43:31And maybe I know that it has maybe one or two characters after it.
    • 0:43:36So I'll try this one.
    • 0:43:37Now I have three underscores, single underscore.
    • 0:43:40So match any book title that has T and then any three
    • 0:43:45individual characters after it.
    • 0:43:47I'll hit Enter.
    • 0:43:48And I'll see I get back Tyll.
    • 0:43:50This is the only title that has T and then three characters after it.
    • 0:43:56I could try to get better with this.
    • 0:43:57I could say maybe I'll accept five or six characters like that.
    • 0:44:00Hit Enter.
    • 0:44:02And I'll see-- whoops, I didn't compute my query here.
    • 0:44:04Let me just try it from the top again.
    • 0:44:06I'll say SELECT "title" from "longlist" where
    • 0:44:08title is like, let's say, T-Y underscore, underscore,
    • 0:44:12underscore, underscore.
    • 0:44:13Hit Semicolon.
    • 0:44:16And now I get no matches.
    • 0:44:18So there is no book in the database that has T-Y
    • 0:44:22and then, let's say, three or four underscores for any character after it.
    • 0:44:28OK.
    • 0:44:28So this covers our use of LIKE, but let's
    • 0:44:32keep going and building more complex conditions
    • 0:44:35to find even more answers to questions we have about this data over here.
    • 0:44:40Let me think what we should show next.
    • 0:44:43We've seen LIKE.
    • 0:44:44We've seen some compound conditionals.
    • 0:44:46Well, let's go back to trying to find books that are in a certain year.
    • 0:44:51So we saw earlier we had this kind of query.
    • 0:44:54We could say SELECT "title" and "year" FROM, let's say, our "longlist."
    • 0:45:00Now I can try to find those books that are written or nominated
    • 0:45:04in 2022 and 2023.
    • 0:45:06But let's say I want to go further.
    • 0:45:08I want those from 2019 to 2022, a span of multiple years here.
    • 0:45:15So I could try it like this, WHERE "year" equals--
    • 0:45:18let's go ahead and say 2019--
    • 0:45:21OR "year" is 2020 OR "year" is 2021.
    • 0:45:28And let me make a new line again.
    • 0:45:30OR "year" is 2022 Semicolon.
    • 0:45:36And before I run this query, let me ask our audience,
    • 0:45:39what strikes you as being not very well designed about this query?
    • 0:45:44What could I be doing better here?
    • 0:45:47So I'm seeing maybe one improvement is that I don't need to write out
    • 0:45:52OR "year" is this, OR "year" is that.
    • 0:45:55I could probably do better with this.
    • 0:45:57And let's introduce some new keywords for working
    • 0:46:00with ranges in terms of our conditions.
    • 0:46:04So here, we can see some new operators to use.
    • 0:46:08We have this greater than sign, this less than sign,
    • 0:46:12greater than or equal to, and less than or equal to.
    • 0:46:15And we can use these to build ranges inside of our queries to say,
    • 0:46:19I want something to be greater than this number or less than this number too.
    • 0:46:24And we can combine these with AND and our OR
    • 0:46:27we saw before to get back in this case some set of rows
    • 0:46:31that match what we intend to find.
    • 0:46:34So let me go back and try some of these out.
    • 0:46:36I'll try to improve the design of this query.
    • 0:46:39Let me first run it, and we'll see we do get back 2019 to 2022.
    • 0:46:46But I could probably do better.
    • 0:46:47So let's try using our new operators here
    • 0:46:50that can give us some range capabilities.
    • 0:46:53I'll say SELECT "title" and also SELECT "year" from "longlist."
    • 0:46:59But now I want those rows where the year is greater than or equal to 2019,
    • 0:47:07and the year is less than or equal to 2022 Semicolon.
    • 0:47:13I'll hit Enter.
    • 0:47:14Notice I'll get the very same results.
    • 0:47:17So I get all those same rows.
    • 0:47:19But now my query is much smaller.
    • 0:47:21It's making use of these range operators I've seen so far.
    • 0:47:26I could even further improve this.
    • 0:47:28I could make this a little better designed to.
    • 0:47:31Let me go back to some slides and show you we could use these keywords
    • 0:47:34BETWEEN blank AND blank, where this can be some condition
    • 0:47:39or some number in this case.
    • 0:47:41I could say between, let's say, 2019 AND 2022.
    • 0:47:46This is inclusive.
    • 0:47:47So if I say 2019 AND 2022, I'll get back a query that includes 2019 and 2022.
    • 0:47:56So let me try this one.
    • 0:47:57I'll go back over here.
    • 0:47:59And I will now try SELECT "title" and "year"
    • 0:48:04from "longlist," WHERE the "year" is between 2019 AND 2022.
    • 0:48:12Same results now, which is a different way of writing this same query.
    • 0:48:17Now what else can we do with these ranges?
    • 0:48:21Well, as we've said before, these books actually have some ratings involved.
    • 0:48:26These ratings are crowdsourced from Goodreads,
    • 0:48:29a site you can review books online.
    • 0:48:31And I want to find maybe the books that have a rating of 4.0 or higher.
    • 0:48:35Well, I could do that now with my ranges I could say SELECT "title" and "rating"
    • 0:48:41from my "longlist."
    • 0:48:43And I could say WHERE the "rating" is greater than 4.0 Semicolon.
    • 0:48:48I'll hit Enter.
    • 0:48:49And I'll see now only those books that have a rating of 4.0
    • 0:48:55or higher like this.
    • 0:48:58I could even combine conditions.
    • 0:49:00So I know that these books have a certain rating,
    • 0:49:03but how many votes do they really get?
    • 0:49:05Well, let's take a peek.
    • 0:49:07I'll come back over here.
    • 0:49:08And let me try this one.
    • 0:49:10I could say SELECT "title."
    • 0:49:13Oops, let me clear my terminal again so it's back up top.
    • 0:49:17SELECT "title" and "rating" and the number of
    • 0:49:22votes that these books got from, let's say, our "longlist" table.
    • 0:49:28Now, I want to find those that have a rating of greater than 4.0,
    • 0:49:34and, let's say, a number of votes--
    • 0:49:38a number of votes which is greater than at least--
    • 0:49:41let's go with 10,000.
    • 0:49:43So at least we know a good number of folks actually voted on these books
    • 0:49:46to find the best among them.
    • 0:49:48I'll Enter.
    • 0:49:49And now I'll see we're only down to a few books,
    • 0:49:53four in fact, where each one has a rating higher than 4.0.
    • 0:49:58And indeed, every vote row has a vote total greater than 10,000 in this case.
    • 0:50:06So a good way to try to find the top books in our data set here.
    • 0:50:12Let's keep going with these ranges.
    • 0:50:13And let's think about one more thing we could do.
    • 0:50:17Maybe I want to find books that are less than a certain length.
    • 0:50:20So I'll try that as well.
    • 0:50:21I'll say SELECT, let's say, "title" and "pages" from my "longlist."
    • 0:50:27And now I can make a condition based on pages.
    • 0:50:30I'll say WHERE "pages" is less than 300.
    • 0:50:33Hit Enter.
    • 0:50:34And now I should see that I have all these books that
    • 0:50:37are less than 300 pages long when they were first published.
    • 0:50:43So let's pause here and ask what questions
    • 0:50:45we have on these range conditions.
    • 0:50:49SPEAKER: I just wanted to check if for a proper query in this case
    • 0:50:55to be able to run operations, they have to be integers in the database.
    • 0:51:00And my second question is for when we're matching a string,
    • 0:51:04is it case-sensitive or not?
    • 0:51:06CARTER ZENKE: Yeah, two great questions.
    • 0:51:07So the first one here is going to be, do I have to use integers in this case?
    • 0:51:12And what types, maybe, should I use?
    • 0:51:14And the second one being, could I match strings like case insensitively?
    • 0:51:17So for the first one, in this case, it'll
    • 0:51:20depend on the design of your database.
    • 0:51:23So we'll see later on in the course, how we
    • 0:51:25can choose the types for our columns.
    • 0:51:27And how that might impact the types we actually use in our queries.
    • 0:51:31For now, I made this database.
    • 0:51:33So I just know that my year column is an integer,
    • 0:51:36my ratings column is a real number or a float, if you're familiar,
    • 0:51:40and my votes is an integer.
    • 0:51:41So I just know to use those numbers there.
    • 0:51:43To the question of matching things case-insensitively,
    • 0:51:47let's actually revisit LIKE just briefly here to show you what that can do.
    • 0:51:50So I go back to my terminal, and let's say I want to find just a book title.
    • 0:51:58And I want to type it in kind of sloppily.
    • 0:51:59I don't want to capitalize it like capital books are.
    • 0:52:02So I'll say SELECT let's say "title" from "longlist."
    • 0:52:06And maybe, I'll want to find that book Pyre again.
    • 0:52:10So I could say WHERE "title" is LIKE 'pyre', but all in lowercase.
    • 0:52:17Now I'll hit Enter.
    • 0:52:19And I'll see I do get back Pyre.
    • 0:52:21So even though I said WHERE "title" is LIKE lowercase 'pyre', I got back
    • 0:52:27capital Pyre.
    • 0:52:29Now this is in contrast to saying WHERE "title" equals lowercase 'pyre'.
    • 0:52:35Let's try that.
    • 0:52:36I'll come back over here.
    • 0:52:37And I'll say again SELECT "title" from "longlist,"
    • 0:52:43but now WHERE "title" equals, quote, unquote, 'pyre' Semicolon.
    • 0:52:52I'll hit Enter.
    • 0:52:53And now I see no results.
    • 0:52:55So in this case, the equal is going to be case-sensitive.
    • 0:52:59Case matters in this case, but LIKE is case-insensitive.
    • 0:53:05OK.
    • 0:53:06Why don't we keep going then?
    • 0:53:09And let's take a look at a few other things
    • 0:53:11we can do with these SQL keywords for querying.
    • 0:53:15Well, earlier, we were trying to find a way
    • 0:53:18to find the best books in our data set.
    • 0:53:21And we did that by filtering them based on some ranges.
    • 0:53:24But we could probably do that a little bit more
    • 0:53:27methodically in this case using a new keyword, this one called ORDER BY.
    • 0:53:33So ORDER BY allows us to take the results of our query
    • 0:53:37and order them, as it suggests, by some column itself.
    • 0:53:43So we could put them in alphabetical order or in order by a number of votes
    • 0:53:47or in order by number of ratings.
    • 0:53:50And let's just try an example of this to see how ORDER BY works for us.
    • 0:53:54But in the end, we'll see it can arrange columns, arrange rows for us
    • 0:53:58in our resulting query.
    • 0:54:00So I'll go back to my computer.
    • 0:54:02And let's try this question here.
    • 0:54:04I want to try to find the top 10 books in my table.
    • 0:54:09So I'll say SELECT "title" and "rating" from "longlist."
    • 0:54:16Enter.
    • 0:54:18Not only query yet though.
    • 0:54:19Now I'll say ORDER BY the rating.
    • 0:54:24And let's only take the top 10.
    • 0:54:26So I'll say limit 10 in this instance Semicolon.
    • 0:54:31So now I've combined some of my prior keywords.
    • 0:54:34I'm using SELECT.
    • 0:54:35I'm using ORDER BY.
    • 0:54:36And I'm still using our old friend LIMIT.
    • 0:54:38So let me hit Enter here.
    • 0:54:40And I'll get back, well, not quite the top 10.
    • 0:54:46I see rating of 3.05 here and rating of 3.42 down here.
    • 0:54:52So based on this, what do you think the default ordering of ORDER BY is?
    • 0:55:00So you might be from least to greatest.
    • 0:55:02So we saw here that we have rating being pretty small,
    • 0:55:05but we said ORDER BY our rating.
    • 0:55:07So it starts from small and goes down to large.
    • 0:55:10So we need to fix this in some way.
    • 0:55:12And let's introduce a new addition to ORDER BY
    • 0:55:15to have us fix this query overall.
    • 0:55:18So let me show you that ORDER BY does by default sort from least to greatest.
    • 0:55:24But let's try some addition here.
    • 0:55:27We have not just ORDER BY but ORDER BY some column
    • 0:55:31and then ascending or descending.
    • 0:55:35So ascending is the default. It means from least to greatest.
    • 0:55:39Descending, we can specify meaning from greatest to smallest.
    • 0:55:44So let's try using ORDER BY but now with this other keyword
    • 0:55:47called DESC for descending here.
    • 0:55:50I'll go back to my terminal.
    • 0:55:53And let's rewrite this query to include DESC.
    • 0:55:57I'll say SELECT "title" and "rating" from "longlist."
    • 0:56:02And let me-- before I run this query, let
    • 0:56:04me just clear my terminal so it's back up at the top.
    • 0:56:06I'll backspace this.
    • 0:56:08And then a moment here, I'll press Control-L. Now I'm back at the top.
    • 0:56:13I'll say SELECT "title", SELECT "title" and "rating" from "longlist" WHERE--
    • 0:56:21actually, not WHERE.
    • 0:56:22We're not filtering yet.
    • 0:56:23I'll say ORDER BY rating but not by ascending by default--
    • 0:56:30going from least to greatest.
    • 0:56:31I want greatest to least.
    • 0:56:33So I'll say DESC here.
    • 0:56:35Now, I can say LIMIT 10 Semicolon.
    • 0:56:38Hit Enter.
    • 0:56:39And now I'll see the top 10 books.
    • 0:56:42Here, I have The Eighth Life coming in at 4.52 and The Books of Jacob
    • 0:56:47coming in at 4.06.
    • 0:56:49So now we're going from greatest to smallest.
    • 0:56:53Well, I could order by not just these ratings,
    • 0:56:56but also by the number of votes.
    • 0:56:59It seems there's a tie to break here.
    • 0:57:00If I look at Still Born and When We Cease to Understand the World,
    • 0:57:04those both have a rating of 4.14.
    • 0:57:07But presumably, one book has maybe more votes than the other.
    • 0:57:12So I could try to break this tie by ordering not just by rating but also
    • 0:57:18by votes, the number votes this book actually received on Good Reads.
    • 0:57:22So let's try that then to break this tie.
    • 0:57:25I'll come back over here.
    • 0:57:26And I'll try this query now.
    • 0:57:29I'll say, again, SELECT "title" and "rating" from "longlist."
    • 0:57:35Now I'll order by, first, the rating column in descending order.
    • 0:57:41But I also want to order by the number of votes after I order by rating.
    • 0:57:47So I'm saying first order by rating, but afterwards,
    • 0:57:50followed by a comma, let's order by the number of votes
    • 0:57:54also in descending order-- from greatest to smallest.
    • 0:57:58Now I'll just continue my query on the next line.
    • 0:58:01And I'll say LIMIT 10 Semicolon.
    • 0:58:04This then gives me, let's see, the books but now
    • 0:58:09they're going to be in the order that allows us to see the number of votes.
    • 0:58:12Let me just actually refine this.
    • 0:58:14Let me say not just title and rating, let's make
    • 0:58:17sure we can see the votes here too.
    • 0:58:18So SELECT "title" and "rating" and "votes" from "longlist."
    • 0:58:23Hit Enter on my query.
    • 0:58:24Now I'll say ORDER BY "rating" and "votes."
    • 0:58:27Then I'll say LIMIT 10.
    • 0:58:29And here, I'm just getting the Up Arrow on my computer.
    • 0:58:32I'll hit Enter.
    • 0:58:33And now I'll see the votes included.
    • 0:58:36So let me show you this on the big screen.
    • 0:58:38Here, we see that the tie is broken.
    • 0:58:40So when we cease to understand the world,
    • 0:58:42these both have 4.14 along with Still Born.
    • 0:58:46But here, this book has more votes and so is higher in our order
    • 0:58:51now that we've ordered by multiple columns.
    • 0:58:55So let me pause here and ask what questions
    • 0:58:58we have on ordering with data.
    • 0:59:01Ordering by one column or multiple, and how we can sort data like this.
    • 0:59:07SPEAKER: Sir, I want to know that can we write rating to 4.93 to 4.9?
    • 0:59:14CARTER ZENKE: Yeah good question.
    • 0:59:15I think if I understand you correctly, how can we select a rating
    • 0:59:18or try to find a rating that's like equal to 4.92 or things like that.
    • 0:59:22Let's try that here.
    • 0:59:23So if I want to find a particular rating,
    • 0:59:26I could simply use my WHERE friend from before.
    • 0:59:30I could say SELECT, let's say, "title" and "rating."
    • 0:59:34And maybe I could try to find a particular rating for a book
    • 0:59:37from longlist.
    • 0:59:39I could say then WHERE this rating is equal to, let's say, 4.932 Semicolon.
    • 0:59:46If this book exists, it will get it back here.
    • 0:59:49So I'll hit Enter.
    • 0:59:50And I see there's no book with this particular rating, 4.392.
    • 0:59:55But good question for how to find particular ratings for our books here.
    • 1:00:01OK.
    • 1:00:02Other questions too on how we've been able to sort our data and use ORDER BY?
    • 1:00:09SPEAKER: Will descend work on a string on an alphabetic basis?
    • 1:00:13Or do we need to have special conditions for alphabetic characters?
    • 1:00:18CARTER ZENKE: Yeah, a great question.
    • 1:00:19So how could we use ORDER BY with some characters or strings or some text
    • 1:00:23in our database?
    • 1:00:24Let's try that one out too and see how that works with ASC
    • 1:00:27for ascending and DESC for descending.
    • 1:00:30So I'll go back to my terminal.
    • 1:00:32And I'll demonstrate here how we can use this for some text.
    • 1:00:35So let's try to simply sorting our books alphabetically for, let's say,
    • 1:00:40our library.
    • 1:00:41I'll say SELECT "title" from "longlist," Enter.
    • 1:00:46And I want to order by title, just plain and simple.
    • 1:00:50And then hit Semicolon.
    • 1:00:51Let's see what happens.
    • 1:00:52I'll hit Enter.
    • 1:00:54And now I'll see that these books are ordered.
    • 1:00:57But they seem to be ordered alphabetically.
    • 1:00:59So here, we have some titles lower in the alphabet.
    • 1:01:03And up here, we have titles earlier in the alphabet.
    • 1:01:06So by default ORDER BY seems to order alphabetically.
    • 1:01:10If I change that default, though, from ascending to descending,
    • 1:01:15let's see what happens.
    • 1:01:16I'll go back over here.
    • 1:01:18And I'll try the same query but now using DESC.
    • 1:01:22SELECT "title" from "longlist."
    • 1:01:24ORDER BY "title" now in descending order.
    • 1:01:29Hit Enter.
    • 1:01:30And now I'll see these titles in reverse alphabetical order.
    • 1:01:34So notice how earlier on, we have titles that are lower in the alphabet.
    • 1:01:39But down below, we have titles that are earlier in the alphabet here.
    • 1:01:42So you can use ORDER BY with these texts.
    • 1:01:46But you then have to specify whether you want it in alphabetical order
    • 1:01:49or in reverse alphabetical order.
    • 1:01:53OK, so let's show a few other concepts here
    • 1:01:58we can use alongside of these orderings.
    • 1:02:01One thing we could also do is try to find more information
    • 1:02:06about the ratings of these books.
    • 1:02:07So let's say I want not just to order these books
    • 1:02:10but try to find the average rating, or to try to find the number of books,
    • 1:02:15or try to find let's say maybe the sum of my total votes on each of my books.
    • 1:02:21Well, for this, we could introduce some new concepts,
    • 1:02:24these ones called SQL's aggregate functions.
    • 1:02:29These allow us to take a whole set of rows and return not each of those rows
    • 1:02:35individually.
    • 1:02:36But instead, in this case, one number based on the values in those rows.
    • 1:02:42You could imagine trying to count the number of rows you have
    • 1:02:45or take the average of the number of rows,
    • 1:02:47or take the average of let's say a rating, for instance.
    • 1:02:50Finding the minimum rating of the maximum rating
    • 1:02:52or finding the sum of some votes.
    • 1:02:55And we'll see each of these in action here.
    • 1:02:58Let's go back to our terminal, try some of these out.
    • 1:03:01I will try, in this case first trying to find
    • 1:03:05the average rating from my longlist.
    • 1:03:08Well, I just from experience, and as you now know too,
    • 1:03:12I can try to find the average of some column
    • 1:03:14by using the AVG aggregate function.
    • 1:03:18So I'll say SELECT not just rating in this case
    • 1:03:22but select the average rating FROM "longlist."
    • 1:03:27Notice how in this case, I'm using this kind of syntax, where I take rating,
    • 1:03:33my column I want to aggregate or to sum up or to average like this.
    • 1:03:38And I apply the function by saying its name
    • 1:03:41followed by some parentheses around that column name.
    • 1:03:45So this will return to me not all of the rating
    • 1:03:47rows but the average of the rating rows in one single cell.
    • 1:03:52Let me try this.
    • 1:03:54I'll come back, and I will then hit Enter.
    • 1:03:56And I'll see this is the average rating.
    • 1:04:00We have 3.7537179471795 is our average rating for all of these books.
    • 1:04:08But of course, this isn't great.
    • 1:04:11What might I want to do if I was going to show this to somebody else?
    • 1:04:15I could probably improve the presentation of this in some way.
    • 1:04:20So I could probably round this result. I have 3.75371,
    • 1:04:25we can probably stop after two decimal points, right?
    • 1:04:28Just simply like 3.75.
    • 1:04:30So I could introduce some new keyword here, this one around the results.
    • 1:04:35Let me show you this one in action.
    • 1:04:37I'll come back, and I'll try not just select average rating
    • 1:04:41but select the rounded average rating.
    • 1:04:44So I'll say SELECT ROUND and then take average of "rating" and round
    • 1:04:51to 2 decimal points FROM "longlist" Semicolon.
    • 1:04:58So now this query decides to first find the average of the rating column.
    • 1:05:04Then take the result and round it using two decimal points.
    • 1:05:10Notice how round takes two inputs or two arguments, the first one
    • 1:05:14being the rating, the average rating, the second one being number
    • 1:05:18of decimal points to round, 2.
    • 1:05:20And we complete our query in the way we usually do by saying FROM this table.
    • 1:05:24So let's try this one to figure this out.
    • 1:05:26I'll come back.
    • 1:05:27And I'll hit Enter.
    • 1:05:29And now I'll see we do get back 3.75.
    • 1:05:33But there is still one thing to improve here.
    • 1:05:35When I write this query, I see this ugly title name--
    • 1:05:39ROUND average "rating" comma 2.
    • 1:05:41I wouldn't send this to my boss or somebody else who I
    • 1:05:44work for or maybe even a friend, right?
    • 1:05:46I want to make sure it's pretty so they can read it correctly.
    • 1:05:49So what could I do then to try to make this prettier?
    • 1:05:53I could maybe rename this column.
    • 1:05:55I could try to take this and make it not just this ugly mess of SQL
    • 1:05:59keyword but to give it some name I could use instead.
    • 1:06:02So for this, we'll introduce a brand new one--
    • 1:06:05new brand new keyword called AS.
    • 1:06:07Let's try this one too.
    • 1:06:09I'll come back, and I'll say SELECT, again, ROUND average "rating" comma 2.
    • 1:06:19But now, I'll select it AS, let's say, average rating.
    • 1:06:25And now before I actually finish this query,
    • 1:06:27let me try to bring it up to the top my terminal so we can see it all in one
    • 1:06:30go.
    • 1:06:30I'll backspace this.
    • 1:06:32And I'll say SELECT the rounded version of the average rating rounded
    • 1:06:40to 2 decimal points AS--
    • 1:06:42let's call this one "average rating."
    • 1:06:45Now hit Enter.
    • 1:06:46And I'll say FROM my "longlist" table Semicolon.
    • 1:06:51Now, I see it's much prettier overall.
    • 1:06:53I have no longer these SQL keywords but instead
    • 1:06:57just average rating as my column name.
    • 1:07:01OK.
    • 1:07:02So let me pause here and ask questions then
    • 1:07:04on using average or using ROUND or using AS in these cases.
    • 1:07:12SPEAKER: I'm wondering, do these sorts of commands have a funct--
    • 1:07:16are these commands encode something like data types or just encode something?
    • 1:07:21Do these commands also have a name?
    • 1:07:23CARTER ZENKE: Yeah.
    • 1:07:24And can I ask are you referring to the AVG like COUNT like SUM
    • 1:07:28those kinds of things, or?
    • 1:07:29Yeah.
    • 1:07:30So these functions do have a name.
    • 1:07:33They are called aggregate functions.
    • 1:07:34And aggregate means to combine, to bring together.
    • 1:07:37So they're called aggregate functions because they take some number of rows,
    • 1:07:41like all my ratings for instance, and bring it down to one single cell,
    • 1:07:45like the average or the sum or the count.
    • 1:07:48So if you look up or read more about SQL aggregate functions,
    • 1:07:52you'll see all of these and perhaps some more
    • 1:07:54depending on the software you're using.
    • 1:07:57OK.
    • 1:07:58So let's keep going then and try to start counting some other rows
    • 1:08:02and use our other aggregate functions here.
    • 1:08:04I'll go back to my terminal.
    • 1:08:06And so far, we've seen average, as well as we have seen ROUND and so on.
    • 1:08:13But why don't I try to find the maximum or the minimum rating in my table?
    • 1:08:18I'll say SELECT, let's say, the MAX "rating" FROM my "longlist" Semicolon.
    • 1:08:25Hit Enter.
    • 1:08:26Now I see the highest rated book had a rating of 4.52.
    • 1:08:31Well, what about the minimum rating?
    • 1:08:33I could use MIN here too.
    • 1:08:34I could say SELECT, let's say, MIN of my "rating" column FROM my "longlist"
    • 1:08:40table.
    • 1:08:41I'll hit Semicolon.
    • 1:08:42And I'll see it 3.05.
    • 1:08:44It is the lowest-rated book I have in this set.
    • 1:08:48Well, as we've seen, let me try to view this for you all.
    • 1:08:53I could say SELECT "title" and "votes" FROM my "longlist."
    • 1:08:59SELECT "title" and "votes" from my "longlist."
    • 1:09:01Here, I have many books with many user-generated votes.
    • 1:09:06Maybe people on the internet decided to rate this book out of 5
    • 1:09:09and maybe Go, Went, Gone got about, let's say, 592 votes.
    • 1:09:15So I'm curious then, how many total votes do I have in my data set?
    • 1:09:20Well, for that, I could use the SUM aggregate function.
    • 1:09:23Try to count up each one of these rows and/or turn it back to me
    • 1:09:27in a single cell.
    • 1:09:29So I'll use SUM here.
    • 1:09:30I'll come back and I'll say I want to find the sum of my votes column.
    • 1:09:36I'll say SELECT, let's say, the SUM of my "votes" column
    • 1:09:42FROM my "longlist" table.
    • 1:09:45And then I'll just Enter--
    • 1:09:46Semicolon Enter.
    • 1:09:47And I'll see over 600,000 people offer to vote
    • 1:09:52for each of these books that were longlisted for the International Booker
    • 1:09:56Prize.
    • 1:09:58Now there'd be a few more here.
    • 1:10:00Let's check out what else we have left to do and our aggregate functions.
    • 1:10:03We could also try to count up just the number of books in our data set.
    • 1:10:08So why don't I try to find the number of rows I have?
    • 1:10:11For that, I could use count.
    • 1:10:14And often to find the number of rows in your data set,
    • 1:10:17you might use count and star as we saw a little earlier.
    • 1:10:20I could say SELECT COUNT star from "longlist"
    • 1:10:25And this means, again star means give me every row and every column.
    • 1:10:30Give me basically my whole table, right?
    • 1:10:32And if I say COUNT star, that means count up the number of rows
    • 1:10:36that I have in my database.
    • 1:10:39So I'll say COUNT star from "longlist."
    • 1:10:41And I get back 78 books in this database.
    • 1:10:45Well, let me try counting up the number of translators here.
    • 1:10:48I'll say SELECT, let's say, COUNT of "translators" FROM "longlist"
    • 1:10:54Semicolon.
    • 1:10:55Hit Enter.
    • 1:10:57And now I see 76.
    • 1:11:00So I have 78 books, but if I count translators, I have 76 of them.
    • 1:11:08So why might that be?
    • 1:11:10Free to raise your hand and try to guess at this.
    • 1:11:14Why do I have 78 rows but 76 translators?
    • 1:11:19SPEAKER: Hi.
    • 1:11:20I actually had raised my hand for the question.
    • 1:11:22CARTER ZENKE: Yeah, go ahead.
    • 1:11:23SPEAKER: Also, I wanted to know whether the MAX
    • 1:11:26and the MIN functions can be used for finding the longest or the shortest
    • 1:11:30string as well?
    • 1:11:30Or do you have a different command for that?
    • 1:11:32CARTER ZENKE: Good question.
    • 1:11:33Could we use MAX and MIN to find the longest or shortest string?
    • 1:11:37That's a good question.
    • 1:11:38So let's actually pause on this counting here and try that out real quick.
    • 1:11:41So I'll come back to my terminal.
    • 1:11:43And let me try to use MAX and MIN with some book titles.
    • 1:11:47So I'll say Select--
    • 1:11:50let's select the MAX title.
    • 1:11:52And at the same time, why don't we select the MIN title as well?
    • 1:11:57And I'll select these from my "longlist" table.
    • 1:12:00I'll hit Semicolon.
    • 1:12:02And now, let me try this out.
    • 1:12:05I'll get back Wretchedness and A New Name--
    • 1:12:09Septology VI-VII.
    • 1:12:12Now, there's a few hypotheses here.
    • 1:12:14It does seem that our MAX "title" is shorter than our MIN "title."
    • 1:12:19So it's probably not that MAX gives us the length of the string,
    • 1:12:23but what do you notice?
    • 1:12:25Well, I see MIN is really early on in the alphabet.
    • 1:12:28It has an A here, whereas MAX has a W, pretty low in the alphabet.
    • 1:12:34And I would bet if we ordered these book titles,
    • 1:12:37we would see a new name up at the very top and a Wretchedness, the book here,
    • 1:12:43down at the bottom.
    • 1:12:44So MAX seems to give us the lowest alphabetically, which
    • 1:12:48is kind of contradictory with titles here or strings.
    • 1:12:51And MIN gives us the earliest in the alphabet using this A as well.
    • 1:12:58OK.
    • 1:12:58So a good question.
    • 1:12:59Let's come back to our counting here.
    • 1:13:01Let's go back to my terminal.
    • 1:13:03And again, we had, in this case, 78 rows, but only 76 translators.
    • 1:13:09So, again, if I did SELECT COUNT star from, let's say,
    • 1:13:12"longlist" then Semicolon, I get back 78.
    • 1:13:17But if I do SELECT COUNT of "translator" from "longlist," I get back 76.
    • 1:13:25And let me ask again, why do we have 78 rows but 76 translators?
    • 1:13:31Feel free to say it.
    • 1:13:34OK, so I'm seeing maybe we have some number of rows, 78.
    • 1:13:39But for our translators, you remember, two of those were null values.
    • 1:13:43They didn't exist in our table.
    • 1:13:45So it seems like if we use COUNT star, we're counting all the rows.
    • 1:13:50But if we use COUNT "translator," some column that has null values,
    • 1:13:54we're only getting back those rows or those values that aren't null.
    • 1:13:59So COUNT, when given a column, counts only those that
    • 1:14:02are not null that exist in our table.
    • 1:14:06OK, let's look at one more example here for counting.
    • 1:14:09And let's try this.
    • 1:14:11Let's say I want to find all of the publishers in this database.
    • 1:14:16I'll say SELECT COUNT of "publisher" from my "longlist."
    • 1:14:22And I'll hit Semicolon.
    • 1:14:24So you might think that I have 78 publishers in this long list.
    • 1:14:30But would it be accurate if I were to say
    • 1:14:32I have 78 different publishers in this longlist?
    • 1:14:37Could I say that?
    • 1:14:40I'm seeing no, right?
    • 1:14:41I couldn't try to count up these publishers
    • 1:14:44and then say I have 78 different ones.
    • 1:14:46I might double count the publisher along the way.
    • 1:14:49And let me show you what we mean here.
    • 1:14:51So I'll go back to my table.
    • 1:14:52And let me try to select from publishers or select the publisher column
    • 1:14:57from longlist.
    • 1:14:58I'll select "publisher" from "longlist."
    • 1:15:02Hit Semicolon.
    • 1:15:03Oops, and now I see something a little odd.
    • 1:15:06Let me scroll back up and maybe ask for a raised hand here.
    • 1:15:13Why might I get this odd result?
    • 1:15:19SPEAKER: Because of the quotes?
    • 1:15:21CARTER ZENKE: Yeah.
    • 1:15:22So I think I mistyped some of my query here.
    • 1:15:24I said it looks like "pubsliher" instead of "publisher."
    • 1:15:28And in this case, SQL will give me what I asked for.
    • 1:15:31I said SELECT "pubsliher" from "longlist."
    • 1:15:34And says, OK, here it is.
    • 1:15:35But that column doesn't exist so it creates this data for me.
    • 1:15:38So let me try this again.
    • 1:15:40I'll go back.
    • 1:15:41And I'll hopefully type this correctly now.
    • 1:15:43I'll say SELECT, let's say, "publisher," this one, FROM "longlist" Semicolon.
    • 1:15:52And now I'll see all of the publishers that I have in my table.
    • 1:15:57But what do you see?
    • 1:15:58Well, some repeat, right?
    • 1:15:59I have Harvill Secker multiple times here.
    • 1:16:02I have similarly MacLehose Press multiple times as well.
    • 1:16:06So if I count it up, these publishers, I would
    • 1:16:09get each one counted one time, which I want to find
    • 1:16:12the distinct ones the different ones.
    • 1:16:14I need a new keyword for this.
    • 1:16:16And for this, we'll use this keyword indeed called DISTINCT, trying
    • 1:16:20to find unique values from our column.
    • 1:16:23So let's try this.
    • 1:16:24I'll go back over here and I will now select not just publishers,
    • 1:16:30but distinct publishers.
    • 1:16:32I'll say SELECT DISTINCT publisher from "longlist" Semicolon.
    • 1:16:39Now, if I scroll through here, I should see each publisher in here
    • 1:16:44only one time.
    • 1:16:46If they have the same name, they've been filtered out.
    • 1:16:49And now they're only the same publisher here too.
    • 1:16:52So I will then try to say SELECT, let's say,
    • 1:16:54COUNT of "publisher," SELECT COUNT of "publisher" from--
    • 1:17:01COUNT of DISTINCT "publisher," for instance.
    • 1:17:02COUNT distinct publisher-- oh, typo, "publisher" FROM "longlist" Semicolon.
    • 1:17:15And I'll see I have 33 distinct publishers.
    • 1:17:19OK.
    • 1:17:21So this just about brings us to the conclusion
    • 1:17:24of all of these new SQL keywords here.
    • 1:17:27We've seen so far that we have several here to use.
    • 1:17:30But let's figure out how to actually exit this prompt.
    • 1:17:33So you might it be in your SQLite prompt right now.
    • 1:17:35If you want to leave it, you could also use this command, dot quit.
    • 1:17:39Dot quit is not a SQL keyword.
    • 1:17:41It's a SQLite keyword to leave your terminal
    • 1:17:44and go back to where you started.
    • 1:17:47So just to review then, what we've seen so far
    • 1:17:50is how to select data from our table.
    • 1:17:53We can use select column to take some column from our table
    • 1:17:56and give us back all of those rows from that table for that column.
    • 1:18:02We've seen we can apply some aggregate functions to take
    • 1:18:05maybe the count of our columns or the average or so on.
    • 1:18:08And we can get back not just all of our rows,
    • 1:18:10but only some of them using our WHERE clause here along with a condition.
    • 1:18:16We could have multiple conditions, having not just one but perhaps two.
    • 1:18:20Like, let's say, here, condition 0 and condition 1.
    • 1:18:23And we could also use, we saw before, this idea of equals and LIKE
    • 1:18:27to match some pattern or to make something exactly equal over here.
    • 1:18:31We could, again, use AND and OR.
    • 1:18:33And we saw later on how we could order our data
    • 1:18:35and use our LIMIT function to get back only some number of rows.
    • 1:18:41Now, this then is our interaction to querying.
    • 1:18:45And so far, we've seen this world of books.
    • 1:18:48And the table we've had so far really just has books inside of it.
    • 1:18:52But next time, what we'll see is how to take this world of books and split
    • 1:18:57into multiple tables.
    • 1:18:58How do we find information on publishers or books or authors too?
    • 1:19:02And how do we try to put that in a table that
    • 1:19:04can present the relationships among all of these different entities?
    • 1:19:07We'll talk about all that and more when we come back next time.
    • 1:19:10And we'll see you there.
  • CS50.ai
Shortcuts
Before using a shortcut, click at least once on the video itself (to give it "focus") after closing this window.
Play/Pause spacebar or k
Rewind 10 seconds left arrow or j
Fast forward 10 seconds right arrow or l
Previous frame (while paused) ,
Next frame (while paused) .
Decrease playback rate <
Increase playback rate >
Toggle captions on/off c
Toggle mute m
Toggle full screen f or double-click video