CS50 Video Player
    • 🧁

    • 🍬

    • 🍓

    • 🍿
    • 0:00:00Introduction
    • 0:00:17Designing
    • 0:06:48MBTA
    • 0:14:04Normalization
    • 0:20:24CREATE TABLE
    • 0:27:25Data Types and Storage Classes
    • 0:45:27Table Constraints
    • 0:53:34Column Constraints
    • 1:00:45Altering Tables
    • 1:20:01Charlie
    • 0:00:00[MUSIC PLAYING]
    • 0:00:17CARTER ZENKE: Well, hello, one and all.
    • 0:00:18And welcome back to CS50's Introduction to Databases with SQL.
    • 0:00:22My name is Carter Zenke.
    • 0:00:24And last we left off, we learned about relating--
    • 0:00:26that is, how to have multiple tables in our database for people, places,
    • 0:00:31and things, and how to have them relate to one another in the way
    • 0:00:34you might do in the real world.
    • 0:00:36Now, today, we'll take a step forward.
    • 0:00:38And we'll talk about how to put you in the driver's seat,
    • 0:00:41designing your very own database schemas to organize your data.
    • 0:00:45Now, we'll pick up where we left off, which is with this database of books.
    • 0:00:49So we had this database.
    • 0:00:52It was full of books that have been longlisted for the International Booker
    • 0:00:56Prize.
    • 0:00:57To be longlisted means to be nominated for some prize, let's say.
    • 0:01:01So we had the past five years of books in this database.
    • 0:01:05And we worked on improving this database over time from week 0 to week 1.
    • 0:01:10Now, this week, we'll actually take a look underneath the hood
    • 0:01:13and see what commands we had used to create these varied databases.
    • 0:01:17So let's reveal now what we had done all along.
    • 0:01:21I'll go back to my terminal here.
    • 0:01:24And if you remember, I could use a command
    • 0:01:26to open up a database file, which was this command here,
    • 0:01:29SQL sqlite3, and then the name of the file I want to open.
    • 0:01:34So let's try this.
    • 0:01:35I'll go back to week 0, to my week 0 folder like this.
    • 0:01:39And I'll open up my long list database, week0/longlist.db.
    • 0:01:46I'll hit Enter.
    • 0:01:47Now, I'm in my sqlite prompt.
    • 0:01:49So I could work on typing some SQL commands,
    • 0:01:52SQL statements or queries inside of this terminal prompt here.
    • 0:01:58So if I want to get a feel for what's inside this database,
    • 0:02:01we saw I could use a command called select.
    • 0:02:04So I'll select now some rows from this table.
    • 0:02:07I'll say SELECT, let's say, the title and also the author columns
    • 0:02:14from my longlist table, semicolon, and Enter.
    • 0:02:19Now, I'll see all the titles and all the authors that
    • 0:02:23were inside of this long list table.
    • 0:02:27But I only want to peek here.
    • 0:02:29I only want to see roughly what kind of data is inside.
    • 0:02:32So I could probably improve this command here.
    • 0:02:36I could instead say SELECT the author and title columns from longlist--
    • 0:02:43from longlist, but limit now to the first five rows we saw before.
    • 0:02:49Semicolon, and I'll hit Enter on this query.
    • 0:02:52And now, I'll see only the top five rows.
    • 0:02:56So I'm able now to see what kind of data is inside my database.
    • 0:03:02But what I can't yet see is what command was used to create this table
    • 0:03:08and what kind of data could be stored inside of it.
    • 0:03:11So let's reveal now what was going on underneath the hood all this time.
    • 0:03:16I'll say, this new command, a sqlite command-- not a SQL keyword,
    • 0:03:20but sqlite command called .schema, .schema.
    • 0:03:24Now, if I hit Enter, I'll see the following--
    • 0:03:28the command, the statement, the query that
    • 0:03:30was used to create this longlist table.
    • 0:03:34And notice how I have many columns inside this table.
    • 0:03:37I have an ISBN column, a title column, an author column, and so on.
    • 0:03:43And each column seems to have some kind of data
    • 0:03:47that could be stored inside of it like text, or integers, or real,
    • 0:03:52or floating point values, or decimals, if you're familiar.
    • 0:03:56So this is how we created the very first version of longlist.db.
    • 0:04:01But let's also see how we created the second.
    • 0:04:04So I'll type .quit to leave this version of longlist.db.
    • 0:04:09And now, let me open up the next version we had created,
    • 0:04:12which is more relational.
    • 0:04:13It had tables inside of it that could relate to one another.
    • 0:04:17So I'm going to type sqlite3, then long--
    • 0:04:21sqlite3, and go to week 1, and then type longlist.db.
    • 0:04:26I'll hit Enter.
    • 0:04:27Now, I'm on my next version of longlist.db.
    • 0:04:31Well, what could I do?
    • 0:04:32I could type SELECT and look at some of the tables in here
    • 0:04:36to see what kind of data is inside.
    • 0:04:38I could perhaps say SELECT maybe the names from the authors table
    • 0:04:44here, from the authors table, and hit semicolon.
    • 0:04:48Now, I'll see all the names of authors that are inside of the authors table.
    • 0:04:54I could do the same thing for books.
    • 0:04:55I could maybe look at the titles of books.
    • 0:04:58I could say SELECT title from the books table, semicolon, and Enter.
    • 0:05:05Now, I see all of the titles that are inside my books table.
    • 0:05:10But what I haven't seen yet is the schema of this database.
    • 0:05:14It is the way it is organized and the commands
    • 0:05:17that were used to create these tables.
    • 0:05:19So let me work on that now.
    • 0:05:21I'll clear my terminal using Control-L.
    • 0:05:24And now, let me type .schema again.
    • 0:05:28I'll type .schema to see what commands were used to create this database.
    • 0:05:34Hit Enter, and I can see--
    • 0:05:37I mean, there are quite a lot of commands here.
    • 0:05:39Now, if this feels overwhelming, I mean, I'd be right there with you.
    • 0:05:42This is a lot of commands to parse through and read.
    • 0:05:45So there's probably a better way to do this.
    • 0:05:49And one way to try is to type .schema, and then give it some table name.
    • 0:05:55Let's say, want to understand the schema for just the books table,
    • 0:05:58like just that for now.
    • 0:06:00So I'll say .schema, and then the table name--
    • 0:06:04books, in this case.
    • 0:06:05Then I can hit Enter.
    • 0:06:07Now, I'll see the schema, the organization,
    • 0:06:11the command that we used to create the books table, in this case.
    • 0:06:15And notice again, we have several columns--
    • 0:06:18ID, ISBN, title, publisher ID, and so on.
    • 0:06:22Each one has their own kind of data they could support or take
    • 0:06:26into this column like integers, text, and so on.
    • 0:06:30So again, what we'll do today is have you all
    • 0:06:33learn how to write your very own create table
    • 0:06:36commands to build your very own databases that represent what you
    • 0:06:41want to represent in the real world.
    • 0:06:44So let me exit this prompt here.
    • 0:06:47And let me propose that we'll have a bit of a design challenge
    • 0:06:51today to actually try to represent a real world entity with some database
    • 0:06:55here.
    • 0:06:56And if you're not already familiar, Boston
    • 0:06:59is perhaps famous for being among the first cities
    • 0:07:02to have a subway system in the United States.
    • 0:07:05So here is a picture from the late 1800s of a subway being
    • 0:07:09built in Boston's city streets.
    • 0:07:11Underneath the streets here, there would be
    • 0:07:14trolley cars that would go and transport people across Boston.
    • 0:07:18Here's another picture of a trolley actually working
    • 0:07:20underneath the streets.
    • 0:07:21So people would go down underneath.
    • 0:07:23They would hop on a trolley.
    • 0:07:25They'd be able to different parts of Boston, perhaps from Harvard to MIT
    • 0:07:29or downtown up to, let's say, Braintree or down to Braintree,
    • 0:07:33which is more south of Boston, for example.
    • 0:07:36One of the famous stops is the Park Street stop,
    • 0:07:39which is right down in the middle of Boston,
    • 0:07:41one of the central hubs of this subway system.
    • 0:07:45And now, these photos are all from, let's say, the early 1900s, late 1800s,
    • 0:07:50and so on.
    • 0:07:51But the subway has gotten a lot more modern since then.
    • 0:07:54And actually, now, we have several lines that span the entire city and beyond.
    • 0:07:59So here, we have the Red Line, of which Harvard and MIT are a part.
    • 0:08:02We have the green line, which brings you kind of west to east--
    • 0:08:05the blue line, the orange line, and so on.
    • 0:08:09So many more lines and stations have been added to this system.
    • 0:08:14It's a big design challenge to represent all of these stations,
    • 0:08:18all of these lines, and all of these people who might ride this subway, too.
    • 0:08:24So the question then becomes, how can we create a schema for this data?
    • 0:08:31And again, by schema, we mean what kinds of tables should we have,
    • 0:08:36what kinds of columns might those tables have, and what kind of data
    • 0:08:40should we put in each of those columns, for instance.
    • 0:08:44So let me propose this.
    • 0:08:46Let's say we start just first with names and stations.
    • 0:08:49So Charlie here, our very first rider on this system,
    • 0:08:53is going to be at the Kendall at MIT station.
    • 0:08:56So this is what this table represents now.
    • 0:08:58But what more could we have?
    • 0:09:01Well, we might also want to have maybe what Charlie is doing at that station.
    • 0:09:05Maybe he's entering the station, for instance.
    • 0:09:09And if you're familiar with the subway system,
    • 0:09:11you often have to pay to get onto a train or get onto the station itself.
    • 0:09:17So let's say Charlie pays some fare to enter into the Kendall/MIT station.
    • 0:09:23Well, back in the mid 1900s, the fare was only about a dime.
    • 0:09:26It was $0.10.
    • 0:09:27So we'll say Charlie paid $0.10 to enter the Kendall/MIT station.
    • 0:09:33And now, this seems pretty good.
    • 0:09:36But if I am the Transit Authority, the person who runs the subway system,
    • 0:09:41I probably want to know, does Charlie have enough money to get on the train?
    • 0:09:45And if so I want to make sure that, OK, well, Charlie actually
    • 0:09:48could get on this train.
    • 0:09:50So let's say not only does Charlie pay a fare.
    • 0:09:53He has some remaining balance afterwards.
    • 0:09:56So Charlie here has gotten onto the Kendall/MIT stop.
    • 0:10:00He's paid the fare of $0.10 and has $0.05 left.
    • 0:10:05OK, so here's a bit of a table.
    • 0:10:08We probably add more information to it.
    • 0:10:11Let's say Charlie then leaves at the Jamaica Plain stop.
    • 0:10:15And the fare to leave is about a nickel, $0.05.
    • 0:10:18And now, Charlie has no cents left over.
    • 0:10:21So again, Charlie paid $0.10 to get on, had $0.05 left, paid $0.05 to get off,
    • 0:10:27and now has no remaining balance here anymore.
    • 0:10:31OK.
    • 0:10:32So that's Charlie's story.
    • 0:10:34Let's look at Alice, though.
    • 0:10:35Let's say Alice gets on at the Harvard stop.
    • 0:10:38They too pay $0.10 to get on at the Harvard stop.
    • 0:10:42And they have a remaining balance of $0.20.
    • 0:10:45Alice will go, let's say, to Park Street, get off at Park Street,
    • 0:10:49pay the nickel to leave.
    • 0:10:51And now, they'll have a balance of $0.15 at the end.
    • 0:10:55Let's go to Bob.
    • 0:10:56Bob enters the Alewife station.
    • 0:10:59They pay $0.10.
    • 0:11:00They have remaining balance of $0.30.
    • 0:11:02And let's say they leave at Park Street and have a fare of $0.10 to leave
    • 0:11:08because it's a further distance.
    • 0:11:09Now, they'll have a running balance of $0.20 overall.
    • 0:11:14So this table is OK, I might admit.
    • 0:11:18I mean, last time, we learned about having what we called primary keys
    • 0:11:23and foreign keys.
    • 0:11:24So it seems like that's missing here.
    • 0:11:26Let's go ahead and add that here.
    • 0:11:28I'll give each row a unique ID, so I can know who entered, who exited,
    • 0:11:32and so on, give that a unique ID here.
    • 0:11:35But I might even say that this table could really be improved substantially.
    • 0:11:41And I want to ask you what redundancies or inefficiencies do you see here?
    • 0:11:49If we're trying to represent riders and stations,
    • 0:11:52what can we improve about this design?
    • 0:11:56AUDIENCE: So probably the redundancy will be the names and the stations,
    • 0:12:01too.
    • 0:12:02For example, if Charlie will go to the train daily,
    • 0:12:06then he will become most of the names in the data.
    • 0:12:10CARTER ZENKE: Yeah, a good point.
    • 0:12:12If I'm hearing what you're saying, [? Loren, ?]
    • 0:12:13let me show you some examples that I highlighted here.
    • 0:12:16One example could be, to your point, about these names.
    • 0:12:20These names seem to be telling us the name of a person.
    • 0:12:24But here, we have only three names--
    • 0:12:27Charlie, Alice, and Bob.
    • 0:12:29Well, my name is Carter.
    • 0:12:32And what if somebody else named Carter also
    • 0:12:35tried to get on and leave at some stop?
    • 0:12:37Well, I wouldn't be able to know which Carter
    • 0:12:39was which or which Charlie was which, which Alice was which, and so on.
    • 0:12:44So we probably need a way to represent people and their names a little better
    • 0:12:47here, too.
    • 0:12:50What other ideas do we have for how to improve the design of this table?
    • 0:12:55AUDIENCE: Yes, I think we can have a singular ID for a singular person.
    • 0:13:00That way, we'll be better able to track their activities.
    • 0:13:02CARTER ZENKE: Nice.
    • 0:13:03So we probably have an ID for each person,
    • 0:13:06a bit what we learned about last week, putting people in their own table
    • 0:13:10and giving them their own unique ID, a primary key.
    • 0:13:14Let's show that here.
    • 0:13:16I'll go to some slides.
    • 0:13:18And I'll pick out one that shows us just riders.
    • 0:13:21So to your point, [? Soqanya, ?] we could try
    • 0:13:23to have maybe a table for just riders.
    • 0:13:25And maybe to simplify, this table has only two columns.
    • 0:13:29It has a column for ID and a column for name.
    • 0:13:32So here, we have Charlie, Alice, and Bob all in their own table.
    • 0:13:38Well, let me propose to you, we could do the same thing for stations.
    • 0:13:42Let's say we have a table of stations now.
    • 0:13:44And we give each one their very own ID as well,
    • 0:13:48our own primary key for this table.
    • 0:13:50We have Harvard, Kendall, and Park Street.
    • 0:13:52We can differentiate between them using their IDs here.
    • 0:13:57So a few improvements could be made.
    • 0:14:00And as we're making these improvements, splitting one table into many,
    • 0:14:04and debating what kind of data to store in each table,
    • 0:14:08the process we're going through is one called normalizing.
    • 0:14:12We're normalizing our data.
    • 0:14:14To normalize means to reduce redundancies, effectively,
    • 0:14:18to take a table--
    • 0:14:19take one table, for instance, split up into multiple,
    • 0:14:22and have each entity be part of its very own table.
    • 0:14:26Some academics in the world have named different normal forms, quote unquote.
    • 0:14:31There's like first normal form, second normal form, third normal form,
    • 0:14:36this progression of making your day more and more efficient.
    • 0:14:39You can use those as heuristics.
    • 0:14:41But end of the day, a few principles might apply.
    • 0:14:43First, take your entities, like in this case stations and riders,
    • 0:14:48and put them each in their own table.
    • 0:14:51And if you add more data, make sure that if I were to add a column, let's say,
    • 0:14:56to riders, it is only a data point about riders--
    • 0:15:00not about stations, not about fares, only about riders.
    • 0:15:05And that process can help us make a data set that
    • 0:15:08is more dynamic, more easy to reproduce, and more easy to write queries on.
    • 0:15:14So that is the process here of normalizing.
    • 0:15:18OK.
    • 0:15:18So if we have now riders and stations, we want to represent them in our table.
    • 0:15:25Well, we could use what we learned about relating
    • 0:15:28last week to ask, how could we actually represent
    • 0:15:30these riders and these stations?
    • 0:15:33So let's say here I can have riders and stations.
    • 0:15:36I want to make sure that I have the right relationship between them.
    • 0:15:40Well, if you're familiar with subways, we
    • 0:15:42might say that a rider goes to one station.
    • 0:15:45And this big T here is the symbol for a station here in Boston,
    • 0:15:50for the T's that we call it, for the subway.
    • 0:15:52So a rider might go to one station.
    • 0:15:56But of course, that might not be the full picture.
    • 0:15:58A rider also gets off at some station on.
    • 0:16:01So a rider could be associated with not just one station, but multiple.
    • 0:16:06And if you're familiar, at least with any subway system or the Boston one,
    • 0:16:10too, it can often get pretty busy.
    • 0:16:12And so riders might not just go to, of course, one station or two.
    • 0:16:16Stations could also have multiple riders that are on a particular station here.
    • 0:16:22So to recap, one rider might be associated with more than one station.
    • 0:16:28They might get on at this first one and get off at this later one.
    • 0:16:33But each station could presumably have more than one rider.
    • 0:16:37Each station here could have rider A or rider B, the rider
    • 0:16:41up here or the rider down below, and even many more than that as well.
    • 0:16:47So to put it in the language of our ER diagrams,
    • 0:16:50our entity relation diagrams from last week,
    • 0:16:53we could look at it bit like this, where we have riders and stations.
    • 0:16:57Riders visit stations, and they're associated like this.
    • 0:17:02A rider must have at least one station associated with them.
    • 0:17:07That's what this horizontal line means.
    • 0:17:09If they aren't at a station, they aren't really a rider, right?
    • 0:17:13A rider though could have many stations associated with them.
    • 0:17:16That's what this three prongs down here means.
    • 0:17:19They could have one, two, three, four, they
    • 0:17:22could have many stations they get on and get off of.
    • 0:17:26Now a station could have anywhere between zero riders,
    • 0:17:31if it's maybe out of commission or isn't very popular, upwards to many.
    • 0:17:35It could have two, three, four, five, even hundreds of riders associated
    • 0:17:39with this particular station.
    • 0:17:42So here is our entity relation diagram for these particular riders
    • 0:17:47and these stations here.
    • 0:17:50So let me ask, what questions do we have on these relationships between riders
    • 0:17:55and stations and how to design this table so far?
    • 0:17:59AUDIENCE: Then I want to ask that you have
    • 0:18:01used the same ID for stations and riders so that maybe
    • 0:18:06give us a problem in coding?
    • 0:18:08CARTER ZENKE: Yeah, a good observation.
    • 0:18:10So you might have noticed that in the riders table and in the stations table,
    • 0:18:14I gave the same kind of ID.
    • 0:18:17Like I had one, two, three for each of them.
    • 0:18:19And let me just show you that again here.
    • 0:18:21I'll come back to some slides, and I'll show you again,
    • 0:18:24the riders table where we had Charlie, Alice, and Bob, ID 1, 2, 3.
    • 0:18:30Same for the stations.
    • 0:18:31We had stations Harvard, Kendall, Park Street, ID 1, 2, 3.
    • 0:18:36And to your question, isn't that a problem?
    • 0:18:39Well, I would argue in this case, it's not,
    • 0:18:43so long as we keep clear that these IDs are for stations
    • 0:18:48and these IDs are for riders.
    • 0:18:50And we'll see how to do that using our SQL keywords later on.
    • 0:18:54But again, so long as we have an ID just for our riders and an ID
    • 0:18:59just for our stations, we can keep these separate
    • 0:19:02even if they might have the same values.
    • 0:19:05But a great question here.
    • 0:19:07Let's take just one more.
    • 0:19:10AUDIENCE: Regarding the entity relationship diagram,
    • 0:19:12how is it possible for a station to have a possibility of zero riders,
    • 0:19:17but riders must compulsorily have at least one station.
    • 0:19:21CARTER ZENKE: Yeah, good question.
    • 0:19:22So this might be up to you and how you formulate it,
    • 0:19:25but for me, let me show that diagram again.
    • 0:19:28I'll go back to over here.
    • 0:19:32In my mind, to be a rider, you have to visit a station.
    • 0:19:36If you aren't visiting a station, you aren't really a rider, right?
    • 0:19:39Now presumably, there are stations that were built but aren't really being
    • 0:19:43used right now or aren't really in service yet.
    • 0:19:46That could be a station that has no visitors.
    • 0:19:49So you could argue--
    • 0:19:50let's make sure every station has at least one rider
    • 0:19:54and every rider may or may not have to visit a station.
    • 0:19:56For that I would say, we could probably reasonably
    • 0:19:59disagree there and talk about how we could represent the diagram here, too.
    • 0:20:03But a great observation and a good other critique of this system here.
    • 0:20:09All right.
    • 0:20:11So let's now turn to representing this in our database.
    • 0:20:16I'll go back to my computer and we'll learn
    • 0:20:19about this new SQL keyword, SQL statement,
    • 0:20:23this one called Create Table.
    • 0:20:26Create table allows us to, as the name suggests, create a brand new table.
    • 0:20:31So let's do just that in our new database
    • 0:20:34to represent riders and stations.
    • 0:20:37I'll go into my terminal, and I want to make a brand new database.
    • 0:20:42I'll call this one mbta.db, because mbta stands for the Massachusetts Bay
    • 0:20:49Transportation Authority, the people who run the subway, essentially.
    • 0:20:52So I'll do sqlite3, mbta.db, hit Enter, and I'll
    • 0:20:57type Y to say yes, I want to create this brand new database.
    • 0:21:04Now if I type dot schema, I see nothing.
    • 0:21:09But that's expected.
    • 0:21:10I don't have any tables yet.
    • 0:21:11I have nothing inside this database.
    • 0:21:13It's up to me to create these tables myself.
    • 0:21:17So what I'll do is clear my terminal, and let's start first with riders.
    • 0:21:22I might create a table for riders.
    • 0:21:23I'll say, create table.
    • 0:21:26And now I have to give that table some name.
    • 0:21:29I might call it riders here.
    • 0:21:31And then in parentheses, like this, I can specify what
    • 0:21:36columns should be part of this table.
    • 0:21:39So let's start first.
    • 0:21:40I'll hit Enter here and continue this query.
    • 0:21:44Now, all by convention, I'll just indent four spaces--
    • 0:21:47one, two, three, four, and I'll give an ID to each of these riders
    • 0:21:52as we saw before.
    • 0:21:54I'll say ID here is one of my columns.
    • 0:21:58Now to create a new column, I'll follow this up with a comma and hit Enter.
    • 0:22:04I'll again, by convention, for style, just indent four spaces.
    • 0:22:08And what's my next column?
    • 0:22:09Perhaps a name for these riders.
    • 0:22:12I'll give this column the name, name, and I'll leave it at that.
    • 0:22:17Once I'm done adding columns, I no longer need to have a comma.
    • 0:22:22I could simply close out this query, this statement.
    • 0:22:26I could hit Enter here, say close in parentheses
    • 0:22:29to close the top parentheses here, semicolon, hit Enter.
    • 0:22:33And now nothing seems to happen.
    • 0:22:36But that's actually a good sign.
    • 0:22:38So let me type dot schema, hit Enter, and I'll
    • 0:22:41see the result of that statement before.
    • 0:22:44Create table, if it doesn't already exist, riders, and riders
    • 0:22:48has inside of it two columns, ID and name.
    • 0:22:53OK, let's keep going.
    • 0:22:54Let's make one for stations, too.
    • 0:22:56I'll clear my terminal and I'll say, create me
    • 0:22:59a table called stations and include--
    • 0:23:02actually, not station.
    • 0:23:04If you ever want to, let's say fix this kind of table here,
    • 0:23:09let me try closing the parentheses, hit semicolon, Enter.
    • 0:23:14I'll get a syntax error.
    • 0:23:15I can restart.
    • 0:23:17I'll do Control L. Now I'll do create table, stations, plural.
    • 0:23:22Open parentheses, Enter.
    • 0:23:24I'll indent by four spaces.
    • 0:23:25One, two, three, four.
    • 0:23:27And now, I'll similarly include an ID for each of these stations.
    • 0:23:31I'll say ID comma, and then what all should I have?
    • 0:23:36Well, stations tend to have a name, like the Kendall MIT station,
    • 0:23:40the Harvard station, the Park Street Station.
    • 0:23:43So I'll say, I'll give each of these their very own name, comma.
    • 0:23:48What else do stations have?
    • 0:23:50Well, they might also have a line that they're on.
    • 0:23:54Let's say it's the red line, or the blue line, or the green line and so on.
    • 0:23:58I'll have a space for them to write in their line that they're a part of.
    • 0:24:03OK, and maybe I'll leave it at that to keep it simple.
    • 0:24:06I'll say, stations have an ID, a name and a line.
    • 0:24:09Now I'll close this out.
    • 0:24:10I'll say end parentheses, semicolon, hit Enter, and nothing seems to happen.
    • 0:24:15But if I type dot schema, I'll now see my riders and my stations
    • 0:24:20tables inside of my database.
    • 0:24:24Now one more step here.
    • 0:24:26We have riders, we have stations, but we have to relate to them.
    • 0:24:31We have to relate them using this many to many relationships,
    • 0:24:34as we saw last week.
    • 0:24:37So let me try making a table to implement
    • 0:24:40this many to many relationship.
    • 0:24:43And if you remember, we might call this kind of table a junction table,
    • 0:24:47an associative entity, a join table.
    • 0:24:50It has a lot of names, but it looks a bit like this.
    • 0:24:53I'll create this new table to represent, let's say, visits.
    • 0:24:58A rider visits a station.
    • 0:25:01So I'll call this table visits.
    • 0:25:03And inside, I'll make sure it has two columns, one for a rider
    • 0:25:09ID to represent a rider, and one, let's say, for a station ID,
    • 0:25:17to represent a station.
    • 0:25:19Now when I see a rider ID next to a station ID in this table,
    • 0:25:24I'll know the rider with that certain ID visited
    • 0:25:27the station with that certain ID.
    • 0:25:30So I'll close this out.
    • 0:25:31I'll say end, parentheses here, semicolon, Enter, and finally, clear
    • 0:25:36my terminal, type dot schema.
    • 0:25:38And I can see, I have riders, stations, and visits between riders and stations
    • 0:25:45in this associative entity, this junction table or a joined table.
    • 0:25:48Up to you what you might want to call it in this case.
    • 0:25:51Now what questions do we have?
    • 0:25:55AUDIENCE: Why we have not use the primary key
    • 0:25:58and secondary key in this table?
    • 0:26:00CARTER ZENKE: Good question.
    • 0:26:01So we're going to get there in just a minute.
    • 0:26:03But if I look back at my terminal here, my schema, I'll see,
    • 0:26:07I really just have column names.
    • 0:26:09And we saw before and we typed dot schema on our longlist.db,
    • 0:26:13we had more than just column names.
    • 0:26:16We had column names, we had perhaps data types,
    • 0:26:19we had primary keys and foreign keys.
    • 0:26:21So we'll get to that in just a minute.
    • 0:26:23But suffice to say for now, we're going to keep improving this over time.
    • 0:26:28Let's take one more.
    • 0:26:30AUDIENCE: Is it required to put spaces--
    • 0:26:33the four spaces indents, or that's just for the visual look?
    • 0:26:37CARTER ZENKE: Yeah, great question.
    • 0:26:38Is it required to have these four spaces before each column name.
    • 0:26:43And in fact, no, it's not, but it makes the code more readable.
    • 0:26:46So I could put this all in one line--
    • 0:26:49I shouldn't, but I could.
    • 0:26:51And if I have instead this new line followed by four spaces,
    • 0:26:55I can make this more readable for myself and for my colleagues, too.
    • 0:27:00Good question.
    • 0:27:02OK, so to our earlier point, there are things
    • 0:27:07that are missing from this schema.
    • 0:27:09Like, we have column names, but as we saw before,
    • 0:27:13we should ideally specify what kind of data
    • 0:27:16should be able to go into each of these columns.
    • 0:27:20And for that, we'll need some new ideas to talk about here.
    • 0:27:24So let's focus now on this idea of data types and storage classes.
    • 0:27:30Data types and storage classes are two very similar but distinct IDs,
    • 0:27:36and they're distinct in a way we'll talk about in just a minute.
    • 0:27:40Now SQLite has five storage classes, five kind of storage--
    • 0:27:45kind of type, so to speak, of values that can hold.
    • 0:27:49So let's talk about the first one, null, for instance.
    • 0:27:51Null in this case means nothing.
    • 0:27:54There's nothing that actually is inside of this value.
    • 0:27:57It's kind of a central value to mean nothing is here.
    • 0:27:59Integer means a whole number, like 1, 2, 3, 4, or 5.
    • 0:28:04Real talks about decimals like floating points like 1.2 or 2.4 and so on.
    • 0:28:10Text is used for characters.
    • 0:28:12And blob, kind of a funny one, blob stands for Binary Large Object,
    • 0:28:19and it represents the data exactly as I give it to this value.
    • 0:28:24If I tell it to store 101010, it will store exactly 101010 in binary.
    • 0:28:30So useful for storing, in this case like images and video files
    • 0:28:34and audio files, things that have some structure
    • 0:28:37we don't want to mess around with.
    • 0:28:40Now let's focus on this idea of a storage class.
    • 0:28:44These, I'll say, are storage classes and not data types in SQLite.
    • 0:28:50Now a storage class like integer can comprise, can hold several data types.
    • 0:28:58Notice how there are seven different data
    • 0:29:01types that could be stored under this integer storage class.
    • 0:29:05We have a 6-byte integer, 2-byte integer, a 8 and 0 and so on.
    • 0:29:10It could be any of these particular types,
    • 0:29:13but each of these types under the umbrella of this integer storage class.
    • 0:29:19And SQLite itself will take care of making sure
    • 0:29:23it uses the appropriate data type.
    • 0:29:25Like if I give a very large number, like let's say 4 billion or 6 billion
    • 0:29:30or even bigger than that, it will probably use a longer--
    • 0:29:35that is, a bigger byte integer to store that kind of value.
    • 0:29:39If I give it a smaller one, like 1, 2, 3, or 4,
    • 0:29:42it will probably use a 1-byte or a 2-byte integer for that.
    • 0:29:47But SQLite's idea is that I, as a programmer,
    • 0:29:50shouldn't have to care if I use an 8-byte or a 1-byte or a 2-byte integer,
    • 0:29:54I just care that I'm using integers, whole numbers,
    • 0:29:58and they give me a storage class to use any of these up to their choice
    • 0:30:03here as well.
    • 0:30:05Now let's look at a few examples of values in SQLite that we could store.
    • 0:30:10Well, we have perhaps the red line as some text.
    • 0:30:14And because this is characters, it's quoted,
    • 0:30:16we could use the text storage class to represent this particular value here.
    • 0:30:22We could have maybe an image, and to the earlier point,
    • 0:30:25we could say, well, this image might be best represented
    • 0:30:29using a blob, a binary large object, to keep all of these pixels
    • 0:30:33exactly as they are in this image.
    • 0:30:36But we do get some choice, some interesting design
    • 0:30:40challenges when we look at the idea of fares.
    • 0:30:43So let's say to our point earlier, fares are $0.10 back in the 1950s or so.
    • 0:30:50Well, $0.10 we could store as an integer, which seems just fine.
    • 0:30:55But this could get confused.
    • 0:30:58I'm talking about dollars here or cents?
    • 0:31:00Maybe it would be better, let's say, if I did this.
    • 0:31:03A dollar sign, 0.10.
    • 0:31:07And what might that be stored as?
    • 0:31:09Well, probably text, right?
    • 0:31:11I could say this dollar sign isn't really a number,
    • 0:31:14but now I have to include it, so I'll say this will be quoted, essentially,
    • 0:31:18as dollar sign 0.10.
    • 0:31:22Now there's some downsides here, too.
    • 0:31:23Like let's say I have--
    • 0:31:25I want to add these up.
    • 0:31:26Well, I can't add up text.
    • 0:31:28Like, what does it mean to say dollar sign 0.10 plus dollar sign 0.20.
    • 0:31:33I can't do math with text.
    • 0:31:35So maybe it'll be better after all if I used a real or a decimal like this,
    • 0:31:410.10.
    • 0:31:43But I mean even here, you'll run into some problems.
    • 0:31:46If you are familiar with how the number is represented in binary,
    • 0:31:50you might know that decimal values or floating point values
    • 0:31:54can't be perfectly, precisely represented.
    • 0:31:57And if I talk about 0.10, the computer might store 0.10000056789.
    • 0:32:04It could get very wonky out to the many, many decimal digits down below here.
    • 0:32:09So trade offs and challenges overall.
    • 0:32:13Let's look at these three, though.
    • 0:32:15I have the first one to store as an integer.
    • 0:32:17I'm trying to store fares here, second one as text, and the third one
    • 0:32:23as a floating point or a real in this case.
    • 0:32:27Let me ask for some opinions here, which one
    • 0:32:30would you use and why, for trying represent fares in this case?
    • 0:32:36AUDIENCE: Thank you.
    • 0:32:37I prefer using integers because of course,
    • 0:32:41I need to get the calculation very accurately.
    • 0:32:44That's my point of view.
    • 0:32:45Well, sometimes I can use float but, you know,
    • 0:32:50like you said before, it can get very wonky
    • 0:32:52if I really need that kind of precision, I don't really recommend using floats.
    • 0:32:59CARTER ZENKE: Yeah, a good point.
    • 0:33:01So if I go back to some slides here, you might argue for the integer
    • 0:33:04because you know you can precisely represent integers.
    • 0:33:07And let's say I want to add up fares over a lot, a lot of riders.
    • 0:33:12This might be useful for me because I know that each number will
    • 0:33:15be perfectly, precisely represented.
    • 0:33:16I can do lots of big kind of math with this number here.
    • 0:33:20To your point, this decimal might kind of, as you said,
    • 0:33:24get wonky later on towards the later decimal points.
    • 0:33:26I might get some unexpected results if we add up these overall.
    • 0:33:30Let me ask, though, are there any proponents of this floating point
    • 0:33:36value or a real value?
    • 0:33:40AUDIENCE: So I think the--
    • 0:33:42I think the float is the number of, for example,
    • 0:33:48for each pair, like the answers like truncation
    • 0:33:56probably suggest by the comments there.
    • 0:33:58CARTER ZENKE: Yeah, a good point.
    • 0:34:00So if you talk about using this float value,
    • 0:34:02I mean, one thing we could say for it is that this decimal could be--
    • 0:34:06it's more accurate to say, like, where you're working with dollars now,
    • 0:34:09and we could have maybe $0.10, which is only 0.1 of a dollar.
    • 0:34:13I totally hear that point as well.
    • 0:34:15And the point we're making here is that they're really
    • 0:34:17just trade offs among these data storage classes
    • 0:34:20to use, whether you're using integers or real values,
    • 0:34:23it just depends on your use case and what you're designing for.
    • 0:34:27So be sure to read up on trade offs among these data types,
    • 0:34:30and determine for yourself which one should you best use.
    • 0:34:35OK.
    • 0:34:36So we have now these storage classes to store values,
    • 0:34:41and it turns out that columns can be made to store certain storage classes
    • 0:34:47or prioritize certain classes.
    • 0:34:49And the key distinction here is that columns in SQLite
    • 0:34:53don't always store one particular type.
    • 0:34:57Instead, they have type affinities, meaning
    • 0:35:01that they'll try to convert some value you insert into a given cell or given
    • 0:35:07row to the type they have the affinity for.
    • 0:35:11Now there are, let's say five type affinities in SQLite--
    • 0:35:16text.
    • 0:35:17Columns can be of the type affinity text,
    • 0:35:19meaning they store just characters at the end of the day.
    • 0:35:22There's also numeric, which stores either integer values or real values,
    • 0:35:27depending on which one seems best to convert to.
    • 0:35:31You have integer type affinity, which means it can store whole numbers.
    • 0:35:33Real, to store floating points or decimal values.
    • 0:35:36And we have blob here again, our old friend
    • 0:35:39to store binary exactly as we get it.
    • 0:35:42So whereas before we were talking about storage classes--
    • 0:35:44those are associated with individual values--
    • 0:35:47type affinities are now associated with individual columns.
    • 0:35:51So let's see some example of how this might work in SQLite.
    • 0:35:55Let's say I have this table of fares, and we've
    • 0:35:59decided to store fares as integers.
    • 0:36:02Well, if I said that this column called amount has the affinity for the text
    • 0:36:09storage class, what would happen is if I insert this integer, 10,
    • 0:36:14it would look a bit like this later on.
    • 0:36:16It would be converted to text, it would be quoted in a sense,
    • 0:36:20to represent-- it's now been converted to some set of characters.
    • 0:36:24Let's say I insert this value 25.
    • 0:36:27Well, 25 has a storage class right now of an integer.
    • 0:36:31It is a whole number.
    • 0:36:33But if I insert this into a column that has the text affinity,
    • 0:36:38it will be converted into, in this case, text at the end of the day.
    • 0:36:44Let's do the opposite.
    • 0:36:45Let's say I have my fare as a string, some text in this case.
    • 0:36:51I want to insert it into this column called amount,
    • 0:36:54but now amount has the integer type affinity.
    • 0:36:58Well, if I insert 10, quote unquote, into the column amount,
    • 0:37:03I'll get back not 10, the text, but 10 the integer, because again, amount--
    • 0:37:10this column here-- has an affinity for the integer storage class.
    • 0:37:16Let's try this.
    • 0:37:1725, some text again I'll insert it into this table.
    • 0:37:21Now I'll have 25 as an integer.
    • 0:37:26So this is how SQLite allows us to give certain columns
    • 0:37:31an affinity for certain types, that they'll
    • 0:37:33try to store values of that type, so long as we insert values it could be
    • 0:37:38feasibly converted to that type here.
    • 0:37:41So let's go back to our schema and try to improve it now,
    • 0:37:45to use not just column names, but also type affinities to store
    • 0:37:50certain data of a certain type.
    • 0:37:53Go back to my computer here, and let's improve this once more.
    • 0:37:58So I'll go over to my table.
    • 0:38:00And now, I probably want to improve the design here.
    • 0:38:05And often, if I want to improve this, I might just
    • 0:38:08need to erase what I've already done.
    • 0:38:10So let me introduce this new keyword, this new statement called Drop Table.
    • 0:38:16To drop a table means to delete it, to remove it, effectively.
    • 0:38:19So let me try doing this for riders, stations, and visits.
    • 0:38:23I'll type drop table writers, semicolon, Enter.
    • 0:38:28Nothing seems to happen, but if I type dot schema, well, riders is gone.
    • 0:38:34I'll try drop table stations, then semicolon, hit Enter, and type,
    • 0:38:40let's say dot schema again.
    • 0:38:42No more stations.
    • 0:38:44I'll try drop table visits, semicolon, Enter,
    • 0:38:49and then dot schema, and our table--
    • 0:38:52our database is really gone.
    • 0:38:54There are no more tables inside of it.
    • 0:38:57So let me propose that instead of working inside of the SQLite
    • 0:39:02prompt, like typing out again, and again, and again, create table riders,
    • 0:39:05create table stations, create table visits,
    • 0:39:07let me be more efficient about this and create myself
    • 0:39:10a schema file that I could reuse throughout this lesson
    • 0:39:14and also later on while I'm working on this database on my own.
    • 0:39:18To do that, let me quit my SQLite prompt here, and let me
    • 0:39:23type something like code schema dot sql.
    • 0:39:27I'm just creating this file called schema.sql.
    • 0:39:31Now a dot SQL file allows me to type in SQL keywords
    • 0:39:36and have them be syntax highlighted, so I know
    • 0:39:38what's going on inside of this file.
    • 0:39:42So let's just try this once more.
    • 0:39:44I'll type Create table riders, and inside I'll
    • 0:39:49say it has the ID column of what type affinity?
    • 0:39:53Well, IDs are whole numbers, so perhaps integer in this case.
    • 0:39:58I could say, ID has the integer type affinity.
    • 0:40:02Now let me say the riders also have a name,
    • 0:40:05and how could names be best represented?
    • 0:40:07Maybe text, right?
    • 0:40:09Characters here.
    • 0:40:10So I'll say name and text.
    • 0:40:12Now I'll include a semicolon to say, this
    • 0:40:14is the end of my create table statement.
    • 0:40:17And before, remember how I had to kind of air out
    • 0:40:21or I had to like, backspace and so on to improve the design?
    • 0:40:25Here I can literally just point and click and edit this file
    • 0:40:28to improve my schema.
    • 0:40:30And I'll later on apply this in my database using
    • 0:40:35a command that we'll see a bit later.
    • 0:40:37So let's keep going here.
    • 0:40:38I'll say create table stations, and inside of the stations table,
    • 0:40:44I'll make sure it has an ID column of type integer,
    • 0:40:48a name column that probably stores some text,
    • 0:40:51and a line column that also stores some text, where name
    • 0:40:57is the name of the station like Kendell MIT, Harvard,
    • 0:41:00and line is blue line or green line, what line it's part of in our subway.
    • 0:41:05Let me try now visits.
    • 0:41:07I'll say, create table visits.
    • 0:41:08And then I'll do rider ID, which has what type affinity?
    • 0:41:13Probably integer.
    • 0:41:15And then I'll do station ID, which has this same type of affinity.
    • 0:41:19It's relying on integers, whole numbers here for IDs.
    • 0:41:23So my colon to finish this statement.
    • 0:41:25Now this is my schema as a whole.
    • 0:41:29I have riders, stations, and visits.
    • 0:41:31But now I want to apply this schema to my database.
    • 0:41:36So what could I do?
    • 0:41:37I could reopen it, let's say.
    • 0:41:39I'll do sqlite3 mbta.db in my terminal.
    • 0:41:43And now I want to read in this schema.sql file,
    • 0:41:47just run the commands that are inside of this file.
    • 0:41:51So what could I do?
    • 0:41:52I could say dot read schema.sql, where dot read is a command that
    • 0:41:59says take whatever file you give me, like schema.sql
    • 0:42:01and read it into this database, running any SQL keywords you come across there.
    • 0:42:08I'll hit Enter.
    • 0:42:09And now nothing seems to happen.
    • 0:42:11But if I type dot schema, I now see my schema improved.
    • 0:42:17And this is helpful for me, because what I could do now
    • 0:42:19is I could just edit my schema.sql file and rerun it and rerun
    • 0:42:23it to make sure I now have these tables being improved over time.
    • 0:42:29OK.
    • 0:42:30So this, then, is our new representation of our database.
    • 0:42:37We have riders, of course, their own entity, and stations.
    • 0:42:42They have an ID and a name, and stations have an ID, a name, and a line.
    • 0:42:47We've also now included these type affinities-- integer, text, integer,
    • 0:42:52text, to tell SQL what kinds of storage classes
    • 0:42:56could be put inside of each of these columns.
    • 0:43:00Now before we keep improving this, let me ask,
    • 0:43:03what questions do we have on these storage classes and type affinities?
    • 0:43:10AUDIENCE: That you would use--
    • 0:43:12when you were creating the table, the table was not in line.
    • 0:43:17So when we search for the authors of books,
    • 0:43:21so it comes with a perfect table, so how can we make a perfect table in the SQL?
    • 0:43:25CARTER ZENKE: Yeah, do you mind clarifying what
    • 0:43:27you mean by like, the perfect table?
    • 0:43:30AUDIENCE: So I mean that it was arranged in something
    • 0:43:33like in boxes, block boxes.
    • 0:43:36CARTER ZENKE: Good question.
    • 0:43:37So before we able to see the results of our queries
    • 0:43:41inside some boxes in our terminal, and that is actually a mode of SQLite,
    • 0:43:46I think type like dot mode table to see your results in that version.
    • 0:43:51Here though, we have no data inside of our tables,
    • 0:43:54so I can't really select anything from them.
    • 0:43:56Like, if I go to my terminal here and I try
    • 0:44:01to select some data from this riders table,
    • 0:44:04let me say select star from rider's, semicolon, I won't get anything back.
    • 0:44:10Next week, though, we'll see how to insert, and add, and update, and delete
    • 0:44:15data inside of these tables, at which point,
    • 0:44:17you could write, select star from riders and see some data
    • 0:44:21you've inserted yourself.
    • 0:44:23Great question.
    • 0:44:25Let's take one more here.
    • 0:44:27AUDIENCE: Yes.
    • 0:44:28I would like to know if you have a column of the type boolean.
    • 0:44:33CARTER ZENKE: Yeah, do we have a Boolean type affinity, let's say.
    • 0:44:36So here, we don't, at least in SQLite.
    • 0:44:39Some other DBMS', Database Management Systems might have bool or Boolean,
    • 0:44:45true or false, right?
    • 0:44:47Let me show you this instead.
    • 0:44:48If I go to my terminal, I can see-- if I type dot schema,
    • 0:44:52I haven't used Boolean, there's no need for me in this case,
    • 0:44:56but I have used integer.
    • 0:44:58And integer for SQLite can kind of serve the same purpose.
    • 0:45:02I could have the integer 0 or the integer 1 to be true
    • 0:45:06or to be false or true, respectively, 0 being false and true being 1,
    • 0:45:09I believe in this case.
    • 0:45:11But good question.
    • 0:45:13OK.
    • 0:45:14So to the earlier question, like, we've improved our tables.
    • 0:45:19We now have type affinities for our columns.
    • 0:45:21But we don't yet have this ID we talked about last week,
    • 0:45:26which was primary keys and foreign keys.
    • 0:45:29This idea of trying to uniquely represent each item in our own table
    • 0:45:34using primary keys and trying to reference
    • 0:45:36that primary key from some other table using foreign keys.
    • 0:45:40So let's try to work on that now.
    • 0:45:43And for this, we'll need this new ID called a table constraint.
    • 0:45:48In SQLite, you can apply what's called a constraint to your entire table.
    • 0:45:54A constraint means that some values have to be a certain way.
    • 0:45:59Like let's say for a primary key, primary keys must be unique.
    • 0:46:04They can't repeat, and they must be, at least in our case,
    • 0:46:07is going to be integers, to be able to quickly add on to them over time.
    • 0:46:12Similarly, for foreign keys, well, a constraint
    • 0:46:16is that if you have a foreign key, you better
    • 0:46:19find that value in some other table, otherwise
    • 0:46:22you violated this constraint to having a foreign key.
    • 0:46:26So we have two kinds of table constraints, among others,
    • 0:46:29but two of these are primary key and foreign key.
    • 0:46:33And we can apply these to our table by applying them
    • 0:46:36underneath the columns we tend to say will be inside of our table.
    • 0:46:42Let's try these two here now.
    • 0:46:43So we come back to my terminal here so we implement our very own primary key
    • 0:46:47and foreign key constraints.
    • 0:46:49We'll go back to my SQLite terminal and clear my screen,
    • 0:46:52and let's then pull up our schema.sql file
    • 0:46:55so we can keep modifying our schema.
    • 0:46:58I can now see I have the riders table, stations, and visits,
    • 0:47:01and I have some columns that could be primary keys or foreign keys,
    • 0:47:06but I need to declare them as such.
    • 0:47:08So here in the riders table, what was our primary key?
    • 0:47:12Well, it was ID.
    • 0:47:13Every rider should have their own, unique ID that should not
    • 0:47:16be duplicated across any two riders.
    • 0:47:19So to ensure that constraint is applied, I could follow this up with a comma
    • 0:47:24and then say, primary key ID, just like this.
    • 0:47:28Now ID is the primary key of this rider's table.
    • 0:47:32I can go down to stations and ask, what was my primary key?
    • 0:47:36Well, similarly, it was ID, this ID column on line 8.
    • 0:47:40So I'll type a comma followed up with primary key ID,
    • 0:47:45and now that ID column is a primary key.
    • 0:47:49It has that constraint applied in stations.
    • 0:47:52But now if I get down to visits, we'll have a few more options.
    • 0:47:56Visits here actually doesn't have its own ID column that I created.
    • 0:48:01I instead have a rider ID and a station ID column.
    • 0:48:05So a few options here.
    • 0:48:07One option is actually to make a joint primary key.
    • 0:48:10I could have a primary key composed of two columns, both rider ID and station
    • 0:48:17ID.
    • 0:48:18If I applied that constraint, that would mean
    • 0:48:20that if I were to ever insert a row that had the same rider
    • 0:48:24ID and the same station ID as another row,
    • 0:48:26I would trigger a constraint violation.
    • 0:48:29Every row that has a rider ID and a station ID
    • 0:48:33has to be unique in their combination of those two values.
    • 0:48:36To write that kind of scenario, I could follow this up
    • 0:48:39and I could say similarly, the primary key of this table
    • 0:48:43is not just rider ID like this, but it's also station ID.
    • 0:48:48Now this is a joint primary key constraint.
    • 0:48:52But if we think about this logically, I mean,
    • 0:48:54it kind of stands to reason that somebody would visit a station more
    • 0:48:57than once, and I don't want to make sure that every combination of rider
    • 0:49:01and station ID should be unique.
    • 0:49:03I want people to able to visit a station more than once.
    • 0:49:06So maybe not the best design for this table, but I could certainly use it.
    • 0:49:09In other cases or other contexts.
    • 0:49:12One other option would be to do it before
    • 0:49:15and to have maybe the ID column here of type integer,
    • 0:49:19and then down below, make that our primary key, a bit like this.
    • 0:49:23And now visits has its own ID column.
    • 0:49:27But actually, SQLite by default will give me its very own primary key,
    • 0:49:33one called row ID.
    • 0:49:35It's implicit, I can't see it, but I actually could query it.
    • 0:49:39I could query it for row ID, all one word,
    • 0:49:41and get back a unique primary key for this table SQLite has automatically
    • 0:49:46created for me.
    • 0:49:47Now we have seen the primary key options,
    • 0:49:50what are our foreign key options?
    • 0:49:52Well, it seems like rider ID and station ID
    • 0:49:55are the foreign keys of this table where rider ID references the ID
    • 0:49:59column in rider's, and station ID references the ID column in stations.
    • 0:50:05So to codify that, to make that a reality in our schema,
    • 0:50:08I could follow this up with the foreign key.
    • 0:50:11The foreign key of this table is rider ID,
    • 0:50:14and it references the rider's table and the ID column
    • 0:50:19inside of it using this syntax here.
    • 0:50:22Now I could keep going.
    • 0:50:24I could say I have more than one foreign key.
    • 0:50:26I also have a foreign key--
    • 0:50:29foreign key called station ID.
    • 0:50:32And that references, like we said before,
    • 0:50:34the stations table and the ID column inside of it.
    • 0:50:37So now here I have my completed schema.
    • 0:50:40I have a primary key for the tables I've declared an explicit column
    • 0:50:43for a primary key, and now I also have foreign key constraints
    • 0:50:47for those columns that should be foreign keys.
    • 0:50:51So now let me ask, what questions do we have on the schema
    • 0:50:55or on primary keys and foreign keys?
    • 0:50:59AUDIENCE: So, yeah.
    • 0:51:00I just noticed that whenever before we hadn't
    • 0:51:03added the affinities and the keys, we were not
    • 0:51:06applying commas after each column name.
    • 0:51:09So what is the difference there?
    • 0:51:11CARTER ZENKE: Yeah, a good catch.
    • 0:51:12Let me kind of show you what this looks like in my terminal
    • 0:51:15so you can see it live.
    • 0:51:17You, I think, had noticed that before we had,
    • 0:51:20let's say, this primary key ID constraint in riders,
    • 0:51:25we had done something like this.
    • 0:51:26Let me just copy paste that.
    • 0:51:28And we had removed this last column--
    • 0:51:30or comma from the name column.
    • 0:51:32Is that right?
    • 0:51:33And if that's the case, well, it's just convention, just style here.
    • 0:51:38So if I want to keep adding some constraint
    • 0:51:41or like a new line to my table, I should include a comma.
    • 0:51:45Here, this name column was the last portion of my table I had specified.
    • 0:51:51I have this column called name that has type affinity text, right?
    • 0:51:55But now, if I add this new constraint, we'll
    • 0:51:58have to follow it up with-- follow it up after a comma from this new column
    • 0:52:03here.
    • 0:52:04Notice now, this constraint, primary key ID, is the last--
    • 0:52:10let's say attribute of my table I specified.
    • 0:52:12I no longer need to include a comma at the end of it.
    • 0:52:15So whatever is the last portion, I should not have a comma after,
    • 0:52:20but everything else, I should.
    • 0:52:23Let's take one more question here, too.
    • 0:52:25AUDIENCE: Would it be OK for the visits table to have an ID column as well?
    • 0:52:29CARTER ZENKE: Yeah, good question.
    • 0:52:31Would it be OK for the visits table to have an ID column as well.
    • 0:52:35It certainly would be.
    • 0:52:36We could define our very own primary key for this table, too.
    • 0:52:39So let me go back and show you how that could work.
    • 0:52:42I'll go to my visits table here.
    • 0:52:44And I could try to add my own primary key to this table.
    • 0:52:49I could say ID, make a new column here, make it a type affinity integer,
    • 0:52:54like this.
    • 0:52:55Let me scroll up.
    • 0:52:56And now let me add some new constraint.
    • 0:53:00I could say, because I've made my very own primary key,
    • 0:53:03I'll say primary key ID.
    • 0:53:06Now, this table has a primary key that I've created called ID.
    • 0:53:12And this will be in place of SQLite, so that it
    • 0:53:15would have made called row ID itself but hidden from my own view.
    • 0:53:21OK, let's keep going then.
    • 0:53:24We've seen table constraints, we've seen type affinities,
    • 0:53:27but we could probably do more to improve the design
    • 0:53:30of this table or this database.
    • 0:53:32So let's introduce one more kind of constraint,
    • 0:53:35this one called a column constraint.
    • 0:53:38So whereas a table constraint applies to the table as a whole,
    • 0:53:42a column constraint applies to a particular column, let's say.
    • 0:53:47Maybe I want to column to have certain data inside of it.
    • 0:53:51Maybe I want to make sure it doesn't have null values, and so on.
    • 0:53:54I could do that with column constraints.
    • 0:53:57There are four in SQLite--
    • 0:54:00check, default, not null, and unique, and each one does something different.
    • 0:54:07Check allows me to make my very own check, like check to be sure
    • 0:54:11that this amount is greater than 0.
    • 0:54:14Or I could use default. Default means if I
    • 0:54:17don't supply a value when I add a new row,
    • 0:54:20it'll just use a default value instead.
    • 0:54:23Not null means I can't insert null or empty values into this column.
    • 0:54:29And in fact, it's required.
    • 0:54:31Unique means, I want to make sure that every row in this column
    • 0:54:37is a unique value.
    • 0:54:39It doesn't appear twice in my data set.
    • 0:54:42So let's try applying a few of these to our schema here.
    • 0:54:46Go back to my terminal.
    • 0:54:48And now let me check out this.
    • 0:54:50Well, I could try applying the not null constraint
    • 0:54:55when I know I want a column to be required, effectively.
    • 0:54:59Now where could I best apply that?
    • 0:55:01Maybe I could apply that to the name column in stations,
    • 0:55:05like stations must have a name.
    • 0:55:07So I'll say, the name column cannot be null.
    • 0:55:11It cannot be empty in here.
    • 0:55:14Line also should probably be not null.
    • 0:55:17A station must be part of some line.
    • 0:55:19I can't have an empty value for line.
    • 0:55:21So I'll say, too, this should be not null.
    • 0:55:26Now I could apply this up at name.
    • 0:55:28I could say, riders must have a name, too.
    • 0:55:31Let me try that.
    • 0:55:32I'll say text not null.
    • 0:55:34Or I could leave it optional.
    • 0:55:35I could say, maybe text just on its own and let
    • 0:55:39riders choose to supply a name or not.
    • 0:55:42Now the question here is, should I apply not null to my primary key columns,
    • 0:55:48like ID not null or ID not null here?
    • 0:55:53You might think that you should, for thoroughness sake.
    • 0:55:56Well it turns out that when you apply the primary key table constraint down
    • 0:56:01below here, this already ensures that there are several constraints applied
    • 0:56:06to this particular column called ID, among them being that ID cannot be
    • 0:56:12null.
    • 0:56:13So no need to duplicate this and say that this ID cannot be null when I
    • 0:56:17already have it specified down below that ID is a primary key.
    • 0:56:23Let me check others here.
    • 0:56:25You might also think, could I do it for rider ID and station ID?
    • 0:56:29Should I include not null here?
    • 0:56:32Rider ID, not null, station ID, not null.
    • 0:56:36Well, that would be a good thought, but again, we're
    • 0:56:39taken care of by our table constraint using our foreign key here.
    • 0:56:43Again, this constraint will say, if rider ID doesn't already
    • 0:56:48exist in the ID column of rider's, I can't insert that value.
    • 0:56:53And we could probably presume that if rider's ID is a primary key,
    • 0:56:58well, null will not be part of this column,
    • 0:57:01and therefore, I already can't insert null for rider ID or station ID.
    • 0:57:06This would be in this case, redundant.
    • 0:57:08So not null is good when you have columns that are neither
    • 0:57:12primary keys nor foreign keys and you want
    • 0:57:15to make sure that they have-- they always have a value, that they
    • 0:57:20are never null, effectively.
    • 0:57:25So that is not null.
    • 0:57:27So we could keep going here.
    • 0:57:28We also had one called unique that makes sure every value,
    • 0:57:34every row in this column is unique.
    • 0:57:37Where could we apply this?
    • 0:57:38I could try to apply it, let's say, to the name of a station.
    • 0:57:42Like, station should have unique names.
    • 0:57:44So I'll say not null and unique.
    • 0:57:47Now this column has two constraints.
    • 0:57:50The first, not null.
    • 0:57:51It should always have a value.
    • 0:57:53The second, unique.
    • 0:57:55The value shouldn't repeat throughout this column.
    • 0:57:58Line I might leave without this constraint.
    • 0:58:01I could imagine two stations being on the same line, like both on blue,
    • 0:58:04I'll allow that in this case.
    • 0:58:07Now again, we could try to apply unique to our primary keys or our foreign keys
    • 0:58:12as I just did here, but it's already taken
    • 0:58:15care of for us using this primary key constraint.
    • 0:58:19A primary key, again, is always going to be unique and never null.
    • 0:58:24So we'll take advantage of that already using our primary key and foreign key
    • 0:58:29constraints here.
    • 0:58:32OK.
    • 0:58:33So we've seen unique and not null.
    • 0:58:36And I might argue, we're at the point where this schema seems
    • 0:58:39to be fairly optimized, at least using our column constraints, our table
    • 0:58:44constraints, our type affinities and so on.
    • 0:58:47So let's ask then, what questions do we have on not null and unique, if any?
    • 0:58:55AUDIENCE: So basically, to recap if I understood correctly, it's
    • 0:58:58not precisely about not null and unique but about
    • 0:59:01the concept of the key, labeling key immediately
    • 0:59:06gives the attribute of not null, unique, and to be referenced, right?
    • 0:59:10CARTER ZENKE: That is true.
    • 0:59:11So when you use a primary key or a foreign key constraint,
    • 0:59:14there are other constraints that go along with that constraint.
    • 0:59:17A primary key, for instance, must not be null, it must be unique, and so on.
    • 0:59:23So it would be redundant to apply that again,
    • 0:59:25to say that this primary key should be unique or not null.
    • 0:59:29Good clarification there.
    • 0:59:31OK.
    • 0:59:32So I think we're at the point where this schema is pretty well set for us,
    • 0:59:36and we're going to need to think about how we use this in the real world.
    • 0:59:42Like, if this is our schema here, we have riders and stations, well,
    • 0:59:48what could we do?
    • 0:59:49I mean, riders tend to--
    • 0:59:52I mean, they could register for the subway.
    • 0:59:54And riders, well, they do visit stations.
    • 0:59:56But I think if we applied this to the real world,
    • 0:59:59we'd see that this isn't quite how it's actually done.
    • 1:00:02Like, riders don't really register that often.
    • 1:00:05If a rider who's new to the city comes in, they want to ride the subway,
    • 1:00:08they should be able to ride, too.
    • 1:00:10So it turns out at least here in Boston, the MBTA doesn't really
    • 1:00:15track riders, per se, but they do track what we call CharlieCards.
    • 1:00:20If you want to ride a subway, you get a CharlieCard.
    • 1:00:23A CharlieCard allows you access to the subway,
    • 1:00:26keeps track of your fare, your balance and so on,
    • 1:00:28and allows you to swipe in to certain stations.
    • 1:00:31So when we come back from a break here, we'll
    • 1:00:33see how we can actually implement these CharlieCards instead of riders
    • 1:00:37to make our system more efficient and more in line with what
    • 1:00:40happens in the real world.
    • 1:00:42We'll see you all in just a few.
    • 1:00:45And we're back.
    • 1:00:46So we saw last time, this challenge of designing
    • 1:00:49a system for the Massachusetts Bay Transportation Authority
    • 1:00:53to represent riders and stations-- people
    • 1:00:55who get on a subway to go around Boston.
    • 1:00:58But as we also saw, we learned that the MBTA doesn't
    • 1:01:03keep track of riders themselves, they keep track of CharlieCards, this card
    • 1:01:09that a rider might carry in their pocket.
    • 1:01:11And they can scan this card at a station to enter or even to exit in some cases,
    • 1:01:17to make sure that their fare is counted as they enter that station.
    • 1:01:21So let's think now how to improve our schema to represent not just riders,
    • 1:01:27like me, But CharlieCards themselves people might carry around
    • 1:01:31when they enter the subway station.
    • 1:01:34Well, we saw before we had riders and cards.
    • 1:01:37But our goal now is to remove riders from the picture, focus only on cards.
    • 1:01:43Well cards, as we've seen, might make us swipe at a station.
    • 1:01:47If I enter Harvard station, I might swipe my CharlieCard
    • 1:01:51to enter that station.
    • 1:01:54And we could see that a card would have maybe many swipes involved.
    • 1:01:59Like if I swipe at Harvard, I might also swipe it MIT or swipe it Park Street,
    • 1:02:04and so on.
    • 1:02:05We could see that a swipe can only happen at a single station at a time,
    • 1:02:10though.
    • 1:02:10Like if I swipe at Park Street, I'm only swiping at Park Street here,
    • 1:02:14and similarly, a swipe might only involve one, single card.
    • 1:02:20So if we think about these entities and how they relate,
    • 1:02:23we could also think about what kinds of columns you have inside of each entity.
    • 1:02:27In this case, I would argue we have something a bit like this.
    • 1:02:30We could say that a card has a ID for its primary key, in this case.
    • 1:02:36Similarly, this card makes a swipe, and this swipe has, itself, an ID,
    • 1:02:42as well as a type, some time that that swipe happened,
    • 1:02:46and an amount or a transaction that is involved.
    • 1:02:49So for example, let's say I swipe in at the Harvard Square Station.
    • 1:02:54That type of swipe might be to enter the station at some certain time.
    • 1:02:59Now associated with that swipe is also some amount
    • 1:03:03in dollars that happened to be subtracted from my card.
    • 1:03:07Like let's say the fare nowadays is $2.40, well,
    • 1:03:10that amount is subtracted from my card from this swipe.
    • 1:03:15Now, of course, I do all of this at a station, which has our same columns
    • 1:03:20from before, ID, name, and line.
    • 1:03:23So a similar idea now, but we're replacing riders with cards
    • 1:03:28and adding more information to these visits.
    • 1:03:31There are instead swipes that could be maybe entering the station,
    • 1:03:34exiting the station or just adding funds to my balance, for instance.
    • 1:03:39So let's see how we could improve our schema now
    • 1:03:41using some new SQL keywords to alter our tables and add some new ones, too.
    • 1:03:47I'll go back to my computer.
    • 1:03:49And let's see how we could try to alter the table we've already created.
    • 1:03:54We already have a riders table, a visits table and a stations table,
    • 1:03:58but we could learn some new statements, some new queries to actually
    • 1:04:02update and alter these tables as well.
    • 1:04:06The first one as we saw a little bit before
    • 1:04:08is this one called drop table, arguably the most dramatic thing
    • 1:04:11you can do to table, is just drop it, delete it like this.
    • 1:04:14So let's try just deleting the riders table from our database.
    • 1:04:18I'll go back to my--
    • 1:04:21go back to my database here.
    • 1:04:24I'll type sqlite mbta.db to pull up this database again.
    • 1:04:29And now if I type dot schema, I'll see I have a riders, stations, and visits
    • 1:04:34table.
    • 1:04:34But no longer-- I don't want to have the riders table.
    • 1:04:38I want to remove that.
    • 1:04:39So I'll say, drop table riders, semicolon.
    • 1:04:45Now I'll hit enter.
    • 1:04:46No more riders table.
    • 1:04:48If I type dot schema, that's gone for my database.
    • 1:04:52Well, what could I do now?
    • 1:04:54I've dropped the table, but I'd still need to maybe update
    • 1:04:58visits to instead be swipes.
    • 1:05:00I could probably leave stations as is.
    • 1:05:02But I want to update a table or alter its schema,
    • 1:05:06I can use some new commands as well.
    • 1:05:09I'll use this one called alter table.
    • 1:05:12Alter table looks a bit like this.
    • 1:05:16I can use the alter table statement here and give it some table name.
    • 1:05:22Like let's say alter table visits, and then inside this visits table,
    • 1:05:27I get to ask, what do I want to do?
    • 1:05:30Do I want to, let say, rename the table?
    • 1:05:33I could do that.
    • 1:05:34I could also decide to add a column, rename a column,
    • 1:05:37or drop a column altogether.
    • 1:05:40Let's try first looking at rename to.
    • 1:05:43I want to rename this visits table to a swipes table,
    • 1:05:47representing not just a visit to the station, but a swipe of a card.
    • 1:05:51So let's try this one here.
    • 1:05:53I'll go back to my computer and I'll go back to SQLite.
    • 1:05:57And I'll say, I no longer want visits to be called visits.
    • 1:06:01I ideally want visits to actually be called swipes.
    • 1:06:06So let me try this.
    • 1:06:07I'll say alter table, as we saw before, visits
    • 1:06:13rename to swipes, like this, semicolon.
    • 1:06:18Now I'll hit enter.
    • 1:06:19And now if I type dot schema, again, I see--
    • 1:06:24oops, kind of strange--
    • 1:06:25I'll hit dot schema again, I'll see swipes,
    • 1:06:28no longer called visits, but now called swipes.
    • 1:06:32What we saw-- I'd ideally like to add a type of the swipe,
    • 1:06:36maybe I'm entering the station, maybe I'm exiting.
    • 1:06:40Maybe I'm just adding funds or depositing some funds.
    • 1:06:43So let me try adding, let's say, a new column to swipes.
    • 1:06:47I'll say alter table, add column--
    • 1:06:51alter table, swipes, I'm going to name that table, swipes, like this.
    • 1:06:57And then let's add a column--
    • 1:07:00add a column called type.
    • 1:07:03And this will have the type affinity text.
    • 1:07:06I'll hit semicolon, enter.
    • 1:07:08And now if I type dot schema, I'll see--
    • 1:07:12well, I have a column called a ttpe.
    • 1:07:16So clearly, I made a typo here.
    • 1:07:18I had rider ID, station ID.
    • 1:07:20This new column down below called ttpe, I kind of want to fix this, right?
    • 1:07:26I don't want ttpe, I want type.
    • 1:07:28So probably good chance to use my rename column.
    • 1:07:31I'll come back here and I'll try that.
    • 1:07:34I'll instead do alter table--
    • 1:07:37alter table, swipes, and I will rename a column now.
    • 1:07:43I'll say rename the column, ttpe to type, spelled correctly now.
    • 1:07:51Hit semicolon, clear my terminal, type dot schema.
    • 1:07:55And now I see over here that type has been added to my table of swipes.
    • 1:08:01I see rider ID, station ID, and now a new column called type.
    • 1:08:06So through alter table, we go ahead and add new columns, rename them even,
    • 1:08:11or if I wanted to just drop the column altogether.
    • 1:08:15Let's say I add this column type and I change my mind,
    • 1:08:19I don't want it here anymore.
    • 1:08:20I go back to my computer and I could try dropping a particular column.
    • 1:08:24Let me try in this case, alter table, swipes.
    • 1:08:28And now let me drop column type, semicolon.
    • 1:08:33Hit enter.
    • 1:08:34And now if I type dot schema, I'm back to where I began.
    • 1:08:40So these are the commands we can use to alter our tables, to improve them
    • 1:08:45if you make a mistake during our first create table command,
    • 1:08:48or if you want to add more functionality later down the line.
    • 1:08:52So ideally, I could keep using alter table, add table, create table, drop
    • 1:08:58table, and so on.
    • 1:08:58But what I want to do here is just start from scratch.
    • 1:09:01Like I have stations and swipes and so on.
    • 1:09:04Why don't I just go back to my schema file and use that instead?
    • 1:09:07So what I'll do is I'll drop table for stations and--
    • 1:09:11oops, semicolon.
    • 1:09:12I will drop table for swipes now, semicolon.
    • 1:09:17I'll type dot schema, and now I'll see nothing in here.
    • 1:09:22I'll quit and I'll type code schema.sql.
    • 1:09:26Let me just start from scratch using this schema.sql.
    • 1:09:29So we no longer want to have riders, we only want to have cards.
    • 1:09:35So I could just rename this table here.
    • 1:09:38I'll call it cards.
    • 1:09:39Create table, cards.
    • 1:09:41Now cards don't have a name, they only have some unique ID in this case.
    • 1:09:47Now I'll leave stations just as it is.
    • 1:09:50This seems perfectly fine to me.
    • 1:09:51Stations have an ID, a name and a line with these constraints
    • 1:09:55applied to the name and line columns.
    • 1:09:58But now the visits table.
    • 1:10:00Well, the visits table is no longer a visit to the station per se,
    • 1:10:04it's more a swipe of the card at that station.
    • 1:10:07So let's now say visits becomes swipes.
    • 1:10:10And among these new columns to add are the following.
    • 1:10:15I want to have not just an ID for each swipe,
    • 1:10:18not an ID for rider ID and station ID and so on,
    • 1:10:22but also I want to have a type of swipe.
    • 1:10:26Am I entering the station, exiting, or depositing some funds?
    • 1:10:30So I'll say the type of this is going to be a new column,
    • 1:10:34and the data type, this whole will be text, enter, exit,
    • 1:10:40or deposit for funds.
    • 1:10:42Now let me try another column, two.
    • 1:10:45I'll include a date time.
    • 1:10:47A date time is like a timestamp.
    • 1:10:49What time did this swipe actually happen?
    • 1:10:51And I'll make this type numeric.
    • 1:10:54Numeric can store all kinds of dates and times for me in this table.
    • 1:11:00Now let me add one final column.
    • 1:11:02This one will be an amount.
    • 1:11:04I'll also use numeric for this kind of column here.
    • 1:11:09And I'll say that this column called amount
    • 1:11:13can store, in this case integers or real numbers, like floats and so on.
    • 1:11:18I'll probably decide on that when I actually add some data to this table.
    • 1:11:24So here we've updated our schema to represent that diagram we saw before.
    • 1:11:28I have cards, I have stations, and I have
    • 1:11:31swipes that have some type associated with them, enter, exit, deposit,
    • 1:11:36some date, time, and some amount that was charged to me
    • 1:11:39while I made this swipe.
    • 1:11:41Either I added some funds, in which case amount is positive,
    • 1:11:44or I subtracted some funds, entering exiting, in this case from the station.
    • 1:11:50All right.
    • 1:11:50So I have these tables now.
    • 1:11:53And now I want to probably apply some of those same column
    • 1:11:56constraints we saw before.
    • 1:11:58Like here, it's fine, but I also want to make sure I'm not adding some data
    • 1:12:02that I don't want to add to this table.
    • 1:12:05So I could go back to my old friends, these column constraints.
    • 1:12:09And we saw before, we had default and not null, unique and check.
    • 1:12:14We've used not null and unique, but we haven't check or default.
    • 1:12:20So let's start using more than just not null and unique
    • 1:12:22and also focus on check and default, what they can do for us here.
    • 1:12:27I'll go back to my schema.
    • 1:12:29And let me just make sure that I'm making all the columns that I
    • 1:12:32want to be required actually required.
    • 1:12:35I'll go into my swipes table and I'll say that type--
    • 1:12:39I mean, this should be required.
    • 1:12:41I should know whether I entered, exited, or deposited some funds.
    • 1:12:46So I'll say text not null.
    • 1:12:48Similarly, for the timestamp, the time this swipe happened,
    • 1:12:52I want that to be not null as well.
    • 1:12:54I must know what time I swiped the card.
    • 1:12:57And also, it makes sense for there to always be
    • 1:13:00some amount associated with this swipe.
    • 1:13:02Either I added some funds to my balance or I removed some funds overall.
    • 1:13:07So I'll make this not null as well.
    • 1:13:11Well, let's see what we could do here with default.
    • 1:13:14Default gives me some default value, some initial value
    • 1:13:18to add to this column if I don't specify what kind of value to add,
    • 1:13:25what be good for date time.
    • 1:13:27Date time here is, again, the timestamp.
    • 1:13:30The time I swiped this card to enter, let's say, Harvard station.
    • 1:13:35Well, if I want this to always have the current time,
    • 1:13:40I could use default. I could say the default
    • 1:13:43value for this column is this special value here,
    • 1:13:48CURRENT TIMESTAMP, in all caps.
    • 1:13:52CURRENT TIMESTAMP will show me the year, the month, the day, the hour,
    • 1:13:57the minute, the second, all in one value, and insert that into my table.
    • 1:14:02So as soon as I add new row, if I don't supply a time for this datetime column,
    • 1:14:08I'll instead get back the current time exactly as it is represented in SQLite.
    • 1:14:15Now what can I do further than this?
    • 1:14:17I could also try to add a-- my very own check.
    • 1:14:21Maybe I want to make sure that the amounts here are never equal to zero.
    • 1:14:26Like, nobody should ever be able to make a $0 transaction or $0 swipe.
    • 1:14:31They're always being charged some money or they're always
    • 1:14:33depositing some money, in this case.
    • 1:14:36So I could say, amount here has my very own check on this column.
    • 1:14:41And inside check, I can actually write my very own expression to check for.
    • 1:14:45I could say, for example, amount is not equal to zero.
    • 1:14:51Using those same operators we saw back in week zero,
    • 1:14:54this will ensure that any value inside amount will not be equal to zero.
    • 1:15:01Let's try also for type.
    • 1:15:03I mean, type can only have a few values.
    • 1:15:06We saw enter, exit, deposit some funds.
    • 1:15:11I could make sure that only those values are included inside my type column.
    • 1:15:16I could say check that type is in some list of values.
    • 1:15:20Going back to week one here, we talk about in.
    • 1:15:23I could say, maybe the type is in enter exit or is in deposit.
    • 1:15:31So now when I have this table called swipes,
    • 1:15:34I'm representing what I'm actually doing when I go to Harvard station.
    • 1:15:38I have here a visit for myself, my very own ID
    • 1:15:42here, which will update in second.
    • 1:15:44I have a station ID where I'm actually going to visit.
    • 1:15:47I have a type that I'm going to use to enter, exit, or deposit
    • 1:15:52some funds at this station.
    • 1:15:54I'm doing it at a certain time, and I have an amount
    • 1:15:56associated with this transaction.
    • 1:16:00Now, there's one thing to fix here, which
    • 1:16:02is that we're still talking about riders inside our swipes table.
    • 1:16:07So let's fix that here, too.
    • 1:16:08I'll go back to my computer, and let's try fixing this.
    • 1:16:12I have a rider ID inside of my swipes table, but no longer do I have riders,
    • 1:16:19I have cards.
    • 1:16:20So let me say that this is now a card ID, and down below in my foreign key,
    • 1:16:25I'll say that this card ID column references the ID column in cards.
    • 1:16:32And I think this should represent everything
    • 1:16:35I want to represent about swipes at the station.
    • 1:16:39So let me ask now what questions we have about this new schema
    • 1:16:44and the constraints we've applied.
    • 1:16:47AUDIENCE: I wonder, how could you delete or drop
    • 1:16:50the table of riders when you use ID as a foreign key?
    • 1:16:56I try to do that, but I got an error.
    • 1:17:00CARTER ZENKE: Yeah, so you're getting into some more advanced stuff here.
    • 1:17:03And suffice to say for now, my foreign key constraints
    • 1:17:06aren't actually being checked right now, but yours might be.
    • 1:17:09If you try to drop a table that actually has some data that's
    • 1:17:14referenced by a foreign key, SQLite will warn you, perhaps to tell you,
    • 1:17:17you can't do that because this ID is referenced from this table over there.
    • 1:17:21So in that case, best to delete the value that has that foreign key
    • 1:17:26and then proceed with dropping that table altogether.
    • 1:17:30Good question there.
    • 1:17:31Yeah, let's take one more.
    • 1:17:34AUDIENCE: How might this syntax send the other SQL languages, same as SQLite
    • 1:17:43languages, how much is syntax same?
    • 1:17:46CARTER ZENKE: Yeah, a good question.
    • 1:17:48So here we're using the SQLite database management system.
    • 1:17:52It is similar to but distinct in some ways
    • 1:17:55from others like MySQL or PostgreSQL.
    • 1:17:57I would say that most of what you're doing here
    • 1:18:00could also be ported to MySQL and PostgreSQL with a few exceptions
    • 1:18:04that you might need to treat on a case by case basis.
    • 1:18:07In fact, the developers of SQLite built things
    • 1:18:10so that it would be easy to port their database schemas to another schema
    • 1:18:14like MySQL or PostgreSQL as well.
    • 1:18:18But good question there.
    • 1:18:21All right, let's take one more question here.
    • 1:18:23AUDIENCE: Imagine if we put in the ID, we don't put any data type,
    • 1:18:30it's going to give us--
    • 1:18:32in this schema, it's going to give us a text or--
    • 1:18:37CARTER ZENKE: Yeah, great question.
    • 1:18:38So you're asking if I didn't tell SQLite what kind of type affinity
    • 1:18:42a column had, what type of affinity would it actually have?
    • 1:18:46A great question.
    • 1:18:47In this case, by default, SQLite gives the numeric type affinity,
    • 1:18:52where numeric can store integers or real values.
    • 1:18:56But if you give it like a string of text, like let's say red line,
    • 1:18:59it will store that for you, too, kind of non-intuitively, but it will.
    • 1:19:04But the default type affinity is numeric in this case,
    • 1:19:08if you don't otherwise specify.
    • 1:19:10Good question.
    • 1:19:11OK, let's come back then and let's focus on wrapping up on a final few pieces
    • 1:19:17here.
    • 1:19:17So we've seen some table constraints, which
    • 1:19:20we apply primary keys and foreign keys.
    • 1:19:23We saw column constraints, where we could make sure
    • 1:19:26that certain values were given to us through not null.
    • 1:19:28We could also make sure that the--
    • 1:19:31let's say the value is in some list of values,
    • 1:19:33they're using check or making sure that it's not some value,
    • 1:19:37also using check as well.
    • 1:19:39Default allows us to specify a given value for every new insertion of rows
    • 1:19:44here.
    • 1:19:45And this is actually pretty important to have.
    • 1:19:49Not just schemas that have column names and let's
    • 1:19:53say type affinities as well, but also constraints,
    • 1:19:57make sure the data we insert works well for us.
    • 1:20:00And there's actually a story behind this person
    • 1:20:02who's on the CharlieCard, this person who's on the subway.
    • 1:20:07His name is Charlie, and he's perhaps the most famous subway
    • 1:20:11rider in all of Boston.
    • 1:20:13Back in the 19-- let's say 1900s, the band
    • 1:20:17called The Kingston Trio wrote a song about this man named Charlie.
    • 1:20:21Charlie, supposedly, got on at the Kendall Square station where MIT is,
    • 1:20:26and he made for Jamaica Plain.
    • 1:20:27But once he got to Jamaica Plane, the conductor
    • 1:20:30asked him for one more nickel, and well, he didn't have that nickel.
    • 1:20:34So he got stuck on the train for years, and years, and years.
    • 1:20:36He couldn't get off of the subway.
    • 1:20:38So keep in mind, Charlie, when you're writing your own database constraints,
    • 1:20:42making sure that if you get on the train,
    • 1:20:44you're able to get off of it at some point, right?
    • 1:20:46Don't end up like Charlie in this case.
    • 1:20:49So with this in mind, you're able to design your very own database
    • 1:20:53schemas that keep not just certain columns involved,
    • 1:20:56but also type affinities for those columns, types--
    • 1:20:59those data types-- types the columns can store.
    • 1:21:01You're also able to apply constraints to those columns to make sure
    • 1:21:04that the data you're inserting is that you actually
    • 1:21:07want to have in that column.
    • 1:21:09Next time, we'll focus on actually adding data to our columns,
    • 1:21:13actually write data to a database file to insert,
    • 1:21:16update, and delete that data altogether.
    • 1:21:18So with that in mind, we'll see you next time.
  • CS50.ai
Shortcuts
Before using a shortcut, click at least once on the video itself (to give it "focus") after closing this window.
Play/Pause spacebar or k
Rewind 10 seconds left arrow or j
Fast forward 10 seconds right arrow or l
Previous frame (while paused) ,
Next frame (while paused) .
Decrease playback rate <
Increase playback rate >
Toggle captions on/off c
Toggle mute m
Toggle full screen f or double-click video