CS50 Video Player
    • 🧁

    • 🍬

    • 🥥

    • 🍿
    • 0:00:00Introduction
    • 0:14:12Stacks and Queues
    • 0:23:15Jack Learns the Facts
    • 0:25:39Resizing Arrays
    • 0:44:18Linked Lists
    • 1:33:39Break
    • 1:45:58Trees
    • 2:00:22Dictionaries
    • 2:04:15Hashing and Hash Tables
    • 2:23:33Tries
    • 0:14:13All right,
    • 0:14:15this is CS 50
    • 0:14:17this is week five.
    • 0:14:19And among our goals for today are to revisit some topics from past weeks,
    • 0:14:23but to focus all the more on design possibilities,
    • 0:14:27particularly by way of data structures.
    • 0:14:28So data structures again is this way via which you can structure your data,
    • 0:14:32but more specifically and see it's like how you can use your computer's memory
    • 0:14:36in interesting and dares say clever ways
    • 0:14:38to actually solve problems more effectively.
    • 0:14:40But we're gonna see today that there's actually
    • 0:14:42different types of data structures.
    • 0:14:45And we'll make the distinction between abstractions like high level
    • 0:14:48descriptions of these structures and the lower level implementation details,
    • 0:14:52so to speak.
    • 0:14:52So in particular, we'll talk first today about what we call abstract data type.
    • 0:14:56So an abstract data type is kind of like a data structure,
    • 0:14:59but it offers certain properties,
    • 0:15:01certain characteristics and it's actually up to the programmer,
    • 0:15:04how to implement the underlying implement
    • 0:15:06detail.
    • 0:15:07So for instance,
    • 0:15:08there's actually this abstract data type that's common in computing
    • 0:15:11known as a cue and from the real world,
    • 0:15:13most of us are presumably familiar with cues otherwise known
    • 0:15:16in the US typically as lines or forming lines.
    • 0:15:19In fact, I have here three stacks of cook three bags of cookies.
    • 0:15:23Could I get like three volunteers to come up on stage and queue up? Ok.
    • 0:15:26I saw your hand first. How about your hand? Second and
    • 0:15:29in the blue, in the blue light.
    • 0:15:34L
    • 0:15:34ok. Come on down. Just do three.
    • 0:15:37And if you could kindly come on up,
    • 0:15:43come on up. Yep. There's our blue.
    • 0:15:46Come on over and if you want to queue up over here, if you could
    • 0:15:53come on down. Thank you as we begin. Do you want to introduce yourselves first?
    • 0:15:57Uh, my name is
    • 0:15:57Holly
    • 0:15:58Harwit.
    • 0:15:58Um, I'm the first year of studying computer science and economics and,
    • 0:16:02uh, I
    • 0:16:03think
    • 0:16:05all right, next.
    • 0:16:07Hi,
    • 0:16:07my
    • 0:16:07name is Katherine. Um, I'm planning on studying engineering.
    • 0:16:10Not sure mechanical or electrical yet, but one of the two.
    • 0:16:13and I'm currently in Kennedy.
    • 0:16:14Nice, nice to meet you.
    • 0:16:16Hi, everyone. I'm Isabella.
    • 0:16:17I'm in Strauss and I plan on majoring in computer science. Wonderful.
    • 0:16:21Well, welcome to all three of you. And I think this will be pretty straightforward.
    • 0:16:24I have here these three bags of cookies you formed nicely this line or this queue.
    • 0:16:28So if you'd like to come up first and take your cookies,
    • 0:16:31thank you. And right that way, that's all there is to this demonstration,
    • 0:16:34your cookies as well, right this way
    • 0:16:37and your cookies right this way. Wonderfully well done.
    • 0:16:39Thank you to our volunteers.
    • 0:16:41The point is actually sincere though simple as that demonstration was
    • 0:16:45and as easy as it was to get those cookies,
    • 0:16:47cues actually manifest a property that actually is germane
    • 0:16:51to a lot of problem solving in computing.
    • 0:16:53And the real world specifically cues offer this characteristic
    • 0:16:57fifo first in first out. And indeed, as our volunteers just noticed
    • 0:17:01they queued up on stage 123, that is the order in which I handed them their cookies.
    • 0:17:06And dare say it's a very equitable approach.
    • 0:17:08It's very fair first come first served, might be a more casual way of describing
    • 0:17:12FIFO first in first out.
    • 0:17:14But there's other ways to store data, especially as data is coming in
    • 0:17:19to a computer system. And we might also describe data as being known as oops. Ah
    • 0:17:25let me remind
    • 0:17:27as for the
    • 0:17:28now structures like these actually offer specific operations that makes sense.
    • 0:17:32And in the context of cues,
    • 0:17:33we generally describe these operations as N Qing and D Qing.
    • 0:17:36So when our first three volunteers came up, they N Qed and as I handed them each,
    • 0:17:41a bag of cookies, they D Qed and exited in that same order.
    • 0:17:44Now, how could you go about implementing A Q
    • 0:17:47in code specifically in C?
    • 0:17:49Well, we can actually implement it in bunches of different ways,
    • 0:17:52but perhaps the most obvious is to borrow our old friend
    • 0:17:54namely arrays and we could use a data
    • 0:17:57structure that looks a little something like this,
    • 0:17:59whereby we specify the total capacity of this data structure.
    • 0:18:04For instance, we might store a total of 50 people or just three.
    • 0:18:07In this case,
    • 0:18:08we might define our structure then as containing those people as simply an array.
    • 0:18:12And if a person is a data type that we've defined in week past,
    • 0:18:15you can imagine each of our volunteers is indeed a person and we've stored
    • 0:18:19them one after the other continuously in memory by way of this actual array.
    • 0:18:25But we do need to keep track inside of a queue using one other piece of data,
    • 0:18:29namely we need to keep track of an integer like
    • 0:18:31the size like how many people are actually in the queue
    • 0:18:34at this moment.
    • 0:18:35Because if we have a total capacity of 50 I'd
    • 0:18:38like to know if I only have three volunteers that
    • 0:18:40I can do some quick arithmetic and know that I
    • 0:18:42could have fit another 47 people in this same queue.
    • 0:18:46But it's finite.
    • 0:18:47Of course, if we had 50 volunteers all wanting cookies,
    • 0:18:50that's as many people as we could actually handle.
    • 0:18:52So there is this upper bound then on how many we could fit.
    • 0:18:55But there's yet other ways for storing data inside of a computer's memory.
    • 0:18:59And there's this other abstract data type known as
    • 0:19:01a stack and stacks are actually omnipresent as well.
    • 0:19:04Even though it's not necessarily the system you would want when you line up on stage,
    • 0:19:09for instance, could we get three more volunteers?
    • 0:19:12OK. I saw hands here right here and right here. Come on down.
    • 0:19:16We'll have the orchestra come up this time.
    • 0:19:18All right.
    • 0:19:19Come on over.
    • 0:19:21And if you wouldn't mind, come on over, we'll do introductions first.
    • 0:19:26This will be almost as easy as the last one if you wanna
    • 0:19:28introduce yourself and let me just stack you against the lectern this time.
    • 0:19:30So if you could go there
    • 0:19:32and if you could come over here and if you could come over here,
    • 0:19:35we'll stack all three of you. So you were first. So you're first in the stack. Hi, I'm,
    • 0:19:39I have no idea what I'm studying and I live in Strauss.
    • 0:19:42Wonderful. And next I am tonight, I am studying Econ and CS and I live in Canada.
    • 0:19:47Um Hi, I'm Claire. I want to study applied math and I'm in Wigglesworth. Wonderful.
    • 0:19:50Welcome to all three of you.
    • 0:19:52And if I may let me just advance a bit more information about stacks,
    • 0:19:56the catch is that stacks uh actually support what's known as
    • 0:20:00LIFO. So last in first out, which is sort of the opposite really of a Q or a line.
    • 0:20:04So, in fact, you were last in line. So here we have your cookies. Thank you so much.
    • 0:20:08If you'd like to exit that way, we have your cookies here. Thank you so much.
    • 0:20:11We'd like to you to exit this way.
    • 0:20:13And uh even though you were first like,
    • 0:20:16Lifo doesn't really give you any cookies because you're first in, not last
    • 0:20:21in and so,
    • 0:20:22yeah. Ok. Points made. We'll give you their cookies.
    • 0:20:25All right. So thank you to all three of our volunteers. But
    • 0:20:28Lifo suffice it to say,
    • 0:20:31doesn't offer the same fairness guarantees as
    • 0:20:34a queue or a line more traditionally and
    • 0:20:36imagine just go lining up in any store or the dining hall or the like,
    • 0:20:39ideally, you want the people running the place to adhere to that cue to that line.
    • 0:20:45So that
    • 0:20:45FIFO is preserved if you indeed care about being first,
    • 0:20:48whereas there are contexts in which
    • 0:20:50Lifo does actually make sense.
    • 0:20:53In fact, if you think about gmail, your inbox or outlook, typically,
    • 0:20:56you're viewing your inbox as a stack, right?
    • 0:20:59Because when you get new mail, where does it end up? It
    • 0:21:01actually ends up on the top, on the top and the top.
    • 0:21:04And if you're like me, odds are, which emails do you tend to first?
    • 0:21:08I mean, probably the ones on the top, the ones that came in last most recently,
    • 0:21:12that is,
    • 0:21:13and that might actually be to the detriment of people who emailed you earlier today
    • 0:21:17or yesterday because once they sort of fall off the bottom of your screen,
    • 0:21:20frankly, unless you click next, you may never see those emails again.
    • 0:21:23But stacks are indeed one way of storing data
    • 0:21:26and Google and Microsoft presumably made the judgment call that
    • 0:21:29in general, we users want to see most recent data.
    • 0:21:32First, the last information might be the first we want out.
    • 0:21:36Now, just in terms of nomenclature,
    • 0:21:38the two operations that are analogous to N Qing and D Qing.
    • 0:21:42But with this property of lifo are instead called push
    • 0:21:45and pop.
    • 0:21:45So when our first volunteer came up on stage, so to speak,
    • 0:21:48I pushed him onto the stack against the lectern.
    • 0:21:50There, second person was pushed,
    • 0:21:52third person was pushed and then when it was time to hand out the cookies,
    • 0:21:55we popped them, so to speak one after the other.
    • 0:21:58But preserving that LIFO property.
    • 0:22:00But here's where
    • 0:22:01are a little interesting in terms of implementation details.
    • 0:22:04A stack could be implemented almost identically underneath the
    • 0:22:08hood to a queue because what do you need?
    • 0:22:10You need an array of people which we could use our person data type for,
    • 0:22:14for per pass classes.
    • 0:22:16We have to keep track of how many people are in the stack.
    • 0:22:18So that even if we have a capacity of like 50 we
    • 0:22:21know at least that we can store three plus maybe 47 others.
    • 0:22:26Now there's still going to be a
    • 0:22:27change in the underlying implementation details because not
    • 0:22:30picture here is actual C code that actually pushes and pops or NQ and DQ.
    • 0:22:35So whatever loops you're using whatever code you're using.
    • 0:22:39Odds are that's where those properties are going to be implemented.
    • 0:22:42FIFO versus
    • 0:22:43LIFO you're going to implement maybe the loop in this
    • 0:22:45direction instead of this one or some such distinction.
    • 0:22:48But at the end of the day,
    • 0:22:49stacks and cues are just abstract data types in the
    • 0:22:52sense that we can implement them in bunches of ways,
    • 0:22:55two of them among them here, thus far on the screen,
    • 0:22:57but that array is going to come back to bite us because if you only have a
    • 0:23:01capacity of 50 what happens if 51 people want cookies next time?
    • 0:23:05Like you just don't have room for them, even though clearly,
    • 0:23:08we have enough room for the people themselves, we have enough memory.
    • 0:23:11And so it seems a little short sighted to limit
    • 0:23:13just how much data can fit in our data structures.
    • 0:23:16So with that said, a friend of our Shannon Duval at Elon University,
    • 0:23:19kindly put together
    • 0:23:21uh a visualization of the same and allow me to introduce you
    • 0:23:25to two fellows known as Jack and Lou. If we could dim the lights for this video.
    • 0:23:36Once upon a time, there was a guy named Jack when it came to making friends,
    • 0:23:40Jack did not have the knack.
    • 0:23:42So Jack went to talk to the most popular guy.
    • 0:23:44He knew he went up to Lou and asked, what do I do?
    • 0:23:48Lou saw that his friend was really distressed.
    • 0:23:50Well, Lou began just look how you're dressed.
    • 0:23:53Don't you have any clothes with a different look?
    • 0:23:55Yes. Said Jack, I sure do. Come to my house and I'll show them to you.
    • 0:24:00So they went off to Jack's and Jack showed Lou the box
    • 0:24:03where he kept all his shirts and his pants and his socks.
    • 0:24:06Lou said, I see you have all your clothes in a pile. Why don't you wear some others?
    • 0:24:10Once in a while?
    • 0:24:11Jack said, well, when I remove clothes and socks,
    • 0:24:15I washed them and put them away in the box.
    • 0:24:17Then comes the next morning and up by hop, I go to the box and get my clothes up
    • 0:24:22up.
    • 0:24:23Lou quickly realized the problem with Jack, he kept clothes C DS and books in a stack.
    • 0:24:29When he reached for something to read or to wear, he chose a top book or underwear.
    • 0:24:34Then when he was done, he would put it right back back. It would go on top of the stack.
    • 0:24:39I know the solution said a triumphant lou,
    • 0:24:41you need to learn to start using a cue.
    • 0:24:44Lou took Jack's clothes and hug them in a closet.
    • 0:24:47And when he had emptied the box, he just asked it.
    • 0:24:49Then he said, now Jack at the end of the day,
    • 0:24:52put your clothes on the left when you put them away.
    • 0:24:55Then tomorrow morning when you see the sun shine,
    • 0:24:57get your clothes on the right from the end of the line.
    • 0:25:00Don't you see said, lou, it will be so nice.
    • 0:25:03You'll wear everything once before you wear something twice
    • 0:25:06and with everything in cues in his closet and
    • 0:25:09shelf jack started to feel quite sure of himself.
    • 0:25:12Oh, thanks to Lou and his wonderful cue.
    • 0:25:17All right.
    • 0:25:19So the same wonderful thanks to Shannon.
    • 0:25:22So you might have noticed I wear black all the
    • 0:25:24time so we could make a similar gag about.
    • 0:25:26Here's what my stack of clothes at home looks like.
    • 0:25:28Even I, even though I might own a blue and a red sweatshirt doesn't really work.
    • 0:25:31If you're popping everything from a stack every time cleaning it,
    • 0:25:34replenishing the black sweaters before the red
    • 0:25:36or the blue even get popped themselves.
    • 0:25:38But we're gonna focus today,
    • 0:25:40not just on like stacks and cues which for us are
    • 0:25:42really meant to motivate like different ways of designing data,
    • 0:25:45even though the implementation details might differ.
    • 0:25:48But we're gonna start focusing on solving some problems that invariably we'd be
    • 0:25:51bumping up against anyway as we develop more and more real world software,
    • 0:25:54not just smaller programs as in class and arrays recall or what,
    • 0:25:58like what's the key characteristic or definition of an array
    • 0:26:01with respect to your computer's memory and storing things in it. Yeah,
    • 0:26:06perfect. So it stores the data contiguously back to back to back.
    • 0:26:10And as we've seen thus far, when you allocate space for an array,
    • 0:26:14you typically do it with square brackets,
    • 0:26:16you specify a number in those brackets or maybe
    • 0:26:18a constant like capacity like I just did.
    • 0:26:20And that fixates just how much data you can actually store in there.
    • 0:26:23We did see last week though
    • 0:26:25that we could start to use Malo
    • 0:26:27to allocate an equivalent number of bytes.
    • 0:26:30But even that when you call it just once
    • 0:26:32gives you back a specific finite number of bytes,
    • 0:26:34so you're similar to lead, decided in advance how much memory you can store
    • 0:26:38in an array. So let's consider what kinds of problems this would get us into.
    • 0:26:41So here's an array of size three.
    • 0:26:43And suppose for the sake of discussion, we've already put three numbers into it.
    • 0:26:4612 and three, literally suppose now we want to add 1/4 number to that array.
    • 0:26:51Well, where does it go like intuitively and pictorially,
    • 0:26:54you'd like to think it could go there.
    • 0:26:56But remember the context we introduced last
    • 0:26:58week when we talked about computers memories,
    • 0:27:01there's lots of stuff going on.
    • 0:27:02And if you only ask the computer, the opera system for three room for three integers,
    • 0:27:08who knows what's here and here and here,
    • 0:27:10not to mention everywhere else on the screen.
    • 0:27:12So if we zoom out, for instance, we might like to put the number four there.
    • 0:27:16But we can't, if in that greater context, there's a lot more stuff going on.
    • 0:27:20So for instance, suppose that elsewhere in my same programmer function,
    • 0:27:24I've already created a string like hello comma space world backslash
    • 0:27:29zero just by bad luck that could be allocated right next to my 1231.
    • 0:27:34Well, if I ask the operating system for space for three numbers,
    • 0:27:37then I asked the operating system for space for a string,
    • 0:27:40it's pretty reasonable for the computer to put those things back
    • 0:27:43to back because it's not going to anticipate for us that.
    • 0:27:46Well, maybe they actually want four numbers eventually or five numbers or more.
    • 0:27:51Now, as for all of these Oscars, the grouch that's just meant to represent,
    • 0:27:54uh pictorially here, the notion of garbage values,
    • 0:27:57like there's clearly other bits there and available.
    • 0:28:00I don't know what it is and I don't care what it is,
    • 0:28:02but I do care that I can't just presume to put something right where
    • 0:28:07I want in the computer's memory unless I preemptively ask it for more memory.
    • 0:28:11Now, if all of those are garbage values, which is to say that who cares what they are?
    • 0:28:15It's just junk left over from previous runs of the function or the,
    • 0:28:18like there's clearly plenty of room for 1/4 number.
    • 0:28:22I could put the number four here or here or here or down here or here or here.
    • 0:28:26But why would I not want to just plop
    • 0:28:28the four wherever there is a garbage value currently?
    • 0:28:32Yeah,
    • 0:28:35exactly.
    • 0:28:36I want to be next to my array of 123 because again,
    • 0:28:38arrays must be and must remain contiguous.
    • 0:28:42Now, that's not a deal breaker, right?
    • 0:28:43Because where else could I put maybe the entire array?
    • 0:28:47Well, there's room up here for four numbers.
    • 0:28:49There's room down here for four numbers. So that's fine.
    • 0:28:51And that could be a solution to the problem if
    • 0:28:53you've run out of space in your fixed size array.
    • 0:28:56Well, maybe I just abstract everything else away
    • 0:28:59and I just move my array to a different location. That's a little bit bigger.
    • 0:29:03But there is going to be a downside even though this is a solution,
    • 0:29:05even though I can certainly copy the one, the two,
    • 0:29:08the three and now I can pop the four there and heck I can then let go of
    • 0:29:12the old memory in some way and give it back to the operating system to be reused later.
    • 0:29:17This is successful.
    • 0:29:18But why
    • 0:29:20intuitively might we not want this to be our solution
    • 0:29:23of creating a new array?
    • 0:29:25That's a little bigger, copying the old into the new and getting rid of the old
    • 0:29:30to
    • 0:29:30make
    • 0:29:30it
    • 0:29:33good. Yeah, I think I had one more step. Suppose I want to add 1/5 number, 1/6 number.
    • 0:29:37Like that's a lot of work.
    • 0:29:38And in fact, what's the expensive part or what's the slow part of that story? Yeah,
    • 0:29:44it takes a lot of time,
    • 0:29:45but specifically what's taking time if we can put our finger on it?
    • 0:29:48Yeah, and back.
    • 0:29:52Ok, for a month, for a period of time I'm using twice as much memory,
    • 0:29:55which certainly seems wasteful because even though I don't eventually need it
    • 0:29:59it is gonna kind of mushroom and then shrink back down,
    • 0:30:01which seems like an inefficient use of resources.
    • 0:30:04But what specifically is slow about this process too?
    • 0:30:07Yeah. And me.
    • 0:30:11Yeah. Good choice of words.
    • 0:30:12You're iterating over the array to copy it over using a four loop A Y loop.
    • 0:30:16So it's probably like big O of N steps just to copy the array.
    • 0:30:19And technically big O of N plus one if we add one more, but that's still big O of N.
    • 0:30:23So it's the copying the moving of the data, so to speak. That's certainly
    • 0:30:27correct. But maybe it's not
    • 0:30:29the best design. Wouldn't it be better if we could do something otherwise?
    • 0:30:33Well,
    • 0:30:33let's consider what this might actually translate into in
    • 0:30:36code and what the implications then might be.
    • 0:30:38Let me switch over here to VS code. Let me propose to open up a file called list dot C
    • 0:30:44brand new and just let's create this list of numbers and then add to it
    • 0:30:48over time and see when and where we actually bump up against these problems.
    • 0:30:51So let me include standard
    • 0:30:52IO dot H
    • 0:30:53in order to simply be able to print things out ultimately
    • 0:30:57in main void. So no need for command line arguments here.
    • 0:31:00Let me give myself a,
    • 0:31:02an array called list of just a size three for consistency with the picture thus far.
    • 0:31:08And now let me go ahead and just manually make it
    • 0:31:10look like in memory what it did on the screen.
    • 0:31:12So list bracket zero is going to equal the number one,
    • 0:31:15list bracket one is going to equal the number
    • 0:31:18two and list bracket two equals the number three.
    • 0:31:21So even though the array is of course, zero, index,
    • 0:31:23I'm just sort of using more familiar 123 as my digits here.
    • 0:31:26Now, suppose I want to print these things out.
    • 0:31:28Let's just do something as a simple exercise.
    • 0:31:30So for N to I equals zero, I is less than three I plus plus
    • 0:31:36uh inside of this loop. I'm going to do something simple, like print out
    • 0:31:39iterative, iteratively as you note
    • 0:31:41uh backslash N list bracket. I
    • 0:31:45so very simple program.
    • 0:31:47It's not the best design because I've got this magic number there.
    • 0:31:49I'm hard coding the three,
    • 0:31:50but the point is just to go through the motions of demonstrating how this code works.
    • 0:31:56Ah Oh, good. You got, oh, you got it in before I hit compile. So wait,
    • 0:32:01thank you.
    • 0:32:03All right.
    • 0:32:04Very. Maybe a round of applause. Thank you.
    • 0:32:07All right.
    • 0:32:10All right. So this is going to get aggressive though eventually.
    • 0:32:12So let me add the semicolon.
    • 0:32:14Let me recompile this list seems to compile. OK?
    • 0:32:19And if I do dot slash list, I should see of course, 123. So the code works.
    • 0:32:23There's no memory constraints here because I'm
    • 0:32:25not trying to actually add some values.
    • 0:32:27But let me consider how I could go about
    • 0:32:29implementing this idea of copying everything from the old array
    • 0:32:33to the new array frankly, just to kind of see how annoying it is, how painful it is.
    • 0:32:38So you're about to see the code escalate quickly and it will
    • 0:32:41be helpful to try to wrap your mind around each individual step.
    • 0:32:44Even though if you take a step back, it's going to look like a crazy amount of code
    • 0:32:48to solve a simple idea.
    • 0:32:49But that's the point we're going to get to a place particularly in week six
    • 0:32:52where all of what we're about to do reduces to like one line of code.
    • 0:32:56So
    • 0:32:56hang in there for now. So let me go ahead and do this. If I want to create
    • 0:33:00a version of this code that
    • 0:33:03can grow to fit more numbers.
    • 0:33:06For instance, how can I go about doing this or at least demonstrate as much?
    • 0:33:10Well,
    • 0:33:10I cannot use an array in this traditional way of using square
    • 0:33:15brackets because that makes list the variable forever of size three.
    • 0:33:20I can't free. It's remember free. You can only use with Mao.
    • 0:33:22So you can't give it back and then recreate it using this syntax.
    • 0:33:26But I can use this trick from last time
    • 0:33:28whereby if I know there is this function called Malo
    • 0:33:31whose purpose in life is to give me memory, I could for instance, red,
    • 0:33:34declare list to be a pointer, so to speak, that is the address of a chunk of memory.
    • 0:33:39And I could ask Malo
    • 0:33:41for a chunk of memory, namely of size
    • 0:33:44three, but not three per se three integers for good measure.
    • 0:33:47So technically,
    • 0:33:48that's three times the size of whatever an in is now for our purposes today,
    • 0:33:53that's technically three times four or 12.
    • 0:33:55But I'm trying to do this very generally in case we use it on an old
    • 0:33:58computer or maybe a future computer where the size of an ink might very well change.
    • 0:34:03That's why I'm using size of in it will
    • 0:34:05tell me always the correct answer for my computer.
    • 0:34:07So to use M
    • 0:34:09lock, not going to catch me on this one. What header file do I need to add
    • 0:34:14standard
    • 0:34:16standard
    • 0:34:16lib dot H. So I'm gonna go ahead and include standard
    • 0:34:19lib dot H which gives me access to Malo.
    • 0:34:22And what I'm gonna additionally do is practice what
    • 0:34:24I preached last week whereby in extreme cases,
    • 0:34:27Malo
    • 0:34:27can return not the address of an actual chunk of memory. What else can malo
    • 0:34:31return in cases of error?
    • 0:34:34Yeah.
    • 0:34:35No null in all caps which represents technically address zero,
    • 0:34:40but you're never supposed to use address zero.
    • 0:34:42So it's a special Sentinel value that just means something went wrong.
    • 0:34:45Do not proceed. So it's going to add some bulk to my code, but it is good practice.
    • 0:34:49So if list at this point actually equals equals no,
    • 0:34:52there's no more work to be done here. I've got to abort the demo altogether.
    • 0:34:55So I'm going to return one just arbitrarily to say we're done with this exercise.
    • 0:34:59It's not going to be German for class. We can
    • 0:35:01find room for three integers. But best practice when using Malo.
    • 0:35:06Now,
    • 0:35:06this code here does not need to change because list
    • 0:35:09is now still a chunk of memory of size 12.
    • 0:35:13I can actually get away with still using square bracket notation
    • 0:35:16and treating this chunk of memory as though it's an array.
    • 0:35:20And this is a bit subtle.
    • 0:35:21But recall from last time, we talked briefly about pointer arithmetic,
    • 0:35:25whereby the computer can do
    • 0:35:26arithmetic, some addition,
    • 0:35:28subtraction on the actual addresses to get from one location to the other.
    • 0:35:32And that's what the computer is going to do here because it says list bracket zero,
    • 0:35:37that's essentially just going to put the number one literally
    • 0:35:40at the chunk that beginning of that chunk of memory.
    • 0:35:43And because this is a modern computer, it's going to take four bytes in total.
    • 0:35:46But I don't want to put the number four here
    • 0:35:49to shift it over myself because I'm using square.
    • 0:35:53And because the computer knows that this, this chunk of memory is being treated
    • 0:35:57as a chunk of addresses of integers, pointer arithmetic sort of magically kicks in.
    • 0:36:02So what the computer is going to do for me is put this one at location zero.
    • 0:36:07It's going to put this number two
    • 0:36:09at location one time size of in so four.
    • 0:36:12And it's going to put this number three at location
    • 0:36:15two times size of in which gives me a.
    • 0:36:18So in other words, you don't have to think about how big that chunk of memory is.
    • 0:36:21If you already gave the compiler a clue as to the size for our purposes today,
    • 0:36:25don't worry too much about that.
    • 0:36:27The bigger takeaway is that when you allocate space using Malo,
    • 0:36:30you can certainly treat it as though it's an array using week two notation,
    • 0:36:35which is arguably simpler than using dots and stars and, and all of that.
    • 0:36:39But this isn't quite enough now because let
    • 0:36:41me stipulate that for the sake of discussion
    • 0:36:43at this point in time here on line 16 where the cursor is blinking.
    • 0:36:48Suppose I realize just for the sake of discussion like ah
    • 0:36:51I should have allocated space for four integers instead of three.
    • 0:36:55Now, obviously, if I were writing this for real,
    • 0:36:57I should just go fix the code now and recompile it all together.
    • 0:37:00But let's just pretend for the sake of discussion that somewhere in your program,
    • 0:37:04you want to dynamically allocate more space and free
    • 0:37:07up the old in order to implement this idea
    • 0:37:09of copying from old to new memory.
    • 0:37:12So how could I do that?
    • 0:37:13Well,
    • 0:37:13let me go ahead and temporarily give myself another chunk of memory
    • 0:37:17and I'm going to literally call it T MP for short,
    • 0:37:19which is a common convention,
    • 0:37:20temp I'm going to set that equal to the amount of space that I actually do now want.
    • 0:37:25So I'm going to say four times the size of an in.
    • 0:37:28So technically, it'll give me 16, but space for four integers this time.
    • 0:37:32And what that's doing for me in code is essentially trying to find me
    • 0:37:37four space for four integers elsewhere that might very well be garbage values now,
    • 0:37:42but I can therefore reuse them.
    • 0:37:44So once I've done this,
    • 0:37:46something could still go wrong and I could check if temp equals equals null,
    • 0:37:50then actually I should exit all together and finish up. But
    • 0:37:54there's a subtlety here and you don't need to dwell too much on this for today.
    • 0:37:58But there is technically a bug right now.
    • 0:38:00Why based on week four last week,
    • 0:38:03might it not be correct technically to
    • 0:38:06immediately return one and abort the program altogether
    • 0:38:10at this point?
    • 0:38:15OK.
    • 0:38:19OK. So when you allocate memory, sometimes there might be garbage values there.
    • 0:38:22That is true.
    • 0:38:23But that is to say that those 16 bits might be
    • 0:38:26garbage values have Oscar the grouches all on the screen.
    • 0:38:30But temp itself will literally be the return value of Mao
    • 0:38:33and Mao
    • 0:38:34will always return to you the address of a valid chunk of memory or it will return. No.
    • 0:38:39So this line is actually OK. What I don't love is that I'm returning one immediately.
    • 0:38:48Yes. So this is where it's subtle.
    • 0:38:50It's not quite right to just abort right now and return one. Why?
    • 0:38:53Because up here, remember a few moments ago, we used Malo
    • 0:38:57presumably successfully because if we got all the way down here,
    • 0:39:01we did not abort on line nine.
    • 0:39:03So we kept going. But that means we've allocated three times the size of it.
    • 0:39:07So 12 bytes earlier.
    • 0:39:09So frankly, if you compile this code, run it and then ask Val Gren,
    • 0:39:12it's gonna
    • 0:39:13a memory leak of size 12 because as you know, we did not free the original memory.
    • 0:39:18So this is where frankly c does get a little annoying because you
    • 0:39:21and I as the programmers have to remember all of these details.
    • 0:39:24So what I really want to do here before I return one
    • 0:39:27to be best practice, I want to free the original list.
    • 0:39:31So I give back those bytes to the operating system.
    • 0:39:33Now as an aside, technically, when any program quits,
    • 0:39:37all of the memory is going to be given back to the office
    • 0:39:39system.
    • 0:39:39But practicing what I'm preaching now will get
    • 0:39:41you into better situations later because if you don't
    • 0:39:45free up memory, you will have leaks.
    • 0:39:47And that's when our own macs and pcs tend to start
    • 0:39:49to slow down and use up more memory than they should.
    • 0:39:52But let's avoid discussion with more error checking there.
    • 0:39:55Let's just assume that now I'm on line 23 of
    • 0:39:57this program whereby I have presumably successfully allocated enough space.
    • 0:40:02So the next step after allocating
    • 0:40:04four bytes is to, as you noted earlier,
    • 0:40:06iteratively copy the old numbers into the new space.
    • 0:40:10So this is actually pretty straightforward.
    • 0:40:12I'm going to go ahead and four
    • 0:40:14I get zero, I is less than three I plus plus, just like I was printing last time.
    • 0:40:20I'm going to go ahead and set the I location of temp
    • 0:40:24equal to the IH location of list semicolon and that's it.
    • 0:40:28I'm just copying
    • 0:40:29into the temporary array, whatever was in the old array.
    • 0:40:34But that still leaves me with this fourth bite.
    • 0:40:36Of course, this four or sorry,
    • 0:40:38this fourth location where I want to put the number four.
    • 0:40:41But if I'm going to do that for the sake of discussion,
    • 0:40:43even though this isn't really a compelling real world program,
    • 0:40:45I'm gonna just manually go into the
    • 0:40:48last location in temp A K A temp bracket three and set that equal to my fourth number.
    • 0:40:55So that's all the whole point here is to mimic encode what it was we wanted to do here.
    • 0:41:01But now there's one more step.
    • 0:41:02What was the next step after copying the one, the two, the three and adding the four?
    • 0:41:06What do I wanna do?
    • 0:41:08Now, I can safely free the list.
    • 0:41:10Now,
    • 0:41:10I want to go ahead and get rid of the original
    • 0:41:12memory or at least hand it back to the operating system.
    • 0:41:14So here is where I can free the list, not in the case of an error,
    • 0:41:19but actually deliberately free the original list because
    • 0:41:22I don't need those 12 bytes anymore.
    • 0:41:24But now if I want to really have quote unquote list point at this new chunk of memory,
    • 0:41:31well, then I could also do this list
    • 0:41:34equals temp.
    • 0:41:36So this is a little weird but recall that list has just now been freed.
    • 0:41:39So even though list technically contains the address of a chunk of memory,
    • 0:41:42it's no longer valid because again, it was freed.
    • 0:41:45So yes, it's still technically there, but it's effectively garbage values now.
    • 0:41:48So I'm certainly free. No pun intended.
    • 0:41:51I'm certainly allowed to update the value of list.
    • 0:41:54And I want list to now point to the new chunk of memory.
    • 0:41:57So sort of metaphorically if list was originally
    • 0:41:59pointing at a chunk of memory there,
    • 0:42:01maybe now I want it to point over here.
    • 0:42:03So I'm just updating the value of list ultimately.
    • 0:42:07All right, now that I've got this all done,
    • 0:42:09I think I can just use this same loop as before I
    • 0:42:11could change the three to a four because I now have four numbers
    • 0:42:14uh at the very bottom of this program though also subtle,
    • 0:42:17I should probably now at the very end free this list.
    • 0:42:20And for good measure, let me go ahead and return zero.
    • 0:42:24But now I think I have a complete program that again,
    • 0:42:27to be clear is not how you would write this in the real world.
    • 0:42:30Because you would not
    • 0:42:31all
    • 0:42:3183 by three integers,
    • 0:42:33then I decide you want to allocate for, then fix all of this.
    • 0:42:37But we could probably borrow copy and paste
    • 0:42:39some of this code into production code eventually,
    • 0:42:41whereby this would solve some actual problems dynamically.
    • 0:42:44So let me cross my fingers make list
    • 0:42:46so far. So good dot slash list and I should see
    • 0:42:491234.
    • 0:42:52So long story short,
    • 0:42:53it's a lot of work just to get from the original array to the second.
    • 0:42:57So ideally, we would not, let's do any of this in the first place.
    • 0:43:01Ideally, what could we do instead?
    • 0:43:03Well, maybe we should just allocate more memory from the get go
    • 0:43:07in order to avoid this problem altogether. So how might I do that?
    • 0:43:10Well, instead of having allocated three,
    • 0:43:13an array of size three, let alone an array of size four,
    • 0:43:16why don't I just proactively from the beginning of my program
    • 0:43:19allocate an array of size 30 or heck 300 or 3000
    • 0:43:24and then just keep track of how much of it I'm using.
    • 0:43:27Like that would be correct. It would solve the problem
    • 0:43:30of not painting yourself into a corner so quickly.
    • 0:43:34But what remains as an issue?
    • 0:43:37I'm using a bunch more memory,
    • 0:43:39especially if this is programs only gonna ever manage a few numbers.
    • 0:43:41Why are you wasting 100 times more memory than you might actually?
    • 0:43:46And there's another corner case that could still arise,
    • 0:43:48even though this sort of solves the problem.
    • 0:43:56Exactly, we can eventually still run into the exact same problem.
    • 0:43:59Because if I want to put 301 numbers in the list or 3001, well,
    • 0:44:03I'm still gonna have to jump through all of
    • 0:44:04these hoops and reallocate all of that space.
    • 0:44:07And honestly, now per your concern about the looping iterating 300 times,
    • 0:44:113000 times is certainly eventually gonna start to add up if we're
    • 0:44:15doing it a lot in terms of speed and slow down.
    • 0:44:18So maybe there's a better way altogether than doing this.
    • 0:44:21And indeed, there is if we start to treat our computers, me
    • 0:44:24as a canvas that we can start to use to design data structures,
    • 0:44:29more generally a razor data structure.
    • 0:44:31Arguably they're super simple, they're continuous chunks of memory,
    • 0:44:34but we could use memory a little more cleverly,
    • 0:44:37especially now per last week that we have pointers,
    • 0:44:40which is sort of painful as they might be to wrap your mind around.
    • 0:44:43Sometimes they really just let us point to different places in memory.
    • 0:44:46And so we can start to stitch things together in an interesting way.
    • 0:44:50So the only syntax we'll really need
    • 0:44:53to do that to sort of
    • 0:44:54stitch things together in memory and
    • 0:44:57build more interesting structures are these things
    • 0:44:59struct which allows us to represent strucks already.
    • 0:45:02And we did this with persons and we played with this
    • 0:45:04last time as well and we saw it already for cues
    • 0:45:07and stacks
    • 0:45:08the dot operator, we haven't used it that much.
    • 0:45:11But recall that whenever you have a struct,
    • 0:45:13you can go inside of it using the dot operator.
    • 0:45:15And we did that for a person, person dot name
    • 0:45:17and person dot number. When we were implementing a very simple address book,
    • 0:45:22the star was new last week and it can mean different things in different contexts,
    • 0:45:25use it when declaring a pointer,
    • 0:45:28but you also use it when de referencing a pointer to go there.
    • 0:45:32But just so you've seen it before, it actually tends to be a little annoying.
    • 0:45:36A little confusing to use star and dot together,
    • 0:45:38you might remember one example last week we in parentheses,
    • 0:45:41I put star something and then
    • 0:45:43I use the dot operator to go there and then go inside the structure.
    • 0:45:47Long story short, we'll see today that you can combine
    • 0:45:50simultaneous use of star and dot into something that actually looks like an arrow,
    • 0:45:55something that vaguely looks like a foam finger that
    • 0:45:57might be pointing from one place to another.
    • 0:46:00So we'll see that
    • 0:46:01actually in some code.
    • 0:46:03So where can we take this?
    • 0:46:06Well, let's implement the first of these ideas,
    • 0:46:08namely something that's very canonical and computing
    • 0:46:11known as a linked list. And let's see if we can maybe do this.
    • 0:46:16How about Scully could we get you to come on up and volunteer here?
    • 0:46:20So our friend Scully
    • 0:46:22some cookies in this for you.
    • 0:46:24So,
    • 0:46:24Scully's come prepared with a whole bunch of balloons to represent chunks
    • 0:46:27of memory because we'd like to paint a picture here of what's
    • 0:46:30involved in actually allocating space that's
    • 0:46:32not necessarily contiguous and might be
    • 0:46:34over there or over here or over here in the computer's memory.
    • 0:46:37So for instance,
    • 0:46:38if I want to start allocating space one at a time for a list of numbers Scully,
    • 0:46:43could you go ahead and mal lock one balloon for me?
    • 0:46:46And in this balloon, I'll store, for instance, the number one, ultimately.
    • 0:46:50So we have a balloon here.
    • 0:46:53We rehearsed this before and these balloons are actually
    • 0:46:55really hard to blow up and tie off quickly.
    • 0:46:57So, thank you.
    • 0:46:58So, here we have a chunk of memory and I could certainly, for instance, go in here
    • 0:47:02and store if I might
    • 0:47:04see
    • 0:47:05where to go.
    • 0:47:08Oh,
    • 0:47:09where'd it go? Oh, here we go.
    • 0:47:10I could certainly go ahead here and store in this balloon, for instance, like
    • 0:47:14the number one. But if in the world of an array, it would just be back to back to back.
    • 0:47:18And actually, frankly, why don't we need the balloons even?
    • 0:47:21I could just use these numbers 123.
    • 0:47:23But the problem doesn't indeed arise. Note that when we want to put 1/4 number.
    • 0:47:27Well, where does it go? Well, again, just to paint a picture, ideally,
    • 0:47:30I might allocate space for four
    • 0:47:33But if this is my array of size three, like where does it go?
    • 0:47:36Like this is the point, we can't just put it next to the three,
    • 0:47:40maybe there's room for the four over here,
    • 0:47:42but we have to somehow connect these from one to the other.
    • 0:47:45So in fact, let's sort of act that out.
    • 0:47:47So if I instead use this balloon metaphor
    • 0:47:49of just allocating space from wherever it is,
    • 0:47:50can you go ahead and allocate like another chunk of memory for me?
    • 0:47:54And here is where I'll now have a
    • 0:47:56chunk of memory in which I can store the number
    • 0:47:59computer's a little slow.
    • 0:48:02So in here, the second balloon, I'll have a separate chunk of memory.
    • 0:48:09It's OK. No one's watching.
    • 0:48:10There we go. OK. Good.
    • 0:48:13Second chunk of memory. Thank you. Scully.
    • 0:48:17Now, I can certainly,
    • 0:48:20I can. Thank you. I can certainly.
    • 0:48:22Now store the number two in this chunk of memory,
    • 0:48:25but it's not necessarily contiguous like this trunk
    • 0:48:28came from over here as per Scully's position.
    • 0:48:29Originally, this chunk obviously is coming from over here.
    • 0:48:32And if you don't mind holding that for a moment,
    • 0:48:33this is breaking the metaphor of an array which was indeed contiguous.
    • 0:48:38And even though I as the human could certainly go over and walk next to her,
    • 0:48:41that's the equivalent of like copying values from one place to another.
    • 0:48:44What if we a little more clever though?
    • 0:48:45And if Scully found space for this number one over here.
    • 0:48:48Let's just leave this balloon here.
    • 0:48:50And if she's found space for the number two over there,
    • 0:48:51let's leave that balloon there.
    • 0:48:53But we do somehow have to connect these numbers together.
    • 0:48:57And here is where two, I'll try to do this on the fly.
    • 0:49:00Maybe I could do something like this.
    • 0:49:02I can take this balloon here and I can actually tie a string to it
    • 0:49:06so that if I want to connect one to the other,
    • 0:49:09we can sort of link these if you will together.
    • 0:49:12And so here, now I have a linked list that is not necessarily contiguous.
    • 0:49:16There's a whole bunch of memory that may very well have real values,
    • 0:49:19may very well have garbage values.
    • 0:49:21But I've somehow now linked these two together
    • 0:49:24and maybe just as a final flourish if we could
    • 0:49:26blow up one more balloon to represent more space.
    • 0:49:29And now she's finding room for that balloon over there.
    • 0:49:34Nice. This one is a Yale chunk of memory.
    • 0:49:37So
    • 0:49:39now I'll need
    • 0:49:42one more link if you will. And if I actually connect these two in this way,
    • 0:49:47let me go ahead and tie this off here.
    • 0:49:50And hopefully this will hold.
    • 0:49:53Now, I can go ahead and connect these two
    • 0:49:59if you never see this demonstration again in next year's
    • 0:50:01videos is because this did not go very well here
    • 0:50:04here. Now we have the number one where we first ma
    • 0:50:08locked it. The number two roughly where we next m locked it and the number three.
    • 0:50:11OK. So maybe we'll fix this some other year.
    • 0:50:14Now, we'll have the number three allocated there.
    • 0:50:16But the whole point of this silly exercise is that we
    • 0:50:19can certainly use the computer's memory as more of a canvas,
    • 0:50:22put things wherever we want, wherever is available.
    • 0:50:26So long as we somehow connect the dots, so to speak,
    • 0:50:30and can make our way from one chunk of memory to the next to the next,
    • 0:50:33thereby literally linking them together.
    • 0:50:36But of course, we're using balloons for this metaphor.
    • 0:50:38But at the end of the day, this is just memory.
    • 0:50:40So how could we encode link one chunk to another chunk to a third chunk?
    • 0:50:46Might you think?
    • 0:50:48What's the trick? Yeah.
    • 0:50:50Using pointers, right?
    • 0:50:51That's why we introduced pointers last week because as simple as an idea
    • 0:50:54as it is as hard as it is to write sometimes in code,
    • 0:50:57it's literally just a pointer sort of a
    • 0:51:00foam finger pointing to another chunk of memory.
    • 0:51:03And so these pointers really are metaphorically being
    • 0:51:05implemented now in with these pieces of string.
    • 0:51:08So we'll have to debrief later and decide if we ever do this demo again.
    • 0:51:10But thank you to Scully for participating.
    • 0:51:15OK.
    • 0:51:16We have plenty of. Oh OK. But bear,
    • 0:51:19sir.
    • 0:51:20OK.
    • 0:51:21There we go. Thank you to Scully.
    • 0:51:23So
    • 0:51:24let's now actually translate this to something a little more concrete and then
    • 0:51:28get to the point where we can actually solve this problem in code.
    • 0:51:30So here's that same canvas of memory.
    • 0:51:32And if in this canvas of memory, now,
    • 0:51:34I actually want to implement this idea of the number one, the number two,
    • 0:51:38the number three, let's stop tying our hands
    • 0:51:40in terms of expecting our memory to be contiguous back to back
    • 0:51:44and start to move away from using array.
    • 0:51:46So for instance, suppose I want a Malox space for the number one, just as Scully,
    • 0:51:50I first asked of Scully, suppose it ends up over there on the board.
    • 0:51:53The important thing for discussion here is that
    • 0:51:55that number one, wherever it ends up is surely located at some address.
    • 0:52:00And for the sake of discussion, as in the past,
    • 0:52:02suppose the number one just ends up at location ox 123.
    • 0:52:06So
    • 0:52:06ox 123 is where Scully was originally standing right here. Then we asked for Malo
    • 0:52:10for another chunk of memory.
    • 0:52:12Suppose that it ends up over here at address ox 456.
    • 0:52:16So that's maybe roughly here when Scully was standing in her second position.
    • 0:52:20Lastly, we allocate the number three, maybe it ends up at location ox 789,
    • 0:52:24which was again per Scully's third Malo
    • 0:52:26roughly over here on stage.
    • 0:52:29Now,
    • 0:52:29this picture alone doesn't seem to lend itself to
    • 0:52:32an implementation of the string metaphorically to the pointers
    • 0:52:36unless we allow ourselves a new luxury instead of just
    • 0:52:40storing the number 123 in the our usual squares.
    • 0:52:45I think what I'm gonna have to do is kind of cheat
    • 0:52:47and use more memory to store what the pointers as you proposed.
    • 0:52:52So here
    • 0:52:52there is a trade off that I promised we would sort of
    • 0:52:54start to see more and more if you want to improve your,
    • 0:52:58your performance in terms of time and avoid stupid
    • 0:53:01copying of data from one place to another.
    • 0:53:03Again and again and again, if you want to save time,
    • 0:53:06you're gonna have to give up some space.
    • 0:53:08And there's gonna be this trade off between time and
    • 0:53:10space and it's up to you to decide ultimately,
    • 0:53:12which is more important.
    • 0:53:13So if you allow yourself not enough memory for the numbers two and three,
    • 0:53:17but twice as much memory for the numbers 12 and three and three pointers one for each.
    • 0:53:23What could we now do
    • 0:53:25well if this node and this is a computing term like node
    • 0:53:28is just a generic term describing like a box of memory,
    • 0:53:31a chunk of memory.
    • 0:53:32In this case,
    • 0:53:33if I've given you this blank slate here,
    • 0:53:36what value would make sense to store here if it's associated with this
    • 0:53:40number one?
    • 0:53:42Yeah,
    • 0:53:44good. Maybe the address of the next element.
    • 0:53:47So the next element technically supposed to be the number two.
    • 0:53:50So at this location, I'm gonna store the value ox 456.
    • 0:53:54What then logically should go here in the second box,
    • 0:53:58ox 789.
    • 0:54:00And then here's a little non obvious it's the end of the list as of now.
    • 0:54:04So we can't afford to let it be a garbage value because the garbage value is a value.
    • 0:54:08And we don't want Oscar to effectively be pointing
    • 0:54:11to some random location lest we go there.
    • 0:54:13So what would be a good special value to put here to terminate a list?
    • 0:54:18So
    • 0:54:18null.
    • 0:54:18So not nul which we used for strings, but same idea null,
    • 0:54:22which we keep using now for pointers, otherwise known as the zero address,
    • 0:54:27which I could just write for shorthand as ox zero in this case,
    • 0:54:31which is the same thing as null.
    • 0:54:33So here then even though we've changed nothing about how a computer works,
    • 0:54:37this is just my computer's memory.
    • 0:54:39I'm using more memory now to effectively link one
    • 0:54:42chunk to the next chunk to the next chunk.
    • 0:54:44So easy. Uh Just note that the downside is more space,
    • 0:54:48but now we don't have to worry about ever copying and moving
    • 0:54:52this data around which may be over time for really big programs,
    • 0:54:55big data sets could very well be a net positive and a win for us.
    • 0:55:00So any questions first on
    • 0:55:02this notion of what a linked list
    • 0:55:05actually is
    • 0:55:08any questions on
    • 0:55:10linked
    • 0:55:11lists? No. All right. Well, recall from last time too, that rarely do.
    • 0:55:15We actually care what the specific addresses are. 456789. And so forth.
    • 0:55:20So we can actually abstract those away much like we tried to do with the.
    • 0:55:24Uh do I wanna say that?
    • 0:55:27Yeah, let me back up for one second. So
    • 0:55:29do we,
    • 0:55:31so what though is the link list?
    • 0:55:33Let me describe each of these chunks of memory indeed as a node.
    • 0:55:37So this is one node,
    • 0:55:38two node and three nodes and inside of each of
    • 0:55:40these nodes is two values the actual number we care about
    • 0:55:43and then a pointer which and now this is actually an opportunity
    • 0:55:47to introduce a term that you might see increasingly nowadays data.
    • 0:55:50So 12 and three, which we obviously care about in this case.
    • 0:55:53And then we could actually refer to these points
    • 0:55:55more generally as metadata like it's actual data
    • 0:55:59because it's helping me solve a problem,
    • 0:56:00get from one place to another.
    • 0:56:02But metadata is distinct from data and that I don't
    • 0:56:04fundamentally care about the metadata that's like an implementation detail,
    • 0:56:08but it does help me organize
    • 0:56:10my actual data. So this is more of a high level concept.
    • 0:56:13So what though is a linked list?
    • 0:56:15It turns out the store linked list will generally use just one more value.
    • 0:56:19And I'm going to draw it only as a square,
    • 0:56:21a single box because if I declare now in my code as I
    • 0:56:24soon will a variable may be called list that points to a node.
    • 0:56:30This is effectively how I could implement a linked list I use
    • 0:56:34one node per value and I use one extra pointer to find
    • 0:56:38the first of those nodes.
    • 0:56:40And in fact,
    • 0:56:41here again is where don't need to care
    • 0:56:43fundamentally where any of these addresses are.
    • 0:56:45It suffices to know that.
    • 0:56:46Yes, computers have memory addresses so I could just abstract this away.
    • 0:56:51And this is how I might pictorially represent
    • 0:56:53a linked list, a cleaner version of those three balloons, whereby I was here,
    • 0:56:58this was Scully's first balloon, second balloon, third balloon,
    • 0:57:01these arrows now just represent pointers or strings with the balloons.
    • 0:57:05So with that said,
    • 0:57:07how can we go about translating this to some actual code?
    • 0:57:11Well, here's where we can call into play some of that same syntax from last time.
    • 0:57:16And even a couple of weeks ago when we introduced the notion of a
    • 0:57:19structure.
    • 0:57:20So here, for instance is how we defined a couple of classes ago,
    • 0:57:23the notion of a person.
    • 0:57:24Why? Well, c doesn't come with a person, data type, but we concluded it was
    • 0:57:28of useful to be able to associate someone's name with
    • 0:57:31their number and maybe even other fields as well.
    • 0:57:33So we type deft a structure containing these two values.
    • 0:57:37We learned last week that string is technically char star,
    • 0:57:39but that doesn't change what the actual structure
    • 0:57:42is and we call this struct a person.
    • 0:57:44Well, here's what we revealed last time again, taking those training wheels off.
    • 0:57:48It's just a char star.
    • 0:57:49Let's keep going in this direction though, if I want to define not a person,
    • 0:57:53but maybe more generically something I'll call today a node
    • 0:57:56like a container for my numbers and my pointers.
    • 0:57:59Well, I similarly just need two values,
    • 0:58:01not a name and a number which isn't relevant today,
    • 0:58:04but maybe the number as an actual in.
    • 0:58:06So I can store the one, the two, the three, the four and so forth.
    • 0:58:10And this is a little less obvious,
    • 0:58:12but conceptually what should be the second value inside of any of these nodes?
    • 0:58:17Yeah.
    • 0:58:18So indeed a pointer, a pointer to what though
    • 0:58:22a pointer to another node?
    • 0:58:23And here's where the syntax gets a little weird.
    • 0:58:26But how do I define there to be a pointer in here to another node?
    • 0:58:31Well, you might be inclined to say node star next
    • 0:58:36because this means next is the name of the property or the attribute,
    • 0:58:39the variable inside the struct star means it's a pointer.
    • 0:58:42What is it a point or two? Clearly a node? But here's where C can kind of bite you.
    • 0:58:46The word node does not exist until you get to this
    • 0:58:50line of code, right? C goes top to bottom left to right.
    • 0:58:53So you literally can't use the word node here if it's not existing until here.
    • 0:58:58The simple fix for this is to actually use
    • 0:59:00a slightly more verbose way of defining a structure.
    • 0:59:03You can actually do this and we didn't bother doing
    • 0:59:05this with person because it didn't solve a problem.
    • 0:59:07But if you actually make your first line a little more verbose
    • 0:59:10and say give me a definition for a structure called node.
    • 0:59:14Now in here, you can actually do this
    • 0:59:18right.
    • 0:59:18This is sort of an annoying implementation detail
    • 0:59:21when it comes to implementing structures in C.
    • 0:59:23But essentially we're leveraging the fact that because
    • 0:59:25C code is read from top to bottom.
    • 0:59:27If you give this structure a name called struct
    • 0:59:30node, now you can refer to it here.
    • 0:59:32But you know what it's annoying to write struct node,
    • 0:59:35struct node struck node everywhere in your code.
    • 0:59:37So this last line now just gives you a synonym and it
    • 0:59:40or instruct node
    • 0:59:41to just node.
    • 0:59:43So long story short,
    • 0:59:44this is a good template for any time you implement
    • 0:59:46some notion of a node as we will today.
    • 0:59:49But it's fundamentally the same idea as a person just containing now a
    • 0:59:52number and a pointer to the next as to oppose to someone's name
    • 0:59:56and phone number. So let me go ahead and walk through with some code.
    • 1:00:00How we might actually
    • 1:00:01implement this process of allocating a balloon and putting a number on it,
    • 1:00:06allocating another balloon
    • 1:00:07and putting a number on it and then connecting those two balloons again and again.
    • 1:00:12So we'll do this step by step in a vacuum.
    • 1:00:14So you can see the syntax that maps to each of these ideas.
    • 1:00:17Then we'll actually pull vs code and combine it all and make a
    • 1:00:20demonstrative program.
    • 1:00:22So here for instance,
    • 1:00:23is the single line of C code via which I can give myself the beginning
    • 1:00:27of a linked list that is a pointer that will eventually be pointing to something.
    • 1:00:32So sort of metaphorically, it's like creating a pointer.
    • 1:00:34Right now, we've gotten some,
    • 1:00:35some complaints about that and the audience will use
    • 1:00:38the Harvard one to represent a pointer to something.
    • 1:00:40But if I only do this
    • 1:00:42and I only say give me a variable called list,
    • 1:00:45that is a pointer to a node that's gonna leave a garbage value.
    • 1:00:50So this is like pointing to some
    • 1:00:51random location because it's previously some value,
    • 1:00:54who knows what it is.
    • 1:00:55But we can solve that. How, what would be a good
    • 1:00:58initial value
    • 1:00:59to set this equal to
    • 1:01:01so null, at least if it's null, we then know that this isn't a garbage value.
    • 1:01:05This is literally ox zero A K A null and
    • 1:01:08I'm just going to leave it blank for cleanliness.
    • 1:01:10So this would be the right way to begin to create a link list of size zero,
    • 1:01:15there's nothing there.
    • 1:01:16But at least now that foam finger is not pointing to some bogus chunk of memory,
    • 1:01:20some garbage value.
    • 1:01:21So this is how the world might exist now in computer's memory.
    • 1:01:25How do I go about allocating space now for a node?
    • 1:01:27Well, it's just ideas from last week.
    • 1:01:30Once the word node exists as by as via that type DEA
    • 1:01:34I can just use Mao
    • 1:01:35to ask for the size of a node. I don't have to do the math myself.
    • 1:01:39I don't care how big a node is.
    • 1:01:40Just let it do the math for me,
    • 1:01:43then that's going to return presumably the address of a
    • 1:01:46chunk of memory big enough for that big rectangle.
    • 1:01:49And I'm going to store that for now
    • 1:01:50in a temporary variable called N
    • 1:01:52that itself is a pointer to a node.
    • 1:01:55So if this might look like a lot altogether,
    • 1:01:57but this is just like before when I allocated space for a string or
    • 1:02:01I allocated space for a bunch of numbers and set it equal to a pointer
    • 1:02:05to integers for instance mos
    • 1:02:07recently.
    • 1:02:08All right.
    • 1:02:09So this gives me a box in memory of size N for instance here, whereby
    • 1:02:16oh sorry,
    • 1:02:18what does this do? This gives me a pointer called N.
    • 1:02:21So it's similarly just a single square because it's just an address.
    • 1:02:24And it similarly gives me a bigger chunk of memory
    • 1:02:27somewhere in the computer's memory containing enough space for the,
    • 1:02:30for the number that's going to go there a 1 to 2 or three or whatever
    • 1:02:34and a pointer to the next value.
    • 1:02:36So these lines of code collectively, this half creates this in memory.
    • 1:02:39This half creates this in memory and the assigned
    • 1:02:43here,
    • 1:02:43the equal sign essentially does the equivalent of
    • 1:02:46that I don't care what the address is,
    • 1:02:47the actual number.
    • 1:02:48It's as though and is now pointing to that chunk of memory.
    • 1:02:51But this isn't very useful if I want to store the
    • 1:02:54number one here with what code can I do that?
    • 1:02:57Well, I could do this, borrowing an idea from last week. So star N
    • 1:03:02presumes that N is a pointer. Star N means go there, go to whatever you're pointing at
    • 1:03:07the dot operator means if you're pointing at a structure,
    • 1:03:10go inside of it to the number field.
    • 1:03:12And we did this a couple of weeks ago with
    • 1:03:14number and person when we implemented an address book.
    • 1:03:17So star N is go there.
    • 1:03:19And the dot operator means go to the number field,
    • 1:03:22the one on the right hand side and the equal sign
    • 1:03:24means set whatever is there equal to the number one.
    • 1:03:27And you can perhaps guess what the next step is going to be.
    • 1:03:30Um Rather
    • 1:03:33uh
    • 1:03:34it turns out this is the syntax though that I alluded to
    • 1:03:37being a little bit cryptic and not very pleasant to remember or type
    • 1:03:40here though is where you can synonymously instead use this
    • 1:03:44line of code which most C programmers would use.
    • 1:03:47Instead.
    • 1:03:47This means and is still a pointer the arrow literally with
    • 1:03:51a hyphen and a greater than sign means go there.
    • 1:03:54It's the exact same thing as the parentheses with
    • 1:03:57the star with the dot This just simplifies it,
    • 1:04:00look like these actual pictorial arrows.
    • 1:04:03So this would be the most conventional way of doing this.
    • 1:04:05How now do I update the next field?
    • 1:04:07Well, I think I'm going to just say the same thing and
    • 1:04:09go there but go into the next field and set it equal to null. Why null?
    • 1:04:15If the whole point here was to allocate just one chunk of memory, one node,
    • 1:04:20you don't want to leave this as a garbage value because
    • 1:04:22that value will be mistaken for an arrow pointing to some
    • 1:04:25random location.
    • 1:04:27All right, that's a lot.
    • 1:04:29And again,
    • 1:04:29we're doing it in isolation step by step just to paint the picture on the screen.
    • 1:04:32But any questions on any
    • 1:04:35of these steps,
    • 1:04:37each picture translates to one line of code there. All right. So
    • 1:04:42if you're comfy enough with those lines there, what can I proceed to now do?
    • 1:04:49Well, let me propose that what I could now do
    • 1:04:52with this same approach is set list itself equal to
    • 1:04:56N because if the whole goal is to build up
    • 1:04:58a linked list and list represents that linked list list
    • 1:05:01equals N is essentially saying whatever address is here,
    • 1:05:04put it here and pictorially what that means is
    • 1:05:06temporarily point both pointers to the same exact place.
    • 1:05:09Why? Because this is the list that I care about long term.
    • 1:05:12This is maybe my global variable that I'm going
    • 1:05:14to keep around forever in my computer's memory.
    • 1:05:16This was just a temporary pointer.
    • 1:05:18So that I could get a chunk of memory and go to its low
    • 1:05:20and
    • 1:05:21update it with those values.
    • 1:05:22So eventually this is probably going to go away altogether.
    • 1:05:25And this then is a link list of size one.
    • 1:05:27This is what happened when Scully inflated one balloon.
    • 1:05:30I wrote the number one on it and we, I pointed at that single balloon.
    • 1:05:34All right.
    • 1:05:35If I want to go ahead and do this again and again, we'll do this a little more quickly.
    • 1:05:38But it's the same kind of code for. Now,
    • 1:05:41here's how I allocate space for another node.
    • 1:05:44Here's how I can temporarily store in N and I'll red declare
    • 1:05:47it here just to make clear that it's indeed just a pointer.
    • 1:05:49So the left hand side of the expression gives me this,
    • 1:05:51the right hand side of the expression gives me this.
    • 1:05:53Where could it be? I mean, I put it here, it could have been there.
    • 1:05:56It could have been anywhere else. But mao
    • 1:05:58gets to decide that for us N equals this just
    • 1:06:01sets that temporary pointer equal to that chunk of memory.
    • 1:06:05I should clean this up. How do I now put the number two
    • 1:06:08into this node? While I start at N
    • 1:06:10I go there, I go to the number field which I keep drawing on top and I set it equal to two.
    • 1:06:15Now
    • 1:06:17it's a little non obvious what we should do here.
    • 1:06:19So I'm gonna be a little lazy at first.
    • 1:06:21And rather than put these numbers into the link list in sorted order,
    • 1:06:25like ascending order 1234, I'm just gonna plop it at the beginning of the list.
    • 1:06:29Why? Because it's actually a little simpler.
    • 1:06:31If the H time I allocate a new node, I just prepend it.
    • 1:06:34So to speak to the beginning of the list,
    • 1:06:36even though it's gonna end up looking backwards in this case.
    • 1:06:38So notice at this point in the story,
    • 1:06:41I've got list pointing to the original link list.
    • 1:06:44I've got n pointing to the brand new node.
    • 1:06:47And ultimately,
    • 1:06:48I kind of wanna connect these just as Scully and I did with the strings.
    • 1:06:52This is just temporary. So I want to connect these things.
    • 1:06:55Here's how I could do it wrong
    • 1:06:56if I proceed now and update.
    • 1:06:59Oh rather after one more line setting this equal to N sorry,
    • 1:07:02let's at least get rid of that garbage value.
    • 1:07:04Here's how I could proceed to maybe do this wrong. Let me go ahead and update.
    • 1:07:09For instance, list equals N.
    • 1:07:12So if I update list
    • 1:07:14equaling N
    • 1:07:15that's going to point the list at
    • 1:07:18this new node.
    • 1:07:20But what has just happened?
    • 1:07:23What did I do wrong? Yeah.
    • 1:07:25So nothing's pointing to one.
    • 1:07:27And even though you and I obviously have this like
    • 1:07:28bird's eye view of everything in the computer's memory,
    • 1:07:30the computer doesn't, if you have no variable,
    • 1:07:32remembering the location of that node
    • 1:07:34for all intents and purposes, it is gone.
    • 1:07:36And if I haven't belabored this metaphor too much already,
    • 1:07:41let me go ahead and propose that what I've essentially
    • 1:07:45done here
    • 1:07:46is if this is my one, this is my two, what if I, what I've essentially done rather,
    • 1:07:51what is this metaphor?
    • 1:07:5212,
    • 1:07:54pointing at the two?
    • 1:07:56Yeah, this kind of works. All right,
    • 1:07:58we'll fix this later.
    • 1:07:59So what I've essentially done is this when I
    • 1:08:01update that pointer to point at the number two,
    • 1:08:03it's as though this,
    • 1:08:04this was a much nicer idea in theory when we talked about it,
    • 1:08:06but it's not really working.
    • 1:08:08And also Sanders didn't want us using helium because the balloons
    • 1:08:10would go up and not come down for quite a while.
    • 1:08:12So forgive the metaphor today.
    • 1:08:14But this is effectively what we've tried to achieve, which is I've orphaned,
    • 1:08:17so to speak.
    • 1:08:18The number one. And that too is a technical term in the context of memory.
    • 1:08:21If no one is pointing at it, if no string is connected to it,
    • 1:08:24I have indeed orphaned a chunk of memory A KA A memory leak.
    • 1:08:28And
    • 1:08:28valgrind would not, in fact like this and
    • 1:08:30valgrind would in fact notice this. So what would be the better approach?
    • 1:08:35Instead of let me rewind
    • 1:08:37instead of updating that address to be that of this node,
    • 1:08:40let's rewind to where we were a moment
    • 1:08:42ago where list is still pointing at the original
    • 1:08:45and is still pointing at the new chunk of memory. And what should I do instead?
    • 1:08:49Well, what should do is maybe this, let's go to the next field of the new node.
    • 1:08:55So follow the arrow, go to the next field. And what should I put here instead?
    • 1:09:00Why don't I put the memory address of the original node? How can I get that?
    • 1:09:05Well, that's actually this.
    • 1:09:07So if list is pointing at the original node,
    • 1:09:09I can just copy that address into this next field, which has the effect
    • 1:09:14of doing that albeit in duplicate,
    • 1:09:16I've updated the next field to point at the very
    • 1:09:19thing that the original list is already pointing at.
    • 1:09:22And now for the sake of discussion, let me get rid of my temporary node called N.
    • 1:09:27And what you'll see ultimately
    • 1:09:29is that
    • 1:09:30once we set list equal to N
    • 1:09:33and get rid of it,
    • 1:09:34now we can just
    • 1:09:36treat the whole length list as being connected in linked in this way.
    • 1:09:41How do we do this again? We won't belabor the point with more.
    • 1:09:44But suppose I want to allocate a third node.
    • 1:09:46I have to do the exact same thing,
    • 1:09:47but I have to update this next field to point
    • 1:09:50at the existing list before I update list itself.
    • 1:09:53Long story short order of operations is going to be super important.
    • 1:09:57And if I want to stitch these data structures together,
    • 1:10:00if I would encourage you to think ultimately,
    • 1:10:02certainly when it comes time to write something like this,
    • 1:10:04think about what it is that we're actually trying to tie together.
    • 1:10:08So let me go ahead and do this. I'm going to go over to VS code here.
    • 1:10:12I'm gonna delete the old code for list dot C and perhaps
    • 1:10:16now we can transition away from our old approach and actually do something
    • 1:10:20with these pointers instead.
    • 1:10:23So I'm gonna go ahead and let's say, include as before
    • 1:10:27uh include standard
    • 1:10:29IO dot H. Let's go ahead and include standard
    • 1:10:32lib dot H proactively. And let's go ahead and create that data type. So type de
    • 1:10:37a
    • 1:10:37struct called node
    • 1:10:39and inside of this node, let's give us an integer called number to store the one,
    • 1:10:43the two, the three, the four.
    • 1:10:44And then let's create a struct node star value called next,
    • 1:10:48whose purpose in life is going to point to the next node
    • 1:10:51in any such list. I'm going to shorten the name of all this to just node simply.
    • 1:10:55And then in Maine, let's go ahead and do this.
    • 1:10:58We'll bring back our friend uh ARG C and RV.
    • 1:11:01So that I can actually implement a program this time that lets me construct
    • 1:11:04a linked list using numbers that I just passed at the command line.
    • 1:11:07I don't want to bother with get in again and again over the CS 50 library.
    • 1:11:10So let's just use
    • 1:11:11ARG C
    • 1:11:12and
    • 1:11:13RG V. But with ARG V recall string now as of last week is synonymous with char star.
    • 1:11:19So that's the exact same thing
    • 1:11:21as we've used in week two onward for command line arguments.
    • 1:11:25So what do I wanna do?
    • 1:11:26My goal in life with this demonstration is to create and
    • 1:11:29code this linked list here or at least the beginnings thereof.
    • 1:11:34So how can I do this? Let me go back into vs code.
    • 1:11:37Let me declare a linked list called list, but initialize it to null.
    • 1:11:42So there's nothing there just yet.
    • 1:11:44How now can I go about building this linked list
    • 1:11:48uh by taking numbers from the command line? So let's do this.
    • 1:11:51Uh four N I equals one.
    • 1:11:55I is less than RG C I plus plus. Let me go ahead and do this.
    • 1:12:01I'm gonna go ahead and just for the sake of discussion,
    • 1:12:04let me print out where we're going with this.
    • 1:12:06Let me go ahead and print out percent S backslash N whatever is in RG V bracket. I,
    • 1:12:13so I'm not doing anything interesting yet,
    • 1:12:15but let's just demonstrate where we're going with this.
    • 1:12:17Let me go ahead and make list
    • 1:12:19dot slash list and let me put the numbers 12 and three as command line arguments enter
    • 1:12:25there. We just have those numbers spit out.
    • 1:12:27I'm just kind of jumping through this hoop
    • 1:12:28to demonstrate how I'm getting those values.
    • 1:12:31But notice the values in RV are always strings a Kar star.
    • 1:12:36So if I actually want to convert a string
    • 1:12:38to an integer like this.
    • 1:12:41How can I do this? I wanna set
    • 1:12:44the number variable equal to RV bracket. I but RV bracket I is a string.
    • 1:12:49How can I convert a string to a number?
    • 1:12:52Anyone recall?
    • 1:12:54Yeah.
    • 1:12:55A A to I. So ask E to I, so ask you to integer. So if I do A to I, I can actually convert one
    • 1:13:03to the other in this way. And now I can actually print this as
    • 1:13:07an in instead of a string.
    • 1:13:09Now, that's not going to change the aesthetics of the program if I print it out again,
    • 1:13:12but it does in fact give me an integer to work with. But let's not bother printing it.
    • 1:13:16Let's instead put this number and any other number at the command line
    • 1:13:20into a linked list.
    • 1:13:22So let me go ahead and allocate
    • 1:13:24a pointer called N.
    • 1:13:26Let me set it equal to the return value of Malo
    • 1:13:30asking Malo
    • 1:13:31for the size of one node.
    • 1:13:34Ideally,
    • 1:13:34that will give me a chunk of memory that can
    • 1:13:36fit this number and a pointer just for good measure.
    • 1:13:39I'm gonna check.
    • 1:13:40Well, if N equals equals null, then actually this isn't going to work.
    • 1:13:45So we should probably, you know, free memory thus far.
    • 1:13:49So I'm just gonna leave this like this because there's a few steps involved.
    • 1:13:52So free memory thus far.
    • 1:13:55Um And then we can go ahead for instance and return one.
    • 1:13:59All right. If now
    • 1:14:01I don't have an error and, and is not, in fact, no, but it's a valid address.
    • 1:14:05I can go into N
    • 1:14:06I can follow that pointer to the number field and set it equal to the actual number.
    • 1:14:11So this is a little strange at first glance that I've
    • 1:14:13got number on the left and number on the right.
    • 1:14:15But they're different.
    • 1:14:16N is currently pointing at a chunk of memory that big enough to fit a node
    • 1:14:20N brack.
    • 1:14:21Uh N arrow number means go to that chunk of
    • 1:14:24memory and go to the top half of the rectangle
    • 1:14:27and update that number
    • 1:14:28to be whatever the human typed in. After we've converted it on line 16 here
    • 1:14:33to an actual
    • 1:14:34integer.
    • 1:14:36All right. What next do I do
    • 1:14:38N bracket or N arrow next should probably be at this point initialized to null.
    • 1:14:44And how now do I actually add this node N to my original linked list?
    • 1:14:49Well,
    • 1:14:49I could just do list equals N and that would update all allow
    • 1:14:53the foam finger my list variable to point at this new node.
    • 1:14:57But we said before
    • 1:14:58that that's potentially bad. Why?
    • 1:15:00Because if list is already pointing at something,
    • 1:15:03we can't just blindly change what it's pointing
    • 1:15:05at because we'll have orphaned any previous numbers.
    • 1:15:08It's not relevant at the moment because we're
    • 1:15:10still in the first iteration of this loop.
    • 1:15:11But we don't want to orphan or leak any memory.
    • 1:15:14So what do I first want to do before I actually point the linked list at that new node,
    • 1:15:20I'm going to
    • 1:15:20say, instead say go to this current node
    • 1:15:24arrow next and actually set that equal to list.
    • 1:15:28So strictly speaking, I don't actually need to initialize it to null,
    • 1:15:31I can initialize the next field of this new node
    • 1:15:35to point at
    • 1:15:37the existing list. And so what I've effectively done here
    • 1:15:40is this whereby
    • 1:15:43I have pointed, let me rewind. And here before switching over,
    • 1:15:47what I essentially want to do is
    • 1:15:51give me one second to prepare this
    • 1:15:57what I essentially wanted. No, that's not it. Sorry,
    • 1:16:03apologies.
    • 1:16:10Next list.
    • 1:16:14Nope, I don't want to say that sorry.
    • 1:16:16So what I'm gonna do here is instead of initializing the next field equal to null.
    • 1:16:20If I want to insert this new node
    • 1:16:22in front of any nodes that already exist, I can simply say set the N nodes,
    • 1:16:28next field equal to whatever the list currently is.
    • 1:16:32And now in this last line, I can update the list itself to point to N.
    • 1:16:36So after this, let's just go ahead and do something relatively simple.
    • 1:16:40Even though the syntax for this is going to look a little complicated.
    • 1:16:42At first, how do I go about printing
    • 1:16:45uh the whole list? So print whole list. Well, there's a couple of ways to do this.
    • 1:16:50But if you imagine a world,
    • 1:16:51if we fast forward to a world in which we now have a linked list of size three.
    • 1:16:55For instance,
    • 1:16:56here's where we might be at some point in the computer's memory.
    • 1:16:59We've inserted the one, then we inserted the two, then we inserted the three.
    • 1:17:03But because we're pre pending everything, it actually looks like 321. So
    • 1:17:07how could I go about printing this? Well, ideally I could do this.
    • 1:17:11If a computer can only look at one location at a time,
    • 1:17:13I can sort of grab my foam finger and point at the three and print it out
    • 1:17:17point at the two and print it out point at the one and print it out.
    • 1:17:21And then because this is null, I'm all done pointing and printing.
    • 1:17:24But how can I translate this to actual code?
    • 1:17:27Well, I could implement that foam finger, so to speak. And
    • 1:17:29following way,
    • 1:17:30I could give myself a pointer often abbreviated by computer scientists as PTR
    • 1:17:35specify that that's indeed a pointer to a node as per
    • 1:17:39that star and initialize that pointer to be the list itself.
    • 1:17:42So this is the code equivalent of if I have this same picture on the screen,
    • 1:17:47declaring a pointer variable and pointed at whatever the list itself
    • 1:17:52is uh storing first.
    • 1:17:54And now
    • 1:17:56that's akin to doing this.
    • 1:17:57If I now go back into my code, how can I do this?
    • 1:18:01Well, so long as that pointer does not equal null,
    • 1:18:04that is so long as that pointer is not at the end of the list.
    • 1:18:07Let me go ahead and print out using print F and integer with backs
    • 1:18:12with percent I
    • 1:18:14and then let's print out whatever I'm currently pointing at in PTR
    • 1:18:18arrow number.
    • 1:18:20So whatever I'm pointing at, go there and print the number that you find.
    • 1:18:24Uh after that, what do I want to go ahead and do?
    • 1:18:27I'm going to set pointer equal to pointer
    • 1:18:30arrow next. So what does this mean if I go back to my picture here?
    • 1:18:34And I want to actually walk through this thing,
    • 1:18:37that first line of code ensures that this foam finger a Kaptr
    • 1:18:42represented here is pointing at the first elements of the list.
    • 1:18:45Once I've printed it out with print F, I'm then doing P pointer equal
    • 1:18:49pointer next, which is like following this next arrow.
    • 1:18:53So PTR now points at the two,
    • 1:18:56I then print that out and set pointer equal to pointer next.
    • 1:18:58That's like following this arrow and updating pointer
    • 1:19:01to point at this node instead at that point.
    • 1:19:04It's the next step is going to be to point it to no.
    • 1:19:06So for all intents and purposes, I'm done.
    • 1:19:08And that's why we can actually get away with
    • 1:19:11this while loop because while pointer is not null,
    • 1:19:15it's going to print and print and print.
    • 1:19:18Now let me go into my terminal window,
    • 1:19:20let me go ahead and make list and really hope I
    • 1:19:22didn't make any mistakes because there was a lot all at once
    • 1:19:25seems to have compiled. Ok. When I run dot slash list of 123,
    • 1:19:30theoretically, this code,
    • 1:19:31if correct should unbeknownst to me build up an entire link list in memory.
    • 1:19:36But what's it going to print out ultimately?
    • 1:19:40What do you think it's gonna be printed?
    • 1:19:43Uh, could print out? No, uh if I really screwed up. Yes,
    • 1:19:47what else
    • 1:19:49or could print out 321? And frankly, that's what I'm hoping for.
    • 1:19:52So, even though I've given it in RG V 123,
    • 1:19:57because I'm pre pending to the beginning of the list, the beginning of the list,
    • 1:20:00beginning of the list each time, I think indeed, we're gonna see 32 one now.
    • 1:20:05That's fine. That's correct. But it's not necessarily what we might want.
    • 1:20:10So how could we actually go about inserting things maybe uh otherwise,
    • 1:20:15because in fact, if we consider this algorithm,
    • 1:20:17what's the running time of insert
    • 1:20:20the running? How many steps are required right now?
    • 1:20:22Given the link list of size N, if you want to go ahead and insert one more node,
    • 1:20:26there's actually a reason I took this lazy approach of pre pending, pre pending
    • 1:20:30in big O notation. How much does it cost us to insert into a linked list?
    • 1:20:39Think about it this way. Does it matter how many nodes are already in the link list?
    • 1:20:43Whether it's one or two or three or 300 or 3000?
    • 1:20:46If you're pre pending, doesn't matter how long that chain is,
    • 1:20:50you're just constantly putting it in the beginning at the beginning,
    • 1:20:53at the beginning.
    • 1:20:53Now, how many steps is this? I don't know exactly.
    • 1:20:55I'd have to count the lines of code, but it's some small number.
    • 1:20:58It's like two steps, three steps. How many lines of code is it?
    • 1:21:00It's very few to pre penn, pre Penn.
    • 1:21:03So I would dare say that the running time of insertion
    • 1:21:08into a linked list, it's actually constant time.
    • 1:21:11It's big o of one and that's super fast because it doesn't matter how big the list is.
    • 1:21:15Boom, boom, boom, you've prepend it to the list.
    • 1:21:17But there's a flip side.
    • 1:21:19What's the running time of searching a linked list looking for something in it,
    • 1:21:24finding a number in it?
    • 1:21:26Well, if it looks like this,
    • 1:21:27how long does it take you to find some
    • 1:21:29arbitrary number that the human might ask you for?
    • 1:21:32Like how many steps it will take to find me the number one if it's there.
    • 1:21:35So N big O of N because in the worst case,
    • 1:21:37the number you're looking for might be all the way at the end.
    • 1:21:40And even though you and I again have this bird's eye view,
    • 1:21:42and we can obviously see where the one is.
    • 1:21:44The only way we can get to. The one is by starting at the two. How do you get to the two?
    • 1:21:48You gotta start at the three.
    • 1:21:49How do you get to the three, you've got to start at the beginning of the list itself.
    • 1:21:53And so, whereas in the world of arrays where you had this contiguous chunk of memory,
    • 1:21:57just like we had lockers on the stage weeks ago and you
    • 1:22:00could jump to the middle and then the middle of the middle,
    • 1:22:02in the middle of the middle of the middle, that was all predicated on contiguous.
    • 1:22:05Why?
    • 1:22:06Because if you know where the first uh locker
    • 1:22:08was and you know where the last locker was,
    • 1:22:11you can subtract one from the other divide by two.
    • 1:22:13And boom,
    • 1:22:13you get the index or the location numerically of the
    • 1:22:16middle locker and you can do that again and again,
    • 1:22:19I cannot do any such math here.
    • 1:22:21The middle of this linked list is obviously here but it
    • 1:22:25doesn't matter what the location of this one is in memory.
    • 1:22:28It doesn't matter what the location of this is in
    • 1:22:29memory because they could be anywhere in the computer's memory.
    • 1:22:32So you can subtract one from the other, divide by two.
    • 1:22:34And that's going to put you in some random location
    • 1:22:37because these chunks of memory are not back to back to back to back.
    • 1:22:41They're every which way.
    • 1:22:42So this is to say what algorithm from week zero can we not use on linked lists?
    • 1:22:48S say again,
    • 1:22:49oh what sorry,
    • 1:22:51the log N one. But what was it called?
    • 1:22:54So binary search. So that very algorithm we started the class with
    • 1:22:58was all predicated on contiguous chunks of memory like an array.
    • 1:23:01But the problem with an array though, of course though is that
    • 1:23:04you paint yourself into this corner and you have
    • 1:23:06to know in advance how many locations you want.
    • 1:23:09And if you round up your waste space, if you round down you're wasting time,
    • 1:23:12so you're sort of screwed either way.
    • 1:23:14A linked list avoids those problems.
    • 1:23:16It's more of a dynamic data structure that can grow.
    • 1:23:18And frankly, if we code it up, it could even shrink,
    • 1:23:20we could remove these nodes back and forth.
    • 1:23:23And so we're not necessarily wasting time but on insertion.
    • 1:23:27But we are on searching this thing.
    • 1:23:30We're back to big O of N when it comes to searching a linked list as opposed to
    • 1:23:34it being log N, which was much, much better.
    • 1:23:37Moreover, suppose that we do want to keep this list sorted.
    • 1:23:41Suppose we want to actually sort the data more like this.
    • 1:23:43So when we create a linked list, we've got the one, we maybe have the two, the three.
    • 1:23:48That's if it's backwards. But what if we instead want to do it in sorted order?
    • 1:23:52Well, it's gonna instead of course, look like this.
    • 1:23:54I insert like the number one.
    • 1:23:56Now I insert the number two, now I insert the number three.
    • 1:23:59And again, these boxes are ending up who knows where in memory,
    • 1:24:02but I can still link them together with these pointers.
    • 1:24:05But if I now want to keep this list in sorted order,
    • 1:24:08what's the running time of insertion?
    • 1:24:10Now
    • 1:24:11they go of
    • 1:24:12N because if you insert the number four,
    • 1:24:14obviously it belongs like somewhere over there.
    • 1:24:17But the only means you the programmer or it,
    • 1:24:20the computer has to find the end of this list is to start at the beginning.
    • 1:24:23Follow the breadcrumbs, follow the breadcrumbs, follow the breadcrumbs. Oh, no.
    • 1:24:27Now I can plop the number four somewhere else and heck
    • 1:24:30the number four might end up in memory over here.
    • 1:24:32But that just means that it's gonna be pointing way over which way.
    • 1:24:36So this is where now computer
    • 1:24:38and programming really becomes a lot less obvious.
    • 1:24:40And like every story we tell isn't necessarily better
    • 1:24:43than the last because what have we done?
    • 1:24:45We've solved the problem of painting ourselves again into that corner,
    • 1:24:49whereby with the rays, you have to decide in advance how much memory you want.
    • 1:24:53And you could overestimate, you could underestimate,
    • 1:24:55but either way you're probably going to end up doing work
    • 1:24:57eventually to fix that problem by allocating more or less memory.
    • 1:25:01There's a function in C called real
    • 1:25:03lock that can help a little bit with that process.
    • 1:25:05But it's still doing the looping underneath the hood for
    • 1:25:08you time wise with linked lists in these nodes,
    • 1:25:11we now have the ability to dynamically grow the data structure.
    • 1:25:14And even though we haven't coded it yet shrink the data structure as well.
    • 1:25:17So we have a lot more dynamism. So we're only using as much memory as we might need
    • 1:25:21except for the fact that we're already using twice as much memory,
    • 1:25:25at least for those pointers.
    • 1:25:26And it's technically even more than twice as much on some
    • 1:25:29systems because pointers are very often eight bytes and not even
    • 1:25:32just four. So it's almost, it's a bigger blow up,
    • 1:25:35but it depends on what's important is it,
    • 1:25:37do you want the uh your code to be very memory efficient,
    • 1:25:40very time efficient is one algorithm easier or harder than the other to implement.
    • 1:25:45Now, programming is going to start to become a lot more argumentative.
    • 1:25:49A lot more up to you up to your colleagues, your friends,
    • 1:25:52whoever you're collaborating with as to what makes the better design.
    • 1:25:55And among our goals for today are to give you
    • 1:25:57tools mentally for a value
    • 1:25:59like the quality of design.
    • 1:26:01But there's always going to be a trade off
    • 1:26:03and it ultimately matters what's important to you.
    • 1:26:05Now, as for this code here, it's not technically quite done
    • 1:26:08because at the very end of this program,
    • 1:26:10I should probably actually free all of the memory I've done.
    • 1:26:14And so just so you've seen this like free list at
    • 1:26:17the bottom I can use a loop for this too.
    • 1:26:19I can declare a variable temporarily set it equal to list.
    • 1:26:23I already have pointer from up here. So I don't need to bother wasting
    • 1:26:26time creating it again with its data type.
    • 1:26:29So I can say this while pointer does not equal null,
    • 1:26:32go ahead and free each of those nodes.
    • 1:26:34How do I want to do this?
    • 1:26:35Well,
    • 1:26:36let me temporarily give myself a next pointer set it
    • 1:26:38equal to whatever this current node's next field is.
    • 1:26:41Then let me go ahead and free whatever I'm pointing at,
    • 1:26:44then let me go ahead and update pointer
    • 1:26:46equaling whatever next is so long story short,
    • 1:26:50there's a lot going on here.
    • 1:26:52But this
    • 1:26:53code here is representative of how you could
    • 1:26:56iteratively delete a list from left to right,
    • 1:26:59no matter the order of the numbers.
    • 1:27:01This is code that sets the pointer equal to the beginning of the list.
    • 1:27:04This makes sure that you've not walked off the end of the list,
    • 1:27:07this line temporarily points sort of in duplicate
    • 1:27:10another pointer to point at the same location.
    • 1:27:13So these two foam fingers would technically be pointing at the same place.
    • 1:27:17Why do you do that?
    • 1:27:18Because you want to be able to free one of them,
    • 1:27:20but still remember where you were previously.
    • 1:27:24Sorry. This, that was not true with these foam fingers.
    • 1:27:26One is essentially pointing at the current node,
    • 1:27:28the other is technically pointing at the next node
    • 1:27:31so that when you actually free this one,
    • 1:27:33you still have a variable pointing at the next node in the list.
    • 1:27:37And that's what I literally called
    • 1:27:38next in this case.
    • 1:27:40So if I go ahead and recompile this make list dot slash list 123, I'll indeed see
    • 1:27:47that
    • 1:27:48it's still 321 in this order.
    • 1:27:50And we don't want to go too into the weeds of how we can actually in
    • 1:27:53insert things into the middle because the code starts
    • 1:27:55to get a little involved a little quickly.
    • 1:27:57But let me at least propose pictorially. We consider the other scenarios.
    • 1:28:02If you want to maintain sorted order of the list, it's not sufficient to just prep
    • 1:28:07obviously, because then you get 321
    • 1:28:09and it's not safe to assume that you can just append
    • 1:28:11to the end of the list because if I insert 1234,
    • 1:28:15I get lucky.
    • 1:28:16But if I insert zero after that, well,
    • 1:28:19it's not going to end up at the beginning if by definition, my new code is upending.
    • 1:28:23So how do I insert in some arbitrary location? Well, let's consider this.
    • 1:28:27Suppose I first insert the number two, it can go anywhere. Who cares?
    • 1:28:31It's a list of size one. But then suppose I insert the number one.
    • 1:28:35Notice, I've got to be, I'm a little lucky here and that I can prepend one to this list.
    • 1:28:40And then if I go ahead and insert four, well,
    • 1:28:42that should be appended to the end of the list.
    • 1:28:44And frankly, there's one other annoying case. What if I now insert three?
    • 1:28:48There's even more work to be done because this
    • 1:28:50one belongs to uh in between two separate nodes,
    • 1:28:55which is to say that we actually have to update
    • 1:28:57a whole bunch of pointers. And this is code we won't run live.
    • 1:29:00But I will actually open up here.
    • 1:29:02Let me go ahead and open up for instance, in version
    • 1:29:07uh what you'll see online is version five. Let me go ahead and open up
    • 1:29:12into the same file called list dot C A premade version.
    • 1:29:16So this assume this has just come out of the oven.
    • 1:29:18And this one actually has some comments for me,
    • 1:29:20the only difference here is that this one is more generalized and will
    • 1:29:23actually allow me to insert things into different locations based on the ordering.
    • 1:29:28So here, same code as before,
    • 1:29:30I'm just iterating over all of the command line arguments.
    • 1:29:32I'm skipping the first one because the name of
    • 1:29:34the program is always first in the list.
    • 1:29:36Then as before, I'm going ahead and storing
    • 1:29:39the number
    • 1:29:41as in a variable after converting it to an actual integer with A to I,
    • 1:29:45this code is exactly the same as before, whereby I am allocating a node for the number
    • 1:29:50and I am making sure to update that node,
    • 1:29:54the current number and I'm defaulting it to null.
    • 1:29:56But now here's where there's a conditional.
    • 1:29:58And again, we're just going to wave our hands at some of this code,
    • 1:30:00but it might very well prove useful.
    • 1:30:02Ultimately,
    • 1:30:03if the list is empty.
    • 1:30:04And I've just created a node that's like the easiest possible scenario.
    • 1:30:07There's nothing on the board, so to speak. There's just one new node.
    • 1:30:10All I have to do is set list equal to that new node. And I'm done.
    • 1:30:13But that's a lucky scenario when it's the very first node.
    • 1:30:16Suppose that the list has numbers already. Well, that's my ielts condition.
    • 1:30:20How do I find the location
    • 1:30:22for this node in the existing list?
    • 1:30:25This is sort of scary looking syntax,
    • 1:30:27but it's just the four loop version of a wild loop like
    • 1:30:30we saw before this sets a pointer equal to the list itself.
    • 1:30:34So it's like a foam finger pointing at the
    • 1:30:36list itself.
    • 1:30:37I'm going to do this so long as that pointer is not null and on every iteration,
    • 1:30:41I'm going to update pointer
    • 1:30:43equal to pointer next.
    • 1:30:44So this has the effect of using a foam finger which is over there.
    • 1:30:47Now pointing at one node, then the next node, then the next node,
    • 1:30:50then the next node all that's doing
    • 1:30:53is this line here.
    • 1:30:54Now there's a few scenarios and this is where it gets a little hard,
    • 1:30:57logically if you're at the end of the list within
    • 1:31:01this loop and your next field is thus null.
    • 1:31:04So you've kind of run off the end of the list.
    • 1:31:05Well, obviously,
    • 1:31:06then mathematically it seems that this new node
    • 1:31:08belongs at the end because if you walked
    • 1:31:10all the way from left to right and you made your way to the end.
    • 1:31:13Let's go ahead and append it using code like before.
    • 1:31:16And then I'm just going to break out, break,
    • 1:31:17breaks you out of a loop if you know you're already done logically.
    • 1:31:21But that's another relatively simple scenario. Let me scroll down a bit further.
    • 1:31:26Um
    • 1:31:27And
    • 1:31:28note, sorry,
    • 1:31:30this is version five.
    • 1:31:31Let me do this. Let's see.
    • 1:31:36This will look almost
    • 1:31:40uh huh
    • 1:31:43We're gonna fake this.
    • 1:31:47If though the number you're trying to insert
    • 1:31:49is not at the beginning of the list,
    • 1:31:50which was the very first scenario we checked when there was no
    • 1:31:53list and it doesn't belong at the end of the list.
    • 1:31:55And so it's not meant to be appended.
    • 1:31:57This is the sort of harder scenario whereby I want to insert it somewhere in the
    • 1:32:00middle because there's a number to my left that's smaller and a number to my right,
    • 1:32:04that's bigger.
    • 1:32:04So how do I do that?
    • 1:32:06If the current node that I'm trying to insert
    • 1:32:09has a number
    • 1:32:10that is less than
    • 1:32:13the next nodes number, I found the spot.
    • 1:32:16So if this number is less than the one that's coming next,
    • 1:32:19then I want to insert it in between those two.
    • 1:32:22And how do I do this?
    • 1:32:23I'm going to update the next pointer of this new node to 0.1 away over there.
    • 1:32:29I'm then going to update this current node that I'm pointing at to be equal to me.
    • 1:32:34And then I'm going to break out and that has the effect of inserting one node
    • 1:32:38in between two existing nodes,
    • 1:32:40but doing it in the right order because I'm
    • 1:32:42sort of borrowing the pointer from this middle node,
    • 1:32:44I'm then updating the new pointer, the new nodes to point at this middle one.
    • 1:32:48And at that point, I've sort of stitched it into the middle.
    • 1:32:52So it's a lot of steps and we didn't want to dwell
    • 1:32:54on this particular example because it does start to get complex quickly.
    • 1:32:58But what you've just seen with that last
    • 1:33:01conditional is this scenario where you're trying to insert three. But to do that,
    • 1:33:06you have to do some touch up between two and four in the right order.
    • 1:33:10So as to not get things uh orphaned along the way.
    • 1:33:15OK.
    • 1:33:16Question on any of those processes.
    • 1:33:23It's a lot
    • 1:33:24and next.
    • 1:33:25Oh And but this, the spoiler is next week,
    • 1:33:27all of that gets reduced to like one line of Python code.
    • 1:33:31Yay.
    • 1:33:32OK. That was a lot. Someone really wants to have a, a cookie break.
    • 1:33:35Let's take our 10 minute break. Now, cookies are served. We'll be back in 10.
    • 1:45:58All right, we are back and to recap
    • 1:46:01the problems we've solved and the problems we've created our arrays
    • 1:46:05were problematic because they were a fixed size and that can
    • 1:46:08get us into trouble or it causes us to waste more
    • 1:46:11space preemptively even though we might not ever use it.
    • 1:46:14So we introduced the link lists again
    • 1:46:16to solve that problem by being more dynamic and only allocate
    • 1:46:20as much memory as we need on demand step by step.
    • 1:46:22But of course, we're spending extra space for
    • 1:46:25the pointers.
    • 1:46:25We might gain performance if we at least prepend all of our elements to it.
    • 1:46:29But we lose time again if we append or in certain sordid order.
    • 1:46:34So it's not clear frankly,
    • 1:46:35I think to me in even hearing these upsides and downsides if there is a clear win,
    • 1:46:39but maybe there's a way to get the best of both worlds by trying to capture
    • 1:46:44the upsides of having information that is kept
    • 1:46:48in sorted order that allows us to divide and
    • 1:46:51are still but still gives us the dynamism to grow or shrink the data structure
    • 1:46:55and thus we're born trees.
    • 1:46:57So what we're about to explore are variants of
    • 1:47:01these ideas of arrays and link lists and see
    • 1:47:04if we can maybe kind of mash up some
    • 1:47:05of those building blocks and create more interesting,
    • 1:47:08more compelling solutions that are even not just
    • 1:47:12one dimensional sort of left to right,
    • 1:47:14but are maybe two dimensional and sort of have different axes to them or dimensions.
    • 1:47:18So a tree in the real world of course, tends to grow up from the ground like this,
    • 1:47:22but it tends to branch out and branches branch and that might already in your
    • 1:47:26mind's eye evoke notions of forks in the road or conditionals as we've seen.
    • 1:47:32And let me propose that we first consider what the world calls binary search trees.
    • 1:47:36And so ba is back in that we can do things in half and half and half somehow,
    • 1:47:40if maybe we think about arrays a little bit more cleverly.
    • 1:47:43So here's an array of size seven
    • 1:47:45and I chose that deliberately because there's a perfect middle,
    • 1:47:48there's a middle of middle and so forth just like the lockers a few weeks back.
    • 1:47:51So when the world of arrays like this was actually pretty
    • 1:47:54efficient because we can do binary search and middle of middle,
    • 1:47:57middle of, middle of middle and so forth.
    • 1:47:58And that gave us logarithmic running time,
    • 1:48:01but it's only size seven.
    • 1:48:03And we concluded that it's gonna be like big O of
    • 1:48:05N headache to copy this into a slightly bigger array,
    • 1:48:08free the old memory and so forth.
    • 1:48:09And thus we're born linked list. But with linked lists, we lost
    • 1:48:13log of N
    • 1:48:14running time. Why?
    • 1:48:15Because we have to always start at the,
    • 1:48:17at the beginning to get for instance to the middle or
    • 1:48:20to the end of the list in the worst case.
    • 1:48:22But what if we start to think a little more cleverly in multiple dimensions?
    • 1:48:25So just for the sake of discussion, let me highlight the middle of this here array.
    • 1:48:28Let me highlight the middle of the middle and
    • 1:48:30then the middle of the middle of the middle.
    • 1:48:31So there's sort of implicit structure here, there's a pattern of sorts.
    • 1:48:36And in fact, just to make this more obvious,
    • 1:48:38let me not treat this in as one dimension left to right.
    • 1:48:41But how about two and give myself a bit of vertical space,
    • 1:48:44it's the exact same array.
    • 1:48:45But allow me to just think about it now as though the middle elements way up here,
    • 1:48:48the middle of the middles are slightly lower and the middle of the middle
    • 1:48:51of the middles or the leaves really are at the bottom of this tree.
    • 1:48:54And that word is deliberate,
    • 1:48:55we actually borrow vernacular from the world of trees where the
    • 1:48:58leaf nodes or leaves are the ones at the very bottom
    • 1:49:02and the root node is the one at the very top.
    • 1:49:05So for the sake of discussion,
    • 1:49:06computer scientists draw trees like this instead of like this,
    • 1:49:09but it's the exact same idea, they just tend to grow down
    • 1:49:13in uh in discussion.
    • 1:49:14It's more like a family tree if you drew those growing up for instance.
    • 1:49:18So what's interesting here?
    • 1:49:21Well, at the moment,
    • 1:49:22we've sort of broken the array model because this
    • 1:49:24memory is absolutely not contiguous because this number here,
    • 1:49:27this number here here, here and here it's all over the place.
    • 1:49:30But we do have pointers now
    • 1:49:32in our toolkit,
    • 1:49:33whereby even if these numbers are anywhere in the computer's memory,
    • 1:49:36we can kind of stitch them together like we did string and those balloons.
    • 1:49:40Now, it's not sufficient just to have one
    • 1:49:42of string for each node or one pointer.
    • 1:49:45But what if we actually give each of these nodes, not just a number,
    • 1:49:48like the number four, the number two, the number six,
    • 1:49:50let's give them each a number and two pointers,
    • 1:49:53a so called left child and a right child, so to speak.
    • 1:49:57So we could do this. And I'm going to abstract away.
    • 1:49:59Now, technically, they're sort of like they're not even rectangles anymore,
    • 1:50:02like they're really long rectangles or they're like sort of
    • 1:50:04upside down ts that have three boxes to them.
    • 1:50:07But I'm just going to abstract away nodes now is just simple squares and
    • 1:50:11implementation detail as to what the struts actually are.
    • 1:50:14But the arrows suggest that each of these nodes now has two pointers.
    • 1:50:19You don't have to use them, the leaf nodes have nothing to point to.
    • 1:50:22So those can all be null probably. But each of these nodes now has two pointers.
    • 1:50:27Now, what's the implication of this? This here is what we call a
    • 1:50:31binary search tree because one and first and foremost, it's obviously a tree,
    • 1:50:36but it also is a data structure that's kept
    • 1:50:39in sorted order whereby notice what is true.
    • 1:50:41If you pick any node in this tree like the number four,
    • 1:50:44everything to the left of it,
    • 1:50:46it's left sub tree, so to speak is smaller, everything to the right of it.
    • 1:50:51It's right sub tree is larger and that's true.
    • 1:50:53Elsewhere, look at the six, everything to the left is smaller,
    • 1:50:56everything to the right is bigger and same thing over here.
    • 1:51:00So in some sense,
    • 1:51:01this is a recursive data structure because you can say the same thing about
    • 1:51:05each of these nodes because each of these sub tree compose a larger tree.
    • 1:51:10Conversely, this big tree is a composition of 12 sub trees plus one more node.
    • 1:51:15So think back to our morrow example in those bricks.
    • 1:51:18Well, what's a pyramid of height four?
    • 1:51:20There's a pyramid of height three plus one more row. What's a tree of height three?
    • 1:51:24Well,
    • 1:51:24it's two sub trees of height two plus one more
    • 1:51:27row or really one new root node to connect them.
    • 1:51:30So this already is sort of a recursive data structure by that logic.
    • 1:51:34How do we translate this into code?
    • 1:51:35Well, we won't sludge through so much low level C code this time around. But let me
    • 1:51:39that we could implement a node now as being similar in spirit to what we
    • 1:51:43did last time where every node used to have a number and a next pointer.
    • 1:51:47But now let's actually make some room for ourselves.
    • 1:51:49And redefine a node is still having a number,
    • 1:51:52but now having 22 pointers and I'll call them obviously left
    • 1:51:56and right that we could call them anything we want.
    • 1:51:58I could call it next and previous but really left.
    • 1:52:00And right would seem to make more sense with Children of a given node like this.
    • 1:52:05So this in C is how we might implement therefore
    • 1:52:08a node
    • 1:52:09in a binary search tree.
    • 1:52:11And so let's consider pictorially what the running time
    • 1:52:14is of searching for something if this here is
    • 1:52:15the tree and it follows that binary search tree
    • 1:52:18definition where everything to the left is smaller,
    • 1:52:20everything to the right is bigger.
    • 1:52:22Well, how many steps might it take if you have N nodes in a tree like this?
    • 1:52:27Well,
    • 1:52:28it's not going to take me end steps because
    • 1:52:30I certainly don't have to look through every node.
    • 1:52:32And in fact, just like a link list starts on the left hand side, so to speak.
    • 1:52:36So that's just an artist's rendition just as the link list starts on one end
    • 1:52:39and you have to traverse the whole thing a tree because
    • 1:52:42it's two dimensional always starts in memory at the root node.
    • 1:52:45So this is always where you start any operation insertion deletion searching.
    • 1:52:50So by that logic, in the worst case, if there's N nodes here,
    • 1:52:53how many steps would it seem to take?
    • 1:52:56It's not big O of N but
    • 1:52:58so it's actually back to big o of log N. Why?
    • 1:53:02Because actually if you kind of think of the height,
    • 1:53:03there's roughly eight nodes in here and log base two of eight is actually three.
    • 1:53:07And so 123 is the height of this tree So in the worst case at the moment,
    • 1:53:11it seems that it's only going to take me like one node, two nodes,
    • 1:53:14three nodes are really just two steps to get to
    • 1:53:17the very bottom of this tree to decide is a
    • 1:53:19number there
    • 1:53:20or not. I certainly can ignore like this entire sub tree. Why?
    • 1:53:24Because I'm, I'm searching for the number seven,
    • 1:53:26just like the phone book from week zero.
    • 1:53:28I can kind of divide and conquer this problem.
    • 1:53:30If I'm looking for seven,
    • 1:53:31I don't need to bother wasting any time looking at this entire
    • 1:53:35sub tree which is almost 50% of the picture on the screen.
    • 1:53:39And
    • 1:53:40I can focus on this half, then this half and boom,
    • 1:53:43I'm done. So we sort of have binary search back.
    • 1:53:46We have the metaphor of the lockers back by operating now in two
    • 1:53:49dimensions to mitigate the reality that our memory is no longer contiguous.
    • 1:53:53But that's fine. We can follow these arrows. We can use these pointers instead
    • 1:53:58to get anywhere that we actually want.
    • 1:54:01So any questions now on trees or specifically binary search trees,
    • 1:54:05which I dare say are sort of like the best of both worlds,
    • 1:54:07all of the upsides of an array and it's like log and running time
    • 1:54:10and all of the upsides of the dynamism of
    • 1:54:12link list because this thing can grow and shrink
    • 1:54:15and doesn't need to be contiguous.
    • 1:54:18Any questions on this.
    • 1:54:21All right. Well, the code too lends itself to relative simplicity.
    • 1:54:25And here's where recursion applies, not just to the structure of the data,
    • 1:54:29but also the code itself.
    • 1:54:31So just for the sake of discussion, we won't run this kind of code,
    • 1:54:33we'll just look at it on screen here.
    • 1:54:34Suppose you're implementing a function called search whose purpose in life
    • 1:54:38is to search a tree and return true or false.
    • 1:54:40I found the number you're looking for. Well, here's the number I'm looking for.
    • 1:54:43It's one of the arguments.
    • 1:54:45And the first argument more importantly is actually a pointer to the tree itself,
    • 1:54:49a pointer to the root of the tree.
    • 1:54:51And that's all the information we need to search a tree and go left, go right, go left,
    • 1:54:55go right.
    • 1:54:55How well, let me do this. As always, we'll have a base case when it comes to recursion.
    • 1:55:00Because if there's no tree there,
    • 1:55:02then it makes no sense to even ask me this question. I'm just going to return false.
    • 1:55:04If you hand me null, there's nothing to do return false.
    • 1:55:08But suppose that you don't hand me null.
    • 1:55:10And suppose that the number I'm looking for is less
    • 1:55:13than the number in the tree at the moment,
    • 1:55:16the number at that root.
    • 1:55:17Well, what do I want to do? I effectively want to go left.
    • 1:55:20I want to search the left sub tree. How do I do that?
    • 1:55:22I'm going to return
    • 1:55:24the recursive return value from the same search
    • 1:55:27function passing in a slightly smaller tree,
    • 1:55:30a so called sub tree, but the same number.
    • 1:55:33And this is where recursion is kind of beautiful.
    • 1:55:35Like look at the relative simplicity of this if search exists,
    • 1:55:38which it doesn't exist in its entirety yet.
    • 1:55:40But we'll get there. If you want to search half of the tree,
    • 1:55:43just go there.
    • 1:55:44So go to the root of the tree,
    • 1:55:45follow the left child pointer and pass that in because it's a tree,
    • 1:55:48it's just a smaller tree, but pass in the same number.
    • 1:55:51What if though it's a bigger number?
    • 1:55:53So what if the number you're looking for is bigger
    • 1:55:55than the number at the root of the tree?
    • 1:55:58Well, then just search the right sub tree instead.
    • 1:56:01And now logically
    • 1:56:03what's the fourth and final case
    • 1:56:06if it's equal?
    • 1:56:08So I can express that as if the number you're looking for equals
    • 1:56:11equals the number in the tree that is the root of the tree,
    • 1:56:15then I'm going to go ahead and return true.
    • 1:56:17And you might remember from our days with scratch,
    • 1:56:18like even this conditional is not necessary.
    • 1:56:20I just did it to be explicit. We can tighten that up as just an E
    • 1:56:23instead
    • 1:56:24and that's it.
    • 1:56:25And this is where again, recursion finally is maybe a little more accessible,
    • 1:56:29a little more obvious in its cleanliness.
    • 1:56:31There's relatively little logic here.
    • 1:56:33But what's important is that these recursive calls here and
    • 1:56:36here are dividing and conquering the problem implicitly why?
    • 1:56:40Because it's solving the same problem, search for a number but it
    • 1:56:43doing it on just half of the tree or the other half of the tree.
    • 1:56:46And because we have this base case here,
    • 1:56:48even if you get all the way to the bottom of the tree and you try
    • 1:56:51to go down the left child or you try to go down the right child,
    • 1:56:53but those pointers are null,
    • 1:56:55then, you know,
    • 1:56:56you didn't find it because you would have returned
    • 1:56:57true sooner if anything had been in fact equal.
    • 1:57:01So that then is recursive code for searching a binary search tree,
    • 1:57:05which is again just to connect the dots of what we introduced.
    • 1:57:08Last time of actually doing things now recursively and
    • 1:57:11revisiting some of our own weak zero problems.
    • 1:57:14But I'm kind of lying to you here like yes, this is a binary search tree,
    • 1:57:18but it's not always as pretty as this.
    • 1:57:20It's certainly not always seven elements,
    • 1:57:22but it doesn't actually have to be as well balanced as this one
    • 1:57:25here is
    • 1:57:26in fact, suppose that
    • 1:57:28we insert the following numbers into an empty list starting with two.
    • 1:57:31I can plop the two right there. That's the current root of this tree.
    • 1:57:35Suppose though that I insert next the number. How about one?
    • 1:57:38Well, it stands to reason that it should go now to the left.
    • 1:57:41And so now this is the tree of size two.
    • 1:57:44Now I insert the number, say three, it of course can go there.
    • 1:57:48So that makes perfect sense.
    • 1:57:49Just kind of got lucky because I inserted these numbers as two, then one, then three,
    • 1:57:53I very cleanly got a balanced tree that sort of, you know,
    • 1:57:57weighted properly left and right.
    • 1:57:59But what if you have a more perverse set of inputs, so to speak, you're not lucky.
    • 1:58:04And like the worst possible situation happens in terms of the order
    • 1:58:07in which the human is inputting data into this data structure.
    • 1:58:10What if the human inserts one first? OK. Well, it goes as the root of the tree
    • 1:58:13but here's where things get start to devolve. What if the human then inserts two?
    • 1:58:18OK? It goes there. What if the human then inserts three?
    • 1:58:20Well, according to our definition, it goes there,
    • 1:58:23it looks like part of a tree because of how I've drawn it. But what is it really
    • 1:58:28if you kind of tilt your head? Right?
    • 1:58:32It looks really just like a linked list and there really is no second dimension.
    • 1:58:36I've drawn it this way.
    • 1:58:37But this for all intents and purposes is a linked list of size three. Why?
    • 1:58:40Because there's no having there's no actual
    • 1:58:43um choosing left or right now. This is fixable. How could you fix this?
    • 1:58:47It's still the same numbers 123. And it does
    • 1:58:50uh adhere to the binary search tree definition.
    • 1:58:52Every number to the right is greater every number to the right is greater.
    • 1:58:56Every number to the left is, well, it's inapplicable,
    • 1:58:58but it certainly doesn't violate that definition.
    • 1:59:00Could you kind of fix this tree somehow and make it balanced?
    • 1:59:04So it's not devolving into big O of N, but it's still technically log
    • 1:59:08of N
    • 1:59:10what should be the root
    • 1:59:14so I could reverse the pointer from 1 to 2. Yeah.
    • 1:59:17And so sort of pictorially if I kind of take this and I
    • 1:59:20just kind of like swing everything over and make two the new route,
    • 1:59:24then indeed, this could be the new route up here.
    • 1:59:26One could be hanging off of it over here and three,
    • 1:59:29it can be hanging off of the two as is so long
    • 1:59:32story short when it comes to binary search trees by themselves,
    • 1:59:36they don't necessarily
    • 1:59:37guarantee any sort of balance.
    • 1:59:38So even though theoretically yes, it's big o of log in which is fantastic,
    • 1:59:42not if you get a perverse set of inputs that just happen to be for instance,
    • 1:59:46the worst possible scenario.
    • 1:59:47Now it is fixable.
    • 1:59:48And in fact, in higher level courses in computer science,
    • 1:59:50specifically on algorithms and data structures,
    • 1:59:53you'll be introduced if you go down that road of
    • 1:59:55how you can tweak the code for insertion and deletion
    • 2:00:00in a binary search tree to kind of make these fixes along the way.
    • 2:00:03And it's going to cost you a few more steps to kind of fix things when
    • 2:00:06they get out of whack But if you do it every insertion or every deletion,
    • 2:00:10at least you can maintain a balanced tree and
    • 2:00:13you'll learn about different types of balanced trees.
    • 2:00:15But for our purposes, now,
    • 2:00:16we don't necessarily get that property even if we do want
    • 2:00:19log N unless you keep it balanced along the way.
    • 2:00:23Now, what about other combinations of arrays and linked list?
    • 2:00:27Like we can really start to match these things up and see what comes out of them.
    • 2:00:30Dictionaries are another abstract data type similar in spirit to
    • 2:00:35stacks and cues in that you can implement them in different ways.
    • 2:00:39A dictionary is a data structure that stores
    • 2:00:42keys and values and those are technical terms,
    • 2:00:44keys and values, the analog in the human world.
    • 2:00:47It would be like literally a dictionary that you'd have in a classroom,
    • 2:00:50like a dictionary with words and definitions more generally known as keys
    • 2:00:55and values. So that's all a dictionary. It is, it associates keys with value.
    • 2:01:00So for instance, you could think of it
    • 2:01:02almost as like two columns in a spreadsheet where on
    • 2:01:04the left you put the key on the right,
    • 2:01:06you put the value or specifically you put the word in a dictionary
    • 2:01:09and the definition thereafter.
    • 2:01:11And that's roughly how the pages on the printed pages in a dictionary are laid out.
    • 2:01:14So dictionaries associate words with definitions
    • 2:01:17or more generally keys with value,
    • 2:01:20but it's an abstract data type
    • 2:01:22in that we could implement this in a bunch of ways.
    • 2:01:23We could use maybe two arrays, one array for the keys, one array for the definitions.
    • 2:01:28And you just kind of hope that they line up at bracket I
    • 2:01:31and this one is the maps to bracket I in this one,
    • 2:01:33but an array is not going to give us the dynamism that we want. Right?
    • 2:01:37You might run out of space when Merriam Webster or whoever
    • 2:01:40comes up with adds new words to the English language,
    • 2:01:42you might not want to be using an array, you might want to use a linked list, but again,
    • 2:01:46linked list kind of then devolve into big O of N and it's not good for
    • 2:01:50dictionaries and spell checking if you have to
    • 2:01:52check every possible word to find something,
    • 2:01:54you know, getting something that's a little faster than that is compelling.
    • 2:01:57So let's consider how maybe Apple, maybe Google,
    • 2:02:00maybe others are actually implementing contacts because even though
    • 2:02:04I implied in week zero and maybe outright said
    • 2:02:07it's an array.
    • 2:02:08It's a big uh list of all of your names of contacts, maybe of some fixed size,
    • 2:02:13they probably better be using some Amal,
    • 2:02:15some variants of
    • 2:02:17a list.
    • 2:02:17Otherwise you could never add more friends,
    • 2:02:19potentially you'd max out and they'd say you have to
    • 2:02:21unfriend someone just to fit it as an aside.
    • 2:02:23This is sort of true in the social media world.
    • 2:02:25Like once you have like 5000 friends on Facebook, you can't have 5000 in one.
    • 2:02:29Once you have some number on linkedin, you can't have more connections.
    • 2:02:32That's not necessarily that they're using a raise,
    • 2:02:34but it is the same implication that they've chosen some finite size for memory.
    • 2:02:39So how might we consider implementing a dictionary
    • 2:02:41specifically for your address book or your context?
    • 2:02:44So you can store the names of it on ideally
    • 2:02:46alphabetically but also their phone numbers and maybe anything else.
    • 2:02:50Well, ultimately,
    • 2:02:51we want to be able to get at someone's name and lead to their number.
    • 2:02:54So the keys and values for our discussion here will be
    • 2:02:57names are the keys and phone numbers are the values.
    • 2:03:00But the values themselves could also include email, address and
    • 2:03:03address, mailing address and all of that.
    • 2:03:04But we'll keep it simple names and phone numbers.
    • 2:03:07So here's how you might think about this or draw it on a chalkboard,
    • 2:03:10two columns or in a spreadsheet left and right.
    • 2:03:13But how could we actually
    • 2:03:14this in memory? Because ideally, we don't want it to devolve into something linear.
    • 2:03:19Like we don't have to look through all of my friends and family and
    • 2:03:21colleagues to find someone whose name starts with Z for instance or anything else.
    • 2:03:25It would be nice to have something logarithmic with binary search.
    • 2:03:30But with binary search again,
    • 2:03:32we have to maybe I maybe use a tree instead.
    • 2:03:36But now we have to use two pointers instead of one.
    • 2:03:38Like there's a lot of trade offs here.
    • 2:03:40But let's see how else we could solve the same problem
    • 2:03:43because wouldn't it be nice?
    • 2:03:44And we've not really talked about this before,
    • 2:03:46if we instead aspire to this sort of holy grail of algorithms,
    • 2:03:50like the best algorithm out there is surely one
    • 2:03:52that's big o of one like constant time.
    • 2:03:55Because what that means is that it doesn't matter if you have one friend, 10 friends,
    • 2:03:59101,000, a million, a billion friends.
    • 2:04:01It doesn't matter how big N is,
    • 2:04:02your searches will always take you the same amount of time.
    • 2:04:06It is independent of N. And that's why
    • 2:04:09it's sort of the ultimate goal for performance.
    • 2:04:13So can we get to this this aspiration?
    • 2:04:16Well, a couple of building blocks,
    • 2:04:17there's this notion in computing known as
    • 2:04:19hashing and hashing is a technique literally
    • 2:04:22a function in math or in code that actually takes any number of inputs
    • 2:04:27and maps them to a finite number of outputs.
    • 2:04:30So if you think back to like high school math domains and ranges,
    • 2:04:33you can take an infinite domain with any
    • 2:04:35values in the world,
    • 2:04:36but it reduces them a hash function to a finite range of specific values.
    • 2:04:40So for instance, it's no accident that we have these four buckets on the stage. Now,
    • 2:04:44each of which has a suit from a deck of cards we got for visibility's sake.
    • 2:04:49The biggest cards we can uh these are the super jumbo playing cards.
    • 2:04:52And in this box are a bunch of randomly ordered playing cards.
    • 2:04:55And typically if you were to ever like,
    • 2:04:57play some game or you wanted to sort these for some reason,
    • 2:05:00how would you go about sorting them by
    • 2:05:01suits and also by number, you know, odds are, if you're like me,
    • 2:05:04you'd probably kind of take some shortcuts and maybe pull out all of the hearts,
    • 2:05:08pull out all of the, the spades, pull out all of the clubs or you kind of bucket,
    • 2:05:12ize it into categories.
    • 2:05:14And that term is actually technical. Here are four buckets to make this clear.
    • 2:05:18And for instance, if the first card I find is like the five of hearts, you know what?
    • 2:05:21Just to kind of make my life easier.
    • 2:05:23I'm going to put that into the hearts bucket or here we have four, here we have five,
    • 2:05:28here, we have six, here, we have the
    • 2:05:31queen and notice that I'm putting these cards into the appropriate buckets. Why?
    • 2:05:36Because ultimately,
    • 2:05:37then I'm gonna have four problems but of smaller size of 13 size problem 13, 13, 13.
    • 2:05:42And frankly, it's just gonna be easier,
    • 2:05:44cognitively dare say algorithmically to then sort each of the 13 cards in
    • 2:05:49these buckets rather than deal with like four suits somehow combined all together.
    • 2:05:53So if you've ever in life made
    • 2:05:54piles, if you've ever literally used buckets like this,
    • 2:05:56you are hashing, I'm taking some number of inputs.
    • 2:06:0052 in this case, and I'm mapping it to a finite number of outputs four in this case.
    • 2:06:05So hashing again, just takes in
    • 2:06:07inputs and hashes them to output values in this way.
    • 2:06:11So beyond that terminology, let's consider what we can now do
    • 2:06:15with hash functions.
    • 2:06:16That's a little more germane to storing things
    • 2:06:18like our friends and family and colleagues in
    • 2:06:20and dictionaries.
    • 2:06:22A hash function is just one that does that I as
    • 2:06:25the human was just implementing or behaving like a hash function.
    • 2:06:28But technically a hash function is actually a math function or a function in C
    • 2:06:32or scratch or soon Python or other languages that takes as input some value,
    • 2:06:37be it a physical card or a name or a number or something else
    • 2:06:41and output some value.
    • 2:06:42And we can use hashing
    • 2:06:45as an operation to implement what we'll call hash tables.
    • 2:06:49And that's kind of what that dictionary was.
    • 2:06:51If you think about how I drew it on the screen as two columns,
    • 2:06:53it's like a table of information keys on the left values on the right.
    • 2:06:57So what is a hash table?
    • 2:06:59The simplest way to think about it is that
    • 2:07:01this is an amalgam of a combination of arrays
    • 2:07:04and length lists, right?
    • 2:07:06We kind of borrowed some ideas of linked lists a moment ago to give us
    • 2:07:10in two dimensions. What if we stick with this idea of having two dimensional world?
    • 2:07:14But now use an array initially.
    • 2:07:17So we get the speed benefits of arrays because everything's contiguous,
    • 2:07:20we can do simple arithmetic and jump to the middle or the
    • 2:07:22middle of the middle or the first or the last very easily.
    • 2:07:25And then you know what,
    • 2:07:25let's kind of use the horizontal part of the screen to give us linked lists as needed.
    • 2:07:30So for instance,
    • 2:07:31if the goal at hand is to implement the contacts in my cell phone or my Mac or PC,
    • 2:07:36let me propose
    • 2:07:37that we start at least in English with an array of size 26. Of course, it's zero index.
    • 2:07:42So it's really location zero through 25.
    • 2:07:45And for the sake of discussion,
    • 2:07:46let me propose that location zero represents a location 25 represents Z.
    • 2:07:50And then everything else in between Y we know from C that we can
    • 2:07:54convert thanks to Asi and unicode from letters to numbers and back and forth.
    • 2:07:58So in constant time, we can find location a in constant time, we can find location Zy
    • 2:08:04because we're using an array just like in week two.
    • 2:08:08All right.
    • 2:08:08Well, suppose that I want to think about these more as letters of the alphabet,
    • 2:08:12the English alphabet rather than numbers.
    • 2:08:14So it's equivalent to label them A through Z.
    • 2:08:16And suppose now I want to start adding friends
    • 2:08:18and family and contacts to my address book.
    • 2:08:21How might this look?
    • 2:08:22Well,
    • 2:08:22if the first one I want to add is Mario Mario's name starts with an M and so that's,
    • 2:08:27you know, ABC DEFG, OK.
    • 2:08:28M goes there. So I'm going to put Mario at that location
    • 2:08:33in the, in the array. After that, I add a second person for instance. How about Luigi?
    • 2:08:38Well, L comes just before M so it stands to reason that it goes there
    • 2:08:41in the array.
    • 2:08:42Meanwhile, if I go and add another character like peach,
    • 2:08:45she's going to go there a few spots away because her name starts with P.
    • 2:08:50Meanwhile,
    • 2:08:50here's a whole bunch of other Nintendo characters that happen
    • 2:08:53to have unique uh letters of their first names.
    • 2:08:56And there's room for everyone,
    • 2:08:58room for everyone on the board A through Z with some blanks in the middle.
    • 2:09:01But you can perhaps see where this is going when and
    • 2:09:04where might a problem arise with this array based approach?
    • 2:09:08Yeah.
    • 2:09:08So when we add someone else whose name collides with
    • 2:09:12one of these existing characters just because by accident,
    • 2:09:15they have a name that starts with the same letter.
    • 2:09:17So for instance, there's Latu
    • 2:09:19here,
    • 2:09:20which collides with Luigi potentially here is the link who collides with both,
    • 2:09:24both of them.
    • 2:09:25But I've drawn a solution to this along the way I could if I was Cronian,
    • 2:09:29just remove Luigi from the data structure and put Laquita
    • 2:09:32in or remove and then put Lincoln there instead.
    • 2:09:34But that's kind of stupid if you can only have like one
    • 2:09:36friend whose name starts with L like that's just bad design.
    • 2:09:40But what if we now
    • 2:09:42in the off chance I have two friends whose names start with the same letter?
    • 2:09:46Well, I'll just kind of string them together, link them together.
    • 2:09:50No pun intended using pointers of sorts.
    • 2:09:53So my vertical here is an array and this is just an artist's rendition.
    • 2:09:56There's no actual notion of up, down left, right in the computer's memory.
    • 2:09:59But this is my array always of size 26 and each of the elements in this array
    • 2:10:05are now not a simple number, but it's
    • 2:10:07a pointer to a linked list.
    • 2:10:09And if there's nothing there, it's just null, null, null, null.
    • 2:10:13But otherwise it's a valid address that points to the first node.
    • 2:10:16And you know what if we have multiple names with the same letters,
    • 2:10:19we can just string these nodes together together
    • 2:10:22using pointers as well.
    • 2:10:24So a hash table then as implemented here is an array of
    • 2:10:28linked lists and that allows us to one get some speed,
    • 2:10:32benefit it because look how fast we inserted or found Mario Luigi and Peach.
    • 2:10:36But it still covers the scenario where OK,
    • 2:10:38some people can have the same first letters, some of these names will collide.
    • 2:10:43So collisions are an expected problem with the hash table whereby two
    • 2:10:47values from some domain happen to map to the same value.
    • 2:10:52And frankly, you'll see this here too.
    • 2:10:53So these buckets are technically a finite size.
    • 2:10:56They're definitely big enough for 30
    • 2:10:57cards each.
    • 2:10:58But you can imagine a world where if I'm using two decks, three decks or four decks,
    • 2:11:02I'm gonna run out of space and then my data structure can't fit any more information.
    • 2:11:06But we're not going to have this problem here because the link lists as we've seen
    • 2:11:09can grow and even shrink as much as they want in the world of Nintendo.
    • 2:11:13There's actually lots of collisions and these aren't even all of the characters.
    • 2:11:17So that's then a hash table. So with a hash table in mind,
    • 2:11:21how fast is it?
    • 2:11:23Did we achieve that holy grail of like constant time?
    • 2:11:26Well, for some of these names, if I back up,
    • 2:11:28yeah, it's kind of constant time like Yoshi and Zelda boom, constant time.
    • 2:11:32Location, 24 location 25. Some of them though like Luigi Lupi
    • 2:11:36to link uh it's not quite constant time because I first have to
    • 2:11:40get to Luigi's location and then I have to follow this link list.
    • 2:11:44So technically, then
    • 2:11:46what's the running time of searching a hash table?
    • 2:11:50Sometimes you'll get lucky, but sometimes you won't
    • 2:11:53consider the worst case. Big O is often used to describe worst case.
    • 2:11:57So what would be the worst case in your own contacts?
    • 2:12:01Little letter?
    • 2:12:03So and why?
    • 2:12:06Oly,
    • 2:12:07but then the data notation
    • 2:12:09correct?
    • 2:12:09And so to summarize like, you know,
    • 2:12:10some weird scenario like all of your friends and family in
    • 2:12:13contacts could have names that start with the same letter and
    • 2:12:16then it doesn't matter that this is a hash table with
    • 2:12:19an array of link lists for all intents and purposes.
    • 2:12:22If your friends' names only start
    • 2:12:24the same letter, all you have is a linked list much like with a tree.
    • 2:12:28If you don't keep it balanced, all you have really is a linked list.
    • 2:12:32So technically speaking,
    • 2:12:34yes, hash tables are big o of N
    • 2:12:36even though even if you're good about
    • 2:12:39um
    • 2:12:40even if you have friends uh
    • 2:12:42in the worst case, hash tables are big. O even why?
    • 2:12:45Because it can devolve into this perverse scenario where you just
    • 2:12:48have lots and lots of collisions all at the same values.
    • 2:12:50But you know, there's gotta be a way to fix this, right?
    • 2:12:52Like how could we chip away at the length of these chains, so to speak,
    • 2:12:56could I decrease
    • 2:12:57the length of these link lists?
    • 2:12:58So that with much higher probability there's no collisions?
    • 2:13:01Well, maybe the problem is that I started with just 26 buckets,
    • 2:13:05I mean four buckets here 26 here.
    • 2:13:07Maybe the problem is the size of my array.
    • 2:13:09So why don't I do this instead of storing and let me jump ahead
    • 2:13:12instead of
    • 2:13:15let me fix that. Do we wanna do that?
    • 2:13:17How about this?
    • 2:13:18Instead of storing the uh just the first letter is the indexes indices of this array?
    • 2:13:25What if I instead just give myself a bigger array and it's too big to fit on the screen.
    • 2:13:29But what if I instead have a bucket for names that start with LA A
    • 2:13:33and lab and L
    • 2:13:35AC L
    • 2:13:35ad dot dot dot All the way down.
    • 2:13:37Now, when I hash these names into my, into my hash table, Latu
    • 2:13:43is going to end up at their own location here, link at their own location here,
    • 2:13:46Luigi at their own location here.
    • 2:13:48And so now I don't have linked lists. I really just have a an array of names.
    • 2:13:54So now I'm actually back to constant time. Why?
    • 2:13:57Because so long as every letter of the alphabet has an ask you value,
    • 2:14:01I can get that in constant time and we did that as far back as week one.
    • 2:14:05And so I can figure out what the arithmetic location is of each of these buckets
    • 2:14:10just by looking at 123 characters or the total number of letters that I care about,
    • 2:14:15which is just three in this case.
    • 2:14:17So this feels like a solution even though I haven't drawn all the names,
    • 2:14:20it feels like we've solved the problem.
    • 2:14:22But
    • 2:14:23what's the downside or trade off of what we've just done?
    • 2:14:26Sorry,
    • 2:14:28memory.
    • 2:14:29So not pictured here is the dot dot dot And everything above and everything
    • 2:14:33below this just exploded in terms of the number of locations in this array.
    • 2:14:38Why?
    • 2:14:39Because if I'm taking into account, not just the first letter but the first,
    • 2:14:42the 2nd and 3rd, that's 26 to the third power 26 times 26 times 26.
    • 2:14:48And even though there's going to be a crazy number of names that just
    • 2:14:52don't exist, I can't think of a Nintendo character whose name starts with LA A.
    • 2:14:56You still need that bucket. Why?
    • 2:14:58Because otherwise you don't have contiguous,
    • 2:15:00you can't just arbitrarily label these buckets if you want to
    • 2:15:03be able to use a function that looks at 1st 2nd,
    • 2:15:063rd letter and then arithmetically figures out
    • 2:15:08where to go,
    • 2:15:09whether it's 0 to 25 or 0 to 26 to the
    • 2:15:12third power minus one being the number of buckets there.
    • 2:15:16So there's a trade off there.
    • 2:15:17You're wasting a huge amount of memory just to give yourself that time.
    • 2:15:21But that would then give us constant time.
    • 2:15:23So in that sense, if we have an ideal hash function,
    • 2:15:26whereby the function ensures that no values collide,
    • 2:15:30we do actually obtain that holy grail of big o of one because
    • 2:15:34it only takes one or maybe three steps to find that name's location.
    • 2:15:39Now, to make this clear, how do we translate this to something like code?
    • 2:15:42Well,
    • 2:15:42here again is the structure we used last time for that of a person and
    • 2:15:46a person who had a name and a number here for a hash table.
    • 2:15:49We might do something a little bit differently. We might now have a node
    • 2:15:53in a hash table, storing the person's name,
    • 2:15:56person's phone number
    • 2:15:58and a pointer to the next such person in that chain if needed.
    • 2:16:02Hopefully, this is going to be null most of the time all of the time.
    • 2:16:05But we need it just in case we do have that collision we've seen in our pictures,
    • 2:16:09the names like Mario Luigi and so forth.
    • 2:16:11We didn't see the numbers, but that's what's inside of those boxes on the picture.
    • 2:16:14But that kind of node would give us what we need to build up these linked lists.
    • 2:16:19Meanwhile, what is the hash table itself? That vertical strip along the left?
    • 2:16:23Well, it's really just
    • 2:16:24a variable.
    • 2:16:25We could call it table for short of size 26 and each
    • 2:16:28of the locations in that array that was on the side here,
    • 2:16:31at least in the simple small version
    • 2:16:33was a pointer to a node.
    • 2:16:35So it's null if there's no,
    • 2:16:37no one there or it's a valid address of the first node in the linked list.
    • 2:16:42So this then is a hash table and each of
    • 2:16:44those nodes to be clear would be defined as follows.
    • 2:16:48So what's the takeaway then with a hash table?
    • 2:16:50Ideally with good hash function and with a good set of
    • 2:16:53inputs where you're not presented with some perverse set of inputs.
    • 2:16:56That's like all of the friends whose names start with the same letter.
    • 2:17:00Ideally what the hash function will be doing for you
    • 2:17:02is this the input is going to be someone's name
    • 2:17:05the algorithm in the middle is going to be the hash function and
    • 2:17:07the output is the so called hash value or location in this case.
    • 2:17:11So for instance, in the case of Mario, when we had just, when
    • 2:17:14we had just 26 buckets total,
    • 2:17:17the input to the hash function would be Mario,
    • 2:17:19that hash function would really just look at the first letter
    • 2:17:21M in that case and would ideally output the number 12,
    • 2:17:25I did the same thing.
    • 2:17:26But in my head, whenever I pulled out a card, like the five of diamonds here,
    • 2:17:29I figured out, OK, that's location, zero out of my 0123 or four total buckets.
    • 2:17:35Here, we're doing it instead alphabetically.
    • 2:17:37And so someone like Luigi meanwhile would have a hash value of 11,
    • 2:17:41these numbers would be bigger. Of course, though, if we're looking at 1 to 3 letters
    • 2:17:46instead of just one.
    • 2:17:48So with that said,
    • 2:17:50if we were to implement this in actual code, a hash function, I did it sort of
    • 2:17:55um physically by acting out the cards here is how we might
    • 2:17:59implement this in code using C if the input is a string
    • 2:18:03and that string is a word like the first name of a person.
    • 2:18:06Well,
    • 2:18:07we could implement the function as follows the
    • 2:18:09argument that it takes is a charge star A
    • 2:18:11K A string called word I want it to return for instance an integer,
    • 2:18:150 to 25 in the simplest of cases.
    • 2:18:18What do I then want to do?
    • 2:18:19Well, if I borrow that C type library that we introduced a couple of weeks ago,
    • 2:18:23I can call the two upper function
    • 2:18:25on the Oh,
    • 2:18:26there's a bug.
    • 2:18:28Don't change slides right before class.
    • 2:18:31Sorry.
    • 2:18:34No cookies though.
    • 2:18:35Um,
    • 2:18:38sorry about that.
    • 2:18:39Let me fix this.
    • 2:18:43OK, if you don't mind, let me remind for those watching online.
    • 2:18:46So if we want to implement this in code,
    • 2:18:48I could have a function called hash whose argument is a string A K hr star,
    • 2:18:53a name of which is word where the word is like the first name, first word in their name,
    • 2:18:58we want this function to return an inch which ideally in this
    • 2:19:00case of 26 buckets would be a number from 0 to 26.
    • 2:19:04And how do we achieve that?
    • 2:19:05Well, if we use our old friend C type,
    • 2:19:07which had a function like two upper from a couple of weeks back,
    • 2:19:10we could pass in the first letter of that word,
    • 2:19:14capitalize it,
    • 2:19:15which is going to give us a number that's 65 66 67 on up for the 26 English letters.
    • 2:19:20And if I subtract 65 A K A quote
    • 2:19:23unquote
    • 2:19:24a single quotes because it's a char that's going to
    • 2:19:26mathematically give me a number between zero and 25 inclusive.
    • 2:19:32There's a potential bug if I pass in punctuation or anything, that's not
    • 2:19:36be
    • 2:19:37like bad things will happen. So I should probably have some more error checking.
    • 2:19:40But this is the simplest way in code that I can implement a
    • 2:19:43hash function that looks only at the first letter of their name.
    • 2:19:46Probably not ideal because I can think of friends in the
    • 2:19:48real world who have the same first letter of their name,
    • 2:19:50whether this is better or worse than looking at two letters, three letters,
    • 2:19:54four letters,
    • 2:19:55it's going to depend on how much memory you want to
    • 2:19:57spend and how much time you want to ultimately save.
    • 2:19:59Let me tweak this though a little bit.
    • 2:20:01It's conventional in C just so you know that if you're passing in a string,
    • 2:20:06that is a char star to a function and you
    • 2:20:08have no intention of letting that function change the string,
    • 2:20:11you should probably declare the argument to the function as constant.
    • 2:20:15And that will tell the compiler to,
    • 2:20:17please don't let the human programmer actually
    • 2:20:19change that actual word in this function.
    • 2:20:22It's just not their place to do so.
    • 2:20:23And we can actually do something else in
    • 2:20:25a hash function because you're using this case,
    • 2:20:28the output the integer as a location in an array.
    • 2:20:31It had better not be negative, right? You want it to be zero or positive.
    • 2:20:35And so technically, if you want to impose that in code,
    • 2:20:38you can specify that the ins that's being returned has to be unsigned.
    • 2:20:42That is it's zero on up through the positive numbers, it is not a negative value.
    • 2:20:47So this is slightly better than the previous version
    • 2:20:49where we didn't have these defenses in place.
    • 2:20:53All right.
    • 2:20:54So what does this actually mean? In practice,
    • 2:20:57you don't get to necessarily pick the hash
    • 2:21:00function based on the names of your friends,
    • 2:21:02right?
    • 2:21:02Presumably Apple and Google and others already chose their
    • 2:21:05hash function independent of what your friends names are.
    • 2:21:08So ideally, they wanna pick a hash function that generally is quite fast big O of one.
    • 2:21:14But practically speaking in a hash table,
    • 2:21:16unless you get really lucky with the inputs, uh which you generally won't really,
    • 2:21:21it's big O of N running time.
    • 2:21:23Why? Because in the worst possible scenario, you might have one long length list.
    • 2:21:27But in practice, ideally,
    • 2:21:29and this is a little naive but suppose that you have
    • 2:21:31a uniform distribution of friends in the world where one third,
    • 2:21:341 26 of them have names starting with a
    • 2:21:38and then another 1 26 out of B and then dot dot
    • 2:21:41dot Z that would be a nice uniform distribution of friends.
    • 2:21:44Technically,
    • 2:21:45then you're running time of a hash table
    • 2:21:47for searching it deleting or inserting would technically be
    • 2:21:50big O of N divided by K where K is the number of buckets a constant.
    • 2:21:54So it's technically big O of N divided by 26.
    • 2:21:57Now, again, per our discussion of big O notation,
    • 2:22:01that's still the same thing, right? You get rid of constant factors.
    • 2:22:04So, yes, it's 26 times faster. The chains are 21 26 the length.
    • 2:22:09But asymptotically in terms of big O notation,
    • 2:22:11it's still big O of N and here's where now we can start to veer away from, like,
    • 2:22:16what is theoretically, right?
    • 2:22:17Versus what is practically right
    • 2:22:20in reality, in the real world, if you work for Google, Microsoft Apple and others,
    • 2:22:2426 times faster is actually faster in the real world.
    • 2:22:28Even though a mathematician might say, ah,
    • 2:22:30that's really the same thing, but it's not like the real world wall clock time.
    • 2:22:34If you watch the number of seconds passing on the clock N over
    • 2:22:37K is a much better running time than big O of N.
    • 2:22:41So here too, we're getting to the point where the conversation
    • 2:22:43need to become a little more sophisticated.
    • 2:22:46It's not quite as simple as theory versus practice.
    • 2:22:48It depends on what matters ultimately to you.
    • 2:22:51But ideally and literally,
    • 2:22:53if somehow or other they picked an ideal hash function big O of one
    • 2:22:58would really be the ideal here would really be the running time we achieve.
    • 2:23:02And what you'll generally find in the real world is that you don't use
    • 2:23:04hash functions that are as simplistic as just look at the first letter.
    • 2:23:08And honestly, they won't generally look at the first and the second and letter.
    • 2:23:11They'll use some even fancier math to put real downward
    • 2:23:14pressure on the probability of collisions so that yes,
    • 2:23:16they will still happen.
    • 2:23:18But most of the time a really good hash function,
    • 2:23:21even if it's not quite ideal will be darn close to constant time,
    • 2:23:25which makes hash tables.
    • 2:23:26And in turn dictionaries,
    • 2:23:28one of the most universally compelling data structures to use.
    • 2:23:32Now, with that said, we have time for just another data structure or so.
    • 2:23:35And this is not a typo, this one's called a try
    • 2:23:38and a try is short for retrieval,
    • 2:23:40which is weird because you say retrieval but we say try.
    • 2:23:42But that's the etymology of try and a try is sort of the weirdest
    • 2:23:46amalgamation of all of these things whereby a try is a tree of arrays.
    • 2:23:52So a hash table is an array of length lists.
    • 2:23:56A
    • 2:23:56try is a tree of arrays.
    • 2:24:00So at some point,
    • 2:24:01computer scientists just started like mashing
    • 2:24:03together all of these different inputs.
    • 2:24:04And let's see what comes out of it.
    • 2:24:06But a try is actually really interesting.
    • 2:24:08And what you're about to see is a data structure that is literally big o of one time,
    • 2:24:13constant time.
    • 2:24:15But there is a downside.
    • 2:24:16So in a try,
    • 2:24:18every node is an array
    • 2:24:20and every location in that array generally represents a letter of
    • 2:24:24the alphabet that you could generalize this away from words too.
    • 2:24:27But a try in this case, if we have a root node,
    • 2:24:31that root node is technically a big array with 26 locations.
    • 2:24:34And if you want to insert names or words more generally into a try, what you do is this,
    • 2:24:40you hash again and again and again, creating one array for every letter in your word.
    • 2:24:46So what do I mean by that?
    • 2:24:47If we've got 20 six elements here, this would be representing a,
    • 2:24:51this would be representing Z and initially,
    • 2:24:53these are all null by default when you have just this root.
    • 2:24:56But suppose I want to insert a few friends of mine including Toad, for instance to
    • 2:25:01O
    • 2:25:01ad is the name. So how would I do that?
    • 2:25:04I would first find the location for T based on its numbers zero through 25.
    • 2:25:09And if this is T, what would I then do,
    • 2:25:11I would change the null to actually be a pointer to another node, a K a another array.
    • 2:25:16And then I would go into the second array and hash on the second letter of Toad's name,
    • 2:25:21which is of course, o
    • 2:25:22and then I would set a pointer to a third
    • 2:25:24node in my tree which we'd be represented here.
    • 2:25:28So another 26 pointers,
    • 2:25:31then I would find the pointer representing A and I would create finally 1/4 node,
    • 2:25:35another array representing
    • 2:25:38the fourth letter of Tode's name.
    • 2:25:40But because Toad's name ends with
    • 2:25:43D
    • 2:25:44and therefore I already have four nodes here.
    • 2:25:47We need to specially color though we could probably use an actual variable here.
    • 2:25:52I need to somehow indicate that Toad's name stops here. So it's not null per se.
    • 2:25:57This actually means that to
    • 2:25:59A is in this data structure.
    • 2:26:01But I did this deliberately because another friend of mine might be Todt
    • 2:26:03in the Nintendo world. And Todt
    • 2:26:05course is a super string of Toad that is, it's longer but it shares a common prefix.
    • 2:26:10So Todt
    • 2:26:11could continue and I could have another node for the E another node for the T,
    • 2:26:14another node for the second T and another node for the last E.
    • 2:26:17But I somehow have to sort of mark that E is the end of her name as well.
    • 2:26:22So even though they share a common prefix,
    • 2:26:25the fact that there's two green boxes on the screen means that
    • 2:26:28to O
    • 2:26:29ad is in this dictionary as a key as is to A
    • 2:26:33E
    • 2:26:33TT E is another key. And technically speaking, what's in these boxes too.
    • 2:26:37It's not just a pointer, it's probably Toad and Todd's
    • 2:26:40phone number and email address. And like the actual value of the dictionary,
    • 2:26:44which is to say this too is in fact a dictionary,
    • 2:26:47a dictionary is just an abstract data type, a collection of key value pairs,
    • 2:26:51just like I claimed a stack and A Q as in how you implement it can differ,
    • 2:26:55you could implement it with a hash table, an array of linked lists as we just did.
    • 2:26:59Or you can implement
    • 2:27:00a dictionary as a try a tree of arrays. And let me add one more name to the mix.
    • 2:27:06Um Tom for instance,
    • 2:27:08a valid name from the universe. Tom just means that OK,
    • 2:27:11that name exists in this structure as well.
    • 2:27:14Now, what is the implication of storing the names in this way,
    • 2:27:19which is sort of implicitly like I'm effectively storing toad and Todt
    • 2:27:24and Tom in this uh data structure
    • 2:27:28without actually storing T or O or A or D or any of the other letters,
    • 2:27:32I'm just implicitly storing those letters by actually
    • 2:27:35using valid pointers that lead to another node.
    • 2:27:38And so what's the implication of this in code? Well, in code, it might look like this.
    • 2:27:42Every node in a try is now redefined as being an array
    • 2:27:47of size 26. And I'll call it Children just to borrow the family tree metaphor.
    • 2:27:52And that every um
    • 2:27:54uh in each of these nodes there is room for the person's
    • 2:27:58phone number for instance a KA A string or char star.
    • 2:28:01So what does this mean?
    • 2:28:02Well, if there's actually a non null number there,
    • 2:28:05that's equivalent to there being a green box.
    • 2:28:07Like if you actually see plus 1 6, 1 7 dash,
    • 2:28:09whatever there,
    • 2:28:10that means there's a green box because Toad's number is right here or
    • 2:28:14Todd's number is down here or Tom's is over there.
    • 2:28:17But if this is null,
    • 2:28:18that just means that maybe this is the T or the O
    • 2:28:20or the E which are not actually ends of people's names.
    • 2:28:25So that's all these nodes actually are.
    • 2:28:27And if we think back now to what this data structure looks like,
    • 2:28:31this is in fact
    • 2:28:33a data structure that can be navigated in constant time. Why?
    • 2:28:37Well,
    • 2:28:37all we need to keep track of this data structure is literally one pointer called tri.
    • 2:28:41That's a pointer to the first of these nodes, the so called root of the tri.
    • 2:28:44And when it comes to now thinking about the running time of a try.
    • 2:28:48Well, what is it?
    • 2:28:49Well, if you've got n friends in your contacts already or if there's N keys
    • 2:28:54in that data structure,
    • 2:28:56how many steps does it take to find anyone?
    • 2:28:59Well, whether I have three names, Toad Todt
    • 2:29:01or Tom or 3 million names in that data structure,
    • 2:29:04how many steps will it take me to find Toad ever
    • 2:29:07to
    • 2:29:08ad how many steps for Toad
    • 2:29:10to
    • 2:29:10ad E tt E eight steps? How about for Tom? 123?
    • 2:29:15And frankly, I'm sure if we looked it up,
    • 2:29:17there's probably a limit on the number
    • 2:29:18of characters in a Nintendo character's name.
    • 2:29:21Maybe it's 20 characters total or maybe a little longer 30 there's some fixed value,
    • 2:29:25it's not unbounded,
    • 2:29:26there's not an infinite number of letters in any Nintendo character's name.
    • 2:29:29So there's some constant value call it K.
    • 2:29:32So no matter whose name you're looking for, it's gonna take you maximally K steps.
    • 2:29:36But K is a constant and we ever always said that big O of K is the same thing as big O of
    • 2:29:41one.
    • 2:29:42So for all intents and purposes,
    • 2:29:44even though we're taking a bit of uh liberty here
    • 2:29:46searching a try inserting into a try deleting from a try
    • 2:29:50is constant time because if you have a
    • 2:29:52million a billion names in the dictionary already,
    • 2:29:55it's gonna take up a huge amount of space,
    • 2:29:57but it does not affect how many steps it takes to find Toad or Todt
    • 2:30:01or
    • 2:30:02Tom.
    • 2:30:03That depends only on the length of their names,
    • 2:30:05which effectively is a constant value.
    • 2:30:07But there is a downside here and it's kind of a big one.
    • 2:30:11In practice. I dare say most computers, most systems would actually use
    • 2:30:16the hash tables,
    • 2:30:17not tries to implement dictionaries collections of key value pairs.
    • 2:30:21What's the downside of this here?
    • 2:30:23Data structure might you think?
    • 2:30:26And this is just representative picture
    • 2:30:28for to Tom and Todt
    • 2:30:32all the space it takes up.
    • 2:30:33I mean, even for these three names, look at how many empty pointers there are.
    • 2:30:37So they're null to be sure.
    • 2:30:38But there's like 25 unused spaces here, another 25 unused spaces here,
    • 2:30:4224 unused spaces here.
    • 2:30:44And what's not pictured is if I've got more and more names,
    • 2:30:47this thing's just gonna blow up with more and more and more and more arrays even
    • 2:30:51though there's not going to be someone whose name starts with like LA A or L
    • 2:30:56or LBB,
    • 2:30:57there's going to be so many combinations of letters
    • 2:30:59where it's just going to be null pointers instead.
    • 2:31:01So it takes up a huge amount of space but it does give us constant time.
    • 2:31:06And that then is this here trade off.
    • 2:31:08So I would encourage you here on out as we exit the world of C.
    • 2:31:12And so much of today's code in the past several weeks,
    • 2:31:14code will soon be reduced in a week's time to just one line of code, two lines of code.
    • 2:31:19Because Python and the authors of Python will have
    • 2:31:22implemented all of this week's and last week's.
    • 2:31:24And prior week's ideas for us,
    • 2:31:26we'll be able to operate at a higher level of
    • 2:31:28abstraction and just think about what problems we want to
    • 2:31:30solve and how we want to do so algorithmically and
    • 2:31:33with data structures and data structures and conclusion are everywhere.
    • 2:31:37Has anyone recognized this spot
    • 2:31:41in Harvard Square?
    • 2:31:44Anyone what are we looking at?
    • 2:31:47So this is sweet green,
    • 2:31:48a popular salad place and this is actually a
    • 2:31:50dictionary or really a hash table of sorts.
    • 2:31:54Why?
    • 2:31:54Well,
    • 2:31:54if you buy a very expensive salad at sweet green and they put it on
    • 2:31:57the shelf for you if you've ordered via the app or online in advance.
    • 2:32:01And if I for instance, were to order a salad, it would probably go under the d heading.
    • 2:32:04If Carter were to order a salad, it would go under C
    • 2:32:06ya, under Y and so they hash the salads based on your first name
    • 2:32:11to a particular location on the shelf. Why is that a good thing?
    • 2:32:14Well, if it were just one long shelf that wasn't even alphabetical,
    • 2:32:17it would be big O of N for me to find my salad and
    • 2:32:19for Carter and Julia to find theirs because they've got 26 letters here.
    • 2:32:23It's big o of one, it's one step for any of us to find our salads except again,
    • 2:32:28in perverse situations where to my, this system devolve at like,
    • 2:32:3312:30 p.m. in the afternoon.
    • 2:32:35For instance,
    • 2:32:36what could go wrong?
    • 2:32:40Yeah, a lot of people with the same first letters of their names might order a salad.
    • 2:32:44So there's a lot of like DDDDD. Where do we put the next person? Ok.
    • 2:32:48Well, maybe we overflow to e, what if there's a lot of e people, it overflows to f,
    • 2:32:51what if it overflows and then we go to G,
    • 2:32:52and it sort of devolves anyway into a linked list or really multiple
    • 2:32:56arrays that you have to search in big O of end time.
    • 2:32:59I've even been to Sweet Green at non popular times and
    • 2:33:02sometimes the sta just don't even choose to use the dictionaries.
    • 2:33:04They just put it like what's closest to them.
    • 2:33:06So you have to search the same thing anywhere.
    • 2:33:08But you'll start to see now that you've seen some
    • 2:33:10of these building blocks that data structures are everywhere.
    • 2:33:12Algorithms are everywhere.
    • 2:33:13And among the goals of CS 50 now are to harness these ideas most efficiently.
    • 2:33:17So that's a wrap. We'll see you next time.
    • 2:33:21Yeah.
  • CS50.ai
Shortcuts
Before using a shortcut, click at least once on the video itself (to give it "focus") after closing this window.
Play/Pause spacebar or k
Rewind 10 seconds left arrow or j
Fast forward 10 seconds right arrow or l
Previous frame (while paused) ,
Next frame (while paused) .
Decrease playback rate <
Increase playback rate >
Toggle captions on/off c
Toggle mute m
Toggle full screen f or double-click video