CS50 Video Player
    • 🧁

    • 🍫

    • 🍏

    • 🍿
    • 0:00:00Introduction
    • 0:00:54Week 1 Recap
    • 0:04:47Preprocessing
    • 0:07:05Compiling
    • 0:09:01Assembling
    • 0:09:29Linking
    • 0:12:36buggy0.c
    • 0:16:13buggy2.c
    • 0:25:14Debugging Tools
    • 0:26:02RAM
    • 0:29:11Arrays
    • 0:30:01scores0.c
    • 0:41:47scores2.c
    • 0:49:45scores4.c
    • 0:52:21string0.c
    • 1:00:42Null Terminator
    • 1:03:06strlen.c
    • 1:06:16ascii0.c
    • 1:09:39capitalize0.c
    • 1:12:23capitalize1.c
    • 1:16:38argv0.c
    • 1:21:25argv1.c
    • 1:24:52Ciphering
    • 1:33:15exit.c
    • 1:36:58Finding 50
    • 1:40:38Sorting on Stage
    • 1:50:27Bubble Sort
    • 1:51:34Selection Sort
    • 1:52:23Computational Complexity
    • 1:57:42Merge Sort
    • 2:04:29Comparing Sorts Visually
    • 0:00:00[MUSIC PLAYING]
    • 0:00:49DAVID J. MALAN: All right.
    • 0:00:50This is CS50 and this is the start of week two.
    • 0:00:54And you'll recall that over the past couple of weeks,
    • 0:00:56we've been building up.
    • 0:00:57First initially from Scratch, the graphical programming language
    • 0:01:00that we then, just last week, translated to the equivalent program NC.
    • 0:01:04And of course, there's a lot more syntax now.
    • 0:01:07It's entirely text but the ideas, recall, were fundamentally the same.
    • 0:01:11The catch is that computers don't understand this.
    • 0:01:13They only understand what language?
    • 0:01:15AUDIENCE: [INAUDIBLE]
    • 0:01:16DAVID J. MALAN: zeros and ones or binary.
    • 0:01:18And so there's a requisite step in order for us to get from this code to binary.
    • 0:01:23And what was that step or that program or process called?
    • 0:01:26AUDIENCE: [INAUDIBLE]
    • 0:01:27DAVID J. MALAN: Yeah, so compiling.
    • 0:01:28And of course, recall as you've now experimented
    • 0:01:30with this past week that to compile a program,
    • 0:01:32you can use clang for C, language.
    • 0:01:34And you can just say clang and then the name of the file
    • 0:01:36that you want to compile.
    • 0:01:37And that outputs by default a pretty oddly named program.
    • 0:01:40Just a dot out.
    • 0:01:41Which stands for assembler output.
    • 0:01:43More on that in just a moment.
    • 0:01:44But recall too that you can override that default behavior.
    • 0:01:47And you can actually say, Output instead a program
    • 0:01:49called, hello instead of just a dot out.
    • 0:01:52But you can go one step further, and you actually use Make.
    • 0:01:55And Make it self is not a compiler, it's a build utility.
    • 0:01:58But in layman's terms, what does it do for us?
    • 0:02:00AUDIENCE: [INAUDIBLE]
    • 0:02:02DAVID J. MALAN: compiles it.
    • 0:02:03And it essentially figures out all of those otherwise cryptic
    • 0:02:07looking command line arguments.
    • 0:02:08Like dash-o something, and so forth.
    • 0:02:10So that the program is built just the way
    • 0:02:12we want it without our having to remember
    • 0:02:14those seemingly magical incantations.
    • 0:02:16And though that only works for programs as simple as this.
    • 0:02:20In fact, some of you with the most recent problems that
    • 0:02:23might have encountered compilation errors that we actually
    • 0:02:25did not encounter deliberately in class because Make was helping us out.
    • 0:02:29In fact, as soon as you enhance a program
    • 0:02:31to actually take user input using CS50's library by including CS50 dot H,
    • 0:02:36some of you might have realized that all of a sudden the sandbox,
    • 0:02:39and more generally Clang, didn't know what get_string was.
    • 0:02:43And frankly, Clang might not even known what a string was.
    • 0:02:45And that's because those two are features of CS50's library
    • 0:02:49that you have to teach Clang about.
    • 0:02:51But it's not enough to teach Clang what they look like, as by including CS50.h.
    • 0:02:57Turns out there's a missing step that Make helps us solve
    • 0:03:01but that you too can just solve manually if you want.
    • 0:03:04And by that I mean this, instead of compiling a program with just Clang,
    • 0:03:08hello.c.
    • 0:03:09When you want to use CS50's library, you actually
    • 0:03:13need to add this additional command line argument.
    • 0:03:15Specifically at the end, can't go in the beginning like dash-O.
    • 0:03:18And dash-L stands for link.
    • 0:03:20And this is a way of telling Clang, by the way when compiling my program,
    • 0:03:24please link in CS50's zeros and ones that we the staff
    • 0:03:28wrote some weeks ago and installed in the sandbox for you.
    • 0:03:31So you've got your zeros and ones and then
    • 0:03:33you've got our zeros and ones so to speak.
    • 0:03:35And dash-LCS50 says to link them together.
    • 0:03:38So if you were getting some kind of undefined reference error to get_string
    • 0:03:42or you didn't--
    • 0:03:43you weren't able to compile a program that just used any of the get functions
    • 0:03:46from CS50's library.
    • 0:03:47Odds are, this simple change dash-LCS50 would have fixed.
    • 0:03:51But of course, this isn't interesting stuff to remember, let alone
    • 0:03:54remembering how to use dash-0 as well, at which point
    • 0:03:57the command gets really tedious to type.
    • 0:03:59So here comes, Make again.
    • 0:04:00Make automates all of this for us.
    • 0:04:02And in fact, if you henceforth start running Make and then
    • 0:04:04pay closer attention to the fairly long line of output that it outputs,
    • 0:04:08you'll actually see mention of dash-LCS50,
    • 0:04:11you'll see mention of even dash-LM, which stands for math.
    • 0:04:14So if you're using round, for instance, you
    • 0:04:16might have discovered that round two also
    • 0:04:18doesn't work out of the box unless you use Make itself
    • 0:04:21or this more nuanced approach.
    • 0:04:25So this is all to say that compiling is a bit of a white lie.
    • 0:04:28Like, yes you've been compiling and you've been going from source code
    • 0:04:31to machine code.
    • 0:04:32But it turns out that there's been a number of other steps happening
    • 0:04:35for you that we're going to just slap some labels on today.
    • 0:04:37At the end of the day, we're just breaking the abstraction.
    • 0:04:40So compiling is this abstraction from source code to machine code.
    • 0:04:42Let's just kind of zoom in briefly to appreciate
    • 0:04:45what it is that's going on in hopes that it makes the code we're
    • 0:04:47compiling a little more understandable.
    • 0:04:50So step one of four, when it comes to actually compiling a program
    • 0:04:54is called Pre-processing.
    • 0:04:55So recall that this program we just looked at had a couple of
    • 0:04:58includes at the top of the file.
    • 0:05:00These are generally known as pre-processor directives.
    • 0:05:02Not a particularly interesting term but they're
    • 0:05:05demarcated by the hash at the start of these lines.
    • 0:05:08That's a signal to Clang that these things should be handled first.
    • 0:05:12Preprocessed.
    • 0:05:13Process before everything else.
    • 0:05:15And in fact, the reason for this we did discuss last week, inside of CS50.h
    • 0:05:20is what, for instance?
    • 0:05:21AUDIENCE: [INAUDIBLE]
    • 0:05:24DAVID J. MALAN: Specifically, the declaration of get strings.
    • 0:05:27So there's some lines of code, the prototype if you recall,
    • 0:05:30that one line of code that teaches Clang what the inputs to get_string are
    • 0:05:34and what the outputs are.
    • 0:05:35The return type and the arguments, so to speak.
    • 0:05:38And so when you have include CS50.h at the top of the file, what
    • 0:05:42is happening when you first run Clang during this so-called pre-processing
    • 0:05:45step, is Clang looks on the hard drive for the file literally called CS50.h.
    • 0:05:49It grabs its contents and essentially finds and replaces this line here.
    • 0:05:54So somewhere in CS50.h is a line like this yellow one here
    • 0:05:58that says get_string, is a function that returns a string.
    • 0:06:02And it takes as input, the so-called argument, a string
    • 0:06:05that we'll call prompt.
    • 0:06:06Meanwhile, with include standard I/O. What's the point of including that?
    • 0:06:10What is declared inside of that file presumably?
    • 0:06:14Yeah?
    • 0:06:14AUDIENCE: It's the standard inputs and outputs.
    • 0:06:16DAVID J. MALAN: Standard inputs and outputs.
    • 0:06:17And more specifically, what example there of?
    • 0:06:19What function?
    • 0:06:19AUDIENCE: [INAUDIBLE]
    • 0:06:20DAVID J. MALAN: So printf.
    • 0:06:21The other function we keep using.
    • 0:06:22So inside of standard io.h, somewhere on the sandbox's hard drive
    • 0:06:26is similarly a line of code that frankly looks a little more cryptic
    • 0:06:29but we'll come back to this sort of thing
    • 0:06:31down the road, that says print if is a function.
    • 0:06:33Happens to return on int, but more on that another time.
    • 0:06:36Happens to take a char* format.
    • 0:06:38But more on that another time.
    • 0:06:40Indeed, this is one of the reasons we hide
    • 0:06:41this detail early on because there's some syntax that's
    • 0:06:44just a distraction for now.
    • 0:06:45But that's all that's going on.
    • 0:06:46The sharp include sign is just finding and replacing the contents.
    • 0:06:50Plus dot, dot, dot, a bunch of other things in those files as well.
    • 0:06:54So when we say pre-processing, we just mean
    • 0:06:56that that's getting substituted in so you don't have to copy and paste
    • 0:06:59this sort of thing manually yourself.
    • 0:07:01So "compiling" is a word that actually has a well-defined meaning.
    • 0:07:04Once you've preprocessed your code, and your code looks essentially like this,
    • 0:07:08unbeknownst to you, then comes the actual compilation step.
    • 0:07:11And this code here gets turned into this code here.
    • 0:07:15Now this is scary-looking, and this is the sort of thing
    • 0:07:18that if you take a class like CS61 at Harvard,
    • 0:07:21or, more generally, systems programming, so to speak,
    • 0:07:23you might see something like this.
    • 0:07:25This is x86 64-bit assembly instructions.
    • 0:07:28And the only thing interesting about that claim for the moment
    • 0:07:31is that assembly--
    • 0:07:32I kind of alluded to that earlier-- assembler output, a.out.
    • 0:07:35There's actually a relationship here, but long story short, these
    • 0:07:38are the lower level instructions that only the CPU,
    • 0:07:41the brain inside your computer, actually understands.
    • 0:07:44Your CPU does not understand C. It doesn't understand Python or C++
    • 0:07:48or Java or any language with which you might be familiar.
    • 0:07:50It only understands this cryptic-looking thing.
    • 0:07:53But frankly, from the looks of it, you might glean that probably not so much
    • 0:07:56fun to program in this.
    • 0:07:58I mean, arguably, it's not that much fun to program yet in C,
    • 0:08:00So this looks even more cryptic.
    • 0:08:03But that's OK.
    • 0:08:04C and lots of languages are just these abstractions
    • 0:08:07on top of the lower level stuff that the CPUs do actually
    • 0:08:10understand so that we don't have to worry about it as much.
    • 0:08:13But if we highlight a few terms, here you'll see some familiar things.
    • 0:08:16So main is mentioned in this so-called assembly code.
    • 0:08:19You see mention of get string and printf,
    • 0:08:21so we're not losing information.
    • 0:08:23It's just being presented in really a different language, assembly language.
    • 0:08:27Now you can glean, perhaps, from some of the names of these instructions,
    • 0:08:31this is what Intel Inside means.
    • 0:08:33When Intel or any brand of CPU understands instructions,
    • 0:08:37it means things like pushing and moving and subtracting and calling.
    • 0:08:42These are all low level verbs, functions,
    • 0:08:44if you will, but at the level of the CPU.
    • 0:08:46But for more on that, you can take entire courses.
    • 0:08:48But just to take the hood off of this for today,
    • 0:08:51this is a step that's been happening for us magically unbeknownst
    • 0:08:54to us, thanks to Clang.
    • 0:08:57So assembling-- now that you've got this cryptic-looking code that we will never
    • 0:09:00see again-- we'll never need to output again--
    • 0:09:02what do you do with it?
    • 0:09:03Well, you said earlier that computers only understand zeros and ones,
    • 0:09:07so the third step is actually to convert this assembly language to actual zeros
    • 0:09:12and ones that now look like this.
    • 0:09:15So the assembling step happening, unbeknownst to you,
    • 0:09:17every time you run Clang or, in turn, run
    • 0:09:19make, we're getting zeros and ones out of the assembly code,
    • 0:09:22and we're getting the assembly code out of your C-code.
    • 0:09:25But here's the fourth and final step.
    • 0:09:28Recall that we need to link in other people's zeros and ones.
    • 0:09:32If you're using printf you didn't write that.
    • 0:09:34Someone else created those zeros and ones, the patterns
    • 0:09:36that the computer understands.
    • 0:09:38You didn't create get string.
    • 0:09:39We did, so you need access to those zeros and ones
    • 0:09:41so that your program can use them as well.
    • 0:09:44So linking, essentially, does this.
    • 0:09:45If you've written a program-- for instance, hello.c--
    • 0:09:48and it happens to use a couple of other libraries,
    • 0:09:51files that other people wrote of useful code
    • 0:09:53for you, like cs50.c, which does exist somewhere,
    • 0:09:57and even stdio.c, which does exist somewhere,
    • 0:10:00or technically, Standard IO is such a big library,
    • 0:10:03they actually put printf in a file specifically called printf.c.
    • 0:10:06But somewhere in the sandbox's hard drive, in all of our Macs and PCs,
    • 0:10:10if they support compiling, are, for instance, files like these.
    • 0:10:14But we've got to convert this to zeros and ones, this, and this,
    • 0:10:18and then somehow combine them.
    • 0:10:19So pictorially, this just looks a bit like this.
    • 0:10:21And this is all happening automatically by Clang.
    • 0:10:23Hello.c, the code you wrote, gets compiled
    • 0:10:25to assembly, which then gets assembled into zeros and ones, so-called machine
    • 0:10:31code or object code.
    • 0:10:32Cs50.c-- we did this for you before the semester started.
    • 0:10:36Printf was done way before any of us started decades
    • 0:10:39ago and looks like this.
    • 0:10:41These are three separate files, though, so the linking step literally
    • 0:10:44means, link all of these things together, and combine the zeros
    • 0:10:48and ones from, like, three, at least, separate files,
    • 0:10:51and just combine them in such a way that now
    • 0:10:53the CPU knows how to use not just your code but printf and get string
    • 0:10:57and so forth.
    • 0:10:59So last week, we introduced compiling as an abstraction,
    • 0:11:02if you will, and this is all that we've really meant this whole time.
    • 0:11:05But now that we've seen what's going on underneath the hood,
    • 0:11:08and we can stipulate that my CPU that looks physically
    • 0:11:11like this, albeit smaller in a laptop or desktop,
    • 0:11:14knows how to deal with all of that.
    • 0:11:17So any questions on these four steps--
    • 0:11:19pre-processing, compiling, assembling, linking?
    • 0:11:22But generally, now, we can just call them compiling, as most people do.
    • 0:11:27Any questions?
    • 0:11:28Yeah.
    • 0:11:29AUDIENCE: How does the CPU know that [INAUDIBLE] is there?
    • 0:11:36Is that [INAUDIBLE]?
    • 0:11:39DAVID J. MALAN: Not in the pre-processing step,
    • 0:11:41so the question is, how does the computer
    • 0:11:43know that printf is the only function that's there?
    • 0:11:46Essentially, when you're linking in code,
    • 0:11:49only the requisite zeros and ones are typically linked in.
    • 0:11:51Sometimes you get more than you actually need, if it's a big library,
    • 0:11:55but that's OK, too.
    • 0:11:56Those zeros and ones are just never used by the CPU.
    • 0:11:58Good question.
    • 0:11:59Other questions?
    • 0:12:02OK, all right.
    • 0:12:03So now that we know this is possible, let's start
    • 0:12:06to build our way back up, because everyone here
    • 0:12:09probably knows now that when writing in C, which
    • 0:12:11is kind of up here conceptually, like, it
    • 0:12:13is not without its hurdles and problems and bugs and mistakes.
    • 0:12:16So let's introduce a few techniques and tools with which you can henceforth,
    • 0:12:19starting this week and beyond, trying to troubleshoot those problems yourself
    • 0:12:23rather than just trying to read through the cryptic-looking error messages
    • 0:12:26or reach out for help to another human.
    • 0:12:28Let's see if software can actually answer some of these questions for you.
    • 0:12:31So let me go ahead and do this.
    • 0:12:32Let me go ahead and open up a sandbox here,
    • 0:12:35and I'm going to go ahead and create a new file called
    • 0:12:38buggy0.c in which I will, this time, deliberately introduce a bug.
    • 0:12:43I'm going to go ahead and create my function called
    • 0:12:46main, which, again, is the default, like when green flag is clicked.
    • 0:12:50And I'm going to go ahead and say, printf, quote, unquote,
    • 0:12:53"Hello world/m."
    • 0:12:56All right.
    • 0:12:56Looks pretty good.
    • 0:12:57I'm going to go ahead and compile buggy0, Enter,
    • 0:13:01and of course, I get a bunch of error messages here.
    • 0:13:03Let me zoom in on them.
    • 0:13:05Fortunately, I only have two, but remember, you have to, have to,
    • 0:13:07have to always scroll up to look at the first,
    • 0:13:09because there might just be an annoying cascading effect from one earlier
    • 0:13:12bug to the later.
    • 0:13:13So buggy0.c, line 5, is what this means, character 5, so like 5 spaces in,
    • 0:13:18implicitly declaring library function printf with dot, dot, dot.
    • 0:13:22So you're going to start to see this pretty often if you make
    • 0:13:24this particular mistake or oversight.
    • 0:13:27Implicitly declaring something means you forgot
    • 0:13:29to teach Clang that something exists.
    • 0:13:31And you probably know from experience, perhaps now, what the solution is.
    • 0:13:36What's the first mistake I made here?
    • 0:13:38AUDIENCE: [INAUDIBLE].
    • 0:13:39DAVID J. MALAN: Yeah, I didn't include the header file,
    • 0:13:42so to speak, for the library.
    • 0:13:43I'm missing, at the top of the file, include stdio.h,
    • 0:13:47in which printf is defined.
    • 0:13:49But let's propose that you're not quite sure how to get to that point,
    • 0:13:53and how can we get, actually, some help with this?
    • 0:13:55Let me actually increase the size of my terminal
    • 0:13:57here, and recall that just a moment ago, I ran makebuggy0,
    • 0:14:00which yielded the errors that I saw.
    • 0:14:02It turns out that installed in the sandbox
    • 0:14:04is a command that we, the staff, wrote called help50.
    • 0:14:07And this is just a program we wrote that takes as input any error
    • 0:14:11messages that your code or some program has outputted.
    • 0:14:14We kind of look for familiar words and phrases,
    • 0:14:16just like a TF would in office hours, and if we recognize some error message,
    • 0:14:20we're going to try to provide, either rhetorically or explicitly,
    • 0:14:24some advice on how to handle.
    • 0:14:25So if I go ahead and run this command now, notice there's a bit more output.
    • 0:14:29I see exactly the same output in white and green and red as before,
    • 0:14:33but down below is some yellow, which comes specifically from help50.
    • 0:14:36And if I go ahead and zoom in on this, you'll
    • 0:14:38see that the line of output that we recognized
    • 0:14:43is this one, that same one I verbally drew attention
    • 0:14:46to before-- buggy0.c, line 5, error, implicitly declaring library function
    • 0:14:50printf, and so forth.
    • 0:14:52So here, without the background highlighting, but still in yellow,
    • 0:14:54is our advice or a question a TF or CA might ask you in office hours.
    • 0:14:58Well, did you forget to include stdio.h in which printf
    • 0:15:02is declared atop your file?
    • 0:15:05And hopefully, our questions, rhetorical or otherwise, are correct,
    • 0:15:08and that will get you further along.
    • 0:15:10So let's go ahead and try that advice.
    • 0:15:12So include stdio.h.
    • 0:15:15Now let me go ahead and go back down here.
    • 0:15:16And if you don't like clutter, you can type "clear,"
    • 0:15:19or hit Control+L in the terminal window to keep cleaning it like I do.
    • 0:15:23If you want to go ahead now and run makebuggy0, Enter, fewer errors,
    • 0:15:29so that's progress, and not the same.
    • 0:15:30So this one's, perhaps, a little easier.
    • 0:15:33Reading the line, what line of code is buggy here?
    • 0:15:36AUDIENCE: Forgot the semicolon.
    • 0:15:38DAVID J. MALAN: Yeah, so this is now still on line 5, it turns out,
    • 0:15:41but for a different reason.
    • 0:15:42I seem to be missing a semi-colon.
    • 0:15:44But I could similarly ask help50 for help with that
    • 0:15:47and hope that it recognizes my error.
    • 0:15:48So this, too, should start being your first instinct.
    • 0:15:50If on first glance, you don't really understand
    • 0:15:52what an error message is doing, even though you've
    • 0:15:54scrolled to the very first one, like literally ask this program for help
    • 0:15:57by rerunning the exact same command you just
    • 0:15:59ran, but prefix it with help50 and a space,
    • 0:16:03and that will run help50 for you.
    • 0:16:05Any questions on that process?
    • 0:16:08All right, let's take a look at one other program,
    • 0:16:10for instance, that, this time, has a different error involved in it.
    • 0:16:15So how about-- let me go ahead and whip up a quick program here.
    • 0:16:19I'll call this buggy2.c for consistency with some of the samples
    • 0:16:23we have online for you later.
    • 0:16:25And in this example, I'm going to go ahead and write the correct thing
    • 0:16:29at first, stdio.h, and then I'm going to have int main void, which
    • 0:16:33just gets my whole program started.
    • 0:16:35And then I'm going to have a loop, and recall for--
    • 0:16:37[CLEARS THROAT] excuse me-- Mario or some other program,
    • 0:16:40you might have done something like int i get 0, i is less than or equal to--
    • 0:16:44let's do this 10 times, and then i++.
    • 0:16:47And all I want to do in this program is print out that value of i, as I can do,
    • 0:16:53with the %i placeholder-- so a simple program.
    • 0:16:55Just want it to count from 0 to 10.
    • 0:16:59So let's go ahead and run buggy2, or rather, I want to--
    • 0:17:04let's not print up--
    • 0:17:06rewind.
    • 0:17:07Let's go ahead and just print out a hash symbol
    • 0:17:12and not spoil the solution this way.
    • 0:17:15So here, I go ahead and print out buggy2.
    • 0:17:17My goal is now I will stipulate to print out just 10 hash symbols, one per line,
    • 0:17:21which is what I want to do here.
    • 0:17:23And now I'm going to go ahead and run ./buggy2, and I should see, hopefully,
    • 0:17:2810 hashes.
    • 0:17:30And I kind of spoiled this a little bit, but what do I instead see?
    • 0:17:36Yeah, I think I see more than I expect.
    • 0:17:39And we can kind of zoom in here and double check, so 1, 2, 3, 4, 5, 6, 7,
    • 0:17:468, 9, 10, ooh, 11.
    • 0:17:4811.
    • 0:17:49Now some of your eyes might already be darting to what the solution should be,
    • 0:17:53but let's just propose that it's not obvious.
    • 0:17:55And if it is actually not obvious, all the better, so how might
    • 0:17:57you go about diagnosing this kind of problem, short of just
    • 0:17:59reaching out and asking a human for help.
    • 0:18:02This is not a problem that help50 can help with,
    • 0:18:04because it's not an error message.
    • 0:18:06Your program is working.
    • 0:18:07It's just not outputting what you wanted it to work,
    • 0:18:09but it's not an error message from the compiler with which help50 can help.
    • 0:18:13So you want to kind of get eyes into what your program is doing,
    • 0:18:17and you want to understand, why are you printing 11 when you really
    • 0:18:20are setting this up from 0 to 10?
    • 0:18:22Well, one of the most common techniques in C or any language,
    • 0:18:25honestly, is to use printf for just other purposes-- diagnostic purposes.
    • 0:18:29For instance, there's not much going on in this program,
    • 0:18:32but I'd argue that it would be interesting for me to know,
    • 0:18:35and therefore understand my program, by just,
    • 0:18:37let's print out this value of i on each iteration,
    • 0:18:41as by doing the line of code that I earlier did,
    • 0:18:44and just say something literally like, i is %i.
    • 0:18:47I'm going to remove this ultimately, because it's
    • 0:18:49going to make my program look a little silly,
    • 0:18:52but it's going to help me understand what's going on.
    • 0:18:54Let me go ahead and recompile buggy2, ./bugg2, and this time,
    • 0:19:01I see a lot more output.
    • 0:19:03But if I zoom in, now it's kind of--
    • 0:19:07now the computer is essentially helping me understand what's going on.
    • 0:19:10When i is 0, here's one of them.
    • 0:19:11When i is 1, here's another.
    • 0:19:13I is 2, 3, 4, 5, 6, 7, 8, 9, and that looks good.
    • 0:19:16But if we scroll a little further, it feels a little problematic
    • 0:19:19that i can also be 10.
    • 0:19:22So what's logically the bug in this program?
    • 0:19:24AUDIENCE: [INAUDIBLE].
    • 0:19:25DAVID J. MALAN: Yeah.
    • 0:19:26I use less than or equal to, because I kind of confuse the paradigm.
    • 0:19:29Like programmers tend to start counting at zero,
    • 0:19:31apparently, but I want to do this 10 times, and in the human world,
    • 0:19:34if I want to do something 10 times, I might count up to and including 10.
    • 0:19:38But you can't have it both ways.
    • 0:19:39You can't start at zero and end at 10 if you
    • 0:19:41want to do something exactly 10 times.
    • 0:19:43So there's a couple of possibilities here.
    • 0:19:46How might we fix this?
    • 0:19:48Yeah, so we could certainly change it to less than.
    • 0:19:50What's another correct approach?
    • 0:19:53Yeah, so we could leave this alone and just start counting at one,
    • 0:19:56and if you're not actually printing the values in your actual program,
    • 0:19:59that might be perfectly reasonable, too.
    • 0:20:01It's just not conventional.
    • 0:20:03Get comfortable with, quickly, just counting from zero, because that's just
    • 0:20:06what most everyone does these days.
    • 0:20:08But the technique here is just use printf.
    • 0:20:11Like, when in doubt, literally use printf on this line, on this line,
    • 0:20:15on this line.
    • 0:20:15Anywhere something is interesting maybe going on in your program,
    • 0:20:18just use it to print out the strings that are in your variables,
    • 0:20:21print out the integers that are in your variables, or anything else.
    • 0:20:24And it allows you to kind of see, so to speak,
    • 0:20:26what's going on inside of your program, printf.
    • 0:20:32One last tool-- so it's not uncommon, when writing code,
    • 0:20:36to maybe get a little sloppy early on, especially when you're not
    • 0:20:39quite familiar with the patterns.
    • 0:20:40And for instance, if I go ahead and do this
    • 0:20:43by deleting a whole bunch of whitespace, even after fixing this mistake
    • 0:20:47by going from zero to 10, is this program now correct,
    • 0:20:52if the goal is to print 10 hashes?
    • 0:20:57Yeah, I heard yes.
    • 0:20:57Why is it correct?
    • 0:20:58In what sense?
    • 0:21:02Yeah, exactly.
    • 0:21:03It still works.
    • 0:21:04It prints out the 10 hashes, one per line,
    • 0:21:06but it's poorly written in the sense of style.
    • 0:21:09So recall that we tend to evaluate, and the world
    • 0:21:12tends to think about code in at least three ways.
    • 0:21:14One, the correctness-- does it do what it's supposed to do,
    • 0:21:16like print 10 hashes?
    • 0:21:17And yes, it does, because all I did was delete whitespace.
    • 0:21:19I didn't actually change or break the code after making that fix.
    • 0:21:22Two is design, like how thoughtful, how well-written is the code?
    • 0:21:25And frankly, it's kind of hard to write this in too many ways,
    • 0:21:28because it's so few lines.
    • 0:21:29But you'll see over time, as your programs grow,
    • 0:21:31the teaching fellows and staff can provide you with feedback
    • 0:21:33on the design of your code.
    • 0:21:35But style is relatively easy.
    • 0:21:36And I've been teaching it mostly by way of example, if you will,
    • 0:21:39because I've been very methodically indenting my code
    • 0:21:42and making sure everything looks very pretty, or at least pretty
    • 0:21:45to a trained eye.
    • 0:21:47But this, let's just stipulate, is not pretty.
    • 0:21:49Like, left aligning everything still works, not incorrect,
    • 0:21:53but it's poorly styled.
    • 0:21:54And what would be an argument for not writing code
    • 0:21:56like this and, instead, writing code the way
    • 0:21:58I did a moment ago, albeit after fixing the bug?
    • 0:22:02Yeah.
    • 0:22:02AUDIENCE: It'll help you identify each little subroutine that
    • 0:22:05goes through the thing, so you know this section is here.
    • 0:22:10DAVID J. MALAN: Yeah.
    • 0:22:11AUDIENCE: [INAUDIBLE] next one, so you know where everything is.
    • 0:22:13DAVID J. MALAN: Exactly.
    • 0:22:14Let me summarize this.
    • 0:22:15It allows you to see, more visually, what
    • 0:22:17are the individual subroutines or blocks of code doing
    • 0:22:20that are associated with each other?
    • 0:22:22Scratch is colorful, and it has shapes, like the hugging shape
    • 0:22:25that a lot of the control blocks make, to make
    • 0:22:27clear visually to the programmer that this block encompasses others,
    • 0:22:30and, therefore, this repeats block or this forever block
    • 0:22:34is doing these things again and again and again.
    • 0:22:36That's the role that these curly braces serve, and indentation
    • 0:22:38in this and in other contexts just helps it
    • 0:22:41become more obvious to the programmer what is inside of what
    • 0:22:45and what is happening where.
    • 0:22:46So this is just better written, because you
    • 0:22:49can see that the code inside of main is everything that's indented here.
    • 0:22:52The code that's inside the for loop is everything that's indented here.
    • 0:22:56So it's just for us human readers, teaching fellows
    • 0:22:58in the case of a course, or colleagues in the case of the real world.
    • 0:23:01But suppose that you don't quite see these patterns too readily initially.
    • 0:23:05That, too, is fine.
    • 0:23:06CS50 has on its website what we call a style guide.
    • 0:23:09It's just a summary of what your code should
    • 0:23:11look like when using certain features of C-- loops,
    • 0:23:14conditions, variables, functions, and so forth.
    • 0:23:16And it's linked on the course's website.
    • 0:23:18But there's also a tool that you can use when
    • 0:23:20writing your code that'll help you clean it up and make it consistent,
    • 0:23:23not just for the sake of making it consistent with the style guide,
    • 0:23:26but just making your own code more readable.
    • 0:23:28So for instance, if I go ahead and run a command called
    • 0:23:31style50 on this program, buggy2.c, and then hit Enter,
    • 0:23:37I'm going to see some output that's colorful.
    • 0:23:40I see my own code in white, and then I see, anywhere
    • 0:23:44I should have indented, green spaces that
    • 0:23:47are sort of encouraging me to put space, space, space, space here.
    • 0:23:50Put space, space, space, space here.
    • 0:23:51Put eight spaces here, four spaces here, and so forth,
    • 0:23:54and then it's reminding me I should add comments as well.
    • 0:23:56This is a short program-- doesn't necessarily
    • 0:23:58need a lot of commenting to explain what's going on.
    • 0:24:01But just one //, like we saw last week to explain,
    • 0:24:04maybe at the top of the file or top the block of the code,
    • 0:24:06would make style50 happy as well.
    • 0:24:08So let's do that.
    • 0:24:09Let me go ahead and take its advice and actually indent this with Tab,
    • 0:24:13this with Tab, this with Tab, this with Tab, and this once more.
    • 0:24:17And you'll notice that on your keyboard, even though you're hitting Tab,
    • 0:24:20it's actually converting it for you, which is very common to four spaces,
    • 0:24:23so you don't have to hit the spacebar four times.
    • 0:24:25Just get into the habit of using Tab.
    • 0:24:27And let me go ahead and write a comment here.
    • 0:24:30"Print 10 hashes."
    • 0:24:32This way, my colleagues, my teaching fellow, myself in a week
    • 0:24:35don't have to read my own code again and figure out what it's doing.
    • 0:24:37I can read the comments alone per the //.
    • 0:24:40If I run style50 again, now it looks good.
    • 0:24:44It's in accordance with the style guide, and it's just more prettily written,
    • 0:24:47so pretty printed would be a term of art in programming
    • 0:24:50when your code looks good and isn't just correct.
    • 0:24:53Any questions then?
    • 0:24:55Yeah.
    • 0:24:56AUDIENCE: I tried using [INAUDIBLE] this past week
    • 0:24:58and it said I needed a new program.
    • 0:25:00DAVID J. MALAN: That's--
    • 0:25:01it wasn't enabled for the first week of the class.
    • 0:25:04It's enabled as of right now and henceforth.
    • 0:25:06Other questions?
    • 0:25:08No.
    • 0:25:09All right, so just to recap then, three tools to have in the proverbial toolbox
    • 0:25:13now are help50 anytime you see an error message that you don't understand,
    • 0:25:16whether it's with make or Clang or, perhaps, something else.
    • 0:25:18Printf-- when you've got a logical program--
    • 0:25:21a bug in your program, and it's just not working the way it's supposed to
    • 0:25:24or the way the problem set tells you it should, and then style50
    • 0:25:27when you want to make sure that, does my code look right in terms of style,
    • 0:25:31and is it as readable as possible?
    • 0:25:33And honestly, you'll find us at office hours and the like
    • 0:25:35often encouraging you, hey, before we answer this question,
    • 0:25:38can you please run style50 on your code?
    • 0:25:40Can you please clean up your code, because it just makes our lives, too,
    • 0:25:43as other humans so much easier when we can understand what's
    • 0:25:45going on without having to visually figure out what parentheses and curly
    • 0:25:49braces line up.
    • 0:25:50And so do get into that habit, because it
    • 0:25:52will save you time from having to waste time parsing things visually yourself.
    • 0:25:57All right.
    • 0:25:58So there's not just CPUs in computers.
    • 0:26:01CPUs are the brains, central processing unit,
    • 0:26:03and that's why we keep emphasizing the instructions that computers understand.
    • 0:26:06There's also this, which we saw last time, too.
    • 0:26:09This is an example of what type of hardware?
    • 0:26:12AUDIENCE: RAM.
    • 0:26:12DAVID J. MALAN: RAM, or Random Access Memory.
    • 0:26:15This is the type of memory that laptops, desktops, servers
    • 0:26:17have that is used whenever you run a program or open a file.
    • 0:26:21There's another type of memory called hard drives or solid state drives,
    • 0:26:25which you're probably familiar as a consumer,
    • 0:26:27and that's just where your files are stored permanently.
    • 0:26:29Your battery can die.
    • 0:26:30You can pull the plug from your laptop or desktop,
    • 0:26:32and any files saved on a hard drive are persistent.
    • 0:26:35They stay there because of the technology
    • 0:26:37being used to implement that.
    • 0:26:38But RAM is more ephemeral.
    • 0:26:41RAM is powered only by electricity.
    • 0:26:43It's only used when the power is on or the battery is charged,
    • 0:26:46and it's where your files and programs live effectively when
    • 0:26:49you double click on them and open them.
    • 0:26:52So when you double click on something like Microsoft Word,
    • 0:26:54it is copied from your hard drive long term into this type of memory,
    • 0:26:59because this type of memory, though smaller in capacity--
    • 0:27:02you don't have as many bytes of it--
    • 0:27:04but it is much, much, much, much faster.
    • 0:27:06Similarly, when you open a document, or you go to a web page,
    • 0:27:09the contents of the file you're seeing are stored in this type of hardware,
    • 0:27:13because even though you don't have terribly many bytes of it,
    • 0:27:15it's just much, much, much, much faster.
    • 0:27:18And so this will be thematic in computer science and in hardware.
    • 0:27:20You sort of have lots of cheap, slow stuff,
    • 0:27:23like hard disk space, relatively speaking, and you have a little less
    • 0:27:27of the more expensive but faster stuff like RAM.
    • 0:27:30And you have just one, usually, CPU, which is the really fast thing that
    • 0:27:33can do a billion things per second.
    • 0:27:34But it, too, is more expensive.
    • 0:27:37So there's four visible chips on this thing, if you will.
    • 0:27:39And we won't get into the details of how these things work, but let's
    • 0:27:42just zoom in on this one black chip here and focus on it
    • 0:27:46as being representative as some amount of memory.
    • 0:27:48Maybe it's one megabyte, one million bytes.
    • 0:27:50Maybe it's even one gigabyte these days, one billion bytes.
    • 0:27:54But this is to say that this chip can be thought of as just having
    • 0:27:57a bunch of bytes in it.
    • 0:27:58This is not to scale.
    • 0:27:59You have many more bytes than these, but let
    • 0:28:01me propose that you just think of each of these squares
    • 0:28:04here as representing one byte.
    • 0:28:06So the very first byte of memory I have access to is here.
    • 0:28:08Next one is here, and so forth.
    • 0:28:10And the fact that they wrap around is just an artist rendition.
    • 0:28:13These things you can think of just virtually as going
    • 0:28:15left to right, not in any kind of grid, but physically, they look like this.
    • 0:28:19So when you actually create a variable in a program like C,
    • 0:28:23like you need a char.
    • 0:28:24A char tends to be one byte or eight bits,
    • 0:28:27and so that means when you have a variable of type char in a C program,
    • 0:28:32it goes, literally, physically in one of these boxes,
    • 0:28:35inside of your computer's RAM.
    • 0:28:37So for instance, it might take up this much space at top left.
    • 0:28:40If you have a bigger type of data, so you
    • 0:28:42have an integer, which tends to be four bytes or 32 bits,
    • 0:28:45you might need more than one square, so the computer might give you access
    • 0:28:48to four squares instead.
    • 0:28:50And you have 32 bits spanning that region of memory.
    • 0:28:54But honestly, I chose those boxes arbitrarily.
    • 0:28:56They could be anywhere in that chip or in any of the other chips.
    • 0:28:58It's up to the computer to just remember where they are for you.
    • 0:29:01You don't need to remember that, per se.
    • 0:29:04But if we think about this grid, it turns out
    • 0:29:07this is actually very valuable that we have chunks of memory--
    • 0:29:10bytes, if you will--
    • 0:29:11that are back to back to back to back.
    • 0:29:13And in fact, there's a word for this technique.
    • 0:29:16This is contiguous memory--
    • 0:29:17back to back to back to back to back.
    • 0:29:19And in general, in programming, this is referred to as an array.
    • 0:29:23You might recall from Scratch, if you use this feature,
    • 0:29:25it actually has things called lists, which
    • 0:29:27are exactly that-- lists of values, lists of words, lists of strings.
    • 0:29:30An array is just a contiguous chunk of memory, such
    • 0:29:33that you can store something here, something here, something here,
    • 0:29:35something here, and so forth.
    • 0:29:37So it turns out an array, this super simple primitive,
    • 0:29:41is actually incredibly powerful.
    • 0:29:43Just being able to store things in my computer's memory
    • 0:29:46back to back to back to back enables so many possibilities, both design-wise,
    • 0:29:52like how well I can write my code, and also how fast I can make my code run.
    • 0:29:56So let me go ahead and take out an example.
    • 0:29:59Let me go ahead and open up, for instance, a new file in a sandbox,
    • 0:30:04and we'll call this score0.
    • 0:30:06So let me go ahead and close this one, create a new file called scores0.c.
    • 0:30:12And in this file, let's go ahead and write a relatively simple program.
    • 0:30:16Let me go ahead and, as usual, give myself access
    • 0:30:18to some helpful functions-- cs50.h and stdio.h.
    • 0:30:22And no need to copy all this down verbatim, if you don't like.
    • 0:30:25Everything will have or is already on the course's website.
    • 0:30:28Let me start my program as usual with int main void.
    • 0:30:30And then let me write a program, as this program's name implies,
    • 0:30:33that, like, asks the user for three scores on recent problem sets,
    • 0:30:38quizzes, whatever, and then kind of creates a very simple chart of them,
    • 0:30:41like a bar chart to kind of help me visualize how well
    • 0:30:44or how poorly I did on something.
    • 0:30:46So if I want to get an integer, no surprise,
    • 0:30:49we can use the get int function, and I can just
    • 0:30:51ask the user for their first score.
    • 0:30:54But I should probably do something with this score,
    • 0:30:56and on the left hand side of this, what do I typically put?
    • 0:31:00Yeah.
    • 0:31:00So int-- sure, score 1 equals this, and then my semi-colon.
    • 0:31:04So you might not have had many occasions to use ints just yet,
    • 0:31:07but get int is in the cs50 library.
    • 0:31:09This is the so-called prompt that the human
    • 0:31:11sees, and let me actually fix my space, because I
    • 0:31:13want the human to see the space after the colon.
    • 0:31:16But that's just an aesthetic detail.
    • 0:31:18And then when I get back this value, its return value--
    • 0:31:21just like Aaron, last week, handed me a piece of paper,
    • 0:31:23so does get int hand me a virtual piece of paper with a number
    • 0:31:26that I'm going to store in a variable called Score 1.
    • 0:31:29And now just to be clear, what has just happened effectively is this.
    • 0:31:34The moment you create a variable of type int, which is four bytes,
    • 0:31:39literally, this is what Clang or, more generally,
    • 0:31:42the computer has done for you.
    • 0:31:44That int that the human typed in is stored literally
    • 0:31:47in four contiguous bytes back to back to back, maybe here, maybe here,
    • 0:31:51but together.
    • 0:31:52So that's all that's going on when you're actually using C.
    • 0:31:55So let me go back into my code here, and now I
    • 0:31:58want to-- it's not interesting to plot one score.
    • 0:32:00So let's go ahead and do another.
    • 0:32:01So int Score 2, get int, get int, and I'll ask the user for score 2,
    • 0:32:08semi-colon, and then let's get one more, Score 3, get int, call it Score 3,
    • 0:32:13semi-colon.
    • 0:32:14All right, so now let me go ahead and generate a bar,
    • 0:32:17like a bar chart of this.
    • 0:32:18I'm going to use what we'll call ASCII art.
    • 0:32:20ASCII, of course, is just text, recall--
    • 0:32:22very simple text in a computer.
    • 0:32:23And I can kind of make a bar chart pretty simply by just printing out
    • 0:32:27like a bunch of hashes horizontally, so a short bar
    • 0:32:30will represent a small number, and a long bar will represent a big number.
    • 0:32:34So let me go ahead and say to the user, all right, here's your Score 1.
    • 0:32:38I'm going to go ahead, then, and say, for int i get 0.
    • 0:32:41I is less than Score 1, i++.
    • 0:32:46And now if I scroll down and give myself a bit of room here,
    • 0:32:48let me go ahead and implement just a simple print.
    • 0:32:53So go ahead and print out a hash, and then when you're all done with that,
    • 0:32:57print out a new line at the end of that loop.
    • 0:33:01And let's just pause there.
    • 0:33:03Just to recap, I've asked the human for three scores.
    • 0:33:05I'm only doing something with one of them at the moment, so in fact,
    • 0:33:09just as a quick check, let me delete those so as to not get ahead of myself.
    • 0:33:13Let me do make score 0.
    • 0:33:15Cross my fingers.
    • 0:33:16OK, no errors.
    • 0:33:17Now let me go ahead and do ./score0, and your first score on a pset this year
    • 0:33:22out of 100 has been?
    • 0:33:24OK, 100.
    • 0:33:26And good job.
    • 0:33:27So it's a really long bar, and if we count those up,
    • 0:33:29hopefully, there's actually 100 bars.
    • 0:33:31And if we run it again and say, eh, it didn't go so well.
    • 0:33:33I got a 50.
    • 0:33:35That's half as big a bar.
    • 0:33:36So it seems like we're on our way correctness-wise.
    • 0:33:39So now let me go ahead and get the other scores.
    • 0:33:41Well, I had them here a moment ago.
    • 0:33:42So let me go ahead and just, well, copy, paste, and change this to two,
    • 0:33:46change this to three, change this to three, this to three.
    • 0:33:50All right, I know how to print bars clearly, so let me go ahead
    • 0:33:53and do this, and then do this, and then fix the indentation.
    • 0:33:56I don't want to say Score 1 everywhere.
    • 0:33:58I want to say a Score 2, Score 2.
    • 0:34:00I mean you're probably being rubbed the wrong way that this
    • 0:34:03is both tedious and sloppy, and why?
    • 0:34:06What am I doing poorly now design-wise?
    • 0:34:07AUDIENCE: Copying and pasting code.
    • 0:34:09DAVID J. MALAN: Like copy-pasting almost always bad, right?
    • 0:34:11There's redundancy here, but that's fine.
    • 0:34:13Let's prioritize correctness, at least, for now.
    • 0:34:15So let me go ahead and make Score 0.
    • 0:34:18All right, no mistakes-- ./score0.
    • 0:34:21And then Tab it.
    • 0:34:22Let me go ahead now and run--
    • 0:34:24OK, we got 100 the first time.
    • 0:34:26We got 50 the--
    • 0:34:27oh, that's a bug.
    • 0:34:29What did I do there?
    • 0:34:31See, this is what happens when you copy-paste.
    • 0:34:33So let's fix this.
    • 0:34:34That should say Score 2, so Control+C will quit a program.
    • 0:34:37Make score 0 will recreate it. ./0, Enter--
    • 0:34:42all right, here we go.
    • 0:34:43100, 50.
    • 0:34:44Let's split the difference--
    • 0:34:4675.
    • 0:34:47All right, so this is a simple bar chart horizontally drawn
    • 0:34:51of each of my three scores, where this is 100, this is 50, and this is 75.
    • 0:34:55But there's opportunities for improvement here.
    • 0:34:57So one, it rubbed some folks the wrong way
    • 0:34:59already that we were literally copying and pasting code.
    • 0:35:05So where is one opportunity for improvement here?
    • 0:35:09What should I do instead of copying and pasting that code again and again?
    • 0:35:13What ingredient can you bring?
    • 0:35:15OK, so we can use a loop and actually just do the same thing three times.
    • 0:35:19So let's try that.
    • 0:35:22Let me go ahead and do this.
    • 0:35:25So let's go ahead and delete the copy-paste I did,
    • 0:35:28and let me go ahead and say, OK, well, for int i get zero, i less than 3, i++.
    • 0:35:35Let me create a bracket.
    • 0:35:37I can highlight multiple lines and hit Tab,
    • 0:35:39and they'll all indent for me, which is convenient.
    • 0:35:41And can I do this now, for instance?
    • 0:35:48Say it a little louder.
    • 0:35:50AUDIENCE: If you [INAUDIBLE] to a specific [INAUDIBLE]..
    • 0:35:54DAVID J. MALAN: Yeah, I'm a little worried.
    • 0:35:56As you're noting here, we're using on line 13 here the same variable, so mm.
    • 0:36:01So it's good instincts, but I feel like the fact
    • 0:36:03that this program, unlike last week, we're
    • 0:36:05now collecting multiple pieces of data.
    • 0:36:06Loops are breaking down for us.
    • 0:36:08Yeah.
    • 0:36:08AUDIENCE: [INAUDIBLE] function [INAUDIBLE] takes in--
    • 0:36:13like you can have it [INAUDIBLE].
    • 0:36:16DAVID J. MALAN: OK.
    • 0:36:17AUDIENCE: So like an input of how many scores you wanted to enter.
    • 0:36:20DAVID J. MALAN: OK.
    • 0:36:21AUDIENCE: And then [INAUDIBLE].
    • 0:36:23DAVID J. MALAN: Yeah, we can implement another
    • 0:36:25function that factors out some of this functionality.
    • 0:36:27Any other thoughts?
    • 0:36:28AUDIENCE: Store your scores in an array.
    • 0:36:30DAVID J. MALAN: OK, so we could also store our scores in an array.
    • 0:36:33So let's do these in order then, in fact.
    • 0:36:34So loops are wonderful when you want to do something again and again
    • 0:36:37and again, but the whole purpose of a function,
    • 0:36:39fundamentally, is to factor out common functionality.
    • 0:36:43And there might still be a loop in the solution,
    • 0:36:45but the real fundamental problem with what I was doing a moment ago
    • 0:36:48was I was copying and pasting functionality--
    • 0:36:50shouldn't need to do that, because in both C and Scratch,
    • 0:36:52we had the ability to make our own functions.
    • 0:36:54So let's do that.
    • 0:36:55Let me undo my loop changes here, just to get us
    • 0:36:58back to where we were a moment ago.
    • 0:36:59And let me go ahead and, instead, clean this up a little bit.
    • 0:37:02Let me go ahead and create a new function
    • 0:37:04down here that I'm going to call, say, Chart, just
    • 0:37:07to create a chart for myself.
    • 0:37:09And it's going to take as input a score, but I could call this anything I want.
    • 0:37:12It's void as its return type, because I don't need it to hand me something
    • 0:37:15back.
    • 0:37:16Like I'm not getting a string from the user.
    • 0:37:18I'm just printing a char.
    • 0:37:19It's a so-called side effect or output.
    • 0:37:21Now I'm going to go ahead and do my loop here for int i get 0.
    • 0:37:25I is less than--
    • 0:37:27how many hashes do I want to print if I'm being passed in the user score?
    • 0:37:32Like, is this 3 here?
    • 0:37:34AUDIENCE: The score.
    • 0:37:35DAVID J. MALAN: The score, so if I'm being
    • 0:37:37handed a number that's 0 to 100, that's what I want to iterate over.
    • 0:37:40If my goal here, ultimately--
    • 0:37:43let me finish this thought-- i++ is [? 2 ?] inside this loop print out one
    • 0:37:48hash per point in 1's total score.
    • 0:37:52And just to keep things clean, I'm going to go ahead and put
    • 0:37:54a new line at the very end of this.
    • 0:37:56But I think now, I factored out a good amount of the redundancy.
    • 0:37:59It's not everything, but I've at least now
    • 0:38:01given myself a function called Chart.
    • 0:38:04So up here, it looks like I can kind of remove this loop, which
    • 0:38:08is what I factored out.
    • 0:38:10That's almost identical, except the variable name was hardcoded.
    • 0:38:13And I think I could now do chart like this,
    • 0:38:18and then I maybe could do a little copy-paste, if that's OK, like if maybe
    • 0:38:22I can get away with just doing this, and then say 2, and then say 3,
    • 0:38:28and then say 3, and then say 2.
    • 0:38:30So it's still copy-paste, but it's less.
    • 0:38:32And it looks better.
    • 0:38:33It literally fits on the screen, so it's progress-- not perfect, but progress.
    • 0:38:36Better design, but not perfect.
    • 0:38:38So is this going to compile?
    • 0:38:42I'm going to have errors why?
    • 0:38:44AUDIENCE: Essentially, it's [INAUDIBLE] the program [INAUDIBLE]..
    • 0:38:47DAVID J. MALAN: OK.
    • 0:38:49Yeah.
    • 0:38:50AUDIENCE: We need to declare a [INAUDIBLE]..
    • 0:38:52DAVID J. MALAN: OK, good.
    • 0:38:53So let me induce the actual error, just so we know what problem we're solving.
    • 0:38:58Let me go ahead and sort of innocently go ahead
    • 0:39:00and compile Score 0 hoping all is well, but of course,
    • 0:39:03it's not because of a familiar error up here.
    • 0:39:07So notice, implicit declaration of function chart is invalid in C99.
    • 0:39:12So again, implicit declaration of function
    • 0:39:14just tends to mean Clang does not know what you're talking about.
    • 0:39:18And you could run help50, and it would probably
    • 0:39:20provide you with similar advice.
    • 0:39:22But the gist of this is that chart is not a C function.
    • 0:39:25It doesn't come with C. I wrote it.
    • 0:39:27I just wrote it a little too late.
    • 0:39:29So one solution that we didn't used last week
    • 0:39:32would be, OK, well, if you don't know what chart is, let me just
    • 0:39:35go put it where you'll know about it.
    • 0:39:37And now run make score 0.
    • 0:39:40OK, problem solved.
    • 0:39:42So that fixes it, but we fixed it in a different way last week.
    • 0:39:46And why might we want to stick with last week's approach
    • 0:39:48and not just copy-paste my function and put it
    • 0:39:50at the top instead of the bottom?
    • 0:39:55AUDIENCE: [INAUDIBLE].
    • 0:39:57DAVID J. MALAN: Yeah, I mean it's kind of a minor concern at the moment,
    • 0:40:00because this is a pretty short program.
    • 0:40:02But I'm pushing the main part of my program, literally called Main,
    • 0:40:06farther and farther down.
    • 0:40:07And the whole point of reading code is to understand what it's doing.
    • 0:40:10So if I open this file, and I have to scroll, scroll, scroll, scroll, scroll,
    • 0:40:13just looking for the main function, it's just bad style.
    • 0:40:16It's just kind of nice, and it's a good human convention.
    • 0:40:18Put the main code, the main function, when green flag clicks equivalent,
    • 0:40:22at the very top.
    • 0:40:23So C does offer us a solution here.
    • 0:40:25You just have to provide it with a little hint.
    • 0:40:27Let me go ahead and cut this from here, put it back down at the bottom here,
    • 0:40:32and then go ahead and copy-paste only or retype only the value--
    • 0:40:38whoops-- the value of that first line, which is its so-called prototype.
    • 0:40:43Give Clang enough information so that it knows what arguments the function
    • 0:40:47takes, what its return type is, and what its name is, semi-colon,
    • 0:40:50and that's the so-called declaration or--
    • 0:40:53and then implement it with the curly braces and all the logic down below.
    • 0:40:58So let's go ahead and run this.
    • 0:40:59And if I scroll up here, we'll see-- whoops.
    • 0:41:03We'll see make score 0.
    • 0:41:05All right, now we're on our way, score 0.
    • 0:41:08Enter.
    • 0:41:08Score 1 is 100, 50, 75, and now we seem to have some good functionality.
    • 0:41:13But there's still an opportunity, I dare say, for improvement.
    • 0:41:17And I think the fundamental problem is that I'm still
    • 0:41:19copy-pasting the little stuff, but I think
    • 0:41:21the fundamental problem is that I don't have the expressiveness to store
    • 0:41:26multiple values, unless I, in advance, as the programmer,
    • 0:41:30give them all unique names, because if I use the same variable for everything,
    • 0:41:34I couldn't collect all three variables at the top,
    • 0:41:37and then iterate over all three at the bottom, if I only have one variable.
    • 0:41:40So I do need three variables, but this doesn't scale very well.
    • 0:41:43And who knows?
    • 0:41:44If I want to take in five scores, 10 scores, or more scores,
    • 0:41:47then I'm really copying and pasting excessively.
    • 0:41:51So it turns out, indeed, the answer is an array.
    • 0:41:53So an array, at the end of the day, is just
    • 0:41:55a side effect of storing stuff in memory back to back to back to back.
    • 0:41:59But what's powerful about this reality of memory is the following.
    • 0:42:03I can go ahead here and in, say, a new and more improved
    • 0:42:07version of this program, do this.
    • 0:42:10Let me go ahead and open this one, which I wrote in advance, called scores2.c.
    • 0:42:14And in scores2.c, notice we have the following code.
    • 0:42:18In my main function, I've got a new feature and a new bit of syntax.
    • 0:42:23This line here that I've highlighted says, hey, Clang,
    • 0:42:26give me a variable called Scores of type integer,
    • 0:42:30but please give me three of them.
    • 0:42:32So the new syntax are your square brackets,
    • 0:42:34and inside of which is the number of variables you want of that type.
    • 0:42:37And you don't have to give them unique names.
    • 0:42:39You literally call them collectively, Scores,
    • 0:42:41and in English, I deliberately chose a plural to connote as much.
    • 0:42:44This is an array of values, not a single value.
    • 0:42:48What can I do next?
    • 0:42:49Well, here's my for loop for int i get zero i is less than 3 i++,
    • 0:42:53and now I've solved that earlier problem that was proposed.
    • 0:42:56Well, just put it in a loop.
    • 0:42:57Now I can, because now my variables are not called Score 1, Score 2, Score 3,
    • 0:43:02which I literally had to hard code.
    • 0:43:04They're just called Scores, and now that they're called Scores,
    • 0:43:07and I have this square bracket notation, notice what I can do.
    • 0:43:10I can get an int, and I can say, give me score%i, and plug in i plus 1.
    • 0:43:15I didn't want to say "zero," because humans
    • 0:43:17don't count from zero in general.
    • 0:43:18So this is counting from one, two, and three, but the computer is doing this.
    • 0:43:24So Scores is a variable.
    • 0:43:25Bracket, i, close bracket says store the i-th value there.
    • 0:43:32So i-th is just non-English.
    • 0:43:33That means go to bracket 0, bracket 1, bracket 2.
    • 0:43:36So what this effectively means is on the first iteration of the loop, when
    • 0:43:40i equals 0, this looks like this, effectively.
    • 0:43:44When i then becomes 1 on the next iteration, then you're doing this.
    • 0:43:48When i becomes 2 on the final iteration, it looks like this.
    • 0:43:51When i becomes 3, well, 3 is not less than 3,
    • 0:43:54and so it doesn't execute again.
    • 0:43:57So by using i inside of these square brackets, am I indexing into an array?
    • 0:44:03To index into an array means go to a specific location,
    • 0:44:06the so-called i-th location, but you start counting at zero.
    • 0:44:09Just to make this more real, then, if you go back
    • 0:44:12to this picture of your computer's memory,
    • 0:44:14this might, therefore, be bracket i, bracket 1--
    • 0:44:18bracket 0, bracket 1, bracket 2, bracket 3, bracket 4, bracket 50, or wherever.
    • 0:44:23You can now, using square brackets, get at any of these blocks of memory
    • 0:44:27to store values for you.
    • 0:44:30Any questions on what we've just done?
    • 0:44:34All right, then on the flip side, we can do the exact same thing.
    • 0:44:37Now when I print my scores, I can similarly iterate from 0 to 3,
    • 0:44:42and then print out the scores by passing to chart
    • 0:44:45the same value, the i-th score.
    • 0:44:48Again, the only new syntax here is variable name, square bracket,
    • 0:44:51and then a number, like 0, 1, 2, or a variable like i,
    • 0:44:54and then my chart function down here is exactly the same.
    • 0:44:57It has no idea an array is even involved, because I'm just
    • 0:45:00passing in one score at a time.
    • 0:45:04Now it turns out there's still one bad design decision in this program.
    • 0:45:08There's still some redundancy, something that I keep typing again
    • 0:45:12and again and again.
    • 0:45:14Do any values jump out at you as repeated?
    • 0:45:19AUDIENCE: The for loop.
    • 0:45:20DAVID J. MALAN: The for loop.
    • 0:45:21OK, so I've got the for loop in multiple places.
    • 0:45:24Sure.
    • 0:45:25And what other value seems to be in multiple places?
    • 0:45:29It's subtle.
    • 0:45:32Total number.
    • 0:45:33Yeah, 3.
    • 0:45:34Three is in a few places.
    • 0:45:35It's up here.
    • 0:45:36It's when I declare the array and ask myself for three scores.
    • 0:45:41It's here when I'm iterating.
    • 0:45:44It's not here, because this is a different iteration.
    • 0:45:46That's just for the hashes.
    • 0:45:48So in, ironically, three places, have I written 3.
    • 0:45:51So what does this mean?
    • 0:45:52Well, suppose next year you take more tests or whatever,
    • 0:45:54and you need more scores.
    • 0:45:55You open up your program, and all right, now I've got five scores and five--
    • 0:46:00whoops, typo already-- five, like this kind of pattern
    • 0:46:04where you're typing the same thing again and again.
    • 0:46:06And now the onus is on me, the programmer,
    • 0:46:08to remember to change the same [? damn ?] value in multiple places--
    • 0:46:11bad, bad, bad design.
    • 0:46:12You're going to miss one of those values.
    • 0:46:14Your program's going to get more complex.
    • 0:46:15You're going to leave one at 3 and change the other to 5,
    • 0:46:18and logical errors are eventually going to happen.
    • 0:46:20So how do we solve this?
    • 0:46:21The function's not the solution here, because it's not functionality.
    • 0:46:24It's just a value.
    • 0:46:25Well, we could use a variable, but a certain type of variable.
    • 0:46:28These numbers here-- 5, 5, 5 or 3, 3, 3--
    • 0:46:32are what humans generally refer to as magic numbers.
    • 0:46:34Like they're numbers, but they're kind of magical,
    • 0:46:36because you just arbitrarily hardcoded them in random places.
    • 0:46:39But a better convention would be, often as a global variable, to do this--
    • 0:46:44int, let's call it "count," equals 3.
    • 0:46:47So declare a variable of type int that is the number of things
    • 0:46:51you want, and then type that variable name all throughout your code
    • 0:46:55so that later on, if you ever want to change this program,
    • 0:46:58you change it-- whoops-- in one place, and you're
    • 0:47:01done after recompiling the program.
    • 0:47:03And actually, I should do a little better than this.
    • 0:47:05It turns out that if you know you have a variable that you're never going
    • 0:47:08to change, because it's not supposed to change--
    • 0:47:10it's supposed to be a constant value--
    • 0:47:12C also has a special keyword called const, where before the data type,
    • 0:47:16you say, const int, and then the name and then the value, and this way,
    • 0:47:20the compiler, Clang, will make sure that you, the human,
    • 0:47:22don't screw up and accidentally try to change the count anywhere else.
    • 0:47:27There's one other thing notable.
    • 0:47:28I also capitalize this whole thing for some reason--
    • 0:47:30human convention.
    • 0:47:31Anytime you capitalize all of the letters in a variable name,
    • 0:47:34the convention is that that means it's global.
    • 0:47:36That means it's defined way up top, and you can use it anywhere, therefore,
    • 0:47:40because it's outside all curly braces.
    • 0:47:42But it's meant to imply and remind you that this is special.
    • 0:47:46It's not just a so-called local variable inside
    • 0:47:48of a function or inside of a loop or the like.
    • 0:47:52Any questions on that?
    • 0:47:54Yeah.
    • 0:47:55AUDIENCE: What is [INAUDIBLE]?
    • 0:47:56Why do you have i plus 1?
    • 0:47:57DAVID J. MALAN: Oh, why do I have i plus 1?
    • 0:47:59Let me run this program real quick.
    • 0:48:01Why do I have i plus 1 in this line here, is the question.
    • 0:48:05So let me go ahead and run make scores 2--
    • 0:48:07whoops-- in my directory.
    • 0:48:09Make scores 2 ./scores2, Enter.
    • 0:48:13I wanted just the human to see Score 1 and Score 2 and Score 3.
    • 0:48:18I didn't want him or her to see Score 0, Score 1, Score 2, because it just looks
    • 0:48:22lame to the human.
    • 0:48:23The computer needs to think in terms of zeros.
    • 0:48:25My humans and my users do not, so just an aesthetic.
    • 0:48:29Other questions.
    • 0:48:29Yeah.
    • 0:48:30AUDIENCE: [INAUDIBLE].
    • 0:48:37DAVID J. MALAN: Ah, really good question.
    • 0:48:39And I actually thought about this last night
    • 0:48:40when trying to craft this example.
    • 0:48:43Why don't I just combine these two for loops,
    • 0:48:45because they're clearly iterating an identical number of times?
    • 0:48:50Was this a hand or just a stretch?
    • 0:48:52No, stretch.
    • 0:48:53So this is actually deliberate.
    • 0:48:57If I combine these, what would change logically in my program?
    • 0:49:01Yeah.
    • 0:49:02AUDIENCE: After every [INAUDIBLE] input, you would [INAUDIBLE]..
    • 0:49:05DAVID J. MALAN: Yeah, so after every human input of a score,
    • 0:49:08I would see that user's chart, the row of hashes.
    • 0:49:11Then I'd ask them for another value.
    • 0:49:13They'd see the chart, another value, and they'd see the chart.
    • 0:49:15And that's fine, if that is the design you want.
    • 0:49:17Totally acceptable.
    • 0:49:18Totally correct.
    • 0:49:19I wanted mine to look a little more traditional with all of the bars
    • 0:49:22together, so I effectively had to postpone printing the hashes.
    • 0:49:25And that's why I did have a little bit of redundancy
    • 0:49:27by getting the user's input here and then iterating again
    • 0:49:30to actually print the user's output as a chart, so just a design decision.
    • 0:49:34Good question.
    • 0:49:35Other questions?
    • 0:49:37All right, so what does this look like?
    • 0:49:40Actually, you know what?
    • 0:49:41I can probably do a little better.
    • 0:49:42Let me open up one final example involving scores and this thing
    • 0:49:45called an array.
    • 0:49:46In Scores 4 here, let me go ahead and do this.
    • 0:49:52Now I've changed my chart function to do a little bit more,
    • 0:49:55and you might recall from week 0 and 1, we had the call function,
    • 0:49:58and we kept enhancing it to do more and more,
    • 0:50:00like putting more and more logic into it.
    • 0:50:02Notice this.
    • 0:50:04Chart function now takes a second argument, which is kind of interesting.
    • 0:50:08It takes one argument, which is a number,
    • 0:50:10and then the next argument is an array of scores.
    • 0:50:13So long story short, if you want to have a function that
    • 0:50:16takes as input an array, you don't have to know
    • 0:50:18in advance how big that array is.
    • 0:50:20You should not, in fact, put a number in between the square brackets
    • 0:50:23in this context.
    • 0:50:24But the thing is you do need to know, at some point,
    • 0:50:27how many items are in the array.
    • 0:50:29If you've programmed in Java, took AP CS, Java just gives you .length,
    • 0:50:32if you recall that feature of objects.
    • 0:50:35C does not have this.
    • 0:50:36Arrays do not have an inherent length associated with them.
    • 0:50:39You have to tell everyone who uses your array how long it is.
    • 0:50:44So even though you don't do that syntactically here,
    • 0:50:46you literally just say, I expect an argument called scores that
    • 0:50:50is an array per the square brackets.
    • 0:50:53You have to pass and almost always a second variable
    • 0:50:55that is literally called whatever you want,
    • 0:50:57but is the number of things in that array,
    • 0:50:59because if the goal of this function is just
    • 0:51:02to iterate over the number of scores that are passed in,
    • 0:51:09and then iterate over the number of points in that score
    • 0:51:12in order to print out the hashes, you need to know this count.
    • 0:51:16So what does this function do, just to be clear?
    • 0:51:19This iterates over the total number of scores from 0 to count,
    • 0:51:22which is probably 3 or 5 or whatever.
    • 0:51:24This loop here, using J, which is just a convention,
    • 0:51:27instead iterates from 0 to whatever that i-th score is.
    • 0:51:32So this is what's convenient.
    • 0:51:33Now I've passed in the array, and I can still get at individual values
    • 0:51:38just by using i, because I'm on my i-th iteration here.
    • 0:51:41So you might recall this from Mario, for instance, or any other example
    • 0:51:44in which you had nested loops--
    • 0:51:46just very conventional to use i on the outside, j on the inside.
    • 0:51:50But again, the only point here is that you can, indeed, pass around arrays,
    • 0:51:54even as arguments, which we'll see why that's useful before long.
    • 0:51:59Any questions?
    • 0:52:02OK, so this was a lot, but we can do so much more still with arrays.
    • 0:52:05It gets even more and more cool.
    • 0:52:07In fact, we'll see, in just a bit, how arrays have actually
    • 0:52:10been with us since last week.
    • 0:52:11We just didn't quite realize it under the hood, but let's go ahead
    • 0:52:14and take a breather, five minutes.
    • 0:52:15We'll come back and dive in.
    • 0:52:16All right.
    • 0:52:17So I know that was a bit of a cliffhanger.
    • 0:52:20Where else could arrays have actually been?
    • 0:52:22But, of course, this is how we might depict it pictorially.
    • 0:52:25We called it an array, and it turns out that last week, when
    • 0:52:27we introduced strings, strings, sequences of characters,
    • 0:52:31are literally just an array by another name.
    • 0:52:34A string is an array of chars, and chars, of course, is another data type.
    • 0:52:39Now what are the actual implications of this,
    • 0:52:41both in terms of representation, like how a computer's representing
    • 0:52:44information, and then fundamentally, programmatically,
    • 0:52:48what can we do when we know all of our data
    • 0:52:50is so back to back to back or so proximal to one another?
    • 0:52:53Well, it turns out that we can apply this logic in a few different ways.
    • 0:52:58Let me go ahead and open up, for instance,
    • 0:53:01an example here called String 0.
    • 0:53:04So in our code for today, in our Source 2 folder,
    • 0:53:08let me go ahead and open up String 0, and this example looks like this.
    • 0:53:13Notice that we first, on line 9, get a string from the user.
    • 0:53:17Just say, input, please.
    • 0:53:19We store that value in a string, s, and then we say, here comes the output.
    • 0:53:23And notice what I'm doing in the following line.
    • 0:53:26I'm iterating over i from 0 to strlen, whatever that is.
    • 0:53:31And then in line 13, I'm printing a character one at a time.
    • 0:53:34But notice the syntax I'm using, which we didn't use last week.
    • 0:53:38If you have a string called s, you can index into a string
    • 0:53:43just like it's an array, because it, indeed, is underneath the hood.
    • 0:53:46So s bracket i, where i starts at 0 and goes
    • 0:53:50up to whatever this value is is just a way of getting character 0, then
    • 0:53:55character 1, then character 2, then character 3,
    • 0:53:58and so the end result is actually going to look like this.
    • 0:54:01Let me go ahead and do, make string--
    • 0:54:04whoops-- make string 0.
    • 0:54:06Oops.
    • 0:54:06Not in the directory.
    • 0:54:07Make string 0, ./string0, Enter, and I'll type in, say, Zamyla,
    • 0:54:15and the output now is Z-A-M-Y-L-A. It's a little messy,
    • 0:54:21because I don't have a new line here, so let me actually-- let's clean that up,
    • 0:54:24because this is unnecessarily sloppy.
    • 0:54:27So let me go ahead and print out a new line.
    • 0:54:31Let me recompile with make string 0, dot--
    • 0:54:34whoops-- ./string0.
    • 0:54:37Input shall be Zamyla, Enter, and now Z-A-M-Y-L-A.
    • 0:54:43So why is that happening?
    • 0:54:44Well, if I scroll down on this code, it seems that I am,
    • 0:54:47via this printf line here, just getting the i-th character of the name in s,
    • 0:54:52and then printing out one character at a time per the %c,
    • 0:54:55followed by a new line.
    • 0:54:57So you might guess, what is this function here doing?
    • 0:55:01Strlen-- slightly abbreviated, but you can, perhaps, glean what it means.
    • 0:55:06Yeah, so it's actually string length.
    • 0:55:08So it turns out there is a function that comes
    • 0:55:11with C called strlen, and humans back in the day
    • 0:55:13and to this day like to type as few characters when possible.
    • 0:55:17And so strlen is string length, and the way you use it is you
    • 0:55:21just need one more header file.
    • 0:55:22So there's another library, the so-called string
    • 0:55:24library that gives you string-related functions
    • 0:55:27beyond what CS50's library provides.
    • 0:55:29And so if you include string.h, that gives you access
    • 0:55:32to another function called strlen, that if you pass it,
    • 0:55:35a variable containing a string, it will pass you back
    • 0:55:38as a return value the total number of characters.
    • 0:55:40So I typed in Z-A-M-Y-L-A, and so that should be returning to me six,
    • 0:55:46thereby printing out the six characters in Zamyla's name.
    • 0:55:49Yeah.
    • 0:55:50AUDIENCE: [INAUDIBLE].
    • 0:55:52DAVID J. MALAN: Uh-huh.
    • 0:55:54AUDIENCE: [INAUDIBLE] useful to get the individual digits [INAUDIBLE]..
    • 0:55:56DAVID J. MALAN: Really good question.
    • 0:55:57In the credit problem of the problem set, would this have been useful?
    • 0:56:00Yes, absolutely.
    • 0:56:01But recall that in the credit pset, we encourage you to actually
    • 0:56:04take in the number as a long, so as an integral value, which
    • 0:56:07thereby necessitated arithmetic.
    • 0:56:09But yes, if you had, instead, in a problem involving credit card
    • 0:56:12numbers, gotten the human's input as a long string of characters
    • 0:56:16and not as an actual number like an int or a long,
    • 0:56:18then, yes, you could actually get at those individual characters,
    • 0:56:21which probably would have made things even easier but deliberate.
    • 0:56:26Yeah.
    • 0:56:27AUDIENCE: [INAUDIBLE].
    • 0:56:29DAVID J. MALAN: Really good question.
    • 0:56:30If we're defining string in CS50, are we redefining it in string?
    • 0:56:33No.
    • 0:56:34So string, even though it's named string.h,
    • 0:56:36doesn't actually define something called a string.
    • 0:56:39It just has string-related functions.
    • 0:56:42More on that soon.
    • 0:56:43Yeah.
    • 0:56:43AUDIENCE: [INAUDIBLE] individual values [INAUDIBLE]??
    • 0:56:46DAVID J. MALAN: Ah, really good question.
    • 0:56:48Could you edit the individual values?
    • 0:56:51So short answer, yes.
    • 0:56:52We could absolutely change values, and we'll soon do that in another context.
    • 0:56:57Other questions?
    • 0:56:59All right, so turns out this is correct, if my goal is
    • 0:57:03to print out all of the characters in Zamyla's name,
    • 0:57:06but it's not the best design.
    • 0:57:07And this one's a little subtle, but this is, again, what we mean by design.
    • 0:57:09And to a question that came up during the break,
    • 0:57:12did we expect everyone to be writing good style and good design last week?
    • 0:57:16No.
    • 0:57:16Up until today, like we've introduced the notion of correctness
    • 0:57:19in both Scratch and in C last week, but now we're
    • 0:57:21introducing these other axes of quality of code
    • 0:57:24like design, how well-designed it is, and how pretty
    • 0:57:27does it look in the context of style.
    • 0:57:28So expectations are here on out meant to be aligned with those characteristics,
    • 0:57:33but not in the past.
    • 0:57:35So there's a slight inefficiency here.
    • 0:57:37So on the first iteration of this loop, I first initialize i to 0,
    • 0:57:41and then I check if i less than the length of the string, which hopefully,
    • 0:57:45it is, if it's Zamyla, which is longer than 0.
    • 0:57:48Then I print the i-th character.
    • 0:57:50Then I increment i.
    • 0:57:51Then I check this condition.
    • 0:57:53Then I print the i-th character.
    • 0:57:55Then I increment i.
    • 0:57:56Then I check this condition and so forth.
    • 0:57:58We looped through loops last week, and you've used them, perhaps,
    • 0:58:01by now in problems.
    • 0:58:03What question am I redundantly asking seemingly unnecessarily?
    • 0:58:11I have to check a condition again and again,
    • 0:58:13because i is getting incremented.
    • 0:58:15But there's another other question that I
    • 0:58:18don't need to keep asking again just to get the same answer.
    • 0:58:21AUDIENCE: What is the length [? of the string? ?]
    • 0:58:23DAVID J. MALAN: Yeah, there's this function call
    • 0:58:25in my loop of strlen s, which is fine.
    • 0:58:28This is correct.
    • 0:58:29I'm checking the length of the string, but once I type in Zamyla,
    • 0:58:32her name is not changing in length.
    • 0:58:34I'm incrementing i, so I'm moving in the string, if you will.
    • 0:58:38But the string itself, Z-A-M-Y-L-A, is not changing.
    • 0:58:41So why am I asking the computer, again and again, get me the strlen of s,
    • 0:58:46get me the strlen of s, get me the strlen of s.
    • 0:58:48So I can actually fix this.
    • 0:58:49I can improve the design, because that must take some amount of time.
    • 0:58:52Maybe it's fast, but it's still a non-zero amount of time.
    • 0:58:55So you know what I could do?
    • 0:58:56I could do something like this-- int n get string length of s.
    • 0:58:59And now just do this.
    • 0:59:01This would be better design, because now I'm only asking the question once
    • 0:59:05of the function.
    • 0:59:06I'm remembering or caching, if you will, the answer, and then
    • 0:59:09I'm just using a variable.
    • 0:59:10And just comparing variables is just faster
    • 0:59:12than comparing a variable against a function, which has to be called,
    • 0:59:16which has to return a value, which you can then compare.
    • 0:59:18But honestly, it doesn't have to be this verbose.
    • 0:59:20We can actually be a little elegant about this.
    • 0:59:22If you're using a loop, a secret feature of loops
    • 0:59:25is that you can have commas after declaring variables.
    • 0:59:28And you can actually do this and make this even more elegant, if you will,
    • 0:59:32or more confusing-looking, depending on your perspective.
    • 0:59:35But this now does the same thing but declares n inside of the loop,
    • 0:59:39just like I'm declaring i, and it's just a little tighter.
    • 0:59:41It's one fewer lines of code.
    • 0:59:44Any questions, then?
    • 0:59:47AUDIENCE: [INAUDIBLE].
    • 0:59:50DAVID J. MALAN: Good question.
    • 0:59:51In the way I've just done it cannot reuse this outside of the curly braces.
    • 0:59:54The scope of i and n exists only in this context right now.
    • 0:59:59The other way, yes.
    • 1:00:00I could have used it elsewhere.
    • 1:00:03AUDIENCE: What if you [INAUDIBLE] other loops, and you also had [INAUDIBLE]??
    • 1:00:09DAVID J. MALAN: Absolutely.
    • 1:00:11AUDIENCE: Using different letters of the alphabet,
    • 1:00:13you could just use n and not be [INAUDIBLE]..
    • 1:00:17DAVID J. MALAN: Correct.
    • 1:00:18If I want to use the length of s again, absolutely.
    • 1:00:20I can declare the variable, as I did earlier, outside of the loop,
    • 1:00:24so as to reuse it.
    • 1:00:25That's totally fine.
    • 1:00:26Yes.
    • 1:00:27And even i-- i exists only inside of this loop, so if I have another loop,
    • 1:00:31I can reuse i, and it's a different i, because these variables only
    • 1:00:34exist inside the for loop in which they're declared.
    • 1:00:37So it turns out that these strings don't have anything in them
    • 1:00:44other than character after character after character.
    • 1:00:46And in fact, let me go ahead here and draw
    • 1:00:49a picture of what's actually going on underneath the hood of the computer
    • 1:00:52here.
    • 1:00:52So when I type in Zamyla's name, I'm, of course,
    • 1:00:55doing something like Z-A-M-Y-L-A. But where is that actually going?
    • 1:01:02Well, we know now that inside of your computer is RAM or memory,
    • 1:01:04and you can think of it like a grid.
    • 1:01:06And honestly, I can think of this whole screen
    • 1:01:08as just being in a different orientation, a grid of memory.
    • 1:01:11So for instance, maybe we can divide it into rows and columns like this, not
    • 1:01:16necessarily to scale, and there's more rows and columns.
    • 1:01:20So on the screen here, I'm just dividing things
    • 1:01:23into the individual bytes of memory that we saw a moment ago.
    • 1:01:28And so, indeed, underneath the hood of the computer is this layout of memory.
    • 1:01:32The compiler has somehow figured out or the program has somehow figured out
    • 1:01:35where to put the z and where the a and the m and the y and the l and the a,
    • 1:01:39but the key is that they're all contiguous, back to back to back.
    • 1:01:42But the catch is if I'm typing other words into my program or scores
    • 1:01:46into my program or any data into my program,
    • 1:01:49it's going to end up elsewhere in the computer's memory.
    • 1:01:51So how do you know where Zamyla begins and where
    • 1:01:53Zamyla ends, so to speak, in memory?
    • 1:01:56Well, the variable, called s, essentially is here.
    • 1:02:02There's some remembrance in the computer of where s begins.
    • 1:02:06But there's no obvious way to know where Zamyla ends,
    • 1:02:10unless we ourselves tell the computer.
    • 1:02:12So unbeknownst to us, any time a computer is storing a string like
    • 1:02:16Z-A-M-Y-L-A, it turns out that it's not using one, two, three, four, five,
    • 1:02:21six characters.
    • 1:02:22It's actually using seven secretly.
    • 1:02:25It's actually putting a special character
    • 1:02:28of all zeros in the very last bytes.
    • 1:02:33Every byte is eight bits, so it's putting secretly eight zeros there,
    • 1:02:37or we can actually draw this more conventionally as /0.
    • 1:02:40It's what's called the null character, and it just means all zeros.
    • 1:02:44So the length of the string, Zamyla, is six,
    • 1:02:46but how many bytes does it apparently take up, just to be clear?
    • 1:02:50So it actually takes up seven.
    • 1:02:52And this is kind of a secret implementation detail
    • 1:02:54that we don't really have to care about, but eventually, we will,
    • 1:02:57because if we want to implement certain functionality,
    • 1:02:59we're going to need to know what is actually going on.
    • 1:03:01So for instance, let me go ahead and do this.
    • 1:03:03Let me go ahead and create a program called strlen itself.
    • 1:03:07So this is not a function but a program called strlen.c.
    • 1:03:10Let me go ahead and include the CS50 library at the top.
    • 1:03:13Let me go ahead and include stdio.h.
    • 1:03:15Let me go ahead and type out main void, so all this is same as always.
    • 1:03:20And then let me go ahead and prompt the user for, say, his or her name,
    • 1:03:24like so.
    • 1:03:25And then you know what?
    • 1:03:26Let me actually, this time, not just print their name out,
    • 1:03:28because we've done that ad nauseam.
    • 1:03:29Let's just count the number of letters in his or her name.
    • 1:03:32So how could we do that?
    • 1:03:33Well, we could just do this-- int n get strlen of s, and then say,
    • 1:03:40printf "The length of your name is %i."
    • 1:03:45And then we can plug in n, because that's
    • 1:03:48the number we stored the length in.
    • 1:03:49But to use strlen, I have to include what header file?
    • 1:03:52String.h, which is the new one, so string.h.
    • 1:03:56And now if I type this all correctly, make strlen, make strlen, good.
    • 1:04:02./strlen-- let's try it--
    • 1:04:05Zamyla.
    • 1:04:06Enter.
    • 1:04:06OK, the length of her name is six.
    • 1:04:08But what is strlen doing?
    • 1:04:10Well, strlen is just an abstraction for us that someone else wrote,
    • 1:04:13and it's wonderfully convenient, but you know, we don't strictly need it.
    • 1:04:16I can actually do this myself.
    • 1:04:18If I understand what the computer is doing,
    • 1:04:20I can implement this same functionality myself as follows.
    • 1:04:24I can declare a variable called n and initialize it to 0,
    • 1:04:26and then you know what?
    • 1:04:27I'm going to go ahead and do this.
    • 1:04:29While s bracket n does not equal all zeros,
    • 1:04:36but you don't write all zeros like this.
    • 1:04:38You literally do this--
    • 1:04:39that /0 to which I referred earlier in single quotes.
    • 1:04:42That just means all zeros in the bytes.
    • 1:04:45And now I can go ahead and do n++.
    • 1:04:47If I'm familiar with what this means, remember,
    • 1:04:49that this is just n equals n plus 1, but it's just a little more compact to say,
    • 1:04:54n++.
    • 1:04:55And then I can print out the name of your n--
    • 1:04:57the name of your n--
    • 1:04:59the name of-- the length of your name is %i, plugging in n.
    • 1:05:03So why does this work?
    • 1:05:05It's a little funky-looking, but this is just
    • 1:05:07demonstrating an understanding of what's going on
    • 1:05:09underneath the proverbial hood.
    • 1:05:10If n is initialized to zero, and I look at s bracket n,
    • 1:05:14well, that's like looking at s bracket 0.
    • 1:05:16And if the string, s, is Zamyla, what is s bracket 0?
    • 1:05:21Z. And then it does not equal /0.
    • 1:05:24It equals z, obviously.
    • 1:05:25So we increment n.
    • 1:05:26So now n is 1.
    • 1:05:28Now n is 1.
    • 1:05:29So what is s bracket 1 in Zamyla's name?
    • 1:05:32A and so forth, and we get to Z-A-M-Y-L-A, then all zeros,
    • 1:05:38the so-called null character, or /0.
    • 1:05:41That, of course, does equal /0, so the loop stops,
    • 1:05:44thereby leaving the total count or value of n at what it previously was,
    • 1:05:49which was 6.
    • 1:05:51So that's it.
    • 1:05:52Like all underneath the hood, all we have
    • 1:05:54is memory laid out like this, top to bottom, left to right,
    • 1:05:57and yet all of the functionality we've been using for a week now
    • 1:06:00and henceforth just boils down to some relatively simple primitives,
    • 1:06:03and if you understand those primitives, you
    • 1:06:05can do anything you want using the computer, both computationally
    • 1:06:08code-wise, but also memory-wise.
    • 1:06:11We can actually see, in fact, some of the stuff we looked at two weeks ago as
    • 1:06:14follows.
    • 1:06:15Let me go ahead and open up an example called ASCII 0.
    • 1:06:18Recall that ASCII is the mapping between letters and numbers in a computer.
    • 1:06:22And notice what this program's going to do.
    • 1:06:23Make-- let me go into this folder.
    • 1:06:26Make ascii0, ./ascii0, Enter.
    • 1:06:30The string shall be, let's say, Zamyla, Enter.
    • 1:06:34Well, it turns out that if you actually look up
    • 1:06:38the ASCII code for Zamyla's name, z is 90, lowercase a is 97, m is 109,
    • 1:06:45and so forth.
    • 1:06:46There are those characters, and actually, we
    • 1:06:47can play the same game we did last week.
    • 1:06:49If I do this again on "hi," there's your 72, and there's your 73.
    • 1:06:53Where is this coming from?
    • 1:06:55Well, now that I know how to manipulate individual strings,
    • 1:06:57notice what I can do.
    • 1:06:58I can get a string from the user, just as we always have.
    • 1:07:01I can iterate over the length of that string, albeit inefficiently
    • 1:07:05using strlen here.
    • 1:07:06And then notice this new feature today.
    • 1:07:09I can now convert one data type to another, because a char,
    • 1:07:14a character is just eight bits, but presented in the context of characters.
    • 1:07:20Bytes is also just eight bits that you could treat as an integer, a number.
    • 1:07:24It's totally context-sensitive.
    • 1:07:25If you use Photoshop, it's a graphic.
    • 1:07:27If you use a text program, it's a message and so forth.
    • 1:07:29So you can encode--
    • 1:07:31change the context.
    • 1:07:33So notice here, s bracket i is, of course, the i-th character of Zamyla's
    • 1:07:38name, so Z or A or M or whatever.
    • 1:07:40But I can convert that i-th character to an integer doing what's called casting.
    • 1:07:44You can literally, in parentheses, specify the data type
    • 1:07:47you want to convert one data type to, and then
    • 1:07:50store it in exactly that data type.
    • 1:07:52So s bracket i-- convert it to a number.
    • 1:07:54Then store it in an actual number variable, so I can print out its value.
    • 1:07:59So c-- this is show me the character.
    • 1:08:01Show me the letter as by plugging in the character, and then the letter--
    • 1:08:06sorry, the character and the number that I've just converted it to.
    • 1:08:09And you don't actually even have to be explicit.
    • 1:08:11This is called explicit casting.
    • 1:08:13Technically, we can do this implicitly, too.
    • 1:08:17And the computer knows that numbers are characters,
    • 1:08:19and characters are a number.
    • 1:08:20You don't have to be so pedantic and even do
    • 1:08:22the explicit casting in parentheses.
    • 1:08:24You can just do it implicitly with data types, and honestly, at this point,
    • 1:08:27I don't even need the variable.
    • 1:08:29I can get rid of this, and down here, I can literally just
    • 1:08:32print the same thing twice, but tell printf
    • 1:08:36to print the first in the context of a character
    • 1:08:39and the second in the context of an int, just treating the exact same bits
    • 1:08:43differently.
    • 1:08:44That's implicit casting.
    • 1:08:46And it just demonstrates what we did in week 0
    • 1:08:48when we claimed that letters are numbers,
    • 1:08:51and numbers can also be colors, and colors can be images, and so forth.
    • 1:08:54Is this a question?
    • 1:08:55AUDIENCE: Would've been useful for credit.
    • 1:08:57DAVID J. MALAN: Also, yes.
    • 1:08:57It all comes back to credit.
    • 1:08:58Yeah.
    • 1:08:59Indeed.
    • 1:09:00Other questions?
    • 1:09:01No.
    • 1:09:02All right, so what else can we actually do with this appreciation?
    • 1:09:06So super simple feature that all of us surely take for granted,
    • 1:09:08if we even use it anymore these days.
    • 1:09:10Google Docs, Microsoft Word, and such can automatically
    • 1:09:13capitalize words for you these days.
    • 1:09:14I mean your phone can do it nowadays.
    • 1:09:16They just sort of AutoCorrect your messages.
    • 1:09:18Well, how is that actually working?
    • 1:09:20Well, once you know that a string is just a bunch of characters
    • 1:09:22back to back to back, and you know that these characters have numbers
    • 1:09:26representing them, and like capital A is 65, and lowercase A is 97, apparently,
    • 1:09:32and so forth, we can leverage these patterns.
    • 1:09:34If I go ahead and open up this other example
    • 1:09:36here called Capitalize 0, notice what this program is
    • 1:09:40going to do for me first by running it.
    • 1:09:43Make capitalize 0 ./capitalize0.
    • 1:09:47Let me go ahead and type in Zamyla's name just as before, but now
    • 1:09:50it's all capital.
    • 1:09:51So this is a little extreme.
    • 1:09:52Hopefully, your phone is not capitalizing every letter,
    • 1:09:55but you can imagine it capitalizing just the first, if you wanted it.
    • 1:09:58So how does this work?
    • 1:09:59Well, let me go ahead and open up this example here.
    • 1:10:03And so what we did--
    • 1:10:04so here, I'm getting a string from the user, just as we always do.
    • 1:10:08Then I'm saying, after, just to kind of format the output nicely.
    • 1:10:11Here, I'm doing a loop pretty efficiently from i equals 0 up
    • 1:10:15to the length of the string.
    • 1:10:17And now notice this neat application of logic.
    • 1:10:20It's a little cryptic, certainly, at first glance.
    • 1:10:22But whoops.
    • 1:10:23And now it's gone.
    • 1:10:23And what am I doing exactly with these lines of code?
    • 1:10:27Well, with every iteration of this loop, I'm asking the question,
    • 1:10:31is the i-th character of s, so the current character,
    • 1:10:33is it greater than or equal to lowercase A, and is it less than
    • 1:10:37or equal to lowercase Z?
    • 1:10:39Put another way, how do you say that more colloquially in English?
    • 1:10:42Is it lowercase, literally.
    • 1:10:44But this is the more programmatic way of expressing, is it lowercase?
    • 1:10:47All right, if it is, go ahead and do this.
    • 1:10:49Now this is a little funky, but print out a character, specifically
    • 1:10:53the i-th character, but subtract from that lowercase letter whatever
    • 1:10:58the difference is between little A and big A. Now where did that come from?
    • 1:11:05So it turns out--
    • 1:11:06OK, capital A is 65.
    • 1:11:08Lowercase A is 97.
    • 1:11:10So the difference between those is 32.
    • 1:11:13And that's true for B, so capital B is 66, and lowercase B is 98.
    • 1:11:18Still 32, and it repeats for the whole alphabet.
    • 1:11:20So I could just do this.
    • 1:11:22If I know that lowercase letters have bigger numbers, like 97, 98,
    • 1:11:27and I know that lowercase numbers have lower letters, like 65, 66,
    • 1:11:32I can just literally subtract off 32 from my lowercase letters.
    • 1:11:35As you point out, it's a lowercase letter.
    • 1:11:37Subtract 32, and that gives us what result?
    • 1:11:40The capitalized version.
    • 1:11:42It uppercases things for us.
    • 1:11:43But honestly, this feels a little hackish that, like, OK, yes,
    • 1:11:46I can do the math correctly, but you know what?
    • 1:11:48It's better practice, generally, to abstract this away.
    • 1:11:50Don't get into the weeds of counting how many characters are away
    • 1:11:53from each other.
    • 1:11:53Math is cheap and easy in the computer.
    • 1:11:55Let it do the math for you by subtracting whatever the value of A
    • 1:11:58is, of capital A is from the value of lowercase A. Or we could just write 32.
    • 1:12:04Otherwise, go ahead and just print the character unchanged.
    • 1:12:07So in this case, the A-M-Y-L-A in Zamyla's name got uppercased,
    • 1:12:11and everything else, the Z, got left alone,
    • 1:12:13just by understanding what's going on with how the computer's represented.
    • 1:12:18But honestly, God, I don't want to keep writing code like this.
    • 1:12:21Like, I'm never going to get this.
    • 1:12:22I'm new to programming, perhaps.
    • 1:12:23I'm never going to get this sort of sequence of all the cryptic symbols
    • 1:12:26together, and that's OK, because we can actually implement this same program
    • 1:12:30a little more easily, thanks to functions
    • 1:12:32and abstractions that others have written for us.
    • 1:12:35So in this program, turns out I can simplify
    • 1:12:38the questions I'm asking by literally calling a function that says, is lower.
    • 1:12:43And there's another one called, is upper,
    • 1:12:45and there's bunches of others that just literally are called,
    • 1:12:48is something or other.
    • 1:12:49So is lower takes an argument like the i-th character of s,
    • 1:12:53and it just returns a bull-- true or false.
    • 1:12:55How is it implemented?
    • 1:12:57Well, honestly, if we looked at the code that someone else wrote decades ago
    • 1:13:00for is upper, odds are-- or is lower--
    • 1:13:03odds are he or she wrote code that looks almost like this.
    • 1:13:07But we don't need to worry about that level of detail.
    • 1:13:10We can just use his or her function, but how do we do that?
    • 1:13:12Turns out that this function-- and you would only know this
    • 1:13:15by having been told or Googling or reading a reference--
    • 1:13:17is in a library called ctype.h.
    • 1:13:20And you need the header file called ctype.h in order to use it.
    • 1:13:25And we'll almost always point you to references and documentation
    • 1:13:27to explain that to you.
    • 1:13:29Toupper is another feature, right?
    • 1:13:31This math-- like, my god.
    • 1:13:33I just want to uppercase a letter.
    • 1:13:34I don't want to really keep thinking about how far apart uppercase letters
    • 1:13:36are from lowercase.
    • 1:13:37Turns out that in the C type library, there's
    • 1:13:39another function called toupper that literally does the exact same thing
    • 1:13:43in the previous program we wrote.
    • 1:13:45And so that, too, is OK.
    • 1:13:47But you know what?
    • 1:13:48This feels a little verbose.
    • 1:13:50It would be nice if I could really tighten this program up.
    • 1:13:53So how those toupper work?
    • 1:13:55Well, it turns out some of you might be familiar with CS50 Reference
    • 1:13:58Online, our web-based app that we have that
    • 1:14:00helps you navigate available functions in C.
    • 1:14:03Turns out that all of the data for that application
    • 1:14:06comes from an older command line program that comes in Linux
    • 1:14:09and comes in the sandbox called Man for manual.
    • 1:14:12And anytime you type "man" at the command prompt, and then the name
    • 1:14:16of a function you're interested in, if it exists,
    • 1:14:18it will tell you a little something about it.
    • 1:14:20So if I go to toupper, man toupper, I get slightly cryptic documentation
    • 1:14:27here.
    • 1:14:27But notice, toupper and some other functions
    • 1:14:30convert uppercase or lowercase.
    • 1:14:31That's the summary.
    • 1:14:33Notice that in the synopsis, the man page, so to speak,
    • 1:14:36is telling me what header file I have to include.
    • 1:14:39Notice that under Synopsis, it's also telling me
    • 1:14:41what the signature or prototype is of the function.
    • 1:14:44In other words, the documentation in Man, the Linux programmer's manual,
    • 1:14:48is very terse.
    • 1:14:48So it's not going to hold your hand in this black and white format.
    • 1:14:51It's just going to convey, well, implicitly,
    • 1:14:54you better put this on top of your file.
    • 1:14:55And by the way, this is how you use the function.
    • 1:14:57It takes an argument called C, returns a value of type int.
    • 1:15:03Why is it int?
    • 1:15:06Let me wave my hands at that.
    • 1:15:07It effectively returns a character for our purposes today.
    • 1:15:10And if we scroll down, OK, description.
    • 1:15:12Ugh, I don't really want to read all of this, but OK, here we go.
    • 1:15:16If c is a lowercase letter, toupper returns its uppercase equivalent,
    • 1:15:21if an uppercase representation exists in the current locale.
    • 1:15:23That just means if it's punctuation, it's not going to do anything.
    • 1:15:26Otherwise, it returns C, And that's kind of the key detail.
    • 1:15:29If I pass it lowercase A, it's going to give me capital A,
    • 1:15:33but if I pass it capital A, what's it going to give me?
    • 1:15:36AUDIENCE: Capital A.
    • 1:15:37DAVID J. MALAN: Also, capital A. It returns the original character, c.
    • 1:15:40That's the only detail I cared about.
    • 1:15:42When in doubt, read the manual.
    • 1:15:43And it might be a little cryptic, and this
    • 1:15:45is why CS50 Reference takes somewhat cryptic documentation
    • 1:15:48and tries to simplify it into more human-friendly terms.
    • 1:15:50But at the end of the day, these are the authoritative answers.
    • 1:15:53And if I or one of the staff don't know, we literally
    • 1:15:55pull up the Man page or CS50 Reference to answer these kinds of questions.
    • 1:15:59Now what's the implication?
    • 1:16:01I don't need any of this.
    • 1:16:02I can literally get rid of the condition and just let
    • 1:16:06toupper do all of the legwork, and now my program
    • 1:16:10is so much more compact than the previous versions were,
    • 1:16:13because I've read the documentation.
    • 1:16:15I know what the function does, and I can let toupper uppercase something
    • 1:16:18or just pass it through unchanged.
    • 1:16:21We can better design, because we're writing fewer lines of code that
    • 1:16:23are just as clear, and so we can now actually tighten things up.
    • 1:16:29Any questions on this particular approach?
    • 1:16:33All right.
    • 1:16:34So we're getting very low level.
    • 1:16:35Now let's make these things more useful, because clearly, other people
    • 1:16:38have solved some of these problems for us,
    • 1:16:40as by having these functions and the C type library and the string library.
    • 1:16:44What more is there?
    • 1:16:45Well, recall that every time we run Clang, or even run make,
    • 1:16:49we're typing multiple words at the command prompt.
    • 1:16:52You're typing make hello or make Mario, a second word,
    • 1:16:56or you're typing clang-o, hello, hello.c,
    • 1:16:59like lots of words at the prompt.
    • 1:17:01Well, it turns out that all this time, you're using, indeed,
    • 1:17:04command line arguments.
    • 1:17:05But in C, you can write programs that also accept words and numbers when
    • 1:17:10the user runs the program.
    • 1:17:11Think back, after all.
    • 1:17:12When you ran Mario, you did ./mario, Enter.
    • 1:17:15You couldn't type any more words at the prompt.
    • 1:17:17When you did credit, you did ./credit, Enter.
    • 1:17:20No more words at the prompt.
    • 1:17:21You used get string or get long to get more input, but not
    • 1:17:24at the command line.
    • 1:17:26And it turns out that we can, relatively simply, in C,
    • 1:17:29but it's a little cryptic at first glance.
    • 1:17:31Let me go ahead and--
    • 1:17:33let me go ahead and, here, pull up this signature here, which looks like this.
    • 1:17:41This is the function that we're all used to by now for writing a main function.
    • 1:17:45And up until now, we've said void.
    • 1:17:47Main doesn't take any inputs, and indeed, it just runs.
    • 1:17:50But it turns out if you change your existing programs or future programs,
    • 1:17:54not to say void, but to say, int argc, string argv,
    • 1:17:58it's a little cryptic at first glance.
    • 1:18:00But what's a recognizable symbol now?
    • 1:18:04Yeah, there's brackets here.
    • 1:18:05So it turns out that every time you write a program,
    • 1:18:08if you don't just say void, you actually enable this feature
    • 1:18:11by writing int argc, string argv.
    • 1:18:13You can actually tell Clang, you know what?
    • 1:18:15I want this program to accept one or more words or numbers after the name
    • 1:18:20of the program, so I can do ./hellodavid, or ./hellozamyla.
    • 1:18:23I don't have to wait for the program to be running to use string.
    • 1:18:27And just as with the earlier example, where you were able to chart an array,
    • 1:18:34main is defined as taking an array, called argv historical reasons--
    • 1:18:38argument vector.
    • 1:18:39Vector means array.
    • 1:18:40Argument vector, bracket, closed bracket just means this is--
    • 1:18:43this contains one or more words, each of which is a string.
    • 1:18:46Argc is argument count, so this is the variable
    • 1:18:49that main gets access to that tells it how many arguments,
    • 1:18:52how many strings are actually in argv.
    • 1:18:55So how can we use this in a useful way?
    • 1:18:58Well, let me go ahead here and open up the sandbox.
    • 1:19:01And let me go ahead and create a new file called, say, argv0, argv0.c--
    • 1:19:06again, argument vector, just list or array of arguments.
    • 1:19:10And let me go ahead and, as usual, include cs50.h, include stdio.h,
    • 1:19:19and then int main not void, but int argc, string argv--
    • 1:19:26argv-- open bracket, closed bracket.
    • 1:19:28And even if that doesn't come naturally at first, it will eventually.
    • 1:19:31And I'm going to do this.
    • 1:19:32If the number of arguments passed in equals 2,
    • 1:19:39then I'm going to go ahead and do this-- printf, hello %s, comma,
    • 1:19:45and here in the past, I've typed a variable name.
    • 1:19:47And I now actually have access to a variable.
    • 1:19:49Go ahead and do argv bracket 1.
    • 1:19:52Else, if the user does not type, apparently, two words,
    • 1:19:56let me go ahead and just by default, say, hello world, as we always have.
    • 1:20:00Now why-- what is this doing, and how is it doing it?
    • 1:20:03Well, let's quickly run it.
    • 1:20:04So make-- whoops.
    • 1:20:07Make argv0, ./argv0, Enter, Hello World.
    • 1:20:15But if I do Hello--
    • 1:20:17or dot-- the program would be better named
    • 1:20:19if we called it Hello, but Zamyla, Enter.
    • 1:20:23Hello Zamyla.
    • 1:20:24If I change it to David, now I have access to David.
    • 1:20:26If I had David Malan, no.
    • 1:20:29It doesn't support that.
    • 1:20:30So what's going on?
    • 1:20:31If you change main in any program write to take
    • 1:20:34these two arguments, argc and argv of type string int
    • 1:20:38and then an array of strings, argc tells you how many words
    • 1:20:41were typed at the prompt.
    • 1:20:42So if the human typed two words, I presume
    • 1:20:45the first word is the name of the program, dot slash argv0,
    • 1:20:48the second word is presumably my name, if he or she is actually
    • 1:20:51providing their name at the prompt.
    • 1:20:53And so I print out argv bracket 1.
    • 1:20:55Not 0 because that's the name of the program, but argv bracket 1.
    • 1:20:58Else, down here, if the human doesn't provide just Zamyla, or just David,
    • 1:21:02or just one word more generally, I just print the default, "Hello world."
    • 1:21:07But what's neat about this now is notice that argv is an array of strings.
    • 1:21:15What is a string?
    • 1:21:18It's an array of characters.
    • 1:21:20And so let's enter just one last piece of syntax that gets kind of powerful
    • 1:21:24here.
    • 1:21:24Let me go ahead and do this.
    • 1:21:28Let me go ahead and, in a new file here, argv 1 dot c.
    • 1:21:33Let me go ahead and paste this in.
    • 1:21:35Close this.
    • 1:21:36Let me go ahead and do this.
    • 1:21:38Rather than do this logical checking, let me do this, for--
    • 1:21:43let's say for int, i get 0.
    • 1:21:48i is less than argc--
    • 1:21:50i++.
    • 1:21:51Let's go ahead and, one per line, print out every word
    • 1:21:54that the human just typed, just to reinforce
    • 1:21:57that this is indeed what's going on.
    • 1:21:59So argv bracket 0, save.
    • 1:22:01Make argv 1, enter.
    • 1:22:05And now let's go ahead and run this program--
    • 1:22:07dot slash, argv 1, David Malan.
    • 1:22:12OK, you see all three words.
    • 1:22:14If we change it to Zamyla, we see just those two words.
    • 1:22:17If we change it to Zamyla Chan, we see those three words.
    • 1:22:20So we clearly have access to all of the words in the array,
    • 1:22:23but let's take this one step further.
    • 1:22:25Rather than just print out every word in a string, let's go ahead and do this.
    • 1:22:28For intj get 0.
    • 1:22:32n equals the string length of the current argument, like this--
    • 1:22:40j is less than n, j++--
    • 1:22:43oops, oops, oops-- j++.
    • 1:22:45Now let me go ahead and print out not the full string, but let me do-- oops,
    • 1:22:49oops-- let me go ahead and print out this--
    • 1:22:52not a string, but a character, n bracket i bracket j, like this.
    • 1:23:00All right.
    • 1:23:00So what's going on?
    • 1:23:01One, this outer loop, and let's comment it, iterate over strings in argv.
    • 1:23:07This inner loop, iterate over chars in argv bracket i.
    • 1:23:13So the outer loop iterates over all of the strings in argv.
    • 1:23:17And the inner loop, using a different variable, starting at 0,
    • 1:23:20iterates over all of the characters in the ith
    • 1:23:23argument, which itself is a string.
    • 1:23:26So we can call string length on it.
    • 1:23:28And then we do this up until n, which is the length of that string.
    • 1:23:31And then we print out each character.
    • 1:23:33So just to be clear-- when I run arv1 and correct it, at first glance,
    • 1:23:38why it's implicitly declaring library function sterling, what's almost always
    • 1:23:42the solution when you do this wrong?
    • 1:23:44AUDIENCE: [INAUDIBLE]
    • 1:23:45DAVID J. MALAN: Yeah.
    • 1:23:45So I forgot this, so include string.h and help50 would
    • 1:23:49help with that as well.
    • 1:23:50Let's recompile with make argv1.
    • 1:23:52All right.
    • 1:23:53When I run argv1, of, say, Zamyla Chan, what am I going to see?
    • 1:24:00AUDIENCE: [INAUDIBLE]
    • 1:24:01DAVID J. MALAN: Yeah.
    • 1:24:03Is that the right intuition?
    • 1:24:05AUDIENCE: [INAUDIBLE]
    • 1:24:06DAVID J. MALAN: I'm going to see Zamyla Chan, but--
    • 1:24:10AUDIENCE: [INAUDIBLE]
    • 1:24:11DAVID J. MALAN: One character on each line, including the program's name.
    • 1:24:14So in fact, let me scroll this up so it's a little bigger.
    • 1:24:16Enter.
    • 1:24:17OK, it's a little stupid, the program, but it does confirm that using arrays
    • 1:24:22do I have access not only to the words, but I can kind of
    • 1:24:25have the second dimension.
    • 1:24:26And within each word, I can get at each character within.
    • 1:24:30And we do this, again, just by using not just single square brackets,
    • 1:24:34but double.
    • 1:24:35And again, just break this down into the first principles.
    • 1:24:37What is this first bracket?
    • 1:24:38This is the ith argument, the ith string in the array.
    • 1:24:41And then if you take it further, with bracket j,
    • 1:24:43that gives you the j character inside of this.
    • 1:24:47Now, who cares about any of this kind of functionality?
    • 1:24:51Well, let me scroll back and propose one application here.
    • 1:24:54So recall that CS is really just problem solving.
    • 1:24:57But suppose the problem that you want to solve
    • 1:24:59is to actually pass a secret message in class
    • 1:25:02or send someone a secret for whatever reason.
    • 1:25:04Well, the input to that problem is generally
    • 1:25:06called plain test, a message you want to send to that other person.
    • 1:25:09You ideally want ciphertext to emerge from it,
    • 1:25:12which is enciphered and scrambled, somehow encrypted information
    • 1:25:15so that anyone in the room, like the teacher, can't just grab the note
    • 1:25:18and read what you're sending to your secret crush or love across the room,
    • 1:25:21or in any other context as well.
    • 1:25:23But the problem is that if the message you want to send, say,
    • 1:25:26is our old friend Hi!, with an exclamation point,
    • 1:25:29you can encode it in certain contexts as just 72, 73, 33.
    • 1:25:34And I daresay most classes on campus if you wrote on a piece of paper 72,
    • 1:25:3773, 33, passed it through the room, and whatever professor intercepts it,
    • 1:25:40they're not going to know what you're saying anyway.
    • 1:25:43But this is not a good system.
    • 1:25:44This is not a cryptosystem.
    • 1:25:46Why?
    • 1:25:47It's not secure.
    • 1:25:52[INAUDIBLE]
    • 1:25:52[INTERPOSING VOICES]
    • 1:25:54DAVID J. MALAN: Yeah.
    • 1:25:55Anyone has access to this, right, so long
    • 1:25:57as you attend like week 1 or 0 of CS50, or you just
    • 1:26:00have general familiarity with Ascii.
    • 1:26:02Like this is just a code.
    • 1:26:04I mean Ascii is a system that maps letters to numbers.
    • 1:26:08And anyone else who knows this code obviously
    • 1:26:10knows what your message is, because it's not a unique secret
    • 1:26:12to you and the recipient.
    • 1:26:14So that's probably not the best idea.
    • 1:26:16Well, you can be a little more sophisticated.
    • 1:26:18And this is back-- actually, a photograph
    • 1:26:19from World War I of a message that was sent from Germany to Mexico
    • 1:26:23that was encoded in a very similar way.
    • 1:26:25It wasn't using Ascii.
    • 1:26:26The numbers, as you can perhaps glean from the photo,
    • 1:26:29are actually much larger.
    • 1:26:30But in this system, in a militaristic context, there was a code book.
    • 1:26:33So similar in spirit to Ascii, where you have
    • 1:26:35a column of numbers and a column of letters to which they correspond,
    • 1:26:39a codebook more generally has like numbers, and then maybe
    • 1:26:42even letters or whole words that they correspond to,
    • 1:26:45sometimes thousands of them, like literally a really big book of codes.
    • 1:26:50And so long as only, in this context the Germans and the recipients,
    • 1:26:53the Mexicans, had access to that same book,
    • 1:26:56only they could encrypt and decrypt, or rather encode and decode information.
    • 1:27:01Of course, in this very specific context--
    • 1:27:03you can read more about this in historical texts--
    • 1:27:05this was intercepted.
    • 1:27:06This message, seemingly innocuous, though definitely
    • 1:27:08suspicious looking with all these numbers,
    • 1:27:11so therefore not innocuous, the British, in this case actually, intercepted it.
    • 1:27:16And thanks to a lot of efforts and cryptanalysis,
    • 1:27:18the Bletchley Park style code breaking, albeit further back,
    • 1:27:23were they able to figure out what those numbers represented in words
    • 1:27:27and actually decode the message.
    • 1:27:29And in fact, here's a photograph of some of the words
    • 1:27:31that were translated from one to the other.
    • 1:27:34But more on that in any online or textual references.
    • 1:27:38Turns out in this poem too there was a similar code, right?
    • 1:27:41So apropos of being in Boston here, you might recall this one.
    • 1:27:44"Listen my children, and you shall hear of the midnight ride of Paul Revere.
    • 1:27:49On the 18th of April in '75, hardly a man
    • 1:27:51is now alive who remembers that famous day and year.
    • 1:27:54He said to his friend, if the British march by land or sea
    • 1:27:58from the town tonight night, hang a lantern
    • 1:28:00aloft in the belfry arch of the North Church tower as a signal light,
    • 1:28:05one if by land, and two if by sea.
    • 1:28:08And I on the opposite shore will be ready to ride and spread
    • 1:28:10the alarm through every Middlesex village and farm for the country folk
    • 1:28:13to be up and to arm."
    • 1:28:14So it turns out some of that is not actually factually correct,
    • 1:28:17but the one if by land and the two if by sea code were
    • 1:28:21sort of an example of a one-time code.
    • 1:28:23Because if the revolutionaries in the American Revolution kind of
    • 1:28:27decided secretly among themselves literally that-- we will put up one
    • 1:28:30light at the top of a church if the British are coming by land.
    • 1:28:34And we will instead use two if the British are instead coming by sea.
    • 1:28:37Like that is a code.
    • 1:28:38And you could write it down in a book, unless you have a code book.
    • 1:28:41But of course, as soon as someone figures out that pattern,
    • 1:28:44it's compromised.
    • 1:28:45And so code books tend not to be the most robust mechanisms
    • 1:28:49for encoding information.
    • 1:28:51Instead, it's better to use something more algorithmic.
    • 1:28:54And wonderfully, in computer science is this black box
    • 1:28:56to-- we keep saying, the home of algorithms.
    • 1:28:59And in general, encryption is a problem with inputs and outputs,
    • 1:29:03but we just need one more input.
    • 1:29:05The input is what's generally called the key, or a secret.
    • 1:29:09And a secret might just be a number.
    • 1:29:11So for instance, if I wanted my secret to be 1,
    • 1:29:13because we'll keep the example simple, but it could really be any number.
    • 1:29:16And indeed, we saw with the photograph a moment ago,
    • 1:29:18the Germans used much larger than this, albeit in the context of codes.
    • 1:29:21Suppose that you now want to send a more private message to someone
    • 1:29:24across the room in a class that, I love you.
    • 1:29:26How do you go about encoding that in a way that isn't just using Ascii
    • 1:29:31and isn't just using some simple code book?
    • 1:29:33Well, let me propose that now that we understand how strings are represented,
    • 1:29:37right-- we're about to make love really, really lame and geeky--
    • 1:29:41so now that you know how to express strings computationally,
    • 1:29:44well, let's just start representing "I love you" in Ascii.
    • 1:29:47So I is 73.
    • 1:29:49L is 76.
    • 1:29:50O-V-E Y-O-U. That's just Ascii.
    • 1:29:53Should not send it this way, because anyone
    • 1:29:55who knows Ascii is going to know what you're saying.
    • 1:29:58But what if I enciphered this message, I performed an algorithm on it?
    • 1:30:02And at its simplest, an algorithm can just
    • 1:30:04be math-- simple arithmetic, as we've seen.
    • 1:30:06So you know, let me just use my secret key of 1.
    • 1:30:09And let me make sure that my crush knows that I am using a secret value of 1.
    • 1:30:14So he or she also knows to expect that value.
    • 1:30:17And before I send my message, I'm going to add 1 to every letter.
    • 1:30:21So 73 becomes 74.
    • 1:30:2376 becomes 77.
    • 1:30:2480, 87, 70, 90, 80, 86.
    • 1:30:29Now this could just be sent in the clear.
    • 1:30:31But then, I could actually send it as a textual message.
    • 1:30:35So let's convert it back to Ascii.
    • 1:30:3774 is now J. 77 is now M. 80 is now P. And you can perhaps see the pattern.
    • 1:30:45This message was, I love you.
    • 1:30:48And now, all of the letters are off by one, I think.
    • 1:30:52I became J. L became M. O became P, and so forth.
    • 1:30:57So now the claim would be, cryptographically, I'm
    • 1:31:00going to send this message across the room.
    • 1:31:02And now no one who has a code book is going to be able to solve this.
    • 1:31:05I can't just steal the book and decode it,
    • 1:31:07because now the key is only up here, so to speak.
    • 1:31:09It's just the number 1 that he or she and I
    • 1:31:12had to agree upon in advance that we would
    • 1:31:13use for sending our secret messages.
    • 1:31:15So if someone captures this message, teacher in the room or whoever,
    • 1:31:20how would they even go about decoding this or decrypting it?
    • 1:31:26Are there any techniques available to them?
    • 1:31:29I daresay we can kind of chip away at this love note.
    • 1:31:32AUDIENCE: [INAUDIBLE]
    • 1:31:32DAVID J. MALAN: What's that?
    • 1:31:33Guess and check.
    • 1:31:34OK, we could try all--
    • 1:31:35there still kind of some spacing.
    • 1:31:36So you know honestly, we could do like kind of a cryptanalysis of it,
    • 1:31:40a frequency attack.
    • 1:31:41Like, I can't think of too many words in English
    • 1:31:43that have a single letter in them.
    • 1:31:45So what does J probably represent?
    • 1:31:46[INTERPOSING VOICES]
    • 1:31:47DAVID J. MALAN: I, probably.
    • 1:31:48Maybe A, but probably I. And there's not too many other options.
    • 1:31:52So we've attacked one part of the message already.
    • 1:31:55I see a commonality.
    • 1:31:56There's two what in here?
    • 1:31:59Two P. And I don't necessarily know that that maps to O, but I do
    • 1:32:02know it's the same character.
    • 1:32:04So if I kind of continue this thoughtful process or this trial and error,
    • 1:32:08and I figure out, oh, what if that's an O?
    • 1:32:10And then that's an O. And then wait a minute.
    • 1:32:12They're passing from one to another.
    • 1:32:13Maybe this says, I love you.
    • 1:32:15Like you actually can, with some probability,
    • 1:32:17decrypt a message by doing this kind of analysis on it.
    • 1:32:20It's at least more secure than the code book,
    • 1:32:22because you're not compromised if the book itself is stolen.
    • 1:32:25And you can change the key every time, so long as you
    • 1:32:28and the recipient actually agree on something.
    • 1:32:30But at least we now have this mechanism in place.
    • 1:32:33So with just the understanding of what you can do with strings,
    • 1:32:36can you actually now do really interesting domain-specific things
    • 1:32:39to them?
    • 1:32:40And in fact, back in the day, Caesar, back in militaristic times literally
    • 1:32:45used a cipher quite like this.
    • 1:32:47And frankly, when you're the first one to use these ciphers,
    • 1:32:49they actually are kind of secure, even if they're relatively simple.
    • 1:32:52But hopefully, not just using a key of 1, maybe 2, or 13, or 25,
    • 1:32:57or something larger.
    • 1:32:58But this is an example of a substitution cipher,
    • 1:33:01or a rotational cipher where everything's kind of rotating--
    • 1:33:04A's becoming B, B's becoming C. Or you can kind of
    • 1:33:07rotate it even further than that.
    • 1:33:11Well, let's take a look at one last example here
    • 1:33:14of just one other final primitive of a feature
    • 1:33:17today, before we then go back high level to bring everything together.
    • 1:33:20It turns out that printing out error messages
    • 1:33:23is not the only way to signal that something has gone wrong.
    • 1:33:27There's a new keyword, a new use of an old keyword in this example,
    • 1:33:31that's actually a convention for signaling errors.
    • 1:33:33So this is an example called exit.c.
    • 1:33:36It apparently wants the human to do what, if you infer from the code?
    • 1:33:42AUDIENCE: Exit [INAUDIBLE].
    • 1:33:43DAVID J. MALAN: Yes.
    • 1:33:44Say again?
    • 1:33:44AUDIENCE: [INAUDIBLE]
    • 1:33:45DAVID J. MALAN: Well, it wants the-- well, what
    • 1:33:47does it what the human to do implicitly, based on the printf's here?
    • 1:33:51How should I run this program?
    • 1:33:53Yeah?
    • 1:33:53AUDIENCE: [INAUDIBLE] just apply [INAUDIBLE]..
    • 1:33:56DAVID J. MALAN: Yeah.
    • 1:33:57So for whatever reason, this program implicitly
    • 1:34:00wants me to write exactly two words at the prompt.
    • 1:34:03Because if I don't, it's going to yell at me, missing command line argument.
    • 1:34:06And then it's going to return 1, whatever that is.
    • 1:34:08Otherwise, it's going to say, Hello, such and such.
    • 1:34:10So if I actually run this program--
    • 1:34:12let me go back over here and do make exit--
    • 1:34:17oops-- in my directory, make exit.
    • 1:34:19OK, dot slash exit, enter, I'm missing a command line argument.
    • 1:34:23All right, let me put Zamyla's name.
    • 1:34:25Oh, Hello Zamyla.
    • 1:34:26Let me put Zamyla Chan.
    • 1:34:28Nope, missing command line argument.
    • 1:34:29It just wants the one, so in this case here.
    • 1:34:33I'm seeing visually the error message, but it turns out
    • 1:34:36the computer is also signaling to me what the so-called exit code is.
    • 1:34:41So long story short, we've already seen examples last week of how
    • 1:34:44you can have a function return a value.
    • 1:34:46And we saw how [? Erin ?] came up on stage,
    • 1:34:47and she returned to me a piece of paper with a string on it.
    • 1:34:50But it turns out that main is a little special.
    • 1:34:52If main returns a value like 1 or 0, you can actually see that,
    • 1:34:58albeit in a kind of a non-obvious way.
    • 1:35:01If I run exit, and I run it correctly with Zamyla as the name,
    • 1:35:06if I then type echo, dollar sign, question mark, of all things,
    • 1:35:10enter, I will then see exactly what main returned with, which in this case is 0.
    • 1:35:15Now, let me try and be uncooperative.
    • 1:35:17If I actually run just dot slash exit, with no word,
    • 1:35:23I see, missing command line argument.
    • 1:35:25But if I do the same cryptic command, echo, dollar sign, question mark,
    • 1:35:29I see that main exited with 1.
    • 1:35:30Now, why is this useful?
    • 1:35:32Well, as we start to write more complicated programs,
    • 1:35:35it's going to be a convention to exit from main by returning
    • 1:35:39a non-zero value, if anything goes wrong.
    • 1:35:420 happens to mean everything went well.
    • 1:35:44And in fact, in all of the programs we've
    • 1:35:46written thus far, if you don't mention return anything,
    • 1:35:49main automatically for you returns 0.
    • 1:35:53And it has been all this time.
    • 1:35:55It's just a feature, so you don't have to bother typing it yourself.
    • 1:35:57But what's nice about this, or what's real about this,
    • 1:36:00is if on your Mac or PC, if you've ever gotten an annoying error message that
    • 1:36:04says, error negative 29, system error has occurred, or something freezes,
    • 1:36:08but you very often see numbers on the screen, maybe.
    • 1:36:11Like those error codes actually tend to map to these kinds of values.
    • 1:36:15So when a human is writing software and something goes wrong
    • 1:36:18and an error happens, they typically return a value like this.
    • 1:36:21And the computer has access to it.
    • 1:36:23And this isn't all that useful for the human running the program.
    • 1:36:25But as your programs get more complex, we'll
    • 1:36:27see that this is actually quite useful as a way of signaling
    • 1:36:32that something indeed went wrong.
    • 1:36:34Whew.
    • 1:36:34OK, that's a lot of syntax wrapped in some loving context.
    • 1:36:41Any questions before we look at one final domain?
    • 1:36:44No?
    • 1:36:45All right.
    • 1:36:46So it turns out that we can answer the "who cares" question in yet another way
    • 1:36:51too.
    • 1:36:52It turns out-- let me go ahead and open up an example of our array again here--
    • 1:36:59that arrays can actually now be used to solve problems more algorithmically.
    • 1:37:03And this is where life gets more interesting.
    • 1:37:05Like we were so incredibly in the weeds today.
    • 1:37:06And as we move forward in the class, we're
    • 1:37:08not going to spend so much time on syntax,
    • 1:37:10and dollar signs, and question marks, and square brackets, and the like.
    • 1:37:13That's not the interesting part.
    • 1:37:14The interesting part is when we now have these fundamental building
    • 1:37:17blocks, like an array, with which we can solve problems.
    • 1:37:20So it turns out that an array, you know, you
    • 1:37:23can kind of think of it as a series of lockers,
    • 1:37:26a series of lockers that might look like this, inside of which
    • 1:37:29are values-- strings, or numbers, or chars, or whatnot.
    • 1:37:32But the lockers is an apt metaphor because a computer, unlike us humans,
    • 1:37:36can only see and do one thing at a time.
    • 1:37:38It can open one locker and look inside, but it can't kind of
    • 1:37:41take a step back, like we humans can, and look at all of the lockers,
    • 1:37:44even if all of the doors are open.
    • 1:37:46So it has to be a more deliberate act than that.
    • 1:37:49So what are the actual implications?
    • 1:37:51Well, all this time--
    • 1:37:52we had that phone book example in the first week,
    • 1:37:55and the efficiency of that algorithm, of finding Mike Smith in this phone book,
    • 1:37:59all assumed what feature of this phone book?
    • 1:38:02AUDIENCE: That it's ordered alphabetically.
    • 1:38:03DAVID J. MALAN: That it was ordered alphabetically.
    • 1:38:05And that was a huge plus, because then I could go to the middle,
    • 1:38:08and I could go to the middle of the middle, and so forth.
    • 1:38:10And that was an algorithmic possibility.
    • 1:38:12On our phones, if you pull up your contacts,
    • 1:38:13you've got a list of first names, or last names, all alphabetically sorted.
    • 1:38:17That is because, guess what data structure or layout
    • 1:38:20your phone probably uses to store your contacts?
    • 1:38:24It's an array of some sort, right?
    • 1:38:26It's just a list.
    • 1:38:27And it might be displayed vertically, instead of horizontally,
    • 1:38:29as I've been drawing it today.
    • 1:38:30But it's just values that are back, to back, to back, to back, to back,
    • 1:38:33that are actually sorted.
    • 1:38:34But how did they actually get into that sorted order?
    • 1:38:37And how do you actually find values?
    • 1:38:38Well, let's consider what this problem is actually
    • 1:38:40like for a computer, as follows.
    • 1:38:43Let me go ahead here.
    • 1:38:44Would a volunteer mind joining us up here?
    • 1:38:47I can throw in a free stress ball.
    • 1:38:49OK, someone from the back?
    • 1:38:51OK, come on up here.
    • 1:38:52Come on.
    • 1:38:52What's your name?
    • 1:38:53ERIC: Eric.
    • 1:38:54DAVID J. MALAN: Aaron.
    • 1:38:55All right.
    • 1:38:56So Aaron's going to come on up.
    • 1:38:57And--
    • 1:38:58ERIC: Eric.
    • 1:38:59DAVID J. MALAN: I'm sorry?
    • 1:38:59Oh, Eric.
    • 1:39:00Nice to meet you.
    • 1:39:01All right.
    • 1:39:01Come on over here.
    • 1:39:02So Eric, now normally, I would ask you to find the number 23.
    • 1:39:05But seeing is that's a little easy, can you go ahead and just find us
    • 1:39:08the number 50 behind these doors, or really these yellow lockers?
    • 1:39:118?
    • 1:39:11Nope.
    • 1:39:1242?
    • 1:39:13Nope.
    • 1:39:13OK.
    • 1:39:14Pretty good.
    • 1:39:14That's three, three out of seven.
    • 1:39:16How did you get it so quickly?
    • 1:39:17ERIC: I guessed.
    • 1:39:18DAVID J. MALAN: OK, so he guessed.
    • 1:39:20Is that the best algorithm that Eric could have used here?
    • 1:39:24ERIC: Probably not.
    • 1:39:26DAVID J. MALAN: Well, I don't know.
    • 1:39:27Yes?
    • 1:39:28No?
    • 1:39:28AUDIENCE: Yeah.
    • 1:39:29DAVID J. MALAN: Why?
    • 1:39:30Why yes?
    • 1:39:30AUDIENCE: [INAUDIBLE]
    • 1:39:31DAVID J. MALAN: He has no other information.
    • 1:39:32So yes, like that was the best you can do.
    • 1:39:34But let me give you a little more information.
    • 1:39:35You can stay here.
    • 1:39:36And let me go ahead and reload the screen here.
    • 1:39:40And let me go ahead and pull up a different set of doors.
    • 1:39:43And now suppose that, much like the phone book, and much like the phones
    • 1:39:46are sorted, now these doors are sorted.
    • 1:39:49And find us the number 50.
    • 1:39:54All right.
    • 1:39:54So good.
    • 1:39:55What did you do that time?
    • 1:39:57AUDIENCE: Well, [INAUDIBLE].
    • 1:39:59It was 50 is 116.
    • 1:40:00So I just--
    • 1:40:01DAVID J. MALAN: Right.
    • 1:40:02So you jumped to the middle, initially, and then to the right half.
    • 1:40:07And then technically-- so we're technically off by 1, right?
    • 1:40:10Because like binary search would have gone to the middle of the--
    • 1:40:12that's OK, but very well done to Eric.
    • 1:40:14Here, let me at least reinforce this with a stress ball.
    • 1:40:18So thank you.
    • 1:40:20Very well done.
    • 1:40:21So with that additional information, as you know,
    • 1:40:23Eric was able to do better because the information was sorted on the screen.
    • 1:40:27But he only had one insight to a locker at a time,
    • 1:40:30because only by revealing what's inside can he actually see it.
    • 1:40:33So this seems to suggest that once you do
    • 1:40:35have this additional information in Eric's example, in your phone example,
    • 1:40:39in the phone book example, you open up possibilities for much much, much more
    • 1:40:44efficient algorithms.
    • 1:40:45But to get there, we've kind of been deferring this whole time in class
    • 1:40:50how you actually sort these elements.
    • 1:40:53And if you wouldn't mind-- and this way, we'll hopefully end on a more energized
    • 1:40:56note here because I know we've been in the weeds for a while--
    • 1:40:59can we get like eight volunteers?
    • 1:41:02OK, so 1, 2, 3, 4-- how about 5, 6, 7, 8, come on down.
    • 1:41:09Oh, I'm sorry.
    • 1:41:09Did I completely overlook the front row?
    • 1:41:11OK.
    • 1:41:12All right, next time.
    • 1:41:13Next time.
    • 1:41:14Come on down.
    • 1:41:20Oh, and Colton, do you mind meeting them over there instead?
    • 1:41:23All right.
    • 1:41:25Come on up.
    • 1:41:26What's your name?
    • 1:41:26[? CAHMY: ?] [? Cahmy. ?]
    • 1:41:27DAVID J. MALAN: [? Cahmy? ?] David.
    • 1:41:28Right over there.
    • 1:41:29What's your name?
    • 1:41:29MATT: Matt.
    • 1:41:29DAVID J. MALAN: Matt?
    • 1:41:30David.
    • 1:41:30[? JUHE: ?] [? Juhe. ?]
    • 1:41:31DAVID J. MALAN: [? Juhe? ?] David.
    • 1:41:32MAX: Max.
    • 1:41:32DAVID J. MALAN: Max, nice to meet you.
    • 1:41:33JAMES: James.
    • 1:41:34DAVID J. MALAN: James, nice to see you.
    • 1:41:35Here, I'll get more chairs.
    • 1:41:36What's your name?
    • 1:41:37,PEYTON: Peyton.
    • 1:41:37DAVID J. MALAN: Peyton?
    • 1:41:38David.
    • 1:41:38And two more.
    • 1:41:40Actually can what have you come down to this end here?
    • 1:41:42What's your name.
    • 1:41:43ANDREA: Andrea.
    • 1:41:43DAVID J. MALAN: Andrea, nice to see you.
    • 1:41:45And your name?
    • 1:41:46[? PICCO: ?] [? Picco. ?]
    • 1:41:46DAVID J. MALAN: [? Picco, ?] David.
    • 1:41:47Nice to see you.
    • 1:41:48OK, Colton has a T-shirt for each of you, very Harvard-esque here.
    • 1:41:54And each of these shirts, as you're about to see, has a number on it.
    • 1:41:57And that number is--
    • 1:42:00well, go ahead put them on, if you wouldn't mind.
    • 1:42:06OK, thank you so much.
    • 1:42:07So I daresay we've arranged our humans much like the lockers in an array.
    • 1:42:11Like we have humans back, to back, to back, to back.
    • 1:42:13But this is actually both a blessing and a constraint,
    • 1:42:17because we only have eight chairs.
    • 1:42:18So there's really not much room here, so we're confined to just this space here.
    • 1:42:22And I see we have a 4, 8, 5, 2, 3, 1, 6, 7.
    • 1:42:25So this is great.
    • 1:42:26Like they are unsorted.
    • 1:42:28By definition, it's pretty random.
    • 1:42:29So that's great.
    • 1:42:30So let's just start off like this.
    • 1:42:31Sort yourselves from 1 to 8, please.
    • 1:42:42OK.
    • 1:42:42All right.
    • 1:42:43Well, what algorithm was that?
    • 1:42:45[LAUGHTER]
    • 1:42:46AUDIENCE: Look around, figure it out.
    • 1:42:47DAVID J. MALAN: Look around, figure it out.
    • 1:42:49OK, well--
    • 1:42:49MATT: Human ingenuity.
    • 1:42:50DAVID J. MALAN: Human ingenuity?
    • 1:42:51Very well done.
    • 1:42:52So can we-- well, what was like a thought
    • 1:42:54going through any of your minds?
    • 1:42:56MATT: Find a chair and sit down.
    • 1:42:57DAVID J. MALAN: Find the chair--
    • 1:42:58find the right chair.
    • 1:42:59So go to a location.
    • 1:43:00Good.
    • 1:43:01So like an index location, right?
    • 1:43:02Arrays have indices, so to spea--
    • 1:43:040, 1, 2, all the way up to 7.
    • 1:43:07And even though our shirts are numbered from 1 to 8,
    • 1:43:10you can think in terms of 0 to 7.
    • 1:43:11So that was good.
    • 1:43:12Anyone else?
    • 1:43:12Other thoughts?
    • 1:43:14[? CAHMY: ?] I mean, this is something we implicitly think of,
    • 1:43:17but no one told us that it was ordered right to left.
    • 1:43:19Like we could have done it left to right.
    • 1:43:20DAVID J. MALAN: OK.
    • 1:43:21Absolutely.
    • 1:43:21Could have gone from right to left, instead of left to right.
    • 1:43:23But at least we all agreed on this convention
    • 1:43:25too, so that was in your mind.
    • 1:43:26OK.
    • 1:43:26So good.
    • 1:43:27So we got this sorted.
    • 1:43:28Go ahead and re-randomize yourself, if you could.
    • 1:43:35And what algorithm was this?
    • 1:43:37Just random awkwardness?
    • 1:43:38OK, so that's fine.
    • 1:43:39So it looks pretty random.
    • 1:43:41That will do.
    • 1:43:42Let's see if we can now reduce the process of sorting
    • 1:43:44to something a little more algorithmic so that, one, we can be sure
    • 1:43:47we're correct and not just kind of get lucky that everyone kind of figured it
    • 1:43:50out and no one was left out, and two, then
    • 1:43:52start to think about how efficient it is, right?
    • 1:43:54Because if we've been gaining so much efficiency for the phone book,
    • 1:43:57for our contacts, for [? error ?] coming up,
    • 1:43:59we really should have been asking the whole time,
    • 1:44:01sure, you save time with binary search and divide and conquer,
    • 1:44:05but how much did it cost you to get to a point
    • 1:44:08where you can use binary search and divide and conquer?
    • 1:44:10Because sorting, if it's super, super, super expensive and time-consuming
    • 1:44:14maybe it's a net negative.
    • 1:44:15And you might as well just search the whole list,
    • 1:44:17rather than ever sort anything.
    • 1:44:18All right.
    • 1:44:19So let's see here.
    • 1:44:206 and 5, I don't like this.
    • 1:44:22Why?
    • 1:44:24AUDIENCE: [INAUDIBLE]
    • 1:44:25DAVID J. MALAN: 6 is supposed to come after 5.
    • 1:44:27And so, can we fix this, please?
    • 1:44:29All right.
    • 1:44:30And then let's see.
    • 1:44:30OK, 6 and 1-- ugh, don't really like this.
    • 1:44:33Yeah, can we fix this?
    • 1:44:36Very nice.
    • 1:44:366 and 3, OK, you really got the short end of the stick here.
    • 1:44:39So 6 and 3, could we fix this?
    • 1:44:43And 6-- yeah, OK.
    • 1:44:44Ooh, OK, 6 and 7-- good.
    • 1:44:46All right, so that's pretty good.
    • 1:44:477 and 8, nice.
    • 1:44:498 and 4, sorry.
    • 1:44:50Could we switch here?
    • 1:44:52All right.
    • 1:44:53And then 8 and 2?
    • 1:44:54OK, could we switch here?
    • 1:44:56OK.
    • 1:44:56And let me ask you a somewhat rhetorical question.
    • 1:44:58OK, am I done?
    • 1:45:00OK, no.
    • 1:45:00Obviously not, but I did fix some problems, right?
    • 1:45:03I fixed some transpositions, numbers being out of order.
    • 1:45:06And in fact, I-- what's your name again?
    • 1:45:07[? CAHMY: ?] [? Cahmy. ?]
    • 1:45:08DAVID J. MALAN: [? Cahmy, ?] kind of bubbled to the right here, if you will.
    • 1:45:11Like you were kind of farther down, and now you're over here.
    • 1:45:14And like the smaller numbers, kind of-- yeah 1.
    • 1:45:16Like, my god, like he kind of bubbled his way this way.
    • 1:45:19So things are percolating, in some sense.
    • 1:45:21And that's a good thing.
    • 1:45:23And so you know what?
    • 1:45:24Let Me try to fix some remaining problems.
    • 1:45:26So 1 and 5-- good.
    • 1:45:27Oh 3 and 5, could you switch?
    • 1:45:295 and 6, OK.
    • 1:45:316 and 7?
    • 1:45:327 and 4, could you switch?
    • 1:45:34OK.
    • 1:45:36And 7 and 2, could you switch?
    • 1:45:40And now, I don't have to speak with [? Cahmy ?] again,
    • 1:45:42because we know you're in the right place.
    • 1:45:44So I actually don't have to do quite as much work
    • 1:45:46this time, which is kind of nice.
    • 1:45:48But am I done?
    • 1:45:49No, obviously not.
    • 1:45:50But what's the pattern now?
    • 1:45:52Like what's the fundamental primitive?
    • 1:45:53If I just compare pairwise humans and numbers,
    • 1:45:57I can slightly improve the situation each time
    • 1:45:59by just swapping them, swapping them.
    • 1:46:01And each time now--
    • 1:46:02I'm sorry, [? Picco ?] is in number 7's place.
    • 1:46:04I don't have to talk to him anymore, because he's now bubbled
    • 1:46:07his way all the way up to the top.
    • 1:46:08So even though I'm doing the same thing again and again,
    • 1:46:10and looping again and again isn't always the best thing,
    • 1:46:13so long as you're looping fewer and fewer times, I will eventually stop,
    • 1:46:16it would seem.
    • 1:46:17Because 6 is going to eventually go in the right place, and then 5,
    • 1:46:20and then 4, and so forth.
    • 1:46:21So if we can just finish this algorithm.
    • 1:46:22Good.
    • 1:46:24Not good.
    • 1:46:26OK, 6 and 2, not good.
    • 1:46:27If you could swap?
    • 1:46:28OK, and what's your name again?
    • 1:46:30PEYTON: Peyton.
    • 1:46:31DAVID J. MALAN: Peyton is now in the right place.
    • 1:46:32I have even less work now ahead of me.
    • 1:46:33So if I can just continue this process--
    • 1:46:351 and 3, 3 and 5, 4 and 5, OK, and then 2 and 5.
    • 1:46:39And then, what's your name again?
    • 1:46:40MATT: Matt.
    • 1:46:41DAVID J. MALAN: Matt is now in the right place.
    • 1:46:42Even less work.
    • 1:46:43We're almost there.
    • 1:46:441 and 3, 3 and 4, 4 and 2, if you could swap.
    • 1:46:47OK, almost done.
    • 1:46:48And 1 and 3, 3 and 2, if you could swap.
    • 1:46:51Nice.
    • 1:46:52So this is interesting.
    • 1:46:53It would seem that-- you know, in the first place,
    • 1:46:55I kind of compared seven pairs of people.
    • 1:46:59And then the next time I went through, I compared how many pairs of people
    • 1:47:02maximally?
    • 1:47:02AUDIENCE: [INAUDIBLE]
    • 1:47:03DAVID J. MALAN: Just six, right?
    • 1:47:05Because we were able to leave [? Cahmy ?] out.
    • 1:47:06And then we were able to leave [? Picco ?] out, and then Peyton.
    • 1:47:09And so the number of comparisons I was doing was getting fewer and fewer.
    • 1:47:12So that feels pretty good.
    • 1:47:13But you know what?
    • 1:47:14Before We even analyze that, can you just randomize yourselves again?
    • 1:47:17Any human algorithm is fine.
    • 1:47:18Let's try one other approach, because this feels kind of non-obvious, right?
    • 1:47:22I was fixing things, but I had to keep fixing things again and again.
    • 1:47:26Let me try to take a bigger bite out of the problem
    • 1:47:28this time by just selecting the smallest person.
    • 1:47:30OK, so your name again is?
    • 1:47:31[? JUHE: ?] [? Juhe. ?]
    • 1:47:32DAVID J. MALAN: [? Juhe, ?] number 2-- that's a pretty small number,
    • 1:47:34so I'm going to remember that in sort of a mental variable.
    • 1:47:364?
    • 1:47:36No, you're too big.
    • 1:47:37Too big.
    • 1:47:39Oh, what was your name again?
    • 1:47:40JAMES: James.
    • 1:47:40DAVID J. MALAN: James.
    • 1:47:41James is a 1.
    • 1:47:42That's pretty nice.
    • 1:47:43Let me keep checking.
    • 1:47:43OK, James, in my mental variable is the smallest number.
    • 1:47:47I know I want him at the beginning.
    • 1:47:48So if you wouldn't mind coming with me.
    • 1:47:50And I'm sorry, we don't have room for you anymore.
    • 1:47:52If you could just-- oh, you know what?
    • 1:47:53Could you all just shuffle down?
    • 1:47:55Well, hm, I don't know if I like that.
    • 1:47:57That's a lot of work, right?
    • 1:47:58Moving all these values, let's not do that.
    • 1:48:00Let's not do that.
    • 1:48:01Number 2, could you mind just going where--
    • 1:48:03where--
    • 1:48:03JAMES: It's James.
    • 1:48:04DAVID J. MALAN: --James was?
    • 1:48:06OK, so I've kind of made the problem a little worse in that,
    • 1:48:09now, number 2 is farther away from the goal.
    • 1:48:11But I could have gotten lucky, and maybe she was number 7 or 8.
    • 1:48:14And so let me just claim that, on average, just evicting the person
    • 1:48:17is going to kind of be a wash and average out.
    • 1:48:20But now James is in the right place.
    • 1:48:21Done.
    • 1:48:22Now I have a problem that's of size 7.
    • 1:48:24So let me select the next smallest person.
    • 1:48:264 is the next smallest, not 8, not 5, not 7-- ooh, 2.
    • 1:48:29Not 3, 6.
    • 1:48:30OK, so you're back in the game.
    • 1:48:32All right, come on back.
    • 1:48:33And can we evict number 4?
    • 1:48:35And on this algorithm, if you will, I just
    • 1:48:37interpretively select the smallest person.
    • 1:48:40I'm not comparing everyone in quite the same way and swapping them pairwise,
    • 1:48:44I'm doing some of more macroscopic swaps.
    • 1:48:46So now I'm going to look for the next smallest, which is 3.
    • 1:48:48If you wouldn't mind popping around here?
    • 1:48:50[? Cahmy, ?] we have to, unfortunately, evict you,
    • 1:48:52but that works out to our favor.
    • 1:48:53Let me look for the next smallest, which is 4.
    • 1:48:55OK, you're back in.
    • 1:48:56Come on down.
    • 1:48:57Swap with 5.
    • 1:48:58OK, now I'm looking for 5.
    • 1:49:00Hey, 5, there you are.
    • 1:49:01OK.
    • 1:49:01So go here.
    • 1:49:02OK, looking for 6.
    • 1:49:03Oh, 6, a little bit of a shuffle.
    • 1:49:06OK.
    • 1:49:07And now looking for 7.
    • 1:49:08Oh, 7, if you could go here.
    • 1:49:10But notice, I'm not going back.
    • 1:49:12And this is what's important.
    • 1:49:13Like my steps are getting shorter and shorter.
    • 1:49:15My remaining steps are getting shorter and shorter.
    • 1:49:17And now we've actually sorted all of these humans.
    • 1:49:21So two fundamentally different ways, but they're both comparative in nature,
    • 1:49:24because I'm comparing these characters again,
    • 1:49:27and again, and again, and swapping them if they're out of order.
    • 1:49:29Or at a higher level, going through and swapping them again,
    • 1:49:34and again, and again.
    • 1:49:35But how many steps am I taking each time?
    • 1:49:38Even though I was doing fewer and fewer and I wasn't doubling back,
    • 1:49:41the first time, I was doing like n minus 1 comparisons.
    • 1:49:45And then I went back here.
    • 1:49:46And in the first algorithm, I kind of stopped going as far.
    • 1:49:50In the second algorithm, I just didn't go back as far.
    • 1:49:53So it was just kind of a different way of thinking of the problem.
    • 1:49:56But then I did what?
    • 1:49:57Like seven comparisons?
    • 1:49:59Then six, then five, then four, then three, then two, then one.
    • 1:50:03It's getting smaller, but how many comparisons is that total?
    • 1:50:06I've got like n people, n being a number.
    • 1:50:09AUDIENCE: [INAUDIBLE]
    • 1:50:10DAVID J. MALAN: Is not as bad as factorial.
    • 1:50:12We'd be here all day long.
    • 1:50:14But it is big.
    • 1:50:15It is big.
    • 1:50:15Let's go-- a round of applause, if we could, for our volunteers.
    • 1:50:18You can keep the shirts, if you'd like, as a souvenir.
    • 1:50:19[APPLAUSE]
    • 1:50:20Thank you, very much.
    • 1:50:22Let me see if we can't just kind of quantify that-- thank you, so much--
    • 1:50:26and see how we actually got to that point.
    • 1:50:29If I go ahead and pull up not our lockers, but our answers here,
    • 1:50:34let me propose that what we just did was essentially two algorithms.
    • 1:50:38One has the name bubble.
    • 1:50:39And I was kind of deliberately kind of shoehorning the word in there.
    • 1:50:42Bubble sort is just that comparative sort, pair by pair,
    • 1:50:45fixing tiny little mistakes.
    • 1:50:47But we needed to do it again, and again, and again.
    • 1:50:50So those steps kind of add up, but we can express them as pseudocode.
    • 1:50:54So in pseudocode-- and you can write this any number of ways--
    • 1:50:56I might just do the following.
    • 1:50:58Just keep doing the following, until there's no remaining swaps--
    • 1:51:01from i from 0 to n -2, which is just n is the total number of humans.
    • 1:51:06n -2 is go up from that person to this person,
    • 1:51:10because I want to compare him or her against the person next to them.
    • 1:51:13So I don't want to accidentally do this.
    • 1:51:14That's why it's n -2 at the end here.
    • 1:51:16Then I want to go ahead and, if the ith and the ith +1 elements are out
    • 1:51:19of order, swap them.
    • 1:51:21So that's why I was asking our human volunteers to exchange places.
    • 1:51:24And then just keep doing that, until there's no one left to swap.
    • 1:51:27And by definition, everyone is in order.
    • 1:51:29Meanwhile, the second algorithm has the conventional name of selection sort.
    • 1:51:33Selection sort is literally just that, where you actually
    • 1:51:37select the smallest person, or number of interest to you, intuitively,
    • 1:51:40again and again.
    • 1:51:41And the number keeps getting bigger, but you
    • 1:51:43start ignoring the people who you've already put into place.
    • 1:51:45So the problem, similarly, is getting smaller and smaller.
    • 1:51:48Just like in bubble sort, it was getting more and more sorted.
    • 1:51:52The pseudocode for selection sort might look like this.
    • 1:51:54For i from 0 to n -1, so that's 0 in an array.
    • 1:51:58And this is n -1.
    • 1:52:00Just keep looking for the smallest element between those two chairs,
    • 1:52:05and then pull that person out.
    • 1:52:06And then just evict whoever's there-- swap them,
    • 1:52:09but not necessarily adjacently, just as far away as is necessary.
    • 1:52:13And in this way, I keep turning my back on more and more people
    • 1:52:16because they are then in place.
    • 1:52:18So two different framings of the problem,
    • 1:52:20but it turns out they're actually both the same number of steps, give or take.
    • 1:52:24It turns out they're roughly the same number of steps,
    • 1:52:27even though it's a different way of thinking about it.
    • 1:52:29Because if I think about bubble sort, the first iteration, for instance,
    • 1:52:32what just-- actually, well, let's consider selection sort even.
    • 1:52:35In selection sort, how many comparisons did I have to do?
    • 1:52:39Well, once I found my smallest element, I
    • 1:52:41had to compare them against everyone else.
    • 1:52:43So that's n -1 comparisons the first time.
    • 1:52:46So n -1 on the board.
    • 1:52:47Then I can ignore them, because they're behind me now.
    • 1:52:50So now I have how many comparisons left out of n people?
    • 1:52:54n -2, because I subtracted one.
    • 1:52:56Then again, n -3, then n -4, all the way down to just one person remaining.
    • 1:53:00So I'll express that sort of generally, mathematically, like this.
    • 1:53:03So n -1 plus n -2 plus whatever plus one final comparison, whatever that is.
    • 1:53:09It turns out that if you actually read the back of the math
    • 1:53:12book or your physics textbooks where they have those little cheat
    • 1:53:14sheets as to what these recurrences are, turns out that n -1 plus n -2 plus n -3
    • 1:53:20and so forth can be expressed more succinctly
    • 1:53:22as literally just n times n -1 divided by 2.
    • 1:53:26And if you don't recall that, that's OK.
    • 1:53:28I always look these things up as well.
    • 1:53:30But that's true-- fact.
    • 1:53:32So what does that equal out to?
    • 1:53:33Well, it's like n squared minus n, if you just multiply it out.
    • 1:53:36And then if you divide the two, then it's
    • 1:53:38n squared divided by 2 minus n over 2.
    • 1:53:40So that's the total number of steps.
    • 1:53:42And I could actually plug this in.
    • 1:53:43We could plug in 8, do the math, and get the total number of comparisons
    • 1:53:46that I was verbally kind of rattling off.
    • 1:53:49So is that a big deal?
    • 1:53:51Hm, it feels like it's on the order of n squared.
    • 1:53:54And indeed, a computer scientist, when assessing
    • 1:53:56the efficiency of an algorithm, tends not to care too much
    • 1:53:59about the precise values.
    • 1:54:00All we're going to care about it's the biggest term.
    • 1:54:02What's the value in the formula that you come up
    • 1:54:05with that just dominates the other terms, so to speak,
    • 1:54:07that has the biggest effect, especially as n is getting larger and larger?
    • 1:54:11Now, why is this?
    • 1:54:12Well, let's just do sort of proof by example, if you will.
    • 1:54:15If this is the expression, technically, but I
    • 1:54:18claim that, ugh, it's close enough to say
    • 1:54:20on the order of, big O of n squared, so to speak, let's use an example.
    • 1:54:24If there's a million people on stage, and not just eight,
    • 1:54:27that math works out to be like a million squared
    • 1:54:29divided by 2 steps minus a million divided by 2, total.
    • 1:54:33So what does that actually work out to be?
    • 1:54:35Well, that's 500 billion minus 500,000.
    • 1:54:38And what does that work out to be?
    • 1:54:40Well, that's 499 billion, 999 million, 500,000.
    • 1:54:46That feels pretty darn close to like n squared.
    • 1:54:49I mean, that's a drop in the bucket to subtract 500,000 from 500 billion.
    • 1:54:54So you know what?
    • 1:54:55Eh, it's on the order of n squared.
    • 1:54:57It's not precise, but it's in that general order of magnitude,
    • 1:55:01so to speak.
    • 1:55:02And so this symbol, this capital 0, is literally a symbol
    • 1:55:04used in computer science and in programming
    • 1:55:06to just kind of describe with a wave of the hand,
    • 1:55:09but some good intuition and algorithm, how fast or slow your algorithm is.
    • 1:55:13And it turns out there's different ways to evaluate algorithms
    • 1:55:16with just different similar formulas.
    • 1:55:18n squared happens to be how much time both bubble sort and selection
    • 1:55:21sort take.
    • 1:55:22If I literally count up all of the work we
    • 1:55:24were doing on stage with our volunteers, it
    • 1:55:26would be roughly n squared, 8 squared, or 64 steps, give or take,
    • 1:55:32for all of those humans.
    • 1:55:33And that would be notably off.
    • 1:55:35There's a good amount of rounding error there.
    • 1:55:36But if we had a million volunteers on stage,
    • 1:55:39then the rounding error would be pretty negligible.
    • 1:55:42But we've actually seen some of these other orders of magnitude,
    • 1:55:45so to speak, before.
    • 1:55:46For instance, when we counted someone, or we searched for Mike Smith one page
    • 1:55:49at a time, we called that a linear algorithm.
    • 1:55:52And that was big O of n.
    • 1:55:53So it's on the order of n steps.
    • 1:55:55It's 1,000.
    • 1:55:55Maybe it's 999.
    • 1:55:56Whatever.
    • 1:55:57It's on the order of n steps.
    • 1:55:58The [? twosies ?] approach was twice as fast, recall-- two pages at a time.
    • 1:56:02But you know what?
    • 1:56:03That's still linear, right?
    • 1:56:05Like two pages at a time?
    • 1:56:06Let me just wait till next year when my CPU is twice
    • 1:56:08as fast, because Intel and companies keep speeding up computers.
    • 1:56:10The algorithm is fundamentally the same.
    • 1:56:12And indeed, if you think back to the picture we drew,
    • 1:56:15the shapes of those curves were indeed the same.
    • 1:56:18That first algorithm, finding Mike one page at a time looked like this.
    • 1:56:22Second algorithm finding him looked like this.
    • 1:56:24Only the third algorithm, the divide and conquer, splitting the phone book
    • 1:56:28was a fundamentally different shape.
    • 1:56:29And so even though we didn't use this fancy phrasing a couple of weeks
    • 1:56:33ago, these first algorithms, one page at a time, two pages at a time, eh,
    • 1:56:37they're on the order of n.
    • 1:56:39Technically, yes, n versus n divided by 2,
    • 1:56:42but we only care about the dominating factor, the variable n.
    • 1:56:46We can throw away everything in the denominator,
    • 1:56:48and we can throw away everything that's smaller than the biggest term, which
    • 1:56:51in this case is just n.
    • 1:56:52And I alluded to this two weeks ago--
    • 1:56:54logarithmic.
    • 1:56:55Well, it turns out that any time you divide something again, and again,
    • 1:56:58and again, you're leveraging a logarithmic type function,
    • 1:57:02log base 2 technically.
    • 1:57:03But on the order of log base n is a common one as well.
    • 1:57:08The beautiful algorithms are these--
    • 1:57:10literally, one step, or technically constant number of steps.
    • 1:57:14For instance, like what's an algorithm that might be constant time?
    • 1:57:20Open phone book.
    • 1:57:21OK, one step.
    • 1:57:22Doesn't really matter how many pages there are,
    • 1:57:24I'm just going to open the phone book.
    • 1:57:25And that doesn't vary by number of pages.
    • 1:57:27That might be a constant time algorithm, for instance.
    • 1:57:30So those are the lowest you can go.
    • 1:57:32And then there's somewhere even in between here
    • 1:57:34that we might aspire to with certain other algorithms.
    • 1:57:37So in fact, let's just see if-- just a moment--
    • 1:57:41let's just see if we can do this a little more succinctly.
    • 1:57:44Let's go ahead and use arrays in just one final way, using merge sorts.
    • 1:57:50So it turns out, using an array, we can actually
    • 1:57:53do something pretty powerfully, so long as we allow ourselves
    • 1:57:56a couple of arrays.
    • 1:57:58So again, when we just add sorting with bubble sort and selection sort,
    • 1:58:00we had just one array.
    • 1:58:01We had eight chairs for our eight people.
    • 1:58:04But if I actually allowed myself like 16 chairs, or even more,
    • 1:58:07and I allowed these folks to move a bit more,
    • 1:58:10I could actually do even better than that using arrays.
    • 1:58:12So here's some random numbers that we'll just do visually, without any humans.
    • 1:58:16And they're in an array, back, to back, to back, to back.
    • 1:58:18But if I allow myself a second array, I'm
    • 1:58:20going to be able to shuffle these things around and not just compare them,
    • 1:58:23because it was those comparisons and all of my footsteps in front of them
    • 1:58:26that really started to take a lot of time.
    • 1:58:28So here's my array.
    • 1:58:29You know what?
    • 1:58:29Just like the phone book-- that phone book example got us pretty far
    • 1:58:32in the first week--
    • 1:58:33let me do half of the problem at a time and then kind of combine my answer.
    • 1:58:38So here's an array--
    • 1:58:394, 2, 7, 5, 6, 8, 3, 1-- randomly sorted.
    • 1:58:42Let me go ahead and sort just half of this,
    • 1:58:44just like I searched for Mike initially in just half of the phone book.
    • 1:58:47So 4, 2, 7, 5-- not sorted.
    • 1:58:50But you know what?
    • 1:58:51This feels like too big of a problem, still.
    • 1:58:53Let me sort just the left half of the left half.
    • 1:58:56OK, now it's a smaller problem.
    • 1:58:58You know what?
    • 1:58:594 and 2, still out of order.
    • 1:59:00Let me just divide this list of two into two tiny arrays, each of size 1.
    • 1:59:05So here's a mini-array of size 1, and then another one of like size
    • 1:59:087, but they're back to back, so whatever.
    • 1:59:10But this array of size 1, is it sorted?
    • 1:59:14AUDIENCE: No.
    • 1:59:15DAVID J. MALAN: I'm sorry?
    • 1:59:16AUDIENCE: No.
    • 1:59:17DAVID J. MALAN: No?
    • 1:59:18If this array has just one element and that element is 4--
    • 1:59:21AUDIENCE: There's only one thing you can do.
    • 1:59:22DAVID J. MALAN: Yes, then it is sorted, by definition.
    • 1:59:24All right, so done.
    • 1:59:25Making some progress.
    • 1:59:26Now, let me kind of mentally rewind.
    • 1:59:28Let me sort the right half of that array.
    • 1:59:32So now I have another array of size 1.
    • 1:59:34Is this array sorted?
    • 1:59:36Yeah, kind of stupidly.
    • 1:59:37We don't really seem to be doing anything.
    • 1:59:39We're just making claims.
    • 1:59:40But yes, this is sorted.
    • 1:59:41But now, this was the original half.
    • 1:59:44And this half is sorted.
    • 1:59:46This half is sorted.
    • 1:59:47What if I now just kind of merge these sorted halves?
    • 1:59:49I've got two lists of size 1--
    • 1:59:524 and 2.
    • 1:59:53And now if I have extra storage space, if I had like extra benches,
    • 1:59:56I could do this a little better.
    • 1:59:58don't I go ahead and merge these two as follows?
    • 2:00:002 will go there.
    • 2:00:024 will go there.
    • 2:00:03So now I've taken two sorted lists and made one bigger, more sorted list
    • 2:00:06by just merging them together, leveraging some additional space.
    • 2:00:10Now, let me mentally rewind.
    • 2:00:11How did I get to 4 and 2?
    • 2:00:12Well, I started with the left half, then the left half of the left half.
    • 2:00:15Let me now do the right half of the left half, if you will.
    • 2:00:19All right, let me divide this again.
    • 2:00:207, list of size 1, is it sorted?
    • 2:00:23Yes, trivially.
    • 2:00:245, is it sorted?
    • 2:00:26Yes.
    • 2:00:277 and 5, let's go ahead and merge them together.
    • 2:00:295 is, of course, going to go here.
    • 2:00:317, of course, is going to go here.
    • 2:00:33OK.
    • 2:00:34Now where do we go?
    • 2:00:35We originally sorted the left half.
    • 2:00:37Let's go sort the right-- oh, right.
    • 2:00:39Sorry.
    • 2:00:40Now, we have the left half.
    • 2:00:41And the right half of the left half are sorted.
    • 2:00:45Let's go ahead and merge these.
    • 2:00:46We have two lists now of size 2--
    • 2:00:482, 4 and 5, 7, both of which are sorted.
    • 2:00:52If I now merge 2, 4 and 5, 7, which element should come first
    • 2:00:56in the new longer list, obviously?
    • 2:00:592.
    • 2:01:00And then 4, then 5, and then 7.
    • 2:01:01That wasn't much of anything.
    • 2:01:03But OK, we're just using a little more space in our array.
    • 2:01:05Now what comes next?
    • 2:01:07Now, let's do the right half.
    • 2:01:08Again, we started by taking the whole problem, doing the left half,
    • 2:01:11the left half of the left half, the left half of the left half of the left half.
    • 2:01:14And now we're going back in time, if you will.
    • 2:01:17So let's divide this into two halves, now the left half into two
    • 2:01:20halves still.
    • 2:01:216 is sorted.
    • 2:01:228 is sorted.
    • 2:01:23Now I have to merge them--
    • 2:01:246, 8.
    • 2:01:26What comes next?
    • 2:01:26Right half-- 3 and 1.
    • 2:01:29Well, left half is sorted, right half is sorted--
    • 2:01:311 and 3.
    • 2:01:33All right, now how do I merge these?
    • 2:01:356, 8, 1, 3, which element should obviously come first?
    • 2:01:381, then 3, then 6, then 8.
    • 2:01:42And then lastly, I have two lists of size four.
    • 2:01:45Let me give myself a little more space, one more array.
    • 2:01:48Now let me go ahead and put 1, and 2, and 3, and 4, and 5,
    • 2:01:53and 6, and 7, and 8.
    • 2:01:56What just happened?
    • 2:01:57Because it actually happened a lot faster, even though we were doing this
    • 2:02:00all verbally.
    • 2:02:01Well notice, how many times did each number change locations?
    • 2:02:09Literally three, right?
    • 2:02:10Like one, two, three, right?
    • 2:02:13It moved from the original array, to the secondary array, to the tertiary array,
    • 2:02:17to the fourth array, whatever that's called.
    • 2:02:19And then it was ultimately in place.
    • 2:02:21So each number had to move one, two, three spots.
    • 2:02:24And then how many numbers are there?
    • 2:02:26AUDIENCE: [INAUDIBLE]
    • 2:02:28DAVID J. MALAN: Well, they were already in the original array.
    • 2:02:30So how many times do they have to move?
    • 2:02:32Just one, two, three.
    • 2:02:33So how many total numbers are there, just to be clear?
    • 2:02:36There's eight.
    • 2:02:37So 8 times 3.
    • 2:02:38So let's generalize this.
    • 2:02:39If there's n numbers, and each time we moved
    • 2:02:43the numbers we did like half of them, than half, then half, well,
    • 2:02:46how many times can you divide 8 by 2?
    • 2:02:508 goes to 4.
    • 2:02:514 goes to 2.
    • 2:02:522 goes to 1.
    • 2:02:53And that's why we bottomed out at one element, lists of size 1.
    • 2:02:57So it turns out whenever you divide something by half, by half, by half,
    • 2:03:00what is that function or formula?
    • 2:03:05Not power, that's bad.
    • 2:03:06That's the other direction.
    • 2:03:07AUDIENCE: [INAUDIBLE]
    • 2:03:08DAVID J. MALAN: It's a logarithm.
    • 2:03:08So again, logarithm is just a mathematical description
    • 2:03:11for any function that you keep dividing something again, and again, and again.
    • 2:03:14In half, in half, in half, in third, in third, in third, whatever it is,
    • 2:03:17it just means division by the same proportional amounts again,
    • 2:03:20and again, and again.
    • 2:03:22And so if we move the numbers three times, or more generally log
    • 2:03:27of n times, which again just means you divided n things again,
    • 2:03:31and again, and again, you just call that log n.
    • 2:03:33And there's n numbers, so n numbers moved
    • 2:03:36log n times, the total arithmetic here in question
    • 2:03:40is one of those other values on our little cheat sheet, which
    • 2:03:44looked like this.
    • 2:03:46In our other cheat sheet, recall that we had formulas that looked like this,
    • 2:03:51not just n squared and n, and log n, and 1, we have this one in the middle--
    • 2:03:55n times log n.
    • 2:03:57So again, we're kind of jumping around here.
    • 2:03:59But again, each number moves log n places.
    • 2:04:02There's n total numbers.
    • 2:04:03So n times log n is just, by definition, n log n.
    • 2:04:07But why is this sorted this way?
    • 2:04:09Well log n, recall from week 0 with the phone book example,
    • 2:04:12the green curve is definitely smaller than n. n was the straight lines,
    • 2:04:16log n was the green curved one.
    • 2:04:18So this indeed belongs in between, because this is n times n.
    • 2:04:21This is n.
    • 2:04:22This is n times something smaller than n.
    • 2:04:25So what's the actual implication?
    • 2:04:26Well, if we were to run these algorithms side by side
    • 2:04:29and actually compare them with something like this--
    • 2:04:34let me go ahead and compare these algorithms using this demo here--
    • 2:04:41if I go ahead and hit play, we'll see that the bars in this chart
    • 2:04:44are actually horizontal.
    • 2:04:45And the small bars represent small numbers,
    • 2:04:47large bars represent long numbers.
    • 2:04:49And then each of these is going to run a different algorithm-- selection
    • 2:04:52sort on the left, bubble sort in the middle,
    • 2:04:54merge sort, as we'll now call it, on the right.
    • 2:04:57And here's how long each of them take to sort those values.
    • 2:05:04Bubble's still going.
    • 2:05:06Selection's still going.
    • 2:05:07And so that's the appreciable difference, albeit with a small demo,
    • 2:05:09between n squared and something like log n.
    • 2:05:12And so what have we done here?
    • 2:05:13We've really, really, really got into the weeds of what arrays can actually
    • 2:05:17do for us and what the relationships are with strings, because all of it
    • 2:05:20kind of reduces to just things being back, to back, to back, to back.
    • 2:05:22But now that we kind of come back, and we'll
    • 2:05:24continue along this trajectory next time to be
    • 2:05:26able to talk at a much higher level about what's actually going on.
    • 2:05:29And we can now take this even further, by applying
    • 2:05:32other sort of forms of media to these same kinds of questions.
    • 2:05:35And we'll conclude it's about 60 seconds long.
    • 2:05:37These bars are vertical, instead of horizontal.
    • 2:05:39And what you'll see here is a visualization
    • 2:05:41of various sorting algorithms, among them selection sort, bubble
    • 2:05:43sort, and merge sort, and a whole assortment of others, each of which
    • 2:05:46has even a different sound to it because of the speed
    • 2:05:50and the pattern by which it actually operates.
    • 2:05:53So let's take a quick look.
    • 2:05:54[VIDEO PLAYBACK]
    • 2:05:55[MUSIC PLAYING]
    • 2:06:05This is bubble sort.
    • 2:06:06And you can see how the larger elements are indeed bubbling up to the top.
    • 2:06:15[?
    • 2:06:16And you can kind of hear the ?] periodicity,
    • 2:06:18or the cycle that it's going in.
    • 2:06:25And there's less, and less, and less, and less work to do, until almost--
    • 2:06:33This is selection sort now.
    • 2:06:34So it starts off random, but we keep selecting the smallest human
    • 2:06:38or, in this case, the shortest bar.
    • 2:06:41And you'll see here the bars correlate with frequency, clearly.
    • 2:06:45So it's getting higher and higher and taller and taller.
    • 2:06:50This is merge sort now which, recall, does things in halves,
    • 2:06:53and then halves of halves, and then merges those halves.
    • 2:06:57So we just did all the left work, almost all the right work.
    • 2:07:03That one's very gratifying.
    • 2:07:04[LAUGHS]
    • 2:07:06This is something called [? nom ?] sort, which is improving things.
    • 2:07:10Not quite perfectly, but it's always making forward progress,
    • 2:07:13and then kind of doubling back and cleaning things up.
    • 2:07:24[END PLAYBACK]
    • 2:07:24Whew.
    • 2:07:25That was a lot.
    • 2:07:26Let's call it a day.
    • 2:07:26I'll stick around for one-on-one questions.
    • 2:07:28We'll see you next time.
    • 2:07:29[APPLAUSE]
  • CS50.ai
Shortcuts
Before using a shortcut, click at least once on the video itself (to give it "focus") after closing this window.
Play/Pause spacebar or k
Rewind 10 seconds left arrow or j
Fast forward 10 seconds right arrow or l
Previous frame (while paused) ,
Next frame (while paused) .
Decrease playback rate <
Increase playback rate >
Toggle captions on/off c
Toggle mute m
Toggle full screen f or double-click video