CS50 Video Player
    • 🧁

    • 🍭

    • 🥥

    • 🍿
    • 0:00:00Introduction
    • 0:01:12Readability
    • 0:06:02Compiling
    • 0:38:14Debugging
    • 1:02:57Arrays
    • 1:35:11Strings
    • 2:04:00Command-line Arguments
    • 2:11:30cowsay
    • 2:13:34Exit Status
    • 2:19:13Cryptography
    • 0:00:02[MUSIC PLAYING]
    • 0:01:12DAVID MALAN: All right.
    • 0:01:13This is CS50, and this is week 2 wherein we're
    • 0:01:17going to take a look at a lower level at how things work,
    • 0:01:20and indeed, among the goals of the course is this bottom-up understanding
    • 0:01:24so that in a couple of weeks' time, even a few years' time,
    • 0:01:26when you encounter some new technology, you'll be able to think back hopefully
    • 0:01:29on some of this week's and this is basic building blocks and primitives
    • 0:01:33and really just deduce how tomorrow's technologies work.
    • 0:01:36But along the way, it's going to seem--
    • 0:01:37it's going to be a little hard, perhaps, to see the forest for the trees,
    • 0:01:40so to speak.
    • 0:01:41And so the goal at the end of the day still is going to be problem-solving.
    • 0:01:44And so we thought we'd begin today with a look at some of the problems
    • 0:01:47we'll talk about or solve this coming week,
    • 0:01:50and for that, we have some brave volunteers who have already come up.
    • 0:01:53If we could turn on some dramatic lighting and meet today's volunteers.
    • 0:01:58So on my left here, we have--
    • 0:02:00ALEX: Hi.
    • 0:02:00My name is Alex.
    • 0:02:01I'm a first-year at the college and I'm from Chapel Hill, North Carolina.
    • 0:02:05DAVID MALAN: Welcome to Alex.
    • 0:02:07And to Alex's right.
    • 0:02:09SARAH: I'm Sarah.
    • 0:02:10I'm from Toronto, Canada, and I'm also a first-year student at the college.
    • 0:02:13DAVID MALAN: Wonderful.
    • 0:02:14Well, welcome to both Alex and Sarah.
    • 0:02:15So one of the problems you'll perhaps solve this week for problem
    • 0:02:18set 2 is to analyze the reading level of a body of text,
    • 0:02:22whether someone reads at a first grade level, second grade level, third grade
    • 0:02:25level, all the way up to 12 or 13 or beyond.
    • 0:02:28What you perhaps never quite thought about, certainly in terms of code,
    • 0:02:32like how you would analyze some text, some book and figure
    • 0:02:35out what reading level is it at.
    • 0:02:36And yet, surely our teachers growing up knew or had an intuitive sense of this.
    • 0:02:40So let's consider some sample text.
    • 0:02:42For instance, Alex, what have you been reading lately?
    • 0:02:45ALEX: One fish, two fish, red fish, blue fish.
    • 0:02:52DAVID MALAN: Wonderful.
    • 0:02:53So given that, what grade level would you say Alex is currently reading at?
    • 0:02:58Feel free to just shout it out.
    • 0:03:01First, first?
    • 0:03:02So indeed, you'll see this week, if you run your code on Alex's text,
    • 0:03:07it actually turns out he reads below a first grade reading level.
    • 0:03:10But why might that be?
    • 0:03:12What might your intuition be for why we've
    • 0:03:16accused Alex of reading at this level?
    • 0:03:19Feel free to shout out.
    • 0:03:20Yeah.
    • 0:03:21So very few syllables, short words, short sentences.
    • 0:03:24And so there's some heuristics, perhaps, we can infer from that short text,
    • 0:03:27that that probably means that it's best for younger children.
    • 0:03:30Now Sarah, by contrast, what have you been reading?
    • 0:03:33SARAH: Mr. And Mrs. Dursley of Number.
    • 0:03:35Four Privet Drive were proud to say that they were
    • 0:03:38perfectly normal, thank you very much.
    • 0:03:41They were the last people you'd expect to be involved
    • 0:03:43in anything strange or mysterious because they just
    • 0:03:46didn't hold with much nonsense.
    • 0:03:47DAVID MALAN: All right.
    • 0:03:48Now irrespective of what grade you were in when
    • 0:03:50you might have read that text, what grade level to Sarah
    • 0:03:53seemed to be reading at?
    • 0:03:55So eighth grade, second grade.
    • 0:03:57OK.
    • 0:03:58So hearing a bit of everything, so with that, at least according to code,
    • 0:04:01it would actually be seventh grade.
    • 0:04:03And what might the intuition there be?
    • 0:04:05Why is that a higher grade level even though we might
    • 0:04:07disagree exactly which grade it is?
    • 0:04:09AUDIENCE: Complicated sentences.
    • 0:04:11DAVID MALAN: Yeah.
    • 0:04:12So complicated sentences, longer sentences.
    • 0:04:14So indeed a lot more words were being spoken by Sarah because there
    • 0:04:17was so much more there on the page.
    • 0:04:18So we'll translate these ideas this coming week in problem set 2,
    • 0:04:22if you tackle this one, through code so that you can ultimately
    • 0:04:25infer things of these quantitatively.
    • 0:04:26But to do so, we're going to have to understand text.
    • 0:04:29So let's first thank our volunteers and then we'll dive in to that lower level.
    • 0:04:32[APPLAUSE]
    • 0:04:39Sorry.
    • 0:04:40You can keep those.
    • 0:04:41SARAH: Oh, OK.
    • 0:04:42DAVID MALAN: All right.
    • 0:04:43So besides that, let's consider one other body of text
    • 0:04:45perhaps that you might see this week, which
    • 0:04:48is namely a little something like this.
    • 0:04:50What I have here on the screen is what we'll start calling today ciphertext.
    • 0:04:53It's the result of encrypting some piece of information.
    • 0:04:56And encryption, or more generally, the art and science of cryptography
    • 0:05:00is all around us.
    • 0:05:00It's what you're using on the web, on your phones, with your banks.
    • 0:05:03And anything that tries to keep data secure is using encryption.
    • 0:05:07But there's going to be different levels of encryption-- strong encryption,
    • 0:05:10weak encryption.
    • 0:05:11And what you see here on the screen isn't all that strong,
    • 0:05:14but we'll see later today how we might decrypt this and actually reveal
    • 0:05:18what the plaintext is that corresponds to that ciphertext.
    • 0:05:22But in order to do so, we have to start taking off some training wheels,
    • 0:05:25so to speak.
    • 0:05:26And believe it or not, even though your time
    • 0:05:28would see this past week for the first time,
    • 0:05:30probably, might have been rather in the weeds.
    • 0:05:32And much more complicated seemingly than C, it turns out that along the way,
    • 0:05:36we have been providing and we'll continue
    • 0:05:37to provide certain training wheels.
    • 0:05:39For instance, the CS50 Library is one of them,
    • 0:05:42and even some of the explanations we give of topics for now
    • 0:05:46in these early weeks will be somewhat simplified-- abstracted away,
    • 0:05:49if you will.
    • 0:05:49But the goal ultimately is for you to understand
    • 0:05:51each and every one of those details so that after CS50, you really
    • 0:05:55can stand on your own and understand and wrap your mind
    • 0:05:58around any future technologies as well.
    • 0:06:01So let's consider first the very first program with which we began last week,
    • 0:06:05which was this one.
    • 0:06:06So "hello, world" in C. At the end of the day, it was really the printf
    • 0:06:09function that was doing the interesting part of the work,
    • 0:06:11but there was a lot of technical stuff above and below it.
    • 0:06:14The curly braces, the parentheses, words like void and include, and then
    • 0:06:19of course, the angled brackets and more.
    • 0:06:21But at the end of the day, we needed to convert that source code in C
    • 0:06:25to machine code, the 0's and 1's in binary that the computer understood.
    • 0:06:30And to do that, of course, we ran--
    • 0:06:32we compiled the code.
    • 0:06:33We ran make and then we were able to actually run that code there.
    • 0:06:37So let me actually go over here to VS Code
    • 0:06:39and really quickly recreate that hello.c pretty much by transcribing the same.
    • 0:06:44So I might have here include stdio.h, int main void.
    • 0:06:51And then in here, I had quite simply, hello,
    • 0:06:54comma, world with my backslash, endquotes, and more.
    • 0:06:57Now last time, to compile this, I indeed ran make hello, followed by Enter.
    • 0:07:01Hopefully you see no errors and that's a good thing.
    • 0:07:03And if you do dot, slash, hello, you see,
    • 0:07:05in fact, the results of that program.
    • 0:07:07But it turns out that make is not actually a compiler
    • 0:07:11as I alluded to last week.
    • 0:07:12It's a program that clearly makes your program,
    • 0:07:15but it itself just automates the process of using an actual compiler.
    • 0:07:19And there's lots of different compilers out there,
    • 0:07:21and the one that it's actually using underneath the hood
    • 0:07:24is a little something called Clang for C Language.
    • 0:07:27And Clang is a pretty popular compiler nowadays.
    • 0:07:30There's another one that's been around for ages called GCC,
    • 0:07:33but these are just specific names for types of compilers
    • 0:07:36that different people, different companies, different groups
    • 0:07:38have actually created.
    • 0:07:40But if you use in week 1 a compiler yourself manually,
    • 0:07:44you have to understand a little more about what's
    • 0:07:47going on because it's even more cryptic than what just make alone.
    • 0:07:50So in fact, let me go back to my terminal window here, let me go ahead
    • 0:07:53and clear the screen a little bit and just run really the raw compiler
    • 0:07:58command.
    • 0:07:59So what make is automating for me let me,
    • 0:08:01actually do this manually for just a moment.
    • 0:08:03So if I want to compile hello.c into an executable program I can run,
    • 0:08:10I can do this.
    • 0:08:12clang, space, hello.c, and then Enter.
    • 0:08:17And now there's no output, which is a good thing in this case, no errors,
    • 0:08:20but notice this.
    • 0:08:22If I go ahead and type ls, it turns out there's
    • 0:08:25a file that's been created suddenly in my current folder weirdly called a.out.
    • 0:08:32That stands for Assembler Output.
    • 0:08:33And long story short, that's actually the default name
    • 0:08:35of a program that's created when you just run Clang by itself.
    • 0:08:39Now that's a pretty bad name for a program
    • 0:08:41because it doesn't describe what it does.
    • 0:08:44So better would be here to perhaps do, well, instead of a.out, which, yes,
    • 0:08:49still prints hello.world, but isn't really a clearly-named program,
    • 0:08:53it'd be nice to name this hello.
    • 0:08:55So what could I do?
    • 0:08:56I could do like we learned last week-- well, I could rename a.out to hello
    • 0:08:59by using Linux's mv command.
    • 0:09:01So I'm going to move a.out to become hello.
    • 0:09:04But that, too, seems kind of tedious.
    • 0:09:06Now I have three steps.
    • 0:09:07Like write my code, compile my code, and then rename it
    • 0:09:10before I can even run it.
    • 0:09:12We can do better than that.
    • 0:09:13And so it turns out that certain commands
    • 0:09:15like clang support what we're going to start today
    • 0:09:18calling command line arguments.
    • 0:09:20A command line argument, unlike an argument to a function,
    • 0:09:24is just an additional word or key phrase that you
    • 0:09:27type after a command at your prompt in your terminal
    • 0:09:30window that just modifies the behavior of that command.
    • 0:09:33It configures it a little more specifically.
    • 0:09:35So what you're seeing here on the screen is some of a better command with which
    • 0:09:39to run clang so that now I can specify the output of this command per this o.
    • 0:09:45So do what I mean by that?
    • 0:09:46Well, let me go ahead and clear my terminal window again
    • 0:09:48and more explicitly type clang -o hello hello.c and then Enter.
    • 0:09:54Nothing, again, appears to happen, but that's a good thing when
    • 0:09:57you see no errors and now the program I just created is indeed called Hello.
    • 0:10:02So it achieves really the same exact effect as make did, but what.
    • 0:10:07I don't have to do with make is type and remember something
    • 0:10:09as long as this command.
    • 0:10:11And this, too, is a bit of a white lie.
    • 0:10:12It turns out, we have preconfigured VS Code in the cloud for you
    • 0:10:16to also use some other features of Clang that would be even more
    • 0:10:21tedious for you to write yourselves.
    • 0:10:22And so really, this is why we distill this as ultimately just running make.
    • 0:10:28So let me pause here to see first if there's any questions on what I've
    • 0:10:31done by taking my very first program in C
    • 0:10:34and just now compiling it first with make, but then starting over
    • 0:10:37and now manually compiling it with clang with what
    • 0:10:40we'll call command line arguments. -o, space, hello,
    • 0:10:44and then the name of the file.
    • 0:10:46Yeah?
    • 0:10:47AUDIENCE: What is a.out?
    • 0:10:48DAVID MALAN: Yeah.
    • 0:10:49So a.out is a historical name.
    • 0:10:51It refers to assembler output-- more on that soon.
    • 0:10:55And it's just the default file name that you get automatically
    • 0:10:58if you just run the compiler on any file so that you
    • 0:11:01have just a standard name for it.
    • 0:11:02But it's not a very well-named program.
    • 0:11:05Instead of running Microsoft Word on your Mac or PC,
    • 0:11:07it would be like double-clicking on a.out.
    • 0:11:09So instead with these command line arguments,
    • 0:11:11you can customize the output of Clang and call it hello or anything you want.
    • 0:11:17Other questions on what I've done here with Clang itself, the compiler?
    • 0:11:23Yeah?
    • 0:11:23AUDIENCE: What is -o?
    • 0:11:25DAVID MALAN: So -o--
    • 0:11:26and you would only know this from reading the manual, taking a class,
    • 0:11:29means output.
    • 0:11:30So -o means change Clang's output to be a file called hello
    • 0:11:35instead of the default, which is a.out.
    • 0:11:38And this, too, is, again, a detail you would have to look up on a web page,
    • 0:11:42read the manual, hear someone like me tell you about it.
    • 0:11:44And in fact, there's even more than these options,
    • 0:11:46but we'll just scratch the surface here.
    • 0:11:48All right.
    • 0:11:49So if we now know this, what more is actually happening underneath the hood?
    • 0:11:53Well, let's take a closer look at not just this version of my code,
    • 0:11:57but my slightly more complicated version last week,
    • 0:12:01which looked a little something like this, wherein
    • 0:12:03I added in some dynamic input from the user so I could say not hello, world
    • 0:12:07to everyone, but hello, David or hello to whoever actually runs this program.
    • 0:12:11So in fact, let me go ahead and change my code here in VS Code just
    • 0:12:15to match that same code from last week.
    • 0:12:17So no new code yet.
    • 0:12:19I'm just going to, in a moment, compile it in a slightly different way.
    • 0:12:22So I did last week's string, I think, answer equals string, quote-unquote,
    • 0:12:29"What's your name?"
    • 0:12:30Just like in Scratch.
    • 0:12:31And then down here, instead of doing world, I initially wrote answer,
    • 0:12:35but that didn't go well.
    • 0:12:37What did I ultimately do instead to print out hello, David or hello,
    • 0:12:41so-and-so?
    • 0:12:42Yeah?
    • 0:12:44Sorry, a little louder?
    • 0:12:45AUDIENCE: %s?
    • 0:12:46DAVID MALAN: Yeah, so %s, the so-called format code that printf just knows how
    • 0:12:50to deal with.
    • 0:12:51And I had to add one other thing.
    • 0:12:52Someone else besides %s--
    • 0:12:54yeah?
    • 0:12:54AUDIENCE: The name of the variable.
    • 0:12:56DAVID MALAN: The name of the variable that I want to plug into that
    • 0:12:58placeholder %s.
    • 0:13:00And in this case, it's answer.
    • 0:13:01Now let me make one refinement only because now we're in week 2
    • 0:13:04and we're going to start writing more lines of code,
    • 0:13:06even though Scratch called the return value of the ask puzzle piece,
    • 0:13:10answer always.
    • 0:13:11And see, we have full control over what our variables are called.
    • 0:13:14And now it's probably good not to just generically always call
    • 0:13:17my variable answer if I'm using get_string.
    • 0:13:19Let's call it what it is.
    • 0:13:21So this is now just a matter of style, if you will.
    • 0:13:23Let me change the variable to be name just so
    • 0:13:26that it's a little clearer to me, to you, to a TF or TA
    • 0:13:29exactly what that variable represents instead of more generically answer.
    • 0:13:34All right, so that said, let me go down to my terminal window,
    • 0:13:37and last week again, I ran make to compile this exact same program.
    • 0:13:41Now, though, let me go ahead and just use clang.
    • 0:13:43So clang -o--
    • 0:13:45I'll still call this version hello--
    • 0:13:47space, hello.c.
    • 0:13:49So exact same command as before.
    • 0:13:51The only thing that's different is I've added a couple of more lines of code
    • 0:13:54to get the user's input.
    • 0:13:56Let me hit Enter, and now, darn it, our first error.
    • 0:13:59So output from clang and make is not a good thing,
    • 0:14:02and here, we're seeing something particularly cryptic.
    • 0:14:05So something in function 'main--' undefined reference
    • 0:14:09to 'get_string,' string and then linker command failed with exit code 1.
    • 0:14:13So there's actually a lot of jargon in there that will tease apart today,
    • 0:14:16but my hint is that clearly my problem's in main, although that's not surprising
    • 0:14:20because there's nothing else going on here.
    • 0:14:22get_string is an issue, and the issue is that it's an undefined reference.
    • 0:14:26And yet, notice, I was pretty good.
    • 0:14:28I added the CS50 header file and I said last week that that's
    • 0:14:32enough to teach the compiler that functions exist,
    • 0:14:35but the problem is that even though this does, in fact,
    • 0:14:39teach Clang that get_string exists, it is not
    • 0:14:43sufficient information for Clang to go find on the hard drive of the computer
    • 0:14:47the 0's and 1's that actually implement get_string itself.
    • 0:14:51So in other words, this include line, per last week,
    • 0:14:54is a little bit of a hint.
    • 0:14:55It's a teaser to Clang that you're about to see and use this function somewhere.
    • 0:14:59But if you actually want to use the 0's and 1's that CS50 wrote some time ago
    • 0:15:05and bake those into your program so your program actually
    • 0:15:08knows how to get input from the user, well then,
    • 0:15:11I'm going to have to go ahead and run a slightly different command.
    • 0:15:15So let me do this.
    • 0:15:16Let me clear my terminal window just get rid of that distraction
    • 0:15:18and let me propose now that we run this command instead.
    • 0:15:23Almost the same as before, clang -o, space, hello, then hello.c,
    • 0:15:28but with one additional command line argument at the end, and this is a -l--
    • 0:15:34not a number 1.
    • 0:15:35So -lcs with no space in between those two.
    • 0:15:39Now the l is going to result in all of those 0's and 1's that actually
    • 0:15:43were in by CS50 being linked into your code, your few lines of code or mine
    • 0:15:48here.
    • 0:15:48But that's the second step that the compiler requires in order to know how
    • 0:15:53to actually execute and rather compile your code and CS50's.
    • 0:15:58And CS50 is not the only one that does this.
    • 0:16:00If you use any third party library in C that doesn't come with the language,
    • 0:16:04you would do -l such and such where whoever--
    • 0:16:08however they've named their own library.
    • 0:16:10But you don't have to do it for built in things like we've been using thus far.
    • 0:16:14All right, so let me go ahead and try this.
    • 0:16:16I'll go back to VS Code here, and let me go ahead now
    • 0:16:19and run clang -o hello, then hello.c.
    • 0:16:23And now instead of just hitting Enter, -lcs50
    • 0:16:26with no space between the l and the cs50, Enter.
    • 0:16:29Now nothing bad happens, and now I can do ./hello.
    • 0:16:33What's your name?
    • 0:16:34I'll type in David, Enter, and now we see hello, David.
    • 0:16:37Now honestly, this is where we're really getting into the weeds,
    • 0:16:40and now this is taking--
    • 0:16:42this is really just adding nuisance to the process of compiling and running
    • 0:16:45your code.
    • 0:16:46And so the reality is, even though this is indeed what is happening,
    • 0:16:49this is why we used last week and we're going
    • 0:16:51to continue using this week onward make because it just
    • 0:16:55automates that whole process for you.
    • 0:16:57But it's ideal to understand what's going wrong because any of the error
    • 0:17:00messages you saw for problem set 1, any of the error messages
    • 0:17:02you see for the next few weeks probably aren't coming from make,
    • 0:17:05they're coming from Clang underneath the hood
    • 0:17:08because make is just automating the process.
    • 0:17:10But with make, you literally just write make and then the name of the program,
    • 0:17:14you don't have to worry about any of those command line arguments.
    • 0:17:17Questions, then, on compiling with dash -lcs50 or anything else?
    • 0:17:22Yeah?
    • 0:17:23AUDIENCE: What is the benefit of [INAUDIBLE]??
    • 0:17:24DAVID MALAN: Sorry, what is the benefit of--
    • 0:17:26AUDIENCE: Using Clang manually.
    • 0:17:27DAVID MALAN: What is the benefit of using Clang manually?
    • 0:17:30None, really.
    • 0:17:30In fact, all main is doing is just say-- make is doing
    • 0:17:33is saving us some keystrokes.
    • 0:17:35If you prefer, though, and you just like to be more in control,
    • 0:17:37you can totally run Clang manually if you remember the various command line
    • 0:17:41arguments.
    • 0:17:42Yeah?
    • 0:17:42AUDIENCE: So why did you have to explain [INAUDIBLE]
    • 0:17:47DAVID MALAN: Exactly.
    • 0:17:48Why did I have to explain--
    • 0:17:49that is, provide a hint to CS50 with the cs50.h header file,
    • 0:17:53but I didn't have to do that with standardio.h?
    • 0:17:55Just because.
    • 0:17:56standardio.h comes with C, just like a few other libraries come
    • 0:18:00with C that we'll start seeing today.
    • 0:18:03CS50, though, is not built into C everywhere,
    • 0:18:05and so you do have to explicitly add that one there.
    • 0:18:07Yeah?
    • 0:18:08AUDIENCE: Can you define what command line argument [INAUDIBLE]??
    • 0:18:11DAVID MALAN: A command line argument is a word or phrase
    • 0:18:15that you type at the command line--
    • 0:18:17a.k.a., your terminal-- in order to influence the behavior of a program.
    • 0:18:22AUDIENCE: OK.
    • 0:18:22So it's a term for whatever you're giving it.
    • 0:18:24DAVID MALAN: Yeah.
    • 0:18:24It changes the defaults.
    • 0:18:25In our GUI world, Graphical User Interface,
    • 0:18:27you and I would probably click some boxes,
    • 0:18:29we would select some menu options to configure a program
    • 0:18:32to behave in the same way.
    • 0:18:33At a command line interface, you have to just say everything all at once,
    • 0:18:36and that's why we have command line arguments.
    • 0:18:39Yeah?
    • 0:18:40AUDIENCE: Is make [INAUDIBLE]
    • 0:18:43DAVID MALAN: No.
    • 0:18:43Make is not just for CS50.
    • 0:18:45It's used globally in any project really nowadays using C, C++,
    • 0:18:50even other languages as well.
    • 0:18:52In fact, most every command you see in this class,
    • 0:18:54unless it has 5-0 at the end of it, is globally used.
    • 0:18:57Only those-- a suffix with 50 are, indeed, course-specific.
    • 0:19:00And even those we'll gradually take training wheels off
    • 0:19:03of so that exactly what those commands are doing as well.
    • 0:19:06All right, so what is it that we've just done?
    • 0:19:09Everything we've just done, of course, I keep calling compiling,
    • 0:19:11but let's just go down one rabbit hole so
    • 0:19:13that you understand that when you compile code,
    • 0:19:15there's actually a whole bunch of steps, happening
    • 0:19:18and this is going to enable a lot of features, like companies can
    • 0:19:21write code and then convert it to run it on Macs and PCs alike
    • 0:19:26or phones or the like.
    • 0:19:27So it's not just a matter of converting source code to machine code,
    • 0:19:30there's actually four steps involved in what you and I, as of last week,
    • 0:19:34know as compiling.
    • 0:19:35And these aren't terms that you'll have to keep in mind constantly
    • 0:19:39because again, we're going to abstract a lot of this away.
    • 0:19:41But just so we've gone down the rabbit hole once,
    • 0:19:43let's consider each of these four steps that
    • 0:19:45have been happening for you for a week automatically, the first of which
    • 0:19:49is called preprocessing.
    • 0:19:51So what does this mean?
    • 0:19:52Well, let's consider that same program as before.
    • 0:19:54So notice that two of the lines of code start with a hash mark.
    • 0:19:57That is a special symbol in C, and it's a so-called preprocessor directive.
    • 0:20:02You don't need to memorize terms like that,
    • 0:20:04but it just means that it's a little different from every other line.
    • 0:20:07And anything with a hash symbol here should
    • 0:20:08be preprocessed-- that is, analyzed initially before anything else happens.
    • 0:20:13So let's consider these two lines up top, what exactly is happening.
    • 0:20:17Well, it turns out with these two lines, you
    • 0:20:19have two header files, of course, cs50.h and stdio.h.
    • 0:20:23Where are those files, because they've never been in VS Code for you,
    • 0:20:27seemingly.
    • 0:20:28If you type LS-- if you open up the File Explorer in the GUI,
    • 0:20:31you have never seen, probably, cs50.h or stdio.h.
    • 0:20:35They just work, but that's because there's a folder somewhere
    • 0:20:39on the hard drive that you're using on your Mac or PC
    • 0:20:43or somewhere in the cloud, as in our case.
    • 0:20:45And inside of this folder, traditionally called /usr/include.
    • 0:20:50And user is deliberately misspelled.
    • 0:20:51It's just slightly more succinct, although it's a little weird
    • 0:20:54why we drop that one letter.
    • 0:20:55But usr/include is just a folder on the server that contains cs50.h, stdio.h,
    • 0:21:01and a bunch of other things as well.
    • 0:21:03So in fact, if you type in VS Code, in your terminal window,
    • 0:21:08when you're using code spaces in the cloud and type LS space /usr/include,
    • 0:21:13you can see all of the files in that folder.
    • 0:21:15But we've preinstalled all of that stuff for you.
    • 0:21:17So let's consider what's actually in those files here.
    • 0:21:20If I highlight these two lines up top that start with hash include, well,
    • 0:21:25I kind of hinted last week that what's in that first file is a hint as to what
    • 0:21:30functions CS50 wrote for you.
    • 0:21:32So you can kind of think of these include lines
    • 0:21:35as being temporary placeholders for what's
    • 0:21:38going to become like a global find and replace.
    • 0:21:41That is the first thing clang is going to do is to preprocess this file.
    • 0:21:44It's going to look for any line that starts with hash include.
    • 0:21:47And if it sees that, it's going to essentially go into that file,
    • 0:21:50like cs50.h, and then just copy and paste the contents of that file
    • 0:21:55magically there for you.
    • 0:21:56You don't see it visually on the screen.
    • 0:21:58But it's happening behind the scenes.
    • 0:22:00And so really, what's happening with this first line
    • 0:22:03is that somewhere in cs50.h is the declaration of getString
    • 0:22:09like we talked last week, and it probably
    • 0:22:11looks a little something like this.
    • 0:22:13And we didn't spend much time on this yet this past week,
    • 0:22:15but we will in time more.
    • 0:22:17Notice that this is how a function is declared.
    • 0:22:21That is, it is decreed to exist.
    • 0:22:23The name of the function, of course, is getString.
    • 0:22:25Inside of the parentheses are its arguments.
    • 0:22:28In this case, there's one argument to getString, I claim today,
    • 0:22:31but you've known this implicitly.
    • 0:22:33And it's a prompt.
    • 0:22:34It's the prompt that the human sees when you use getString.
    • 0:22:36What is that prompt?
    • 0:22:37Well, it's a string of text, like quote unquote, "what's your name?"
    • 0:22:41or anything else that I asked last week.
    • 0:22:43Meanwhile, getString, as we know from last week, has a return value.
    • 0:22:46It returns something to you.
    • 0:22:48And that, too, is a string.
    • 0:22:49So again, this is also called a functions prototype.
    • 0:22:52It's the thing toward the end of last week
    • 0:22:53that I just copied and pasted from the bottom of my file to the top,
    • 0:22:57just so that it was like this teaser for clang as to what would exist later.
    • 0:23:02So you can think, then, of these include lines as just kind of combining all
    • 0:23:07of those function declarations in some separate file called cs50.h,
    • 0:23:11so that you yourself don't have to type them every time you use the library--
    • 0:23:14or worse, so that you, yourself, don't have to copy and paste those lines.
    • 0:23:18This is what clang is doing for you in its first step of preprocessing.
    • 0:23:22Second, and last in this example, what happens when clang preprocesses
    • 0:23:27this second include line?
    • 0:23:29Well, the only other function we care about in this story
    • 0:23:31is printf, of course, which comes with C.
    • 0:23:33So essentially, you can think of printf's prototype or declaration
    • 0:23:39as just being this.
    • 0:23:40Printf is the name of the function.
    • 0:23:42It takes a string that you want to format like, Hello comma world,
    • 0:23:47or Hello comma %s.
    • 0:23:49And then with dot, dot, dot, this actually has technical meaning.
    • 0:23:52It means, of course, that you can plug-in 0 variables, 1 variable, 2
    • 0:23:55or 10.
    • 0:23:56So dot, dot, dot means some number of variables.
    • 0:23:58Now we haven't talked about this yet.
    • 0:24:00And we won't really, in general.
    • 0:24:01printf actually returns a value, a number, that is an integer.
    • 0:24:05But more on that perhaps another time.
    • 0:24:07It's generally not something the programmer tends to look at.
    • 0:24:10But that's all we mean by preprocessing, so that at the end of this process,
    • 0:24:14even though there's more lines of code in cs50.h and stdio.h,
    • 0:24:18what's really just happening is that clang, in preprocessing
    • 0:24:21the file, copies and pastes the contents of those files into your code
    • 0:24:25so that now your code knows about everything-- getString, printf,
    • 0:24:29and anything else.
    • 0:24:31Any questions, then, on that first step, preprocessing?
    • 0:24:35Yes?
    • 0:24:35AUDIENCE: [INAUDIBLE]
    • 0:24:49DAVID MALAN: Good question.
    • 0:24:50When you include a file, does it only include what
    • 0:24:52you need or does it include everything?
    • 0:24:54Think of it as including everything.
    • 0:24:56So if it's a big file, that's a lot of code at the very top.
    • 0:24:59And that's why, if you think back to all of the zeros and ones
    • 0:25:01I showed a little bit ago, as well as last week,
    • 0:25:03there's a lot of zeros and ones that end up
    • 0:25:06on the screen as a result of just writing, Hello, world.
    • 0:25:08A lot of those zeros and ones are perhaps
    • 0:25:10coming from code that you didn't actually, necessarily need.
    • 0:25:13But some of it is perhaps there, but there
    • 0:25:15are ways to optimize that as well.
    • 0:25:17All right, so step two of compiling is, confusingly, called compiling.
    • 0:25:22It's just, this is the term that most everyone uses
    • 0:25:24to describe the whole process, instead of just this one step.
    • 0:25:27But once a program has been preprocessed behind the scenes
    • 0:25:32by the compiler for you, it looks now a little something like this.
    • 0:25:35And I've put dot, dot, dot just to imply that, yes, to your question,
    • 0:25:38there's more stuff above it.
    • 0:25:39There's more stuff below it.
    • 0:25:40It's just not interesting right now for us.
    • 0:25:43So now we have just C code.
    • 0:25:44There's no more preprocessor directives.
    • 0:25:46At this point, all of the hash symbols and those lines of code
    • 0:25:49have been preprocessed and converted to something else.
    • 0:25:52And so now-- and this is where things get a little spooky looking.
    • 0:25:56Here now is what happens when clang, or any compiler,
    • 0:26:00literally compiles code like this.
    • 0:26:03It converts it from this in C to this in assembly code.
    • 0:26:08So this is among the scarier languages.
    • 0:26:10I, myself, don't really have fond memories.
    • 0:26:12This is not a language that many people program in.
    • 0:26:14If you take a subsequent class in computer science,
    • 0:26:16in systems, a higher level class, you might actually
    • 0:26:19learn this or some variant thereof.
    • 0:26:21But there's at least a few people out there
    • 0:26:23that need to know this stuff because this
    • 0:26:24is closer to what the computers themselves, nowadays, understand.
    • 0:26:29The Intel CPUs or the AMD CPUs, the brains of today's computers and phones
    • 0:26:34understand stuff that looks more like this and less like C.
    • 0:26:37Now it's completely esoteric, but let me just highlight a few phrases.
    • 0:26:42There's some stuff that's a little familiar.
    • 0:26:44There is mention of main at the top there in yellow.
    • 0:26:47There is mention of getString toward the bottom.
    • 0:26:49There is mention of printf down below.
    • 0:26:52So this is just another programming language called assembly language,
    • 0:26:55that decades ago, humans--
    • 0:26:57myself included in school--
    • 0:26:58did write code in.
    • 0:27:00And absolutely, some people still write this code,
    • 0:27:02especially since you can write very, very efficient code.
    • 0:27:06But it's a lot more arcane.
    • 0:27:08It's a lot less user friendly.
    • 0:27:11So you'll see in yellow now, these are the so-called instructions
    • 0:27:14that a computer's brain or CPU understands, pushing values
    • 0:27:18around, moving them, subtracting values, calling functions, and move, move,
    • 0:27:23move.
    • 0:27:24So really, the low-level operations that computers understand
    • 0:27:27tend to be arithmetic operations-- subtraction, addition,
    • 0:27:31and the like-- moving things in and out of memory.
    • 0:27:34It's just a lot more tedious for folks like us to write code like this.
    • 0:27:37This is why you and I tend to write stuff like this.
    • 0:27:40And ideally, still, people like you and I tend to drag and drop puzzle pieces
    • 0:27:44that sort of abstract all of that away further.
    • 0:27:46But for now, this is, again, called assembly language.
    • 0:27:49It is what happens when the compiler literally compiles your code.
    • 0:27:54But of course, this, still not zeros and ones.
    • 0:27:57So we got two steps to go.
    • 0:27:58So when a compiler proceeds to step three,
    • 0:28:02this is where things get converted to machine code.
    • 0:28:05And when a compiler assembles your code for you,
    • 0:28:08it converts what we just saw on the screen here to actual zeros and ones--
    • 0:28:14the so-called machine code that your phone or your computer understands.
    • 0:28:18But it's worth noting that these are not necessarily all
    • 0:28:22of the zeros and ones of your program.
    • 0:28:24Yes, they are the zeros and ones that correspond to your Hello program
    • 0:28:29or printf and getString and the like, but notice
    • 0:28:33that here, we need one final step.
    • 0:28:36In those zeros and ones are only your lines of code.
    • 0:28:40But what about CS50's lines of code that we wrote to implement getString?
    • 0:28:43What about the lines of code that humans wrote decades ago to implement printf?
    • 0:28:46Those are somewhere on this hard drive, like on my Mac, my PC,
    • 0:28:50or somewhere in the cloud, but we need to combine all of those zeros and ones
    • 0:28:54together and link my code with CS50's code with standard I/O's code,
    • 0:29:01all together.
    • 0:29:02And so what happens in the last step, ultimately,
    • 0:29:05is that if we have my code here in yellow,
    • 0:29:07and then the code that CS50 wrote, and the code that the authors of C
    • 0:29:11itself wrote, what really is happening is that somewhere, we have not only
    • 0:29:15hello.c, which, obviously, I wrote, and wrote with us live here,
    • 0:29:19there's also, let's assume, somewhere on the computer, a cs50.c file
    • 0:29:24that, coincidentally, I and CS50 staff wrote years ago.
    • 0:29:28And also, somewhere on the computer, there's another file.
    • 0:29:30Let me oversimplify by just calling it stdio.c.
    • 0:29:34In practice, it's probably specifically called printf.c.
    • 0:29:36But they're somewhere, these two other files.
    • 0:29:39And so this last step called linking takes my zeros and ones
    • 0:29:44from the code I just wrote, namely this code on the screen here.
    • 0:29:48It then grabs the zeros and ones that CS50 wrote.
    • 0:29:50And it grabs the zeros and ones that the authors of C wrote,
    • 0:29:53in order to implement the standard I/O library.
    • 0:29:56And lastly, voila, links them all together.
    • 0:30:00And this is the same blob of zeros and ones that we saw earlier.
    • 0:30:03It's just now the result of preprocessing your code,
    • 0:30:08compiling your code, assembling your code, linking your code, and my God,
    • 0:30:12at this point, like if there were any fun in programming for you yet,
    • 0:30:15we've just taken it all away, we just call this whole process compiling.
    • 0:30:19Why?
    • 0:30:20Because now that we know those steps exist--
    • 0:30:22and smart people solve that problem for us--
    • 0:30:25you and I can kind of operate at this level of abstraction
    • 0:30:27and just assume that compiling converts source code to machine code.
    • 0:30:32Questions, though, on any of these intermediate steps?
    • 0:30:36Yeah?
    • 0:30:37AUDIENCE: For linking, are different parts, like [INAUDIBLE]??
    • 0:30:50DAVID MALAN: A good question.
    • 0:30:51So where are all of these zeros and one stored?
    • 0:30:53Because you and I, we've been using a browser, right? code.cs50.io,
    • 0:30:56of course, is this web-based user interface.
    • 0:30:58But again, recall from last week, even though you're
    • 0:31:00using a web browser to access VS Code, that web-based version of VS code
    • 0:31:05is connected to an actual server somewhere in the cloud.
    • 0:31:09And on that server, you have your own account and your own files, and really,
    • 0:31:13your own hard drive, virtually in the cloud.
    • 0:31:15Think of it a little like Dropbox or Box or Google Drive or OneDrive
    • 0:31:18or something like that.
    • 0:31:19So you have a hard drive somewhere out there that we've provisioned for you.
    • 0:31:23And it's on that hard drive that you have your code that you just wrote,
    • 0:31:27or I just wrote, cs50.c, stdio.c, and all of the other code
    • 0:31:32that implements the math functions and everything else that C supports.
    • 0:31:36Good question.
    • 0:31:37Yeah?
    • 0:31:38AUDIENCE: So, say in the CS50 library, the line [INAUDIBLE]
    • 0:31:45do we do the same exact thing [INAUDIBLE]
    • 0:31:49copy paste them all the way over?
    • 0:31:51DAVID MALAN: Good question.
    • 0:31:53That hash includes cs50.h line at the top of my code.
    • 0:31:57If I just replace that with the contents of cs50.c, would that work?
    • 0:32:01Short answer, yes, that would work.
    • 0:32:03You could copy all of the code there.
    • 0:32:05However, there's some order of operations that might come into play.
    • 0:32:08And so it's probably not quite as simple as copy, paste.
    • 0:32:10But conceptually, yes, that's what's happening.
    • 0:32:13Now with that said, in cs50.h, are only the prototypes of the functions,
    • 0:32:19the hints as to how the functions look, what their return type is,
    • 0:32:23what their name is, and what their arguments are.
    • 0:32:25It's in the dot c file that actual code tends to be written.
    • 0:32:29And this is a little confusing now because you and I have only
    • 0:32:32written code in dot c files.
    • 0:32:33But in the next few weeks, you'll actually
    • 0:32:35start writing some of your own dot h files
    • 0:32:37as well, just like CS50, just like standard I/O.
    • 0:32:40But in essence, that line of code just makes it easier to use and reuse
    • 0:32:44code that's already been written.
    • 0:32:46And that's the whole point of a library.
    • 0:32:47AUDIENCE: Does linking them [INAUDIBLE]?
    • 0:32:50DAVID MALAN: Say that a little louder.
    • 0:32:51AUDIENCE: Does linking happen when you use the compiler?
    • 0:32:54DAVID MALAN: Yes.
    • 0:32:55Does linking happen when you compile your code?
    • 0:32:56Yes.
    • 0:32:57When you run make, as we have been doing the past week now,
    • 0:33:02all four of these steps are happening.
    • 0:33:04Preprocessing converts the hash include lines to something else.
    • 0:33:07Compiling technically converts it to assembly
    • 0:33:10code, which the Mac, the PC, the server more closely understands.
    • 0:33:14Assembly converts that language to binary machine code that this computer
    • 0:33:18actually understands.
    • 0:33:20And then linking combines everything together.
    • 0:33:22And in fact, if you think back a few minutes ago to when I did this -lcs50,
    • 0:33:27the reason I had to add that, and the reason
    • 0:33:30my code did not compile at first, was because I
    • 0:33:32forgot to tell clang to link in CS50's zeros and ones per that last step.
    • 0:33:38I don't need to do -lstdio because it comes with C,
    • 0:33:42so that would just be tedious for everyone in the world.
    • 0:33:44But CS50 does not come with C, so we link that in.
    • 0:33:47And to be clear, too, we won't always use CS50's library.
    • 0:33:49That'll be yet another pair of training wheels we take off in the coming weeks.
    • 0:33:53But for now, it makes a few things simpler.
    • 0:33:55Yeah?
    • 0:33:57AUDIENCE: What is the [INAUDIBLE]?
    • 0:34:08DAVID MALAN: Short answer, yes.
    • 0:34:10So what do the zeros and ones, the machine code, translate to?
    • 0:34:12Yes, there is a one-to-one relationship between the machine
    • 0:34:15code and the assembly code.
    • 0:34:17Assembly code, it's not really English, but at least it's symbols I recognize.
    • 0:34:21It's not zeros and ones.
    • 0:34:22Machine code, of course, is just zeros and ones.
    • 0:34:24So back in the day, before C existed, people
    • 0:34:27were programming only in assembly code.
    • 0:34:30Before assembly code existed, people were coding in zeros and ones.
    • 0:34:34And you can imagine just how painful that was,
    • 0:34:36and so each of these languages makes life, for us,
    • 0:34:39sort of easier and easier.
    • 0:34:40In a few weeks, we'll transition to Python, which
    • 0:34:42will, in turn, make C even simpler--
    • 0:34:45or coding, in general, simpler to do too.
    • 0:34:48All right, so with that said, what now can we--
    • 0:34:53what could go wrong with this?
    • 0:34:55Well, it turns out that besides compiling, technically speaking,
    • 0:34:58there's decompiling.
    • 0:34:59And we've not done this, and we won't do this.
    • 0:35:01But it's worth considering for just a moment.
    • 0:35:04If you were to not compile your code, but decompile it--
    • 0:35:07as the word suggests, this just means reversing the process, converting it,
    • 0:35:11ideally, from machine code-- zeros and ones--
    • 0:35:14maybe back to C. Now this would be cool, perhaps, if all you have is a program,
    • 0:35:19you can convert it and see the actual source code.
    • 0:35:22What might a downside be, if anyone on the internet
    • 0:35:25is able to decompile code on their machine?
    • 0:35:28Yeah?
    • 0:35:29AUDIENCE: [INAUDIBLE]
    • 0:35:30DAVID MALAN: OK, so it's easier to find bugs in the code that--
    • 0:35:34oh, to exploit.
    • 0:35:35So it might be easier to hack into the software
    • 0:35:38by finding mistakes you and I made because, literally, they're
    • 0:35:41staring at you in code, whereas the zeros and ones make
    • 0:35:43it way less obvious.
    • 0:35:45Other downsides of what I called decompiling?
    • 0:35:48Yeah?
    • 0:35:49AUDIENCE: If stuff is copyrighted or you don't even know how to get it--
    • 0:35:53DAVID MALAN: Yeah.
    • 0:35:54AUDIENCE: [INAUDIBLE]
    • 0:35:55DAVID MALAN: Yeah, if your code, your work,
    • 0:35:57is your intellectual property, copyrighted or otherwise, that's
    • 0:36:00kind of obnoxious that someone can just run a command, and boom,
    • 0:36:03they can see the original code that you wrote.
    • 0:36:05Now, it turns out it's not quite as simple as that.
    • 0:36:08And so even though, yes, you could take a program like Hello,
    • 0:36:11or even Microsoft Word, and convert it from zeros and ones
    • 0:36:15back to some form of source code-- be it in C or Java
    • 0:36:19or Python or something else, whatever it was originally written in-- odds
    • 0:36:22are it's going to be an utter mess to look at.
    • 0:36:25Why?
    • 0:36:26Because things variable names are not retained in the zeros and ones,
    • 0:36:30typically.
    • 0:36:30Function names might not be retained in the zeros and ones.
    • 0:36:33The code is, the logic is, but the computer
    • 0:36:36doesn't care what pretty variables you chose
    • 0:36:38and how nicely named your functions were, it just
    • 0:36:41needs to know them as zeros and ones.
    • 0:36:42Moreover, if you think about last week, we introduced things like loops in C.
    • 0:36:46And besides for loops, there's what other kind of loop, for instance?
    • 0:36:49AUDIENCE: [INAUDIBLE]
    • 0:36:50DAVID MALAN: So, a while loop-- and even though they look different
    • 0:36:53and you have to write different code, they achieve exactly
    • 0:36:55the same functionality, which is to say, when you compile a for loop
    • 0:36:59or you compile a while loop, if they logically do the same thing,
    • 0:37:04they might end up looking identical as zeros and ones.
    • 0:37:07And so, therefore, it's not necessarily predictable
    • 0:37:09that you'll get back the original code, why?
    • 0:37:11Because the zeros and ones might not know, so to speak,
    • 0:37:15whether it was a for loop or a while loop,
    • 0:37:16so maybe compiling will show you one or the other.
    • 0:37:19And honestly, decompiling, while possible-- and it's
    • 0:37:21one way of reverse engineering someone's product.
    • 0:37:24Odds are, if you're good enough to start reading code that's been decompiled
    • 0:37:28and reading through the messiness of it, odds are you
    • 0:37:30have the talent probably to just write that same program from scratch
    • 0:37:34yourself.
    • 0:37:34Now, that's an overstatement, perhaps, but it's not
    • 0:37:36quite as easy or threatening as you might first think.
    • 0:37:40So in general, once code is compiled, it's
    • 0:37:43pretty challenging, time consuming, costly to reverse engineer it, much
    • 0:37:48like it would be in the real world, right?
    • 0:37:50Like all of us have some kind of phone, probably, nowadays in our pocket.
    • 0:37:52There's nothing stopping you from opening it up somehow,
    • 0:37:55poking around, recreating what's there.
    • 0:37:57That's a huge amount of effort, most likely.
    • 0:37:59And at that point, maybe you should just invent the phone, instead
    • 0:38:01of trying to reverse engineer it.
    • 0:38:03So same kind of idea in the physical world.
    • 0:38:06Any questions, then, on compiling, or even decompiling in these forms?
    • 0:38:13All right, so odds are, at this point, not only I, but you have made mistakes.
    • 0:38:17And you've written buggy code--
    • 0:38:19a bug in a code is just a mistake, a logical error
    • 0:38:22or otherwise, where the code just does not behave correctly as you intend.
    • 0:38:26And up until now, odds are, your debugging techniques
    • 0:38:29have been to maybe look back at what I did in class, maybe
    • 0:38:32ask a question online or in-person.
    • 0:38:35But ultimately, it'd be nice if you had some tools of your own
    • 0:38:38with which to debug code.
    • 0:38:39And this, honestly, is a lifelong skill.
    • 0:38:41You're not going to emerge from CS50--
    • 0:38:43and even 20 years from now, you're not going
    • 0:38:44to be writing-- if you're writing code at all-- correct code all of the time.
    • 0:38:47Like, all of us on the staff continue to write bugs.
    • 0:38:50Hopefully, they get a little more sophisticated, and not sort of like,
    • 0:38:54oops, I missed a semicolon.
    • 0:38:55But even those kinds of mistakes, we make too.
    • 0:38:57But there's tools out there and techniques
    • 0:39:00that can make your life easier when it comes to solving those problems.
    • 0:39:03Now, the term bug has actually been around for decades.
    • 0:39:06But a fun story to tell is that the first documented actual bug was
    • 0:39:11actually somehow connected to Harvard.
    • 0:39:13In fact, this is the logbook relating to the Harvard Mark II computer
    • 0:39:18from 1947, whereby if you read the notes here-- and I'll Zoom in-- this
    • 0:39:22was an actual moth discovered inside of this big mainframe computer that
    • 0:39:27was causing some kind of problems.
    • 0:39:29And the engineers there at the time actually
    • 0:39:30thought it was funny that, wow, physical bug actually explains the issue.
    • 0:39:33And it's been forever taped to this sheet of paper, which I believe
    • 0:39:36now is on display in the Smithsonian.
    • 0:39:39With that said, this is just representative, too, of a logical bug.
    • 0:39:43And that story is actually--
    • 0:39:45that story was often retold by a famous mathematician, then computer scientist
    • 0:39:49really, Dr. Grace Hopper, who actually worked not only on the Harvard Mark II
    • 0:39:53computer, but its predecessor, the Harvard Mark I.
    • 0:39:57And if you ever spent time, yet, in the engineering building across the river
    • 0:40:01here, you can actually see much of this computer, which
    • 0:40:04is along the wall when you first walk into the Science and Engineering
    • 0:40:07Complex.
    • 0:40:07And indeed, as you've probably heard growing up,
    • 0:40:09this is a mainframe computer.
    • 0:40:11This is what Macs and PCs, so to speak, looked like back in the day,
    • 0:40:15with very physical things that essentially implemented the zeros
    • 0:40:18and ones that you and I take for granted now being miniaturized in our laptops
    • 0:40:21and phones.
    • 0:40:22So there's a piece of history there.
    • 0:40:23If you visit that side of campus sometime, do take a look.
    • 0:40:27But let's consider, then, how we solve not, of course, physical bugs,
    • 0:40:30but logical bugs.
    • 0:40:31And let's consider something like this from last week,
    • 0:40:33whereby, we were trying very simply to print like this column of three bricks
    • 0:40:38using hashtags of sorts.
    • 0:40:40So let me go over here in just a moment to VS Code.
    • 0:40:44And I'm going to go ahead and open a program I wrote in advance.
    • 0:40:47And I'm bringing it to class because there's a bug in it,
    • 0:40:49and I'd like to figure out how to solve this bug.
    • 0:40:51So let me open up a buggy0.c, which is version 0 of my code.
    • 0:40:56And let's just take a quick peek at what's here.
    • 0:40:58It's pretty short.
    • 0:40:58It includes only stdio.h, it uses printf, it uses a for loop,
    • 0:41:03and the goal, quite simply, is to print out that column of three bricks.
    • 0:41:07Now, it's short enough that some of you, if you're getting comfy already with C,
    • 0:41:11you might already see the logical bug.
    • 0:41:13It's not a syntax error, like it will compile and run.
    • 0:41:16But there's a bug there.
    • 0:41:17And suppose that I'm very new to C, I'm very uncomfortable with C, it's 2:00 AM
    • 0:41:22and I just can't see the bug, what are my recourses here for actually
    • 0:41:26finding a mistake like this?
    • 0:41:27Well, first, let's look at the symptom.
    • 0:41:29Let me go down to my terminal window.
    • 0:41:31I'm going to use make buggy0 because, again, the file is called buggyo.c.
    • 0:41:36I'm not going to use clang.
    • 0:41:37In fact, I'm never really going to use clang manually here on out.
    • 0:41:39I'm just going to use make because it makes our lives easier.
    • 0:41:42It does compile.
    • 0:41:43No errors, so it's not syntax.
    • 0:41:45It's not something silly like a missing semicolon.
    • 0:41:47But when I run ./buggy0, I, of course, see one, two, three, four--
    • 0:41:53and this, of course, does not match the one, two, three bricks that I actually
    • 0:41:57intended for that column.
    • 0:41:59And yet, I'm starting counting at 0, as I usually do.
    • 0:42:02I've got three.
    • 0:42:03I'm going up to three.
    • 0:42:05So where is my logical error?
    • 0:42:06If it hasn't obviously jumped out at you already, well, how can I solve this?
    • 0:42:10Well, first and foremost, perhaps the best technique
    • 0:42:13for solving bugs, at least early on, is just use printf.
    • 0:42:16Like thus far, we've used sprint say, Hello, and other things on the screen.
    • 0:42:20But printf is just a function for printing anything.
    • 0:42:22And there's no reason you can't temporarily
    • 0:42:24use printf to print out the contents of variables,
    • 0:42:27what's going on inside of your program, just
    • 0:42:29to figure out where your mistake is.
    • 0:42:31And then you can delete that line of code later.
    • 0:42:32It doesn't have to stay there forever.
    • 0:42:34So let me do this.
    • 0:42:35Instead of just printing out in VS Code the hash symbol,
    • 0:42:39let me do a little safety check here and print out the value of i.
    • 0:42:45So let me go ahead and say something like, i is--
    • 0:42:49now I want to say i is this.
    • 0:42:51But, of course, this is not how I print out the value of i.
    • 0:42:54If I want to print out the value of i, what should I put here?
    • 0:42:58So %i for integer, instead of %s for string.
    • 0:43:02So they're still placeholders.
    • 0:43:03But we use %s for integers.
    • 0:43:04And now if I want to print out i, I just need the comma as the second argument,
    • 0:43:08and then i.
    • 0:43:09All right, let me go ahead and back to my terminal window.
    • 0:43:13Let me recompile the program because I've changed it.
    • 0:43:15That still works fine, ./buggy0.
    • 0:43:18And now, let me increase the size of my terminal window here.
    • 0:43:22You just see some diagnostic information, if you will.
    • 0:43:25This is not the goal.
    • 0:43:26This is not what you should be submitting for this homework problem,
    • 0:43:29were it one.
    • 0:43:30But it is helping us diagnostically know that, OK, when i is zero,
    • 0:43:33here's a hash.
    • 0:43:34When i is 1, here's a hash.
    • 0:43:36When i is two, here's a hash.
    • 0:43:37When i is 3, here's a hash.
    • 0:43:39Well, wait a minute.
    • 0:43:39That's one, two, three, four.
    • 0:43:41So clearly, I'm printing it one too many times.
    • 0:43:44So let me look back at the code here by shrinking my terminal window.
    • 0:43:48And let me just ask the group, where is, in fact, the mistake?
    • 0:43:53Or what, equivalently, would be the solution?
    • 0:43:56Yeah, in the middle.
    • 0:43:57AUDIENCE: [INAUDIBLE]
    • 0:44:00DAVID MALAN: Yeah, instead of less than or equal to, use just less than.
    • 0:44:03So you've got to kind of pick a lane here.
    • 0:44:05If you're going to start counting from 0, you generally use less than,
    • 0:44:08and go up to, but not through the value.
    • 0:44:10Or if you prefer, like in the human world, counting from 1 on up,
    • 0:44:13you can use less than or equal to, but you have to be consistent.
    • 0:44:17And in general, as a programmer, just always start
    • 0:44:19counting from 0 if you're doing something canonical like this.
    • 0:44:22But the solution is, indeed, just to change this
    • 0:44:25by changing the greater less than or equal to the less than.
    • 0:44:27If I recompile this program with make buggy0, and then do .buggy0 again--
    • 0:44:34and let me increase the size of my terminal window.
    • 0:44:36Now, you see, OK, almost the same output.
    • 0:44:39But indeed, i starts at 0 and goes up to, but not through, three.
    • 0:44:44All right, so printf, in short, can be your first diagnostic tool.
    • 0:44:48Instead of just staring at the screen or raising your hand--
    • 0:44:51I mean, use printf to see, literally, what's going on inside of your program
    • 0:44:55by just printing out things of interest.
    • 0:44:57And then once you've solved the problem, you
    • 0:44:59can go back into your code, as I'll do here, by shrinking my terminal window.
    • 0:45:02I'll delete the printf line.
    • 0:45:04And now I'm ready to share this program with the world
    • 0:45:07or submit it as homework or the like.
    • 0:45:08It's just meant there to be temporary.
    • 0:45:11Any questions on printf as a debugging tool?
    • 0:45:18No?
    • 0:45:18All right, well, that only gets us so far.
    • 0:45:20And honestly, as your programs grow and grow and grow,
    • 0:45:23it's going to actually get really annoying
    • 0:45:25to start going in and adding printf's, then removing them, and figuring out,
    • 0:45:28if you've got multiple printf's, well, which one printed what?
    • 0:45:31It just gets messy, eventually, to rely on printf alone.
    • 0:45:34So being a computer scientist, computer scientists
    • 0:45:37have written software to make it easier to debug code.
    • 0:45:41That software is what we would generally call a debugger, which
    • 0:45:44would be the second tool of the trade that you can use to actually solve
    • 0:45:47problems in your code.
    • 0:45:48Now, in the world of VS code, there's actually a debugger built in.
    • 0:45:52So the graphical user interface you're about to see
    • 0:45:54in VS Code isn't specific to CS50, it actually comes with VS Code.
    • 0:45:58And it supports C, and C++, and Java, and Python,
    • 0:46:01and lots of other languages too.
    • 0:46:03But it's, admittedly, a little complicated
    • 0:46:05to just start using the debugger.
    • 0:46:07You have to create a configuration file and do
    • 0:46:10some annoying steps that just get in the way of solving real problems.
    • 0:46:13So we have automated the process for you of just starting the debugger.
    • 0:46:17And thereafter, it's sort of industry standard how you use it.
    • 0:46:19But we save you the headache of having to create those configuration files.
    • 0:46:23So, suppose I want to do this.
    • 0:46:25Suppose I want to try to debug this program
    • 0:46:27step by step using special software.
    • 0:46:30Well, how can I do that?
    • 0:46:31Well, let me propose that if I revert this back to the original version
    • 0:46:36where i was less than or equal to 3, I'm pretty sure that I
    • 0:46:40was printing too many hashes.
    • 0:46:41So I'm going to do this-- and you might have done this
    • 0:46:43accidentally or never at all.
    • 0:46:45But notice if you hover over the gutter, so to speak, in VS Code, the part of it
    • 0:46:49all the way to the left of the editor, you see this sort of grayed
    • 0:46:52out red dot.
    • 0:46:54If you click there, it becomes a brighter red dot.
    • 0:46:57And this represents what we're going to call a breakpoint.
    • 0:46:59And this is just a visual indicator that you've put like a stop sign equivalent
    • 0:47:03there, and you're telling the debugger in a moment, stop
    • 0:47:06running my code there.
    • 0:47:07Why?
    • 0:47:07Because I prefer to step through my code at sort of a human speed,
    • 0:47:11and not as computer speed where it runs all at once.
    • 0:47:14So I've set my breakpoint, which is step one.
    • 0:47:16And then step two is quite simply this.
    • 0:47:18Instead of running the program itself, run the command called debug50,
    • 0:47:23and then ./buggy0.
    • 0:47:26And now this will start your program, but inside
    • 0:47:29of the debugger, which is a special program
    • 0:47:31that smart people wrote that will empower
    • 0:47:33you to now step through your code line by line, and again, at your own comfort
    • 0:47:38pace.
    • 0:47:38I'm going to hit Enter, some stuff's going to happen on the screen-- whoops.
    • 0:47:43Notice, this is a common mistake that I made accidentally here.
    • 0:47:45Looks like I've changed my code.
    • 0:47:47I did because I went in and changed the less than or equal to sign.
    • 0:47:49So let me go ahead and rerun make buggy0--
    • 0:47:52Enter.
    • 0:47:53Good, now let me rerun debug50--
    • 0:47:55Enter.
    • 0:47:57And now some stuff just happened on the screen
    • 0:47:59and it takes a moment to get started but once it's started you'll
    • 0:48:03see this you'll still see your code.
    • 0:48:06But you'll see this yellow highlight, which you've probably not seen before.
    • 0:48:09And notice that it's specifically highlighting the same line
    • 0:48:11that I set a breakpoint on.
    • 0:48:13Why?
    • 0:48:13That just means the debugger has executed all of these lines,
    • 0:48:18except for line 7.
    • 0:48:20It has broken at-- not in a bad way.
    • 0:48:23But it has paused execution on line 7, so it hasn't yet printed any hashes.
    • 0:48:27And you can see that-- no hashes in the terminal window yet.
    • 0:48:30It's paused execution.
    • 0:48:31But what's interesting with the debugger is the stuff
    • 0:48:35over here on the left-hand side.
    • 0:48:37In the debugger here, you'll see, under variables,
    • 0:48:39all of your so-called local variables.
    • 0:48:41And we haven't really made a distinction between local
    • 0:48:44and something called global.
    • 0:48:45But for now, local variables just means all of the variables
    • 0:48:48that exist in your function.
    • 0:48:49So i currently has a value of 0.
    • 0:48:52OK, and that makes sense.
    • 0:48:53So now, how do I step through my code and see what it's doing?
    • 0:48:57Well, at the top of the screen here, you'll
    • 0:48:59see some playback icons, kind of like a video player,
    • 0:49:02but they have special meaning.
    • 0:49:03This first one will just play the rest of your program all the way to the end.
    • 0:49:07So you only click that if you've sort of solved the problem
    • 0:49:10and you just want to run it to completion like before.
    • 0:49:13But the next three--
    • 0:49:14or next two, really, are really the juiciest.
    • 0:49:16The second one here, if you hover over it, eventually,
    • 0:49:19you'll see that it's called Step Over.
    • 0:49:21Step Over means that the debugger will run
    • 0:49:25this currently highlighted line of code, but it's not going to dive into it.
    • 0:49:28So if it's a function like printf, it's not
    • 0:49:30going to start stepping through printf line by line.
    • 0:49:32Why?
    • 0:49:33Because I can pretty much assume printf, written decades ago, is correct.
    • 0:49:36Problem's probably with me.
    • 0:49:38But this next line, if I did really want to step into the printf code
    • 0:49:42to figure out how it works or find some problem in it all these years later,
    • 0:49:46you can step into printf, and then the screen would change,
    • 0:49:48and you'd see each of the lines for printf,
    • 0:49:50line by line-- at least if you have the source code for printf installed.
    • 0:49:54All right, I'm going to use the first one, Step Over.
    • 0:49:56And watch as the yellow highlight moves.
    • 0:49:59And watch as, in the terminal window, there's a hash symbol.
    • 0:50:03Here we go.
    • 0:50:03There's one hash.
    • 0:50:05Now, notice line 5 is highlighted.
    • 0:50:07That means it has paused on line 5.
    • 0:50:09Line 5 has not yet been executed.
    • 0:50:11So what does that mean?
    • 0:50:12The value of i, per the top left-hand corner, is still 0.
    • 0:50:16But as soon as I click Step Over again, watch
    • 0:50:18what happens at the top left, where i is a variable on the screen.
    • 0:50:24Now i-- and it flashed briefly--
    • 0:50:26has a value of 1.
    • 0:50:27And now if I step over again, watch the terminal window.
    • 0:50:30There's my second hash.
    • 0:50:32Now, let me click Step Over on for loop, watch the variable at top left.
    • 0:50:36Now 1 goes to 2.
    • 0:50:38Now let me click it again.
    • 0:50:39Third hash-- and here's where the logical error is perhaps revealed.
    • 0:50:43Let me go ahead and step over the loop.
    • 0:50:45Now i is 3.
    • 0:50:46Wait a minute, I'm still going to print out a hash.
    • 0:50:49There it is.
    • 0:50:49There's the fourth hash.
    • 0:50:50And at this point, hopefully, the light bulb, proverbially, has gone off.
    • 0:50:53I realize, oh, I screwed up.
    • 0:50:55I can either stop the program altogether with the red square,
    • 0:50:58or I can just let it run all the way to the end, which
    • 0:51:01just terminates everything.
    • 0:51:02At this point, I just want to get back into my code and start fixing things.
    • 0:51:05And you can close, for instance, as I will here,
    • 0:51:07the File Explorer, just to hide the panel that opened.
    • 0:51:10So that's debug50.
    • 0:51:12But it's not a CS50 thing, that just starts the debugger for you, which
    • 0:51:15is something you'd find in most any programming environment nowadays.
    • 0:51:19Questions on debugging?
    • 0:51:23Questions?
    • 0:51:24Yeah?
    • 0:51:24AUDIENCE: Where does it tell you where it went wrong?
    • 0:51:27DAVID MALAN: Good question.
    • 0:51:28Where does it tell you where it went wrong?
    • 0:51:30So, sadly, it does not tell you any of that.
    • 0:51:33The onus is still on you, the human, to use this tool productively to walk
    • 0:51:37through your code at a saner pace.
    • 0:51:39But your brain is the one that still needs to solve it.
    • 0:51:42And I don't doubt, down the line, with artificial intelligence and more,
    • 0:51:45programs like this will get all the more helpful,
    • 0:51:47and start answering questions like that for us.
    • 0:51:49And there are other tools we'll introduce you this semester
    • 0:51:51that are even more powerful than this.
    • 0:51:52But for now, it's just a tool, really, to slow things down and not
    • 0:51:56have to change your code.
    • 0:51:57The fact that I had that panel on the left that just showed me i's changing
    • 0:52:01value is just an alternative to printf, and I can
    • 0:52:04step through it a little more slowly.
    • 0:52:06Other questions on debugging?
    • 0:52:10No?
    • 0:52:11Let me show you one final example with this debugger here.
    • 0:52:14And this one, too, I wrote in advance.
    • 0:52:16Let me close buggy0.c.
    • 0:52:18And let me open up buggy1.c, my second version thereof.
    • 0:52:22Let me close my terminal window for a second
    • 0:52:24and give you a quick tour of this program, which
    • 0:52:26similarly, has a mistake.
    • 0:52:28Now, at the top of this program, some familiar includes, cs50.h and stdio.h.
    • 0:52:32This is not something we've seen before.
    • 0:52:34It's specific to this example--
    • 0:52:36a function called getNegativeInt.
    • 0:52:38Takes no arguments, and it returns an integer.
    • 0:52:41What does it do?
    • 0:52:41It literally gets a negative integer, ideally, from the user.
    • 0:52:45Fun fact, though, it doesn't correctly.
    • 0:52:47That's the bug. getNegativeInt is broken at the moment.
    • 0:52:50So what does main do?
    • 0:52:51Well, main just calls this function, passing in nothing
    • 0:52:54in parentheses, no inputs.
    • 0:52:55And it stores the return value in i.
    • 0:52:58And then it just prints out i on the screen.
    • 0:53:00So honestly, just by eyeballing this, I feel comfortable enough
    • 0:53:03with programming in C, I think main is correct.
    • 0:53:06Let me just stipulate, main is correct.
    • 0:53:07But there is going to be a bug down here.
    • 0:53:09Now, what's the bug down here?
    • 0:53:11Well, let me look at getNegativeInt's implementation.
    • 0:53:14Notice, this first line, 12, is identical to the prototype up here.
    • 0:53:18The prototype is sort of stupidly required up here
    • 0:53:22because C reads things top to bottom, left to right--
    • 0:53:25the compiler technically does.
    • 0:53:26So if you reference getNegativeInt here, but you
    • 0:53:29don't implement it until down here, and you haven't told C in advance
    • 0:53:33that it will exist, again, you get the error we saw last week.
    • 0:53:36All right, so how does getNegativeInt work?
    • 0:53:39We declare a variable called n.
    • 0:53:40We've got to do while loop that does what?
    • 0:53:43It uses getInt, which comes with the cs50 library, per last week.
    • 0:53:47It prompts the user for negative integer, quote unquote,
    • 0:53:49and stores the value in n.
    • 0:53:51I then do all of this while n is less than 0, right?
    • 0:53:56Remember, we used to do while loop last week to make sure the human cooperates
    • 0:54:00and doesn't give us the wrong type of value, be it positive or negative
    • 0:54:03or something else.
    • 0:54:04And then we return n.
    • 0:54:06And there's some subtleties.
    • 0:54:07Anyone recall-- or have an intuition for why I've declared n on line 14,
    • 0:54:12instead of line 17?
    • 0:54:15This is a C specific thing.
    • 0:54:17AUDIENCE: [INAUDIBLE]
    • 0:54:23DAVID MALAN: Exactly.
    • 0:54:24There's this notion of scope in C. And we'll continue to see this over time,
    • 0:54:27whereby, a variable only exists inside of the most recent curly braces
    • 0:54:32that you've opened.
    • 0:54:33So if I've declared n here on line 14, I can use it
    • 0:54:36anywhere between lines 13 and 21 because those are the nearest curly braces.
    • 0:54:40If by contrast, as you note, if I instead said this,
    • 0:54:43int n equals getInt and so forth, and didn't have the current line 14,
    • 0:54:49well, n would exist inside of these curly braces, but not here, which
    • 0:54:53is too late, and definitely not here.
    • 0:54:55So you just have to declare it first, and then use and reuse it as such.
    • 0:54:59Now, let me just show you how I can debug this.
    • 0:55:01But let me show you the symptoms first.
    • 0:55:03Let me open my terminal window.
    • 0:55:04Let me run make buggy1.
    • 0:55:06Compiles OK, so it's not something silly like a semicolon. ./buggy1,
    • 0:55:11and I'm asked for a negative integer.
    • 0:55:13All right, let me give it negative 1--
    • 0:55:15Enter.
    • 0:55:16Well, the main function is supposed to print out what I typed,
    • 0:55:19but it clearly didn't.
    • 0:55:20It's prompting me again.
    • 0:55:21All right, so maybe it'll like negative 2.
    • 0:55:23No?
    • 0:55:24Maybe negative 3.
    • 0:55:2650?
    • 0:55:27OK, so it's definitely broken, right?
    • 0:55:29It kind of seems logically to be doing the opposite.
    • 0:55:31Now, you can perhaps see why this is happening already.
    • 0:55:33These are deliberately simple programs for demonstrations sake.
    • 0:55:37But let's do this.
    • 0:55:38Let me go ahead and set a breakpoint in main,
    • 0:55:41even though I'm pretty sure main is correct.
    • 0:55:42But it just helps me start my thought process-- start with main,
    • 0:55:45and then take it from there.
    • 0:55:47Let me run now, debug50 ./buggy1--
    • 0:55:51Enter.
    • 0:55:52And let's see.
    • 0:55:53With that breakpoint now, the GUI is going to reconfigure itself.
    • 0:55:56It's going to pause on line 8 because that's the first interesting line
    • 0:56:00inside of main.
    • 0:56:01So I could have just put the breakpoint on line 8 too.
    • 0:56:03It's smart enough to know that if I set it on 6,
    • 0:56:06you really mean line 8 because that's the first actual line of code.
    • 0:56:09And watch, now, what happens.
    • 0:56:11If I step over this line, notice that i, which at the moment
    • 0:56:15seems to have a default value of 0--
    • 0:56:18more on that another time.
    • 0:56:19But if I click Step Over like before, I'm prompted for a negative integer.
    • 0:56:24Let me type negative 1--
    • 0:56:25Enter.
    • 0:56:27And now, notice, there's no additional yellow highlight.
    • 0:56:32Why?
    • 0:56:32Where am I currently stuck, logically?
    • 0:56:35AUDIENCE: [INAUDIBLE]
    • 0:56:37DAVID MALAN: Yeah, just logically, I must be in that do, while loop.
    • 0:56:40And even if you don't understand it, like that's the only explanation.
    • 0:56:43If you keep getting prompted, surely, there's a loop going on.
    • 0:56:46There's only one loop in my code, so there's probably a problem there.
    • 0:56:49So I can't just set a breakpoint in main, and then wait for this to work.
    • 0:56:52So let me just--
    • 0:56:53let me stop this with the red square.
    • 0:56:56And let me think, all right, instead of--
    • 0:56:58I can still set my breakpoint in main, but let me rerun the debugger instead.
    • 0:57:02And this time, not step over that line of code,
    • 0:57:05let me step into that line of code.
    • 0:57:07So watch what happens now.
    • 0:57:09Instead of clicking the second icon here,
    • 0:57:11let me click the third, whose name is, indeed, Step Into.
    • 0:57:14And watch as the yellow highlight does not move to line 9.
    • 0:57:17It dives into line 8-- the function on line 8,
    • 0:57:21thereby, bringing me down to line 17.
    • 0:57:25It's kind of going down into that next function.
    • 0:57:28Now, it didn't bother pausing on line 12 or 13 or 14
    • 0:57:31because there's nothing intellectually interesting there happening yet.
    • 0:57:34The juicy part really starts, it would seem, in line 17.
    • 0:57:37So, now notice, n is my variable at the top left.
    • 0:57:40If I click--
    • 0:57:42I don't want to click Step Into now, though.
    • 0:57:45What would go wrong if I click on Step Into--
    • 0:57:48or what would it do that I don't think I want to do?
    • 0:57:52Yeah?
    • 0:57:52AUDIENCE: [INAUDIBLE]
    • 0:57:54DAVID MALAN: Yeah, it would step into getInt.
    • 0:57:56But I'd like to think that the staff's version of getInt is correct,
    • 0:57:59and that's not our problem today, so I want to step over it.
    • 0:58:02And watch now at top left that nothing happens yet to the value of n
    • 0:58:06until I go to the terminal window now, and I type in something
    • 0:58:09like negative 1.
    • 0:58:10Now notice, it jumps to line 19, which is the next interesting line.
    • 0:58:14Top left, n, indeed, is negative 1.
    • 0:58:17And here's where I can now pause as a human
    • 0:58:19and think, all right, so while n is less than 0.
    • 0:58:22All right, n, per the top left corner, is negative 1.
    • 0:58:25So all right, while negative 1 is less than 0,
    • 0:58:27well, obviously that's true mathematically.
    • 0:58:29So what's going to happen?
    • 0:58:30It's a do while loop.
    • 0:58:32So when I click on Step Over again, it's going to go to this line
    • 0:58:37because it's at the end of the inside of that loop.
    • 0:58:39And now here, it's looping through again and again.
    • 0:58:42All right, let me do this once more.
    • 0:58:44I'm going to step over, all right?
    • 0:58:45I'm going to type in negative 2, and it's the exact same thing.
    • 0:58:48Now is my chance, on the yellow line--
    • 0:58:50OK, wait a minute.
    • 0:58:51Negative 2 is obviously less than 0.
    • 0:58:53Let me try this one more time.
    • 0:58:56Click it once here.
    • 0:58:57All right, let me give it 50.
    • 0:58:59And now, OK, while 50 is less than 0, that's not true,
    • 0:59:05so the loop is over because it's not going to do it while 50 is less than 0.
    • 0:59:08That's not true.
    • 0:59:09So now watch, when I click Step Over once more,
    • 0:59:12it then finishes the loop, even though there's nothing more to do.
    • 0:59:15It's now about to return n.
    • 0:59:17It jumps back up to main, where I left off on line 9.
    • 0:59:21It now prints, in my terminal window, the number 50.
    • 0:59:23And hopefully, at this point, to your question earlier,
    • 0:59:26my human brain has realized, oh, I'm an idiot, like I flipped my sign there.
    • 0:59:30So I probably-- let me stop this.
    • 0:59:32I probably want to do something like this.
    • 0:59:34If the goal is to get a negative integer, I probably want to say,
    • 0:59:38while n is, for instance, greater than or equal to 0 would work.
    • 0:59:45So while n is greater than or equal to 0, keep doing this.
    • 0:59:48And that's the logic I wanted to express.
    • 0:59:50So the debugger just saves me from staring at the screen, raising a hand,
    • 0:59:53sort of asking someone else.
    • 0:59:54At least in this case, it allows me to go through it at a healthier pace.
    • 0:59:58Questions now on debug50, which should be your new friend, even if it's not
    • 1:00:03your first instinct after printf?
    • 1:00:07Any questions on debug50?
    • 1:00:09No?
    • 1:00:09All right, well, there's one last technique we can equip you with here.
    • 1:00:13And that is, in addition to printf and a debugger, no joke,
    • 1:00:17a rubber duck is actually a reasonably recommended solution
    • 1:00:21to finding bugs in your code.
    • 1:00:22To your question earlier, the duck two is not
    • 1:00:24going to solve the problem for you.
    • 1:00:26But if you've wondered why this little guy has been here for so long,
    • 1:00:29there's this technique, has its own Wikipedia article
    • 1:00:32of called rubber duck debugging.
    • 1:00:33The idea of which is that if you're home in your dorm room,
    • 1:00:37wrestling with some bug in your code, printf
    • 1:00:39didn't quite reveal the source to you, debugger isn't really helping,
    • 1:00:42honestly, maybe it would help to just sound out what problem you're having.
    • 1:00:46Similar to going to office hours, talking to a TA or a professor,
    • 1:00:50just walking through your problems because
    • 1:00:52in sort of talking to the duck about the fact
    • 1:00:54that you're doing this while n is less than 0, and then if it is--
    • 1:01:00wait a minute.
    • 1:01:01I'm an idiot, not just for talking to the rubber duck.
    • 1:01:03You realize, hopefully, in expressing yourself,
    • 1:01:05literally verbally, you probably will hear with non-zero probability,
    • 1:01:09like some illogic in your statement.
    • 1:01:11And just by sounding things out, you'll realize like, oh, that's my problem.
    • 1:01:16And so, frankly, if you have roommates, you can also use a roommate for this.
    • 1:01:19But the rubber duck is just sort of a go-to
    • 1:01:21when your roommates have no interest in your C problem set,
    • 1:01:24talking something through that as such.
    • 1:01:28And this is an invaluable technique.
    • 1:01:29I admittedly tend not to do it so much with a rubber duck,
    • 1:01:32but ideally with colleagues, human colleagues.
    • 1:01:34But just talking through things often will help you just realize,
    • 1:01:38oh, I said something illogical.
    • 1:01:40Now I can go back to the code.
    • 1:01:41So don't solve problems by staring at your screen
    • 1:01:44endlessly for minutes, for hours.
    • 1:01:46At that point, it's time for a break, time
    • 1:01:48to walk away, time to talk to the duck, if you've already
    • 1:01:50exhausted some of those other tools.
    • 1:01:52As an aside, on your way out today at the end of class,
    • 1:01:55we have, clearly, plenty of rubber ducks for you.
    • 1:01:59And it's become a thing over the years, at least
    • 1:02:01among some, to bring the duck with them when they travel and send us photos.
    • 1:02:05Here, for instance, is CS50's rubber duck debugger, A.K.A. DDB,
    • 1:02:10for Duck Debugger, which is a pun on a geekier program called GDB, the GNU
    • 1:02:15Debugger, which is an actual piece of software for debugging.
    • 1:02:18This is CS50's debugger in the hills of Puerto Rico, also, here on the sea.
    • 1:02:25He made its way to San Francisco here.
    • 1:02:28Also, down by Fisherman's Wharf by the sea lions.
    • 1:02:30Familiar?
    • 1:02:31Here at Stanford, where there's a William Gates Computer Science
    • 1:02:34building for computer science, down the road in SF at Google.
    • 1:02:38And this is the Trevi Fountain in Rome.
    • 1:02:41And lastly, the Colosseum.
    • 1:02:43So we'll be curious to see in the coming years where your duck two travels.
    • 1:02:46So that, then, was quite a bit.
    • 1:02:49Why don't we go ahead here and take a short 5 minute break?
    • 1:02:51No snacks yet.
    • 1:02:52You're welcome to get up or sit down.
    • 1:02:54We'll return in about five.
    • 1:02:56All right, so we are back.
    • 1:03:00And if the goal, ultimately, today is to have a better understanding of things
    • 1:03:04like strings so that we can solve problems with text,
    • 1:03:06let's consider some simpler types of data
    • 1:03:09first, how we might represent those, and then
    • 1:03:11see if that doesn't lead us to a discovery as to how strings,
    • 1:03:14and just today's modern software is using things like that.
    • 1:03:17So when we talked on week zero about representation of data,
    • 1:03:21we had different ways of doing it, in terms of binary and decimal,
    • 1:03:25and unary even.
    • 1:03:27When we started talking about the same last week in code,
    • 1:03:30we started talking about data types instead.
    • 1:03:33And these data types were a way of telling
    • 1:03:36the computer, like do you want an integer, do you want a character,
    • 1:03:40do you want a floating point value, like a real number, or even a string,
    • 1:03:44as we've seen?
    • 1:03:45But it turns out that computers, of course,
    • 1:03:47only have finite amounts of resources.
    • 1:03:49Your computer only has a fixed amount of memory or RAM.
    • 1:03:53And that actually has very real world implications.
    • 1:03:55So for instance, here are some of the data types we've seen thus far.
    • 1:03:59And it turns out that each of these in C has a specific number
    • 1:04:04of bits allocated to it.
    • 1:04:05Now, admittedly, this can vary by system.
    • 1:04:08It's not so much the case nowadays, but for many years,
    • 1:04:10for decades, computers were getting better and better.
    • 1:04:13The earliest computers might have used fewer bits
    • 1:04:15for some of these data types.
    • 1:04:16More modern computers might use more bits.
    • 1:04:18So the numbers you're about to see are pretty much where we are present day.
    • 1:04:21So when it comes to these data types, a bool,
    • 1:04:25which is true or false, somewhat curiously, uses a whole byte,
    • 1:04:29even though that's way overkill because for a bool, true or false,
    • 1:04:32you, of course, only need one bit.
    • 1:04:33But it turns out, even though it's wasteful to use
    • 1:04:36eight bits, or one byte, just to represent true or false,
    • 1:04:39it's just easier for computers.
    • 1:04:41So a bool tends to be one byte.
    • 1:04:42An int, which we've been using a lot, uses 4 bytes, typically, or 32 bits.
    • 1:04:47And if I do some quick math from week zero, with 32 bits,
    • 1:04:50you have 4 billion possible values, roughly.
    • 1:04:54But if you want to represent positive and negative,
    • 1:04:56that means you can represent roughly negative 2 billion, all the way up
    • 1:04:59to positive 2 billion.
    • 1:05:01So that's the range, typically, with ints.
    • 1:05:02If that's too few numbers for you, turns out there's things called longs.
    • 1:05:06And longs use 64 bits, which allow you to have
    • 1:05:10like a quintillion number of possibilities,
    • 1:05:13which is a lot, certainly, a lot more than 4 billion.
    • 1:05:15So sometimes you might use a long.
    • 1:05:17But even that's finite.
    • 1:05:18And so as we discussed at the end of last week,
    • 1:05:21bad things can happen if you make certain assumptions
    • 1:05:23as to the data because of things like integer overflow or the like,
    • 1:05:27where things wrap around.
    • 1:05:28Then there's a float, which is a real number, something with a decimal point.
    • 1:05:31By convention, it's 4 bytes or 32 bits, which gives you, in short,
    • 1:05:36only a specific amount of precision.
    • 1:05:37It doesn't necessarily dictate how many numbers to the left or to the right.
    • 1:05:41In the aggregate, ultimately, you have though,
    • 1:05:454 billion possible permutations still.
    • 1:05:47If you need more precision for scientific, for medical,
    • 1:05:50for financial applications, you might use 8 bytes, A.K.A. a double,
    • 1:05:54which just gives you more digits of precision.
    • 1:05:57They eventually get imprecise per the example we looked at last week,
    • 1:06:01but it at least gets you further down the line.
    • 1:06:03As an aside, in really, really important applications, in finance,
    • 1:06:07in medicine, in military operations, and the
    • 1:06:10like where you really can't have rounding errors--
    • 1:06:12long story short, humans have developed libraries in C and other languages
    • 1:06:17that use more, even, than 8 bytes.
    • 1:06:19So there are solutions to these problems, but they're always finite.
    • 1:06:22You have to pick an upper bound.
    • 1:06:24Then there's char, which we saw briefly last week when I asked
    • 1:06:27the user for y or n, for yes or no.
    • 1:06:29And then there's a string, which I'm going to propose as a question mark
    • 1:06:32because a string totally depends.
    • 1:06:34Like, Hi!
    • 1:06:35H-I, exclamation point, would seem to be three bytes.
    • 1:06:38D-A-V-I-D, would seem to be five.
    • 1:06:41So the strings, clearly, are variable based on what you or the human type in.
    • 1:06:45So we'll see what this means, though, in just a bit.
    • 1:06:48This though, is the thing inside of your Mac, your PC, your phone.
    • 1:06:51It might not look exactly like this, but this is
    • 1:06:53a memory module for a modern computer.
    • 1:06:56And let's go ahead and use this.
    • 1:06:57Really, it's just representative of the finite amount of memory
    • 1:06:59that any computer, indeed, has.
    • 1:07:01Let's zoom in on one of these little black chips on the circuit board here.
    • 1:07:06Zoom in, and let me propose that this rectangle really represents
    • 1:07:10some number of bytes, like tucked inside of this little black circuit
    • 1:07:14on the board is maybe, I don't know, a gigabyte,
    • 1:07:16a billion bytes, maybe it's 100 bytes-- some number of bytes.
    • 1:07:19It totally depends on the computer and how much
    • 1:07:21you paid for the stick of memory.
    • 1:07:22But if there's a finite number of bytes physically implemented somehow
    • 1:07:27digitally inside of this hardware, well, then it
    • 1:07:30stands to reason that we could number those bytes.
    • 1:07:32We can just arbitrarily decide that the top left corner is byte number
    • 1:07:36one, or really byte number zero.
    • 1:07:38The one next to it is number one, then number two,
    • 1:07:41number 3, dot, dot, dot, number 2 billion
    • 1:07:43or whatever it is, however big this memory is.
    • 1:07:46So if you use a variable in a C program, that's only one byte.
    • 1:07:50Like a char, it might literally be stored in that top left-hand corner
    • 1:07:54of the memory.
    • 1:07:55In practice, you don't care where, physically, it is.
    • 1:07:57But really, the artist's rendition would be
    • 1:07:59this-- a char might use one of those single bytes
    • 1:08:02somewhere in the computer's memory.
    • 1:08:04If you use an int, which is 4 bytes, it would give you
    • 1:08:074 bytes, contiguous-- that is left to right, top to bottom.
    • 1:08:10But all 32 bits would be next to each other
    • 1:08:13so the computer knows that those, indeed, all belong to the same int.
    • 1:08:16If you need a long, or a double for that matter,
    • 1:08:18then you might use a full 8 bytes in this case.
    • 1:08:21And you just keep using and using this memory,
    • 1:08:23kind of like a canvas, almost in Photoshop
    • 1:08:26or a spreadsheet where you can just move pixels or you can move data around,
    • 1:08:29that's really what your computer's memory is,
    • 1:08:31a canvas for storing information in units of bytes or 8 bits.
    • 1:08:36Now, we don't need to keep looking at these circuit boards.
    • 1:08:39We can abstract it away, as we often do.
    • 1:08:41And let's go ahead and zoom in on this grid,
    • 1:08:43just to consider some very specific variables.
    • 1:08:45So let me zoom in, and now I see fewer, but larger boxes
    • 1:08:49on the screen, each of which, again, represents a byte.
    • 1:08:51And now let me propose that we play with some actual code.
    • 1:08:55So here in C, albeit without a full program,
    • 1:08:58are three ints-- score1, score2, score3.
    • 1:09:01I have, coincidentally, given myself two scores around 72 and 73,
    • 1:09:07and then a pretty low score at 33.
    • 1:09:09Of course, last week or two weeks ago, this would have been high.
    • 1:09:12But now we're dealing with actual integers.
    • 1:09:13So these are three so-so scores on my quizzes or tests or the like.
    • 1:09:17So let me go to VS Code here.
    • 1:09:19And let's make a program called scores.c.
    • 1:09:22So I'm going to write, code scores.c.
    • 1:09:24That's going to give me my new file.
    • 1:09:26And let me go ahead and implement something like this.
    • 1:09:28Include stdio.h, int main(void), and then inside of here,
    • 1:09:34let me do int score1 will be 72.
    • 1:09:37Int score2 will be 73.
    • 1:09:40And int score3 will be 33.
    • 1:09:43And then let me just do something like write a program
    • 1:09:45to average my three test scores together, something like that.
    • 1:09:48So let me do printf, quote unquote, my average is--
    • 1:09:52and I'm going to go ahead and do, say, %i, /n.
    • 1:09:56And now, let me plug in the results.
    • 1:09:58And this is kind of grade school math now.
    • 1:10:00How do I compute the average of three values?
    • 1:10:02Well, just like on paper, I can do score1 plus score2 plus score3
    • 1:10:09in parentheses, because of order of operations, divided by 3,
    • 1:10:12since there's three total scores.
    • 1:10:14All right, so I think this checks out.
    • 1:10:16And indeed, you can use parentheses and operators like plus in your code
    • 1:10:19like this in C. Let me go ahead now and do make scores.
    • 1:10:23No syntax error.
    • 1:10:24So that's good, nothing missing there.
    • 1:10:25And now let me do ./scores and see what my test average is.
    • 1:10:28All right, it's not great, but I think I still passed.
    • 1:10:32And indeed, my average here is 59.
    • 1:10:36Is it precisely 59 though?
    • 1:10:38Well, let's see.
    • 1:10:39Let's actually, instead of using an int, how about we go ahead
    • 1:10:42and use something like a floating point value here?
    • 1:10:44And let me go ahead and do this.
    • 1:10:46So let me recompile my code, make scores.
    • 1:10:48Huh, all right, I've got an issue.
    • 1:10:50Let me zoom in on my terminal window.
    • 1:10:52We've not seen this one, necessarily, before.
    • 1:10:54But error on line 9.
    • 1:10:56Format specifies type double, which is a lot of precision,
    • 1:11:00but the argument has type int.
    • 1:11:02So what does this mean?
    • 1:11:03Well, it's showing me with these green squiggles that something's bad between
    • 1:11:06the %f and this thing over here.
    • 1:11:09Well, on the left, I'm implying a float, or a double for that matter.
    • 1:11:13On the right, though, what data type are score1, score2, score3?
    • 1:11:16All right, so they're ints.
    • 1:11:17So clang does not like this.
    • 1:11:19The compiler just doesn't like that I'm using ints on the right,
    • 1:11:22but I want floats on the left.
    • 1:11:24So there's going to be different ways of solving this.
    • 1:11:26One way would be to just ignore the problem like I originally did,
    • 1:11:29and just go back to %i.
    • 1:11:32Or as an aside, %d is often an alternative to %i for a decimal number.
    • 1:11:38But we use %i because it sounds like int, so %i is fine here too.
    • 1:11:42But I don't want to just avoid the problem.
    • 1:11:44I want to actually display a floating point value.
    • 1:11:46So how can I fix this?
    • 1:11:47Well, it turns out, I can solve this in a few different ways.
    • 1:11:50The simplest is just to make sure that at least one number on the right
    • 1:11:53is a floating point value, like 3.0 instead of just 3.
    • 1:11:59Now I think clang will be happier.
    • 1:12:01Let me do make scores--
    • 1:12:03Enter.
    • 1:12:04And indeed, it's OK.
    • 1:12:05Why?
    • 1:12:05As soon as you have at least one more precise data type on the right,
    • 1:12:10it just treats everything, at that point, as floating point value
    • 1:12:13so that the math works out.
    • 1:12:14So ./scores, Enter-- and now, there we go, right?
    • 1:12:17Some of us might really want that 1/3 of a point.
    • 1:12:20Our average was not 59.
    • 1:12:21It's 59 1/3, as in this case here.
    • 1:12:25All right, so we've solved that there.
    • 1:12:26As an aside, though, there's one other technique to show here.
    • 1:12:30If you didn't want to change it to 3.0 because that's
    • 1:12:33a little weird, because there were literally three scores,
    • 1:12:36it's not like that needs to have a decimal point,
    • 1:12:38you could also explicitly convert the 3 to a float
    • 1:12:43by saying, in parentheses, float.
    • 1:12:46This is what's called typecasting.
    • 1:12:48And this will just convert the thing right after it to that data type,
    • 1:12:51if it's possible.
    • 1:12:52So if I do this again, make scores, no errors now. ./scores, and I get,
    • 1:12:56in fact, the same result. There's a bit of a rounding issue here,
    • 1:12:59but we know the rounding relates to the imprecision from last week.
    • 1:13:03For now, let me just be happy with my 59.3 something.
    • 1:13:06I'll take that for now.
    • 1:13:08But this is as close to a good enough correct answer for me now.
    • 1:13:14But how do I--
    • 1:13:15think about now, what's going on inside of the computer's memory?
    • 1:13:18Well, let's consider.
    • 1:13:19Here's that same grid of memory.
    • 1:13:20Each box represents a byte.
    • 1:13:22Where are score1, score2, and score3 in my memory?
    • 1:13:25Well, score1, let me just propose, is at the top left.
    • 1:13:28But it's taking up four boxes for 4 bytes.
    • 1:13:32Score2 probably ends up right next to it in memory,
    • 1:13:34though, this isn't always going to be the case,
    • 1:13:36but I've chosen simple examples.
    • 1:13:3873 is next to it, also taking up 4 bytes.
    • 1:13:40And then lastly, 33 is in score3, down there underneath.
    • 1:13:45Now, if we really look at the computer's memory,
    • 1:13:48look at it with some kind of microscope or the like,
    • 1:13:50there's actually 32 bits, 32 bits, 32 bits
    • 1:13:54in each of those four groups of four bytes representing those values.
    • 1:13:59But again, for today's purposes onwards, we
    • 1:14:01don't really need to think again and again in binary.
    • 1:14:03It's just, indeed, these decimal numbers being stored there.
    • 1:14:05But I claim now, this isn't the best design.
    • 1:14:08Even if you have never programmed before CS50,
    • 1:14:11what you're looking at here on the screen,
    • 1:14:13as an excerpt, in what sense is this perhaps bad design, even though it's
    • 1:14:16a correct way of storing three test scores?
    • 1:14:19What's kind of bad here?
    • 1:14:20Yeah?
    • 1:14:21AUDIENCE: The more scores you have, the more you [INAUDIBLE]..
    • 1:14:26DAVID MALAN: Yeah, always do exactly what you did-- extrapolate
    • 1:14:28to 4 scores, 5 scores 50 scores.
    • 1:14:31This can't be that well-designed because now you're
    • 1:14:34going to have 4 lines of code, 5 lines of code,
    • 1:14:3650 lines of code that are almost identical,
    • 1:14:38except for this like arbitrary number that we're
    • 1:14:40updating at the end of the variable.
    • 1:14:42So indeed, there's probably going to be a better
    • 1:14:44way, even though, at least in C, we haven't yet seen that technique.
    • 1:14:48But the solution, today onward, is going to be something called an array.
    • 1:14:52An array is a way of storing your data back
    • 1:14:57to back to back in the computer's memory in such a way
    • 1:15:00that you can access each individual member easily.
    • 1:15:03Put another way, with an array, you can instead do something like this.
    • 1:15:08Instead of saying int score1, int score2, int score3,
    • 1:15:12giving each a value, you can first tell the computer,
    • 1:15:15please give me a variable called scores--
    • 1:15:18plural, though you can call it anything you want--
    • 1:15:20of size three, each of which will be an integer.
    • 1:15:24That is to say, this is how you declare an array in C that will have
    • 1:15:28enough room to store three integers.
    • 1:15:30Put another way, this is the technical way of telling the computer,
    • 1:15:34please give me 12 bytes in total--
    • 1:15:383 times 4 each for an int, so give me 12 bytes in total.
    • 1:15:42And what the computer will do is guarantee
    • 1:15:44that they're back to back to back in the computer's memory.
    • 1:15:47And that'll be useful in just a moment.
    • 1:15:49So let me go ahead and do something useful with this.
    • 1:15:51Let me store three actual scores.
    • 1:15:53Here's how I could now store those same numeric scores in this array.
    • 1:15:58Syntax is a little different, but there's one variable called scores.
    • 1:16:03But if you want to go to its first location,
    • 1:16:05starting today, you use square brackets and go to location 0
    • 1:16:08first, which because things in C are 0 indexed, so to speak,
    • 1:16:13you start counting at 0.
    • 1:16:14The first int is at [0].
    • 1:16:16Second int is at [1].
    • 1:16:18Third int is at [2].
    • 1:16:19So it's not one, two, three.
    • 1:16:20It's literally 0, 1, 2.
    • 1:16:22And this is not something you have control over.
    • 1:16:24You must start at 0.
    • 1:16:26So these lines now create an array of size three,
    • 1:16:29and then insert one, two, three values into that array.
    • 1:16:33But the upside now is that you only have one name of the variable to remember.
    • 1:16:37It's just called scores.
    • 1:16:39Yes, you need to go into the array to get individual values.
    • 1:16:43You need to index into it using those square brackets.
    • 1:16:46But at least you don't have this hackish approach
    • 1:16:48of declaring a separate variable for each and every one of these values.
    • 1:16:53So let me go back to scores.c here.
    • 1:16:56And let me propose that I do this.
    • 1:16:57Let me just use that same idea to do the following.
    • 1:17:00Let me get rid of these three separate integers.
    • 1:17:02Let me give myself an int scores array of size 3.
    • 1:17:06And then scores[0] will, as before, be 72.
    • 1:17:10Scores[1] will be 73.
    • 1:17:14And scores[2] will be 33.
    • 1:17:16And let me get rid of the little dot there.
    • 1:17:18All right, so now, if I go ahead and run this again with make scores--
    • 1:17:23Enter.
    • 1:17:24Huh, what did I do wrong here?
    • 1:17:29I think I got a little too ahead of myself.
    • 1:17:31Let me increase my terminal window.
    • 1:17:36Let's focus on line 10 here, first.
    • 1:17:38Error, use of undeclared identifier, score1.
    • 1:17:42What did I do here that was dumb?
    • 1:17:44Yeah?
    • 1:17:45AUDIENCE: You didn't declare it a variable.
    • 1:17:47DAVID MALAN: Right, so I didn't declare score1.
    • 1:17:49I've got old code.
    • 1:17:50So I just kind of, honestly, got ahead of myself here, not even intentionally.
    • 1:17:53So let me go ahead and shrink my terminal window again.
    • 1:17:56I need to finish my thought here.
    • 1:17:57So let me clear my terminal.
    • 1:17:58And let me change this now to be scores[0] plus scores[1] plus
    • 1:18:04scores[2].
    • 1:18:05So it's a little more verbose because I've
    • 1:18:07got these square brackets, so to speak.
    • 1:18:10But I think now my code is consistent.
    • 1:18:12So let me make scores now.
    • 1:18:13It now compiles.
    • 1:18:14./scores gives me, indeed, the same rough average with those same values.
    • 1:18:19All right, so let me go ahead and maybe enhance this a little bit.
    • 1:18:24It's a little silly to have to write a special program just
    • 1:18:26to check your average of three test scores like 72, 73, 33.
    • 1:18:31Why don't I actually make the program dynamic
    • 1:18:33and ask the human for those scores?
    • 1:18:37So instead, let me do this.
    • 1:18:39How about we get rid of the 72, and change this to getInt.
    • 1:18:43And I'll just prompt the user for a score.
    • 1:18:46Let me get rid of the 73 and get this to be getInt score, quote unquote.
    • 1:18:52And then lastly, get rid of the 33, and replace it with getInt, quote unquote,
    • 1:18:56score.
    • 1:18:57getInt is a CS50 thing for now, so I need to include cs50.h, as always.
    • 1:19:03But I think now, it's sort of a better program
    • 1:19:05because now I can compile it once, I can even share it with my friends.
    • 1:19:08And now any of us can average three scores on some classes test.
    • 1:19:12They don't need to know the code or rewrite the code just
    • 1:19:15to type in their scores.
    • 1:19:16So make scores worked.
    • 1:19:19./scores, now I can type anything I want-- maybe it's a 72, 73, 33,
    • 1:19:25still get the same answer.
    • 1:19:26Or maybe I'm having a better semester, 100, 100, maybe 99,
    • 1:19:31and now we get still a pretty high score there.
    • 1:19:33But now it's dynamic.
    • 1:19:34Now you don't need the source code.
    • 1:19:36You don't need to recompile the program.
    • 1:19:37It's just going to work again and again.
    • 1:19:39But this, too.
    • 1:19:41Let me propose that this code is correct if I
    • 1:19:43want to get three scores from the user.
    • 1:19:45But these highlighted lines now, 6 through 9, are they well-designed,
    • 1:19:50would you say?
    • 1:19:53Yeah?
    • 1:19:53AUDIENCE: Can you loop?
    • 1:19:54DAVID MALAN: Yeah, right?
    • 1:19:55This is-- we can use a loop, is the spoiler here.
    • 1:19:58Why?
    • 1:19:58I mean, my God, it's like the same code again and again and again.
    • 1:20:01The only thing that's changing is the number.
    • 1:20:03And this should have kind of had some code smell again,
    • 1:20:06because if I keep typing the same thing again and again,
    • 1:20:09that's clearly an opportunity to better design something.
    • 1:20:11So let me do this.
    • 1:20:13Let me go ahead and still create my array of size three.
    • 1:20:18But let me use our old friend, the for loop, for int i equals 0,
    • 1:20:23i less than 3, i++.
    • 1:20:26And then in here, let me do scores bracket--
    • 1:20:29we haven't seen this before, but any intuition?
    • 1:20:32Scores bracket--
    • 1:20:34AUDIENCE: i.
    • 1:20:34DAVID MALAN: i, because that will use whatever i is, be it 0 or 1 or 2
    • 1:20:39in iteration.
    • 1:20:40And then I can get an int, asking the user for score,
    • 1:20:43without having to repeat myself again and again.
    • 1:20:47So hopefully, if I didn't make any typos, make scores, all good.
    • 1:20:50./scores, 72, 73, 33, and we're back in business.
    • 1:20:54But the code is arguably now better designed,
    • 1:20:56because now, I haven't actually hardcoded the scores,
    • 1:21:01and I haven't actually copied and pasted any of that code.
    • 1:21:04Well, if we consider now what's going on inside of the computer's memory,
    • 1:21:08it's pretty much the same in terms of the values.
    • 1:21:10But instead of the variables being, literally, score1, score2, score3,
    • 1:21:15there's just one variable.
    • 1:21:17It's an array called scores.
    • 1:21:19But you can index into its three locations by using scores[0] to get
    • 1:21:24the first, scores[1] to get the second, scores[2] to get the third.
    • 1:21:28But this is key.
    • 1:21:29The memory is contiguous.
    • 1:21:33The screen is only so large, so it wraps around.
    • 1:21:35But physically, digitally, the memory is contiguous-- top
    • 1:21:38to bottom, left to right.
    • 1:21:40And that's important, why?
    • 1:21:41Because the brackets indicate 0, 1, 2, that each of these integers
    • 1:21:46is just one integer away from the next.
    • 1:21:48It can't be randomly down here all of a sudden.
    • 1:21:51It's got to be back to back to back.
    • 1:21:54All right, now equipped with that paradigm,
    • 1:21:57what more could we actually do here?
    • 1:22:00Well, it turns out, it's worth knowing that it's possible in code
    • 1:22:04to even pass arrays around as arguments.
    • 1:22:06And let me just whip this program up somewhat quickly,
    • 1:22:09just so you've seen it before long.
    • 1:22:11But let me go ahead and do this.
    • 1:22:13Let me propose that I create a function that does this averaging for me.
    • 1:22:18So I'm going to create a function called average that returns a float.
    • 1:22:22And the arguments this thing is going to take--
    • 1:22:26let's see, it's going to be the array.
    • 1:22:28So it turns out, if you want to take in an array of numbers--
    • 1:22:31you can call it anything you want.
    • 1:22:33This is how you tell C that a function takes, not
    • 1:22:36an integer, but an array of integers.
    • 1:22:39And you don't have to call it array.
    • 1:22:41I'm doing that just for the sake of discussion.
    • 1:22:42It can be called x.
    • 1:22:43It can be numbers.
    • 1:22:44It can be anything else.
    • 1:22:45I'm just calling an array to be super explicit as to what it is there.
    • 1:22:49Now, how do I change my code down here?
    • 1:22:51What I think I'm going to do for the moment is just this.
    • 1:22:55I'm going to get rid of this code here, where I manually computed the average.
    • 1:22:59And let me just call the average function here
    • 1:23:01by passing in the whole array of scores.
    • 1:23:05So this is just an example of abstraction,
    • 1:23:07like now I have a function called average.
    • 1:23:08I don't care.
    • 1:23:09I don't have to remember how it works once I implement it.
    • 1:23:12It just kind of tightens up my main code a little bit.
    • 1:23:15But I do still have to implement this.
    • 1:23:17So later in my file-- let me repeat myself before,
    • 1:23:19the only time it's OK in C to repeat yourself again and again,
    • 1:23:22by typing out again, average, and then int array open bracket--
    • 1:23:27but now not a semicolon.
    • 1:23:28Now I have to implement this thing.
    • 1:23:30And I can implement this in a bunch of different ways,
    • 1:23:33but I don't know in advance--
    • 1:23:37I can't just do this.
    • 1:23:39I can't just do array[0] plus array[1] plus array[2],
    • 1:23:48unless this program's only ever going to work on three numbers.
    • 1:23:52So let me go ahead and do this.
    • 1:23:55Let me first propose that there's a poor design here.
    • 1:23:58In my main function, what value have I repeated twice?
    • 1:24:05Among the highlighted lines, what jumps out at you as twice?
    • 1:24:07AUDIENCE: The length of the array?
    • 1:24:09DAVID MALAN: Yeah, the length of the array, it's just three.
    • 1:24:11Now it's not a huge deal that I typed the number three on line 8 and line 9,
    • 1:24:14but this is exactly the kind of like shortcut
    • 1:24:17that's going to get you in trouble eventually.
    • 1:24:18Why?
    • 1:24:18Because, eventually, you or someone else is
    • 1:24:20going to go in and make the array bigger or smaller,
    • 1:24:22and you're not going to realize that magically,
    • 1:24:24that same number is in two places.
    • 1:24:26And indeed, this is what a programmer would often call a magic number.
    • 1:24:29A magic number is one that just kind of appears magically.
    • 1:24:31And you're on the honor system to change it here, if you change it here,
    • 1:24:35and then you change it over here.
    • 1:24:36That's not going to end well if the onus is on the programmer
    • 1:24:39to remember where they hardcoded-- that is, wrote out three explicitly.
    • 1:24:43So any time you reuse a value like this, you know what?
    • 1:24:46We should probably do what we did last week, which was to declare a variable,
    • 1:24:50perhaps at the very top of my program, so it's super obvious
    • 1:24:53what it is, called, maybe n, and set that equal to 3.
    • 1:24:56Better yet, what did I do last week to make sure
    • 1:24:59that I can't screw up and accidentally change that value?
    • 1:25:02Yeah, constant.
    • 1:25:03And the keyword there was just const for short.
    • 1:25:05And now I have a global variable-- global in the sense that I can
    • 1:25:09access it anywhere-- that is called n.
    • 1:25:11It's an int.
    • 1:25:12And it's always going to be 3.
    • 1:25:14And now I can improve my main function a little bit by just changing
    • 1:25:18the 3's to n, so now if I, if a colleague realized, oh, wait a minute,
    • 1:25:22there's four tests this year.
    • 1:25:23You change n to four, recompile the code,
    • 1:25:25and it just works everywhere else, except in my average function.
    • 1:25:31Let me change it back to 3, just for consistency.
    • 1:25:33This is not going to fly now, to just sum up things like this, for instance,
    • 1:25:39and then return this divided by 3.
    • 1:25:43Why will this not work now as I've defined it?
    • 1:25:51Yeah?
    • 1:25:52AUDIENCE: [INAUDIBLE]
    • 1:25:58DAVID MALAN: OK, I might be returning an integer value when
    • 1:26:00I intend to return a float per this.
    • 1:26:02But I think I'm OK because I used that little trick where I made sure
    • 1:26:05that at least one of the numbers in my arithmetic expression
    • 1:26:08is, in fact, a floating point value.
    • 1:26:11And just by adding the point 0, make sure that everything
    • 1:26:14gets treated as a float.
    • 1:26:15So I think that's OK.
    • 1:26:17AUDIENCE: [INAUDIBLE]
    • 1:26:19DAVID MALAN: I'm sorry, a little louder.
    • 1:26:20AUDIENCE: It just seems like you're [INAUDIBLE]..
    • 1:26:24DAVID MALAN: Exactly.
    • 1:26:25So left hand's not talking to the right hand
    • 1:26:27here, in that my current implementation of average
    • 1:26:30is still assuming that there's only going to be three tests or whatever.
    • 1:26:33But wait a minute, I just went through the trouble
    • 1:26:35of modifying this to be n, generically.
    • 1:26:39And if I change this to 4, I'm not going to be happy, perhaps,
    • 1:26:43with my average because now I'm going to ignore one of my test scores
    • 1:26:46altogether.
    • 1:26:46So let me change this back to 3.
    • 1:26:48And unfortunately, if it's a variable now,
    • 1:26:51n, and therefore, I have literally a variable number of scores,
    • 1:26:55how do I take the average of a variable number of things?
    • 1:27:00I mean, what's my building block there?
    • 1:27:02Yeah?
    • 1:27:03AUDIENCE: [INAUDIBLE]
    • 1:27:10DAVID MALAN: Yeah.
    • 1:27:10Why don't I use a loop that goes through the array and adds things up as you go?
    • 1:27:14I mean, kind of like grade school, as you take the average on your calculator
    • 1:27:17or paper and pencil, you just keep adding the numbers together,
    • 1:27:19and then you divide at the end by the total number of things.
    • 1:27:22So how can I do this?
    • 1:27:23Well, let me change my implementation of average
    • 1:27:25to first declare a variable called sum, or whatever, set it equal to 0.
    • 1:27:30So this is like me on my piece of paper getting ready to count,
    • 1:27:33or my calculator, of course, when you turn it on, typically defaults to zero.
    • 1:27:36And now, let me do for, int i equals 0. i is less than a--
    • 1:27:41well, no, I didn't do that.
    • 1:27:43i is less than n, i++.
    • 1:27:46And now in here, let me go ahead and add to the current sum, whatever
    • 1:27:52is in the array's location, i.
    • 1:27:55And then down here, I think I can just return some divided by 3.0--
    • 1:28:00not 3.0, n, perhaps here.
    • 1:28:04And actually, I think I'm going to get-- let's make sure it's a float.
    • 1:28:08Let's use the type casting trick just to make sure I don't accidentally
    • 1:28:11shortchange someone and throw away everything after the decimal point.
    • 1:28:15So it just escalated quickly, right?
    • 1:28:17Average just got a lot more involved.
    • 1:28:18It's not just a single one line of code, but now it's dynamic.
    • 1:28:22I initialize a variable called sum to 0.
    • 1:28:25In this loop, I go through and just keep adding to sum, which is initially 0,
    • 1:28:30whatever's in array[i]--
    • 1:28:33or specifically array[0], array[1], array[2].
    • 1:28:36That gives me a total sum that I return, divided by the total number of things.
    • 1:28:40Now, this I can tighten slightly.
    • 1:28:42Recall that this is syntactic sugar for just adding things.
    • 1:28:45I can't use plus plus because that only literally adds one.
    • 1:28:48But I can use here, plus equals.
    • 1:28:52Questions on this implementation here?
    • 1:28:54Really the only takeaway-- or the most important takeaway
    • 1:28:58is that this is the syntax for how you tell
    • 1:29:00a function that it expects a whole array, not
    • 1:29:04a single variable like an int or the like.
    • 1:29:06You literally use square brackets, but you
    • 1:29:08don't specify the length inside there.
    • 1:29:11Yeah?
    • 1:29:12AUDIENCE: What variable [INAUDIBLE] at the top?
    • 1:29:16DAVID MALAN: What about the variable at the top?
    • 1:29:18AUDIENCE: [INAUDIBLE]
    • 1:29:22DAVID MALAN: Good question.
    • 1:29:23What do I have it defined as at the top?
    • 1:29:25This variable, N, it must be an integer if you're going to use it inside
    • 1:29:31of an arrays square brackets here.
    • 1:29:33So this line 10, notice, no longer says 3, it says N.
    • 1:29:38And so whatever N is 3 or 4 or something else, that's how many
    • 1:29:42integers I will get in that array.
    • 1:29:43And it must be, by definition of an array, an integer that
    • 1:29:47goes in those square brackets.
    • 1:29:48And here's a common source of confusion.
    • 1:29:50When you create the array, that is declare it,
    • 1:29:52you use square brackets like this, where you put
    • 1:29:54the total number of elements you want.
    • 1:29:56When you subsequently use the array, like I'm doing here,
    • 1:29:59you don't mention int again-- just like you don't mention int
    • 1:30:02again and again once a variable exists.
    • 1:30:04You use the square brackets still, but you don't use N. You use 0 or 1 or 2
    • 1:30:10or, generically here, i.
    • 1:30:11So when C was designed, they sometimes used the same syntax
    • 1:30:14for two different ideas or contexts.
    • 1:30:17Yeah?
    • 1:30:17AUDIENCE: Do you have to include line 6 [INAUDIBLE]??
    • 1:30:22DAVID MALAN: Good question.
    • 1:30:23Do I have to include line 6?
    • 1:30:25Short answer, yes, because of the reason we ran into last week.
    • 1:30:29C, or clang really, reads your code top to bottom, left to right.
    • 1:30:32And so if the compiler sees some mention of this function average on line 16,
    • 1:30:38but you haven't told the compiler that average exists,
    • 1:30:41you're going to get an error on the screen.
    • 1:30:43So the conventional way to do that is you
    • 1:30:45just copy paste the first line of code from the function,
    • 1:30:48it's so-called prototype or declaration.
    • 1:30:51Yeah?
    • 1:30:51AUDIENCE: Is there a library if you don't know the size of the array?
    • 1:30:55DAVID MALAN: Really good question, and a perfect segue way.
    • 1:30:58Is there a library you can use if you don't know the size of the array?
    • 1:31:01No.
    • 1:31:01And so if any of you have programmed in Java or Python or other languages,
    • 1:31:07you can actually just ask the array, how big is it?
    • 1:31:11In C, you and I, the programmers, have to remember it.
    • 1:31:13And so short answer, no, there's no function that
    • 1:31:15will just automatically do this for us.
    • 1:31:17And in fact, let me make a more subtle claim
    • 1:31:20that it's fine to use global variables like this if they're really
    • 1:31:23for configuration options.
    • 1:31:25Why?
    • 1:31:25It's just convenient to put them at the very top of the file
    • 1:31:28because everyone, you, your colleagues, your TAs
    • 1:31:30are going to see them at the top of the code.
    • 1:31:32But you really shouldn't be using them everywhere throughout your code.
    • 1:31:36It'd be better if the average function, itself, were
    • 1:31:38independent of that special variable.
    • 1:31:40So by that, I mean this.
    • 1:31:42You know what I should really do, if I really want to be well-designed?
    • 1:31:46I should pass in the length of the array to the average function.
    • 1:31:51I should give the average function a second argument--
    • 1:31:54I'll call it length, for instance, but I could call it anything I want.
    • 1:31:57And so rather than putting N all the way down here at the bottom of my file,
    • 1:32:02let me just dynamically say length instead.
    • 1:32:05And this is a subtlety-- and no need to get too tripped up over this.
    • 1:32:08But this, now, is just an example of how the same function can
    • 1:32:11take not one, but two arguments.
    • 1:32:13But indeed, in C, you must remember, yourself, what the length of an array
    • 1:32:19is.
    • 1:32:19You can't just ask the array via some syntax
    • 1:32:22like you can, those of you who've programmed before in Java or Python.
    • 1:32:26Yeah?
    • 1:32:27AUDIENCE: [INAUDIBLE]
    • 1:32:35DAVID MALAN: Good question.
    • 1:32:36Would it be better designed to write a function that computes the size?
    • 1:32:39Short answer, can't do that in C. As soon as you pass an array
    • 1:32:42into a function in C, you cannot figure out its size if it's a generic array
    • 1:32:47like that of integers.
    • 1:32:48There are special cases that you can do that.
    • 1:32:51But in general, no, it's just not possible in C.
    • 1:32:53And if that's some frustration, honestly, this
    • 1:32:55is why more modern languages add that feature.
    • 1:32:57Why?
    • 1:32:57Because it was really annoying, as I'm alluding here
    • 1:32:59to not having that information.
    • 1:33:01Now, just to make sure I didn't screw up anywhere,
    • 1:33:03let me compile this final version of scores.
    • 1:33:07Suspense.
    • 1:33:08All good. ./scores, 72, 73, 33, and we're still back in business.
    • 1:33:14So this version is more complicated.
    • 1:33:15And as always, we'll have this version on the course's website for reference.
    • 1:33:18But the point, really, is that arrays, not only
    • 1:33:20can be used as containers to store multiple values--
    • 1:33:23three or more in this case--
    • 1:33:25you can also even pass them around as arguments, as such.
    • 1:33:30All right, now besides that, let's simplify for just a moment,
    • 1:33:34and consider now the world of chars.
    • 1:33:36If we've just got single bytes, where does this lead us?
    • 1:33:39And how does this get us, ultimately, to strings
    • 1:33:41to solve problems like readability and cryptography and the like?
    • 1:33:44Well here, for instance, are three lines of code,
    • 1:33:46out of context, that simply store three chars.
    • 1:33:48And you can already see where this is going.
    • 1:33:50Having three variables called c1, c2, c3 is clearly
    • 1:33:53going to end up being bad design because of all the silly redundancy here.
    • 1:33:57But notice, I'm using single quotes like last week
    • 1:33:59because these are single chars.
    • 1:34:01What does this look like in the computer's memory?
    • 1:34:03Well, it looks a little something like this.
    • 1:34:05If we clear out the old memory, c1, c2, c3 probably
    • 1:34:09will end up here, maybe not literally in the top left-hand corner.
    • 1:34:12This is just an artist's rendition.
    • 1:34:14But c1, c2, c3 will probably end up like that.
    • 1:34:18Now, what's really there?
    • 1:34:20It's really those same three numbers--
    • 1:34:2172, 73, 33.
    • 1:34:23But how many bits does a byte have?
    • 1:34:27Just eight.
    • 1:34:28So if we were to look at the binary representation of these characters,
    • 1:34:33it would only be eight bits each.
    • 1:34:35That's enough to store small numbers like 72, 73, 33.
    • 1:34:39We're not dealing with Unicode and emoji and the like.
    • 1:34:41But the point is the same.
    • 1:34:42You don't have to use four bytes to store these numbers.
    • 1:34:45You can use a different data type like chars, and underneath the hood,
    • 1:34:48it's, indeed, going to use just single bytes for each.
    • 1:34:51But this is sort of like a-- this isn't really how we implement strings, right?
    • 1:34:55When you wanted to say, hi, last week, or this, we used double quotes.
    • 1:34:59And we wrote all of the things together and used one variable, not three,
    • 1:35:02right?
    • 1:35:02When I typed in David, I didn't have a variable for D-A-V-I-D.
    • 1:35:06I had one variable called name that stored the whole thing.
    • 1:35:09So in C, we keep talking about these things called strings.
    • 1:35:13We'll see, eventually, that strings are not necessarily what they seem to be.
    • 1:35:17But for now, the key thing about strings is that they're
    • 1:35:19variable length, so to speak, right?
    • 1:35:22They might be three characters, Hi, or five characters, David,
    • 1:35:25or anything smaller or larger.
    • 1:35:28So how do we go about implementing strings,
    • 1:35:30if all we have at the end of the day is my memory?
    • 1:35:33Well, here is an example of just creating, declaring,
    • 1:35:36and defining a string called s. s because it's just a simple string,
    • 1:35:39and quote unquote, HI!, in double quotes.
    • 1:35:41What does this look like in the computer's memory?
    • 1:35:44Well, let's clear it again.
    • 1:35:45And here, now, because it's technically stored in one variable,
    • 1:35:48s, here is how I might draw it as an artist.
    • 1:35:50It's three bytes in total--
    • 1:35:52H-I exclamation point.
    • 1:35:53But there's no c1, c2, c3, it's just, the whole thing is s.
    • 1:35:59But it turns out that a string, fun fact,
    • 1:36:03is really just what underneath the hood?
    • 1:36:06Kind of leading up to this--
    • 1:36:09what is a string, if this is how it's laid out in memory?
    • 1:36:12AUDIENCE: An array.
    • 1:36:13DAVID MALAN: Literally, it's just an array of characters.
    • 1:36:15And we didn't have to know about arrays last week to use strings.
    • 1:36:18This is where, again, the training wheels are starting to come off.
    • 1:36:21But a string is just an array of characters.
    • 1:36:23H-I exclamation point, for instance.
    • 1:36:26So technically, an array--
    • 1:36:28or a string called s is really a variable called s that allows you
    • 1:36:33to get at the first character with s[0], if you want-- s[1], s[2].
    • 1:36:38You can literally get individual characters
    • 1:36:40just by treating s as though it's an array, which it really
    • 1:36:43is underneath the hood, in this case.
    • 1:36:47But there's a catch.
    • 1:36:48How do you know where strings end?
    • 1:36:51In the past, when I drew some integers on the screen,
    • 1:36:54I know, I claim they always take up 4 bytes.
    • 1:36:57If I had drawn a long, it always takes up 8 bytes.
    • 1:37:00If I had drawn a character, it always takes up 1 byte.
    • 1:37:03But how many bytes does a string take up?
    • 1:37:06Yeah, I mean, that's kind of the right answer.
    • 1:37:08In this case, three, it would seem.
    • 1:37:10But if it's David, that's a good five characters.
    • 1:37:13But where do we put the number three?
    • 1:37:16Where do you put the number five, right?
    • 1:37:17This is literally all that's inside your computer.
    • 1:37:20This is all our building blocks in front of us.
    • 1:37:23So how can we-- where does the three go?
    • 1:37:25Where does the five go?
    • 1:37:26Well, it turns out you can solve this in a couple of different ways.
    • 1:37:29But the way humans decided to implement strings years ago is, indeed, an array,
    • 1:37:34but they added one extra byte at the end of every such string array,
    • 1:37:38just to make clear, with a so-called sentinel value,
    • 1:37:41that the string ends here.
    • 1:37:44Why?
    • 1:37:45So that if you have two strings in the computer's memory like, HI!
    • 1:37:47and bye, you know where the barrier is between the exclamation point of one
    • 1:37:52and the letter B in the next, right?
    • 1:37:54You need some kind of delimiter.
    • 1:37:56And so what really is underneath the hood is this.
    • 1:38:00When you store a string in memory, when you type in a string-- as the user,
    • 1:38:04if you type in 3 characters, it's going to use
    • 1:38:073 plus 1 equals 4 bytes in total.
    • 1:38:10If you type in David, it's going to use 5 plus 1 equals 6 bytes in total.
    • 1:38:14Why?
    • 1:38:14Because C automatically adds this special 0 at the end of the string.
    • 1:38:20I've drawn it with backslash 0 because this is how you represent 0 as a char,
    • 1:38:24as a character.
    • 1:38:25But this is literally just 0, as we'll soon see.
    • 1:38:28So any time there's a string in memory, it always takes up
    • 1:38:31one more byte than you, yourself, as the programmer or human typed in.
    • 1:38:36In fact, if we convert this again, just for discussion's sake,
    • 1:38:38to those integers, what's literally stored in the computer's memory
    • 1:38:41is going to be 72, 73, 33, and now a 0.
    • 1:38:45And the computer, because of C and how it was invented,
    • 1:38:48it's just smart enough to know that when you print out a string,
    • 1:38:51it prints out every character until it sees a 0,
    • 1:38:54and then it just stops printing.
    • 1:38:56In particular, printf knows how this works.
    • 1:38:58And this is why printf knows when to stop printing.
    • 1:39:02Decimal numbers are not that enlightening.
    • 1:39:03We'll generally write the characters like this.
    • 1:39:05And again, backslash 0 is just special symbology.
    • 1:39:09It's what the programmer types to make clear that you're not saying, HI!, 0.
    • 1:39:13You're saying HI!, and then it's a special 0.
    • 1:39:15Specifically, it is eight 0 bits that indicate
    • 1:39:20that it's the end of the string.
    • 1:39:22Technically, that backslash zero, if you want to be fancy, it's called null,
    • 1:39:26N-U-L-L.
    • 1:39:27And it turns out, you've seen this before, though we didn't call it out.
    • 1:39:30Here's that same ASCII chart from the past couple of weeks.
    • 1:39:33If I highlight this, what is decimal number 0 mapping to?
    • 1:39:39NUL, which is just programmer speak for the special null character.
    • 1:39:42All 0 bits that means the string ends here.
    • 1:39:46This all happens automatically for you.
    • 1:39:48You do not need to create these null characters or these zeros.
    • 1:39:53Any questions then, on this implementation thus far?
    • 1:40:00Any questions here?
    • 1:40:01No?
    • 1:40:02Well, let me do this.
    • 1:40:03Let me go back to VS Code in a second.
    • 1:40:05And let's actually corroborate this with some code.
    • 1:40:07Let me go ahead and create a small program called hi.c.
    • 1:40:10And how about we do this?
    • 1:40:12Let me include stdio.h.
    • 1:40:14Let me include-- let me type out int main void, as always.
    • 1:40:18And now let me do something simple and kind of bad,
    • 1:40:20but char c1 equals quote unquote, h, in single quotes.
    • 1:40:24Char c2 equals quote unquote, I, in single quotes.
    • 1:40:28And lastly, char c3 equals exclamation point, in single quotes.
    • 1:40:32And now, let me just print this out.
    • 1:40:34I can't use %s because that is not a string.
    • 1:40:36That's literally three chars, because that's the design decision I made.
    • 1:40:40But I could do this--
    • 1:40:41%c, %c, %c, which we haven't seen before, but %s is string, %i is int,
    • 1:40:48%c is, indeed, char.
    • 1:40:51So let me put a backslash n at the end for cleanliness,
    • 1:40:54and now do, c1, c2, c3.
    • 1:40:56So this is like a char-based version of printing string.
    • 1:41:00So let me make HI!
    • 1:41:01And then let me do ./hi, and it looks like I used printf with %s.
    • 1:41:05But I did things very manually by printing out each individual character.
    • 1:41:09What's cool now, though, is that once you
    • 1:41:11know that characters are just numbers and strings are just characters,
    • 1:41:15you can kind of poke around.
    • 1:41:16Let me change all three placeholders to %i instead.
    • 1:41:21And this is totally fine, too.
    • 1:41:23Let me rerun this, make hi.
    • 1:41:26Actually, let me make one change, just so we can see this.
    • 1:41:31Let me add spaces, just for aesthetics sake, let me do make hi, ./hi, Enter,
    • 1:41:37and voila, like now, you can actually see the numbers,
    • 1:41:40that I claimed back in week zero, were in fact happening underneath the hood.
    • 1:41:44Well, this is not how you would make strings.
    • 1:41:45It'd be incredibly tedious to have three variables for three letter words, five
    • 1:41:49variables for five letter words.
    • 1:41:50We've been using, of course, strings since last week,
    • 1:41:52so let's do that instead.
    • 1:41:54String s equals quote unquote, double quotes "HI!"
    • 1:41:59For this, no, because of these training wheels,
    • 1:42:02I need to include the CS50 library.
    • 1:42:04But we'll come back to that in the coming weeks.
    • 1:42:06But for now, I'm going to go ahead and create a string s called quote unquote,
    • 1:42:10"HI!"
    • 1:42:11And now I'm going to change this to be my familiar %s,
    • 1:42:14and now just print out s itself.
    • 1:42:17This, of course, is the same thing as last week, ./hi,
    • 1:42:20gives me the exact same thing, but now, we're dealing, of course, with strings.
    • 1:42:24But how can we see a little beyond that?
    • 1:42:27Well, how about this?
    • 1:42:28Let's poke around further with today's primitives.
    • 1:42:31Even though s is a string, I could technically print out its first
    • 1:42:35character with %c by doing s[0].
    • 1:42:39I could technically print out its second character with %c by doing s[1].
    • 1:42:43I could print out its third character with %c and printing out s[2].
    • 1:42:47So again, this just derives logically from my understanding
    • 1:42:50now that strings are arrays, as you note.
    • 1:42:52Let me do make--
    • 1:42:54let me do make hi, ./hi.
    • 1:42:57And no visual change, but I'm just kind of now tinkering around.
    • 1:43:00And in fact, if you're really curious, let me do this.
    • 1:43:03Let me change these back to i, back to i--
    • 1:43:06oops, back to i.
    • 1:43:08And let me add a fourth one because if I'm really curious now,
    • 1:43:11let's see what's in s[3].
    • 1:43:14This is the fourth byte.
    • 1:43:16And even though the string itself is H-I,
    • 1:43:18I think we can corroborate this whole null thing.
    • 1:43:21Make hi, ./hi, Enter, and there it is.
    • 1:43:26You could have done this last week, if you really
    • 1:43:28wanted to geek out on strings.
    • 1:43:29But for now, it's just revealing what's going on underneath the hood.
    • 1:43:33Questions then, on what these strings are?
    • 1:43:36Yeah?
    • 1:43:37AUDIENCE: [INAUDIBLE]
    • 1:43:41DAVID MALAN: Why do we need the bracket?
    • 1:43:42AUDIENCE: [INAUDIBLE]
    • 1:43:45DAVID MALAN: Why do you not need brackets?
    • 1:43:47Good question.
    • 1:43:47Why do I not need brackets on line 6?
    • 1:43:51Because s is a string.
    • 1:43:53We'll see in a couple of weeks that s is, essentially,
    • 1:43:56implemented underneath the hood, indeed, as an array,
    • 1:44:00but that happens automatically for you.
    • 1:44:02You can treat s as just a variable name without square brackets.
    • 1:44:06You will use square brackets when you have arrays of ints
    • 1:44:09or you manually create arrays of chars or doubles or floats or anything else.
    • 1:44:13But strings are special.
    • 1:44:14Why?
    • 1:44:15I mean, every program you write seems to use strings, text in some form.
    • 1:44:19We're humans we like text, not just numbers and such.
    • 1:44:21So this is just treated a little specially in C and many other languages
    • 1:44:25as well.
    • 1:44:28Other questions on this here?
    • 1:44:31No?
    • 1:44:31Let's add then, one other string to the mix.
    • 1:44:33So instead of just saying, HI!, why don't we consider a version
    • 1:44:36of the program that says both, HI! and BYE!.
    • 1:44:38And I claim now that that backslash zero,
    • 1:44:41that null character is going to be ever more important now
    • 1:44:44if we've got two strings in memory, so that C knows
    • 1:44:46how to distinguish one from the other.
    • 1:44:48So let me go ahead and just get rid of these two lines for the moment.
    • 1:44:51Let me recreate string s equals, quote unquote double quotes, "HI!"
    • 1:44:55Let me give myself another one.
    • 1:44:56And because I'm just playing around, I'll choose very short variable names.
    • 1:44:59String t equals quote unquote, "BYE!"
    • 1:45:04And then let me just print them both out.
    • 1:45:06Let me go ahead and print out %s, backslash n, comma s,
    • 1:45:11and then printf %s backslash n, and then t.
    • 1:45:16So very simple demonstration of just these two variables.
    • 1:45:19Make hi, ./hi, and of course, it prints out two lines, one after the other.
    • 1:45:26What's actually going on underneath the hood?
    • 1:45:27Well, let's go back to the computer's memory.
    • 1:45:29HI!, I think, is going to be, I claim, pretty much the same.
    • 1:45:32So s, I'll claim, is in the top left, followed by the backslash zero.
    • 1:45:36And that's important now because BYE! probably is going to end up there.
    • 1:45:40And visually, it wraps just by nature of how I've drawn this grid of bytes,
    • 1:45:43but it's contiguous.
    • 1:45:44B-Y-E-!
    • 1:45:46null, A.K.A. backslash zero, this is now helpful to printf
    • 1:45:51because now printf knows where one begins and ends
    • 1:45:55by way of that special null character.
    • 1:45:58But we can poke around now, too.
    • 1:46:00What else can I do here?
    • 1:46:01How about this?
    • 1:46:02How about I go into my code here, back to VS code, and let me go ahead
    • 1:46:08and say something like, well, if I've got two of these strings,
    • 1:46:13you know, let's put them in an array.
    • 1:46:15Let's kind of do this sort of arrays in arrays, sort of inception-style here.
    • 1:46:20So string words[2].
    • 1:46:23So give me an array of two strings is what
    • 1:46:25I'm saying here in code, even though we've not done it with strings yet.
    • 1:46:28We only did it with ints.
    • 1:46:29And now let me do this.
    • 1:46:30The first word A.K.A. words[0] will equal, as before, HI!
    • 1:46:35And now words[1] will equal quote unquote, "BYE!"
    • 1:46:40And now I've done the exact same thing, but again, I'm
    • 1:46:43just avoiding having s, t, q, r, and all these different variables in my code.
    • 1:46:48I just now am treating them as one single array of strings.
    • 1:46:52How do I change my code down here?
    • 1:46:54Well, if I want to print the first word, I do words[0].
    • 1:46:57And if I want to print the second word, I do words[1].
    • 1:46:59This is not a useful exercise at the moment
    • 1:47:02because I'm just making my code more complicated.
    • 1:47:04But again, it allows us to poke around and see what's
    • 1:47:06going on because there is that HI!
    • 1:47:08and BYE!.
    • 1:47:09But watch this.
    • 1:47:10If I really want to be cool, I can do this.
    • 1:47:14Let's print out %c, %c, %c, backslash n, and then here, %c, %c, %c, %c,
    • 1:47:24so four of those.
    • 1:47:25And now here's where things get interesting.
    • 1:47:28Words is an array of strings.
    • 1:47:30Again, if I may, what's a string?
    • 1:47:33An array of characters.
    • 1:47:35So just use the same logic.
    • 1:47:36If words is an array of strings, you get at the first string with words[0].
    • 1:47:41How do you get at the first character in the first string?
    • 1:47:44Bracket 0, words[0][1], and lastly, words[0][2].
    • 1:47:52And now down here, words[1], but the first character is there.
    • 1:47:57Word[1], the second character is here.
    • 1:48:00Words[1], the third character is here--
    • 1:48:03whoops-- third character's here.
    • 1:48:04And words[1], the fourth character is here.
    • 1:48:07This is not how people program.
    • 1:48:09This is only for demonstrations sake.
    • 1:48:10My God, it's so tedious and verbose already.
    • 1:48:13But if I make hi now, ./hi, now, I'm manually reinventing %s,
    • 1:48:20if I forgot it existed, using %c alone.
    • 1:48:22But you can indeed manipulate arrays in this way.
    • 1:48:25But because strings are arrays of characters,
    • 1:48:28you can manipulate strings in this way too.
    • 1:48:32Any question now on this syntax?
    • 1:48:37Any questions here?
    • 1:48:38No?
    • 1:48:39No?
    • 1:48:39All right, well, let's go ahead and propose
    • 1:48:42that we solve a couple of other problems we might not have as before.
    • 1:48:45But first, a quick visual of what's been going on underneath the hood here.
    • 1:48:49If here, again, is where we left off on the screen, HI! and BYE!
    • 1:48:52back to back, here is really how I just treated these things.
    • 1:48:56s bracket 0, 1, 2, 3 and then t 0, 1, 2, 3, 4.
    • 1:49:00But really, once I put them in an array, the picture becomes this.
    • 1:49:04Words[0] is the whole HI!.
    • 1:49:07Words[1] is the whole BYE!.
    • 1:49:08But if I really get into the weeds and start indexing
    • 1:49:11into individual characters in those strings, all I'm using
    • 1:49:14is new syntax in order to represent these same values here.
    • 1:49:20Questions then, on these representations before we forge ahead?
    • 1:49:28No?
    • 1:49:29Yeah?
    • 1:49:30AUDIENCE: Does the new line character not [INAUDIBLE]??
    • 1:49:33DAVID MALAN: Does the new line character-- say that once more?
    • 1:49:36AUDIENCE: Does the new line character take up any space?
    • 1:49:38DAVID MALAN: Ah, really good question.
    • 1:49:40Does the new line character take up any space?
    • 1:49:42It does, so far as printf is concerned.
    • 1:49:45But I'm not storing the backslash n in my strings,
    • 1:49:48printf is being manually handed that thing instead.
    • 1:49:53All right, so let's go ahead then and consider
    • 1:49:55how we might solve some problems that have arisen now with these strings,
    • 1:49:58as follows here.
    • 1:50:00Suppose I-- let's do this.
    • 1:50:02Let me go back to VS Code here.
    • 1:50:04And let me go ahead and open up a new file called, how about, length.c.
    • 1:50:09And let's consider for a moment how I might actually figure out
    • 1:50:12what the length of a string is, which is distinct from the length of an array.
    • 1:50:16I claimed earlier, you cannot figure out dynamically what the length of an array
    • 1:50:19is.
    • 1:50:20But I can figure out the length of a string, specifically, because
    • 1:50:24of this implementation detail of that null character.
    • 1:50:26So let me go ahead and do this.
    • 1:50:28Let me include cs50.h in this second program here.
    • 1:50:31Let me include stdio.h, as before.
    • 1:50:35And let me do this, int main void--
    • 1:50:38and the first thing I'll do is just get a string from the user.
    • 1:50:40I'll ask the user, as always, for their name.
    • 1:50:43So I'll call getString, and say, what's your name, question mark, as always.
    • 1:50:48And then down here, if I want to figure out the length of this string
    • 1:50:51and print the length out on the screen, well, I
    • 1:50:56can kind of do this similar in spirit to the average,
    • 1:50:58where I'm accumulating something.
    • 1:50:59Let me go ahead and initialize N to 0.
    • 1:51:02Let me give myself--
    • 1:51:05it's not a for loop because I don't have a--
    • 1:51:07I don't know in advance how long it is.
    • 1:51:08But what if I do this?
    • 1:51:09While the value at name[n] does not equal '/0'--
    • 1:51:20crazy syntax at the moment, but it's just the culmination
    • 1:51:23of these various building blocks.
    • 1:51:25Let me just finish the thought here, n++.
    • 1:51:28And then down here, let's just print out, with printf and %i,
    • 1:51:33that value of N. So I claim this is going to show me the length of any
    • 1:51:38string I type in, whether it's hi or bye or David or anything else.
    • 1:51:43I initialize a variable to zero, and that's good
    • 1:51:45because that's where you start counting in general.
    • 1:51:47While name[0] does not equal backslash zero.
    • 1:51:50What is this saying?
    • 1:51:51Well, if name is the string the user typed in-- and name is just an array,
    • 1:51:55as you noted--
    • 1:51:56the name[0] is going to be the first character.
    • 1:51:59And I'm asking the question, well, does the first character not equal
    • 1:52:02backslash zero?
    • 1:52:03And if I type in David, D, it's not, so I keep going and I add 1 to N.
    • 1:52:08Then I'm going to check name[1].
    • 1:52:10Well, if I typed in David, name[1] is going to be A.
    • 1:52:13A does not equal backslash zero, and so it's going to go again and again
    • 1:52:18and again.
    • 1:52:18But five steps in total later, it's going to get to the byte after
    • 1:52:23D-A-V-I-D, realize, wait a minute, that is a backslash n.
    • 1:52:26The loop finishes, and I print out the total length.
    • 1:52:29Arrays, in general, do not have this null character.
    • 1:52:33However, strings do.
    • 1:52:34Again, strings are special versus all of the other data types
    • 1:52:38we've talked about thus far.
    • 1:52:39But how could I, for instance, do this differently?
    • 1:52:43Well, let's actually factor this out as a function, as I've commonly done.
    • 1:52:47But rather than implement it myself, you know what?
    • 1:52:50It turns out what's nice about strings being so common,
    • 1:52:54there are many other people who have solved these problems before.
    • 1:52:57And in fact, there's a whole string library in C.
    • 1:53:00It is used by way of a header file called string.h.
    • 1:53:04And what string.h is, is a library of string-related functions.
    • 1:53:08In fact, you can see in CS50's manual pages
    • 1:53:10for C, the string.h functions, at least those that we recommend as most useful,
    • 1:53:16and in particular, if you poke around there,
    • 1:53:18you'll see that there's a function called strlen.
    • 1:53:20It means string length.
    • 1:53:22It was named very succinctly, just because it's a little easier
    • 1:53:24to type than string length.
    • 1:53:25But strlen tells you the length of a string.
    • 1:53:28So how might I use this in my code here?
    • 1:53:30Well, it turns out, I can simplify this quite a bit.
    • 1:53:34Let me get rid of my loop, get rid of my accounting
    • 1:53:37manually, and do something like this-- int n
    • 1:53:40equals strlen of the humans name, name.
    • 1:53:45And now I'll just use printf, as before, with %i backslash n,
    • 1:53:49and output the value of n.
    • 1:53:51But there's a bug at the moment.
    • 1:53:54What have I forgotten to do?
    • 1:53:58Yeah, I have to include the header file at the top of the screen,
    • 1:54:01so let me-- at the top of the code.
    • 1:54:03So let me also include string.h at the top of my file,
    • 1:54:07so that C knows that, in fact, strlen exists.
    • 1:54:10Let me go ahead and make length, as before.
    • 1:54:14./length-- or actually, really for the first time, what's your name?
    • 1:54:18D-A-V-I-D. And hopefully, I'm going to see, in fact, 5.
    • 1:54:22By contrast, if I run it again and type in HI!, now I see three.
    • 1:54:26So strlen is just one of the functions in that library.
    • 1:54:29And there are so many more.
    • 1:54:30In fact, yet another library that might be useful moving forward
    • 1:54:33is this one, ctype, which relates to C data
    • 1:54:37types and lots of functions therein that can be useful.
    • 1:54:40For instance, if you review its documentation in the manual pages
    • 1:54:43online, you'll see that there are functions via which
    • 1:54:46we can solve problems like this.
    • 1:54:49Let me go ahead and propose here--
    • 1:54:52let me see.
    • 1:54:53Let's do an example here involving--
    • 1:54:59how about checking if something is uppercase or lowercase,
    • 1:55:03and converting it to uppercase only.
    • 1:55:06Let me go back to VS Code, and code a program called uppercase.c.
    • 1:55:10In this, file I'm going to start by including now, as always, cs50.h.
    • 1:55:15I'm going to include stdio.h.
    • 1:55:17And I'm going to add one other to the mix, which
    • 1:55:21is string.h now too, so I can access the length of things as needed.
    • 1:55:26Int main void comes next.
    • 1:55:28And then within my main function, I'm going
    • 1:55:30to go ahead and declare a string called s.
    • 1:55:32I'm going to call getString, as before.
    • 1:55:34And I'm going to go ahead and just ask the user for a string called before.
    • 1:55:38I want to do a before and after.
    • 1:55:39Whatever the user types in is before.
    • 1:55:41But I want to force everything to uppercase, thereafter.
    • 1:55:44Let me now, in this loop here, do this.
    • 1:55:48Let me printf quote unquote, "After," just so we can see this on the screen.
    • 1:55:53And let me do four int i gets 0, i is less than strlen of s, i++.
    • 1:56:02What am I about to do?
    • 1:56:03I'm about to iterate over every character in the string
    • 1:56:06from left to right, from 0 on up to, but not through, the length of s.
    • 1:56:11And how do I check if something is lowercase,
    • 1:56:13so that I can actually force it to uppercase?
    • 1:56:16Well, it turns out, I could do this literally.
    • 1:56:19If the character in s at location i is greater than or equal to capital A,
    • 1:56:27ampersand, ampersand, which means and instead of or, which we saw
    • 1:56:31in the past, s[i] is less than or equal to little z, that means,
    • 1:56:37logically in English, that this is indeed lowercase.
    • 1:56:41How do I now convert it to uppercase, this character?
    • 1:56:44Well, I could just literally print out the same character.
    • 1:56:48But that would not be the answer here because that's not changing the value.
    • 1:56:52But what could I do instead?
    • 1:56:54Well, let me actually pull up here real fast the ASCII chart as before,
    • 1:56:59and let's see if we can't glean some insight.
    • 1:57:03If I pull up the same ASCII chart, and suppose
    • 1:57:05the human has typed in a lowercase a, that's 97.
    • 1:57:09What letter-- I want to convert it to uppercase
    • 1:57:13A, so what number do I want to convert the 97 to, per week zero?
    • 1:57:18So 65, we keep coming back to that one.
    • 1:57:21What if the user types in lowercase b?
    • 1:57:23I want to change the 98 value to 66, and so forth.
    • 1:57:27And any quick math, how far apart are those?
    • 1:57:30So it's always 32, like uppercase to lowercase
    • 1:57:33is always, wonderfully, good design, 32 away, one from the other.
    • 1:57:37So what does this mean?
    • 1:57:39Well, I think we saw earlier that underneath the hood,
    • 1:57:41a char is just a number.
    • 1:57:42You can certainly do arithmetic on it.
    • 1:57:44And here, again, if you understand these lower level
    • 1:57:46primitives, what if I do this?
    • 1:57:48Whatever s[i] is, if I know on line 13 that it's lowercase,
    • 1:57:53do I want to add or subtract 32?
    • 1:57:57AUDIENCE: Subtract.
    • 1:57:57DAVID MALAN: So I want to subtract because I want to go from like 97 to 65
    • 1:58:01or 98 to 66, so indeed, if you do some quick math, that gives you 32.
    • 1:58:06So it's suffices to just treat chars as numbers, subtract the 32,
    • 1:58:10and printing it with %c, I think, will just convert lowercase to uppercase.
    • 1:58:16If you now fast forward to the real world, Microsoft Word or Google Docs,
    • 1:58:19if you've ever chosen the menu option that forces things to uppercase
    • 1:58:22or lowercase on occasion, literally, that's
    • 1:58:24what Microsoft and Google have done.
    • 1:58:26They iterate over every character in the document, check if it's lowercase,
    • 1:58:29and if so, they subtract 32 from it and show you the new value.
    • 1:58:33What if, though, it is not a lowercase letter?
    • 1:58:36I think I can keep it easy and just print out the current letter unchanged,
    • 1:58:40if my goal is to simply force things to all uppercase, and that letter,
    • 1:58:44then would be s[i].
    • 1:58:46So let me go ahead now and make uppercase, hopefully, no errors.
    • 1:58:50./uppercase, and I'll now type in David with an uppercase D,
    • 1:58:55but lowercase everything else.
    • 1:58:57But now the after version is DAVID--
    • 1:59:00an aesthetic bug.
    • 1:59:01Notice here, I forgot to include, just for prettiness sake,
    • 1:59:04a backslash n at the end.
    • 1:59:05No problem, I'll add that.
    • 1:59:07Let me fix my mistake.
    • 1:59:08Make uppercase, ./uppercase, Enter.
    • 1:59:12D-A-V-I-D, Enter, and voila.
    • 1:59:14And I deliberately added another space after,
    • 1:59:16just so they would line up pretty, even though before
    • 1:59:19and after have different numbers of letters.
    • 1:59:22Questions then, on this implementation of forcing something
    • 1:59:25to uppercase, which in and of itself is not all that enlightening,
    • 1:59:28but is representative now of how you can leverage these low level primitives.
    • 1:59:33Question?
    • 1:59:35No?
    • 1:59:36All right, well, this honestly is tedious.
    • 1:59:38My God, like does Microsoft, Google, everyone,
    • 1:59:40you have to literally write out this code just to do something simple?
    • 1:59:43Well, no, that's, again, why we have things like libraries.
    • 1:59:46And increasingly now, for problem sets, projects, and beyond,
    • 1:59:49well, you just use libraries more often off-the-shelf
    • 1:59:52so as to solve problems that, surely, other people have had before you.
    • 1:59:55So how can I now use this library, ctype.h?
    • 1:59:59Well, let me go back into my code.
    • 2:00:01Let me include this among my header files here.
    • 2:00:05Just so I can skim things easily, I tend to alphabetize my headers.
    • 2:00:08But that's not strictly necessary, but it allows me, at a glance, to realize,
    • 2:00:11did I or did I not include something I need?
    • 2:00:13Now, let me go ahead and do this.
    • 2:00:15It turns out if you read the documentation for the C type library,
    • 2:00:20there's a function, wonderfully called, if islower,
    • 2:00:24that takes in a character as its argument, essentially, so s[i].
    • 2:00:28And if that returns true, a Boolean value, if you will,
    • 2:00:32well, I'm going to force it to lowercase.
    • 2:00:33But I don't have to do this math anymore.
    • 2:00:36Turns out, in the C type library, there's also a function called to upper
    • 2:00:40that takes a character as input, like s[i],
    • 2:00:43and it just does the math for you.
    • 2:00:45So that you can abstract away the 32 thing,
    • 2:00:47and just know that someone else has solved that problem for you.
    • 2:00:50Otherwise, I can leave my code unchanged down below
    • 2:00:53because I'm not changing anything else.
    • 2:00:55So if I do make uppercase now, and then ./uppercase, D-a-v-i-d,
    • 2:01:00with just a capital D, and now it still works.
    • 2:01:03But if you read the documentation further, it turns out that to upper
    • 2:01:06is smart.
    • 2:01:07If you pass in a character to to upper, that's lowercase,
    • 2:01:10it obviously converts it to uppercase by doing that math.
    • 2:01:13But if you pass in a character to to upper that's already uppercase,
    • 2:01:17the documentation you would see tells you that it leaves it unchanged.
    • 2:01:21So I can tighten all of this up.
    • 2:01:23I can get rid of the whole else.
    • 2:01:25I can get rid of the whole if, and arguably now,
    • 2:01:29implement a program that's just as correct, but better designed.
    • 2:01:33Why?
    • 2:01:34Fewer lines of code easier to read, lower probability of mistakes,
    • 2:01:38assuming the library is correct.
    • 2:01:39It just makes it easier and faster for me, now, to write code.
    • 2:01:43So if I now do, one last time, make uppercase, Enter, ./uppercase,
    • 2:01:47and type in my name, still working.
    • 2:01:50But now notice, we've whittled this down to far fewer lines of code,
    • 2:01:53albeit, using now this additional library.
    • 2:01:57Questions then on how we did this?
    • 2:02:03Well, even though this code, I daresay, is correct,
    • 2:02:06it's not necessarily well-designed just yet.
    • 2:02:09In fact, there's one line of code, one function
    • 2:02:12call in this current implementation that's
    • 2:02:14more inefficient than it needs to be.
    • 2:02:17And allow me to draw your attention to this here,
    • 2:02:20line 10, wherein we're calling strlen.
    • 2:02:24But we're calling it inside of this for loop, specifically,
    • 2:02:27inside of the condition.
    • 2:02:29And why might that not necessarily be the best idea?
    • 2:02:33Well, is the length of the string as changing, ever?
    • 2:02:36I mean, certainly not within the span of this loop.
    • 2:02:38And so here we are within our for loop on line 10, 11, 12, and 13,
    • 2:02:42asking on every iteration that same question.
    • 2:02:45What's the length of s?
    • 2:02:46What's the length of s?
    • 2:02:47What's the length of s?
    • 2:02:48And in turn, we're calling strlen every time,
    • 2:02:50even though we're getting back the same answer.
    • 2:02:52So I daresay a better solution here would
    • 2:02:54be to maybe figure out the length of s earlier on in my code,
    • 2:02:58and maybe declare a variable.
    • 2:02:59Or perhaps do something that's syntactically a little more elegant,
    • 2:03:02and in fact, a very common design in a loop like this,
    • 2:03:05would be to declare not just one variable like i,
    • 2:03:07but to actually declare a second variable called n, for instance, where
    • 2:03:12n is just some number, set n equal to the length of s.
    • 2:03:16But thereafter, inside of this condition,
    • 2:03:18instead of calling strlen of s again and again and again, what might I now do?
    • 2:03:24I could instead just compare i against n itself,
    • 2:03:28because n now will only be calculated once when it's initialized,
    • 2:03:31just as i is initialize to zero.
    • 2:03:32And thereafter, we're going to be comparing i, which is changing,
    • 2:03:36against n, which will not be.
    • 2:03:37So it's going to be marginally more efficient by design.
    • 2:03:40Now with that said, a good compiler could also
    • 2:03:42recognize that there is this optimization possibility,
    • 2:03:46and maybe do it for us.
    • 2:03:47But for now, best to get into the habit, best
    • 2:03:49to develop the muscle memory for making those better design decisions
    • 2:03:52yourselves.
    • 2:03:54Questions, then, on how we did this?
    • 2:03:58No?
    • 2:03:59All right, a few final building blocks for the day.
    • 2:04:03So we started by talking about those command line arguments that clang uses,
    • 2:04:07whereby, anything after the command that you type at a prompt, be it make
    • 2:04:13or clang or even CD in Linux, any word thereafter, or something
    • 2:04:18cryptic like -o is a command line argument.
    • 2:04:21It's an input to the command.
    • 2:04:22It's different from a function argument because a function argument, of course,
    • 2:04:26is an input to a function.
    • 2:04:27But it's the same idea.
    • 2:04:28It's just different syntax after the dollar sign at the prompt.
    • 2:04:30Well, it turns out that command line arguments
    • 2:04:33are something you can now use in your own programs
    • 2:04:37by accessing words after the prompt.
    • 2:04:41And let me propose that we invent this as follows.
    • 2:04:45Let me propose that we switch back to VS Code here,
    • 2:04:49and I'll open a new file here called greet.c.
    • 2:04:53So in greet.c, it's going to be a program that very simply greets
    • 2:04:56the user.
    • 2:04:57Had we written this last week, we would have done this.
    • 2:04:59Include cs50.h, and then include stdio.h, and then int main void,
    • 2:05:08and then we might do something simple like string name equals getString,
    • 2:05:13quote unquote, "What's your name?"
    • 2:05:15And then we would have printed out, as always, Hello, %s,
    • 2:05:20and then plugging in that name.
    • 2:05:21So this is the same program we've implemented many times, just
    • 2:05:25to make sure it works--
    • 2:05:26although, nope, that's not quite the same program.
    • 2:05:29Semicolon's in the wrong place.
    • 2:05:30This now is the same program.
    • 2:05:32So make greet, dot ./greet, and I'll type in my own name. hello, David.
    • 2:05:37So we're back there.
    • 2:05:38Now, what's arguably a little annoying about this program,
    • 2:05:41if I type in something else like, Carter,
    • 2:05:44Enter, I have to run the program, wait for the prompt, type in my name,
    • 2:05:48hit Enter.
    • 2:05:48And that's fine, but imagine if every program worked like this.
    • 2:05:52Like make, suppose you could only type make, then you wait for a prompt,
    • 2:05:55then you type the name of the program you want to make, then you hit Enter.
    • 2:05:58Or worse, in Linux when you have to change directories,
    • 2:06:01as you might have for problem set one, what if you had to type CD, Enter,
    • 2:06:05now type the name of the folder you want to change into, Enter--
    • 2:06:07I mean, it just slows life down.
    • 2:06:09And so it just gets annoying quickly.
    • 2:06:11So command line arguments just let you express your whole thought all at once.
    • 2:06:16So how can I do this?
    • 2:06:18Well, if I want to express the notion of command line arguments in my code,
    • 2:06:22I could do something like this.
    • 2:06:25I could, for the very first time, go up and get
    • 2:06:28rid of this void, which as of today means, this program takes no command
    • 2:06:33line arguments.
    • 2:06:34And I can change it to exactly this.
    • 2:06:37Int argc, string argv, with brackets.
    • 2:06:43Now it's cryptic, admittedly.
    • 2:06:44And let me zoom in.
    • 2:06:46But I think we can perhaps infer now, what's going on.
    • 2:06:49If main now does not have void as its input, which
    • 2:06:52means it takes no arguments, surely, the spoiler
    • 2:06:55here is that now main will take command line arguments somehow.
    • 2:06:59Any guesses as to what argv is or will be?
    • 2:07:05What might this represent?
    • 2:07:08It's an array of strings, right, by way of the syntax.
    • 2:07:11Yeah?
    • 2:07:13AUDIENCE: All the characters will be typed out.
    • 2:07:15DAVID MALAN: Exactly.
    • 2:07:16It will be all of the characters, or really all of the words
    • 2:07:18that you type at the prompt.
    • 2:07:19Argc, as an int, any guess?
    • 2:07:24Argument count is what it generally stands for, though technically,
    • 2:07:28you could call these things anything.
    • 2:07:30But this is the convention.
    • 2:07:31Because I claimed earlier that arrays don't keep track of their own length,
    • 2:07:35if you want to know how many words the human typed at the prompt
    • 2:07:38after your program's name, you have to be told,
    • 2:07:41not just the array of the words, but the length of that array.
    • 2:07:45The strings, you can figure out the length of using strlen,
    • 2:07:48but you can't figure out the length of the array of strings, the collection
    • 2:07:53of words that the human typed in.
    • 2:07:55So how can I now use this?
    • 2:07:56Well, let me go ahead and do this.
    • 2:07:59Let me go ahead and change this program now just to be printf, quote unquote,
    • 2:08:04"hello, %2 /n", then argv[1].
    • 2:08:11So this is not the best version of my code yet, but it's my first.
    • 2:08:14Make greet, and now let me do ./greet, David all at once.
    • 2:08:21Enter, hello, David.
    • 2:08:23Now let me run it again, ./greet, Carter.
    • 2:08:25Enter, hello, Carter.
    • 2:08:27It's a marginal improvement, but I don't have
    • 2:08:29to wait for getString to prompt me to hit Enter.
    • 2:08:32It's just speeding things up, twice as fast.
    • 2:08:34One less command to type in.
    • 2:08:36But I deliberately did [1], but what's the beginning of argv?
    • 2:08:41It would be [0].
    • 2:08:44Well, what's that?
    • 2:08:45This is sometimes useful, though for now, it's not.
    • 2:08:48Suppose I recompile my code and run this program now, greet David.
    • 2:08:54Anyone want to guess what's in argv[0]?
    • 2:08:58AUDIENCE: [INAUDIBLE]
    • 2:08:59DAVID MALAN: Say again?
    • 2:09:00AUDIENCE: Greet, hello.
    • 2:09:01DAVID MALAN: Greet, Enter, hello, ./greet.
    • 2:09:04So if you want to sort of inception style your program to figure out what
    • 2:09:08its own name is, or at least how it was executed at the command line,
    • 2:09:11at the terminal, you can look at argv[0].
    • 2:09:14In general, probably not that useful, probably better
    • 2:09:17to start looking at [1], which was the first word after the program name.
    • 2:09:21And if there were more, I could do this how about argv[2],
    • 2:09:25let me add in a second %s.
    • 2:09:27Let me recompile greet.
    • 2:09:29Let me do ./greet David Malan, Enter, and that, too, now works,
    • 2:09:35taking in two words at the prompt.
    • 2:09:37If I really want to be smart at this now,
    • 2:09:38I could do something like this, though.
    • 2:09:40How about if the count of arguments, A.K.A. argc,
    • 2:09:44equals equals to, then assume that the human typed in only their first name,
    • 2:09:49and do printf hello comma %s /n, and then argv[1].
    • 2:09:58Else, if the human did not provide exactly two
    • 2:10:01arguments, the name of the program and their own name,
    • 2:10:04let's just print out a default value, lest they forgot their name
    • 2:10:07or they typed in two names or three names.
    • 2:10:09Let's just do, hello comma world as a default.
    • 2:10:13And we'll just ignore what the human typed in.
    • 2:10:15If I recompile this, make greet, I can do ./greet and David again, Enter.
    • 2:10:20Oops-- sorry, what am I missing?
    • 2:10:24Yeah, so newbie mistake.
    • 2:10:26Else, all right, make greet again.
    • 2:10:30./greet, David, Enter, there's my hello, David.
    • 2:10:34But if I omit my name, I just get the generic, like a default value.
    • 2:10:37And if I get a little curious and I type in both names, then I get ignored too.
    • 2:10:41Why?
    • 2:10:42Because I just haven't built in support for argc of three.
    • 2:10:44I could do anything I want, but now we have access
    • 2:10:47to these kinds of building blocks.
    • 2:10:50All right, what else might I do here?
    • 2:10:52Well, it turns out there might be some final features for us to now execute.
    • 2:10:57Notice, though, that in C, despite what you
    • 2:11:00might see in books or online tutorials, nowadays,
    • 2:11:02the two official formats for defining a main function
    • 2:11:06are either this, which we've been using now for two plus weeks or now this,
    • 2:11:11whereby, you change the void to int argc,
    • 2:11:14and then for now, string argv, and then empty brackets.
    • 2:11:17And we'll see that this, too, is a simplification, some training
    • 2:11:20wheels if you will.
    • 2:11:21But for now, those are the two forms, even
    • 2:11:23though you will see in online tutorials and even books, some people
    • 2:11:26use main in different ways.
    • 2:11:27These are the two now to keep in mind.
    • 2:11:30And I'll note that these command line arguments
    • 2:11:32are kind of all over the place.
    • 2:11:33Didn't probably expect to see this word on the screen here.
    • 2:11:35And what does it mean?
    • 2:11:36Well, it turns out that for decades-- there's
    • 2:11:37actually this program that comes with Linux systems
    • 2:11:40in particular called cowsay.
    • 2:11:41Why?
    • 2:11:42Probably because someone had too much free time once and decided
    • 2:11:45to write a program that creates ASCII art out of a cow saying something
    • 2:11:49textually on the screen.
    • 2:11:51But you use cowsay, just for fun, by way of command line arguments.
    • 2:11:55So for instance, let me propose that I go back to VS Code
    • 2:12:00here, not because I want to write any code,
    • 2:12:03but I just want to use my terminal window.
    • 2:12:04And let me maximize my terminal window here.
    • 2:12:07And let me go ahead and type in something like, how about cowsay,
    • 2:12:11space moo?
    • 2:12:13So cowsay is not a program I wrote.
    • 2:12:14It's been around for decades.
    • 2:12:16But we installed it in VS Code for you in the cloud.
    • 2:12:18It takes at least one command line argument.
    • 2:12:21What do you want the cow to say?
    • 2:12:23I can say, cowsay moo, and hit Enter, and voila, there
    • 2:12:26is my ASCII art of a cow saying moo on the screen.
    • 2:12:29It can say multiple words.
    • 2:12:31So I can say, Hello, world, Enter.
    • 2:12:33And now it says, Hello, world.
    • 2:12:35So this is just an example of a silly program that uses command line
    • 2:12:38arguments, but it takes others too.
    • 2:12:40Just like clang, use this convention of hyphens
    • 2:12:43to change the output of the program.
    • 2:12:45Dash something is just a super common convention with command line arguments
    • 2:12:49when you want a very terse notation for some option like output.
    • 2:12:53In cowsay, I read the documentation, and it turns out
    • 2:12:56there's a dash f command line argument that
    • 2:12:59allows you to change the appearance of the cow, if you will.
    • 2:13:03So if I do cowsay dash f, duck, and then some other word like quack,
    • 2:13:10it's no longer a cow.
    • 2:13:11That command line argument turns it into a tiny, adorable duck instead.
    • 2:13:15And then lastly, just for fun, because I spent way too much time
    • 2:13:19playing with command line arguments.
    • 2:13:20Cowsay dash f, dragon, and then how about, rawr, Enter,
    • 2:13:25you can even get this on the screen here.
    • 2:13:27So this, too, is just an example of what you
    • 2:13:30can do with these command line arguments now that we have this building block.
    • 2:13:34And there's one final thing we can now do with code.
    • 2:13:36There's one last feature today that we'll
    • 2:13:39introduce before we now connect all of these dots
    • 2:13:41to readability and encryption by talking, lastly, about something called
    • 2:13:47exit status.
    • 2:13:48It turns out that whenever your main function exits,
    • 2:13:52it returns a secret integer that you can figure out,
    • 2:13:55as the programmer or an advanced user, what it was.
    • 2:13:58And these exit codes, exit statuses, are typically used to indicate errors.
    • 2:14:02So for instance, over the past couple of years, if you've used zoom
    • 2:14:05and you ever got some kind of error, you might have seen a screen like this.
    • 2:14:08It's usually not that helpful, maybe tells you to click
    • 2:14:11Report Problem or Contact Support.
    • 2:14:13But very often in our human world on Macs, PCs, and phones,
    • 2:14:16you see cryptic error codes, like literally numbers
    • 2:14:20that probably only Zoom knows, or Microsoft or Google or whatever company
    • 2:14:23wrote the software you're using.
    • 2:14:25But that number corresponds to a specific error
    • 2:14:28that some human somewhere knows might very well happen.
    • 2:14:32These are used similarly, although under a different name
    • 2:14:34that we'll talk about later in the term, on the web as well.
    • 2:14:38Have you ever seen this-- maybe not character, but number?
    • 2:14:41So, 404 means what?
    • 2:14:43AUDIENCE: Error.
    • 2:14:44DAVID MALAN: So error, yes, but really, not found.
    • 2:14:47So, why?
    • 2:14:48I mean, this is the most arcane thing.
    • 2:14:49And we'll talk in a few weeks about what this and other numbers mean,
    • 2:14:53but numbers are all around us in technology,
    • 2:14:54and they very often mean something to the technical people who
    • 2:14:57wrote the software, less so to humans like you and me.
    • 2:15:00Why so many of us recognize 404 is kind of weird,
    • 2:15:03that like that's been around long enough that we all know it.
    • 2:15:05But it really is just a special number that represents an error of some sort.
    • 2:15:10So it turns out, the last thing we'll reveal today
    • 2:15:13about what we've been taking for granted for two weeks,
    • 2:15:15is what the int is in main.
    • 2:15:18We've seen, just a moment ago, that the thing in the parentheses, which
    • 2:15:21up until now has been void, which means no command line arguments.
    • 2:15:24now int argc string argv brackets just means, yes, command line arguments.
    • 2:15:29And we've seen how to access them.
    • 2:15:31So the last piece of the puzzle, honestly,
    • 2:15:33of all the cryptic syntax the past two weeks, is just what int means.
    • 2:15:37Int is always there for main, and it indicates
    • 2:15:40that main will always return an integer, even though you and I have never
    • 2:15:44done so explicitly.
    • 2:15:46Usually, main returns 0, by default. But it
    • 2:15:50would be weird if you saw an error message saying 0, so 0 is just hidden.
    • 2:15:53You would never see it on the screen.
    • 2:15:55But it's happening automatically by way of how C is designed.
    • 2:15:58So let me write one final program here.
    • 2:16:01I'll call it, for instance, status.c to show you these exit statuses.
    • 2:16:05Code of status.c, and then up here, let me do something simple like include
    • 2:16:10cs50.h, then include stdio.h, and then int main--
    • 2:16:18actually, let's use a command line argument. int argc, string argv[],
    • 2:16:21so that's copy, paste.
    • 2:16:23But now let's do this.
    • 2:16:26If argc does not equal to--
    • 2:16:29why don't we do something like this?
    • 2:16:30Let's not just default to hello, world like last time.
    • 2:16:33Let's yell at the user.
    • 2:16:34So let's say something like printf missing command line argument,
    • 2:16:38so that they know they screwed up and they need
    • 2:16:40to run the program again correctly.
    • 2:16:43Else, let's go ahead and say, print out, as before, Hello, comma %s,
    • 2:16:51and then plug in argv[1], so the human's name from the prompt.
    • 2:16:56Now at this point, let me go ahead and run status, ./status,
    • 2:17:01and I'll type nothing first.
    • 2:17:03I get yelled at.
    • 2:17:04This time, I'll type it again. ./status David, and it works properly.
    • 2:17:10But now let me show you a somewhat secret, cryptic command.
    • 2:17:14You can type this at your prompt, and it's just a coincidence
    • 2:17:17that there's another dollar sign.
    • 2:17:18Echo $?, totally arcane, but it allows you
    • 2:17:22to see what exit status your program has ended with.
    • 2:17:25So let me run this again the wrong way.
    • 2:17:27./status, I get the error message.
    • 2:17:31What was secretly returned?
    • 2:17:32I can't see it.
    • 2:17:33There's obviously no error screen, but by typing echo $?,
    • 2:17:37I can see that, oh, my program automatically, by default, returns
    • 2:17:41zero.
    • 2:17:42However, if I run it again correctly, ./status David, Enter,
    • 2:17:46this is the correct version.
    • 2:17:48But if I run echo $?
    • 2:17:50status again, it's still entered with 0.
    • 2:17:52And long story short, this is just a missed opportunity.
    • 2:17:55When something goes wrong, why don't I return a value other than 0?
    • 2:17:590, by default, means success.
    • 2:18:01And it's always there automatically.
    • 2:18:02But you can control this.
    • 2:18:04I can go into my code here and return 1, else, if something works fine,
    • 2:18:11I can return 0, by default. And honestly, if I omit the return zero,
    • 2:18:14again, zero automatically is returned.
    • 2:18:17So let me go ahead and go be explicit, just so I know what's going on.
    • 2:18:20Make status again, ./status, and let's do this correctly with David.
    • 2:18:26Enter, hello, David.
    • 2:18:28Echo $?, zero.
    • 2:18:32So all is well.
    • 2:18:33But now if I do ./status and nothing, or multiple things, but not just David,
    • 2:18:38Enter, I get the error message.
    • 2:18:40But now if I do echo $?, voila, there now is the one.
    • 2:18:45So what does this now mean?
    • 2:18:47This is, in the graphical world, we would just
    • 2:18:49show something like this on the screen, which is
    • 2:18:51a little more informative to the user.
    • 2:18:52But even in the Linux world where you don't have a GUI,
    • 2:18:54necessarily, even for the programs we've written,
    • 2:18:56you can check these exit statuses.
    • 2:18:58And in fact, more comfortable, more advanced programmers,
    • 2:19:01when they write code that calls programs,
    • 2:19:03be it cowsay or anything else, you can encode,
    • 2:19:07check what the exit status is of a program, and then decide,
    • 2:19:11did my program work or did it not?
    • 2:19:13And now let's connect the final dots before we
    • 2:19:16adjourn for some fruit snacks.
    • 2:19:19Cryptography, namely one of the applications this week
    • 2:19:22via which you'll be able to send, if you will,
    • 2:19:24secret messages, and better yet, decrypt secret messages.
    • 2:19:27This will be in addition to perhaps analyzing
    • 2:19:29the readability of text using heuristics, like we
    • 2:19:32identified at the start of class two.
    • 2:19:34So cryptography is just the art, the science of encrypting information,
    • 2:19:38scrambling information so that if you have a secret message
    • 2:19:41to send in so-called plaintext, you can run it through some algorithm
    • 2:19:45and turn it into what's called ciphertext, thereby, encrypting it.
    • 2:19:49And only someone who knows what algorithm you've used
    • 2:19:53and what input you've used to the algorithm, theoretically,
    • 2:19:55can decrypt that process and convert it back to the original message.
    • 2:19:59So if we use our mental model from last week, here is a problem.
    • 2:20:03Here is an input and output.
    • 2:20:04The goal I claim here is to take some plain text, like the message
    • 2:20:08you want to send, think back to grade school
    • 2:20:10if you ever passed a note to a friend or to your crush saying, I love you,
    • 2:20:13it's a little awkward if the teacher or someone else intercepts the paper.
    • 2:20:16And in English, it just says, I love you, or whatever it is.
    • 2:20:19It'd be nice if you had at least encrypted it in some way.
    • 2:20:22But the other person needs to know what algorithm you used
    • 2:20:25and what inputs you use to that algorithm
    • 2:20:27so that, ultimately, they can decode the so-called ciphertext, which
    • 2:20:31is the output.
    • 2:20:32So what goes inside of the box today?
    • 2:20:34Well, an algorithm, as it relates to cryptography, is called a cipher.
    • 2:20:37And a cipher is a fancy name for an algorithm that encrypts text
    • 2:20:41from plaintext to ciphertext.
    • 2:20:43The catch is, there needs to be not just the algorithm,
    • 2:20:46there needs to be an input to it.
    • 2:20:48And so, for instance, you might draw the picture like this for the first time
    • 2:20:52today.
    • 2:20:53And we've seen this in code.
    • 2:20:54You can give multiple inputs or arguments to functions.
    • 2:20:57So in this black box, can you imagine passing in the message
    • 2:20:59you want to send, and then some secret.
    • 2:21:02So for instance, suppose that, the simplest
    • 2:21:05thing I could think of as a kid was instead of sending the letter A,
    • 2:21:08why don't I write the letter B?
    • 2:21:10Instead of the letter B, why don't I write the letter C?
    • 2:21:13So I can kind of shift the English alphabet by one space.
    • 2:21:16So A becomes B, B becomes C, dot, dot, dot,
    • 2:21:18Z becomes A. You can wrap around at the end.
    • 2:21:21And let's assume no punctuation in this part of the story.
    • 2:21:24So that's a very simple algorithm-- add a value to each letter
    • 2:21:29and send the value as the ciphertext.
    • 2:21:32And now the teacher, the classmate, they have to know that you use,
    • 2:21:35not only this rotational algorithm, also known as a Caesar cipher,
    • 2:21:39they also need to know what number you use.
    • 2:21:41Did you add 1 to every letter, 2 to every letter, 25 to every letter?
    • 2:21:45Now if they're super smart and probably not the young age in this story,
    • 2:21:49they could also just try all possibilities.
    • 2:21:51And that would be an attack on the algorithm.
    • 2:21:53This is not a sophisticated algorithm, but it's
    • 2:21:55enough to send a message in class.
    • 2:21:56So if the two inputs now are HI!
    • 2:21:58as the plain text message, and 1 as the so-called key, the secret number
    • 2:22:04that only you and the other person know, you
    • 2:22:06might be able to encrypt a message from one way to the other.
    • 2:22:11And so in this case, for instance, HI!
    • 2:22:13would become I-J-!.
    • 2:22:16In this version of the algorithm, we're not
    • 2:22:17going to bother with numbers or punctuation.
    • 2:22:19We'll only operate on A through Z, be it uppercase or lowercase.
    • 2:22:23So now if you were to receive a slip of paper in class with I-J on it,
    • 2:22:28you, the recipient, would know what it is
    • 2:22:31so long as you know that the sender used one,
    • 2:22:33because you just reverse the algorithm and you subtract one instead.
    • 2:22:36The teacher, they probably don't know what this means,
    • 2:22:39and they're not going to spend time hacking the message,
    • 2:22:41so it just looks scrambled to them.
    • 2:22:42And that's what we get from encryption.
    • 2:22:44Someone who intercepts it, be it in class or in the real world,
    • 2:22:47on the internet or anywhere else, can't actually figure out, ideally,
    • 2:22:51what it is you have sent.
    • 2:22:52The opposite, of course, is indeed called decryption,
    • 2:22:55but the process is the same.
    • 2:22:56We now pass in negative 1.
    • 2:22:58And so how about this?
    • 2:23:00Why don't we end with a demonstration here?
    • 2:23:02UIJT XBT DT50-- there's a bit of a tell there.
    • 2:23:08If we pass that in and do negative 1, well,
    • 2:23:11how do we get out the plaintext originally?
    • 2:23:14Well, if this is the ciphertext, and we subtract 1 from each letter,
    • 2:23:18I think U becomes T, I becomes H, J becomes I, T becomes S, X becomes W,
    • 2:23:28B becomes A, T becomes S, D becomes C, T becomes S, and this was, indeed, CS50.
    • 2:23:37Have a duck on your way out, and some snacks in the lobby.
    • 2:23:40[APPLAUSE]
    • 2:23:42[FILM ROLLING]
    • 2:23:43[MUSIC PLAYING]
  • CS50.ai
Shortcuts
Before using a shortcut, click at least once on the video itself (to give it "focus") after closing this window.
Play/Pause spacebar or k
Rewind 10 seconds left arrow or j
Fast forward 10 seconds right arrow or l
Previous frame (while paused) ,
Next frame (while paused) .
Decrease playback rate <
Increase playback rate >
Toggle captions on/off c
Toggle mute m
Toggle full screen f or double-click video