1 00:00:00,000 --> 00:00:17,450 *35c3 Preroll music* 2 00:00:17,450 --> 00:00:24,130 Herald: And now Ned Williamson's talk: "Attacking Chrome Inter Process 3 00:00:24,130 --> 00:00:30,060 Communication: reliably finding bugs to escape the Chrome sandbox". He will be 4 00:00:30,060 --> 00:00:32,810 talking about finding bugs in the Chrome inter-process communication in order to 5 00:00:32,810 --> 00:00:40,230 escape from the sandbox using a fuzzing method to enumerate the attack surface of 6 00:00:40,230 --> 00:00:46,513 the Chrome inter-process communication. Ned is a vulnerability researcher. He 7 00:00:46,513 --> 00:00:51,690 likes C and C++ vulnerabilities. He did research for consoles and browsers and now 8 00:00:51,690 --> 00:00:57,052 started to work on mobile devices. Please welcome him with a huge round of applause. 9 00:00:57,052 --> 00:01:04,030 *Applause* 10 00:01:04,030 --> 00:01:09,859 Ned: All right. Hello everyone. My name is Ned and today I'll be talking about Chrome 11 00:01:09,859 --> 00:01:16,920 IPC. And actually as I was writing this talk, I kind of came up with this idea to 12 00:01:16,920 --> 00:01:22,283 make it more usable for everyone and the way I ended up doing this was by trying to 13 00:01:22,283 --> 00:01:26,179 start really general and then kind of going more and more specific all the way 14 00:01:26,179 --> 00:01:31,789 down to the Chrome IPC fuzzing. So if you're really technical the end will be 15 00:01:31,789 --> 00:01:36,289 still interesting and then if you're new to this stuff hopefully the beginning part 16 00:01:36,289 --> 00:01:44,702 will show some of how to get started. So just a quick overview about me. I've 17 00:01:44,702 --> 00:01:49,727 mostly been spending the last several years on low level vulnerability research 18 00:01:49,727 --> 00:01:56,159 and my particular interest is on any kind of critical bugs. Meaning kind of the more 19 00:01:56,159 --> 00:02:00,270 severe individual bug the more interesting to me. So I'm trying to kind of solve this 20 00:02:00,270 --> 00:02:06,679 problem of: How do we make the bug finding process effective enough to bring out 21 00:02:06,679 --> 00:02:12,552 these really rare hidden bugs? And you will see an example of how to do that by the 22 00:02:12,552 --> 00:02:19,412 end. But just an overview. I've basically worked on four things the first being CTFs 23 00:02:19,412 --> 00:02:25,800 then went to 3DS and Chrome. Now I'm starting on XNU, but just a month ago 24 00:02:25,800 --> 00:02:32,860 so, not too much yet. Before we get into it I just want to give a little recap of what 25 00:02:32,860 --> 00:02:40,280 happened since last time. So I was part of the Nintendo hacking talk two years ago 26 00:02:40,280 --> 00:02:51,900 here and I presented two exploits called Soundhax and Fasthax and not to go into it 27 00:02:51,900 --> 00:02:56,351 too much but I did want to share like what happened here because I was actually 28 00:02:56,351 --> 00:03:01,425 surprised. I put Google analytics on the Soundhax website and I thought like maybe 29 00:03:01,425 --> 00:03:06,210 a thousand people use it or something but I just looked at the stats like a couple 30 00:03:06,210 --> 00:03:12,254 weeks ago and then turned out like 100k people used it or something. And then I 31 00:03:12,254 --> 00:03:18,879 searched YouTube and found like these huge videos where they were copied so like I 32 00:03:18,879 --> 00:03:23,340 wanted to have a screenshot but it's copyrighted so didn't do that, but 33 00:03:23,340 --> 00:03:28,040 basically it looks like something on the order of about a million users which is 34 00:03:28,040 --> 00:03:34,040 crazy because this is one of my intro projects really so I think like this 35 00:03:34,040 --> 00:03:37,540 should show you that you don't have to be all the way up onto Chrome or whatever it 36 00:03:37,540 --> 00:03:44,350 is to get into this and do some huge fun project. And then I just wanted to 37 00:03:44,350 --> 00:03:49,870 publicly talk about the donations because I had a donation link on the Soundhax 38 00:03:49,870 --> 00:03:55,860 website and we fortunately received about a thousand dollars in donations and then 39 00:03:55,860 --> 00:04:02,990 half of that went to the emulator people because they, uh, that's how I eventually 40 00:04:02,990 --> 00:04:08,500 wrote my exploit for Soundhax so it made sense to repay that, and then the other 41 00:04:08,500 --> 00:04:14,490 half went to buying switches for like the toolchain developers who couldn't afford 42 00:04:14,490 --> 00:04:21,911 it and so just wanted to thank everyone who used this or whoever donated. Just a 43 00:04:21,911 --> 00:04:27,130 shout out. So we will get into the actual meat of the talk. So basically I want to 44 00:04:27,130 --> 00:04:33,440 focus on the bug finding process, not exploitation necessarily because this 45 00:04:33,440 --> 00:04:38,760 topic is pretty well explored I think and I think the bug hunting aspect is kind of 46 00:04:38,760 --> 00:04:44,400 what's the most prohibitive for people to join in. And when I look at the number of 47 00:04:44,400 --> 00:04:48,950 people who I play CTF with who are really good at exploitation and then a number of 48 00:04:48,950 --> 00:04:54,674 like these prolific bug hunters it just seems like from what I see from how smart 49 00:04:54,674 --> 00:04:58,375 people are there should be more people doing the bug hunting and I hope that if I 50 00:04:58,375 --> 00:05:05,650 can talk about it more people can come over. So with that, the agenda will be 51 00:05:05,650 --> 00:05:13,370 just overall how do you make a process to achieve any goal. Then next how do you 52 00:05:13,370 --> 00:05:18,970 apply this kind of, some kind of strategy to bug hunting, then this new fuzzing 53 00:05:18,970 --> 00:05:24,041 style I've been kind of developing, some other people out in the industry have 54 00:05:24,041 --> 00:05:30,520 been working on. And then finally how does this all tie back to Chrome IPC. So also 55 00:05:30,520 --> 00:05:36,237 just to mention - I should mention that the bug I'll be showing in this 56 00:05:36,237 --> 00:05:41,210 presentation was used in a full chain exploit that I developed with a couple 57 00:05:41,210 --> 00:05:47,070 other people and the details of the exploitation of that will be discussed at 58 00:05:47,070 --> 00:05:50,610 OffensiveCon, so that's also here in Germany and hopefully people will check it 59 00:05:50,610 --> 00:05:56,310 out. So how do you become an expert in anything and I kind of was thinking this 60 00:05:56,310 --> 00:06:00,790 before I even started anything and I was like in the CTF stage and I was just kind 61 00:06:00,790 --> 00:06:06,090 of curious like if I approach this with the mindset of there's this arbitrary 62 00:06:06,090 --> 00:06:10,913 skill I want to learn. And if I approach it strategically like what's going to 63 00:06:10,913 --> 00:06:16,010 happen. So I looked into this expert research and then there's kind of this 64 00:06:16,010 --> 00:06:21,526 idea of pop psych like you need to study something deliberately for 10000 hours to 65 00:06:21,526 --> 00:06:26,710 get good at it. And you know there's some debate about this number it's kind of made 66 00:06:26,710 --> 00:06:30,980 up, I guess, but the essential idea of deliberate practice I think is very 67 00:06:30,980 --> 00:06:38,770 useful. And it's exactly how I structured my study. And so what deliberate means is, 68 00:06:38,770 --> 00:06:44,650 when you're learning you want to be thinking like purposefully like I want to 69 00:06:44,650 --> 00:06:50,110 make sure that the project that I'm doing is making me get better. I want to be 70 00:06:50,110 --> 00:06:56,750 actually thinking about how I'm structuring my training and then you want 71 00:06:56,750 --> 00:06:59,870 to make sure that you're kind of always struggling because that's just how you're 72 00:06:59,870 --> 00:07:06,390 growing. So essentially to do this you just need to keep picking projects that 73 00:07:06,390 --> 00:07:11,662 have some like success and failure feedback mechanism that's tied to the real 74 00:07:11,662 --> 00:07:15,669 world and you know with bug hunting like this is very obvious you know you're 75 00:07:15,669 --> 00:07:21,400 either finding a bug or not. And as I mentioned you want something difficult but 76 00:07:21,400 --> 00:07:26,280 achievable. And so this kind of order that I did the different projects I mentioned 77 00:07:26,280 --> 00:07:31,660 in the beginning was specifically chosen so that each stage would be achievable to 78 00:07:31,660 --> 00:07:38,940 me. But also like really really stretching what I could do. And there's a funny 79 00:07:38,940 --> 00:07:44,419 anecdote there's this guy named Ben Franklin from American history. And I read 80 00:07:44,419 --> 00:07:48,667 the story that he used to be really bad at writing and wanted to get better. So the 81 00:07:48,667 --> 00:07:54,389 way he did it was he took an essay that looked perfect to him and then he took 82 00:07:54,389 --> 00:07:59,630 notes on it and then a week later he rewrote the essay from the notes and then 83 00:07:59,630 --> 00:08:05,260 he would just compare the goal versus what he had done and basically saw all the 84 00:08:05,260 --> 00:08:11,639 shortcomings. And so that kind of just stuck in my head. And so I'll show like 85 00:08:11,639 --> 00:08:18,639 how do you play this kind of trick to bug finding practice. And then just another 86 00:08:18,639 --> 00:08:25,684 thing with setting goals for bug hunting. A lot of it is psychological. I think it's 87 00:08:25,684 --> 00:08:33,880 almost psychological more than intelligence, for sure. And basically you 88 00:08:33,880 --> 00:08:38,959 want to iteratively pick harder and harder project so that your tolerance for failure goes 89 00:08:38,959 --> 00:08:44,620 up and up. And so by the time I was working on Chrome I worked on it every day for six 90 00:08:44,620 --> 00:08:50,370 months, like right home from work until 1:00 AM, sleep, up, and then all day every 91 00:08:50,370 --> 00:08:55,908 weekend and found nothing the whole time. And then just one day found something and 92 00:08:55,908 --> 00:09:02,001 then from there all that accumulated struggle and effort, when the bug 93 00:09:02,001 --> 00:09:06,572 precipitated it was just like a sign that all of these necessary skills were there 94 00:09:06,572 --> 00:09:13,250 and then I was able to repeat it. And so now we'll talk about what that actually looked 95 00:09:13,250 --> 00:09:19,910 like for bug hunting. So when you think about how to train the skill I think 96 00:09:19,910 --> 00:09:25,360 there's kind of two constituent skills that are important and those are knowing 97 00:09:25,360 --> 00:09:31,040 where to look and then recognizing the bug when you're looking at it. And this first 98 00:09:31,040 --> 00:09:37,310 part is just from my own experience it seemed like just being a developer it's 99 00:09:37,310 --> 00:09:42,210 pretty easy to get a sense for, you know, you can look at the Git logs like I'm 100 00:09:42,210 --> 00:09:47,930 mentioning here, other crashes happening somewhere in the library, are bugs getting 101 00:09:47,930 --> 00:09:53,550 reported publicly? Does the code look bad? You know it's not hard to tell that 102 00:09:53,550 --> 00:09:58,420 something looks sketchy. But I think what's really hard is getting the bug to 103 00:09:58,420 --> 00:10:05,247 kind of come out. And so that's where I'll talk about strategy kind of directly. And 104 00:10:05,247 --> 00:10:14,940 so I kind of have this training idea, where essentially once you have this kind of 105 00:10:14,940 --> 00:10:18,000 target in mind where it's a little bit of your skill range but you think it's 106 00:10:18,000 --> 00:10:24,310 doable, you try to enumerate all the existing bug reports and then look through 107 00:10:24,310 --> 00:10:28,432 each of them and then it's this like Ben Franklin idea like you take the bug and 108 00:10:28,432 --> 00:10:33,300 then you look at. Usually there will be like you know this block of text and 109 00:10:33,300 --> 00:10:36,599 they're mentioning like the file where it's happening and stuff and you can kind 110 00:10:36,599 --> 00:10:43,580 of skim it and sense like, where the bug is. I went out without actually looking at what it 111 00:10:43,580 --> 00:10:49,130 is. And so then you go over and you try to find it yourself. And you know it's really 112 00:10:49,130 --> 00:10:54,110 important that you actively try to look for the bug yourself and kind of strain 113 00:10:54,110 --> 00:10:58,400 yourself and when you've given up essentially then you look at what was the 114 00:10:58,400 --> 00:11:03,790 bug. And then through that struggle it's usually pretty clear like what was the 115 00:11:03,790 --> 00:11:09,630 fundamental thing you were missing. And you know just by repeating this process 116 00:11:09,630 --> 00:11:14,970 constantly this is how you train. And so this is actually how I first ever started 117 00:11:14,970 --> 00:11:20,480 on bug hunting was you know, some of you may know j00ru he's this like, really 118 00:11:20,480 --> 00:11:25,130 talented researcher. He's been at it for a long time and I remember seeing this blog 119 00:11:25,130 --> 00:11:29,260 post from him showing all these IDA pro bugs and it just kind of blew my mind. 120 00:11:29,260 --> 00:11:36,450 Like wow someone took IDA and found like security vulnerabilities in it and then 121 00:11:36,450 --> 00:11:39,773 when I looked at the bug reports they're pretty small so I thought OK how do I 122 00:11:39,773 --> 00:11:46,750 practice and how could I have done this myself. So basically the first day, you 123 00:11:46,750 --> 00:11:51,061 know they're all like integer overflow bugs I could barely even, like, I knew 124 00:11:51,061 --> 00:11:56,650 what integer overflow was, but I hadn't actively looked for it before. And so I 125 00:11:56,650 --> 00:12:00,870 was looking at the function and I couldn't find it. Basically I went to sleep feeling 126 00:12:00,870 --> 00:12:05,420 like oh god like "I'll never be able to do this stuff." And then the next day I looked 127 00:12:05,420 --> 00:12:09,960 at again. I was like "Oh yeah that's actually easy". And then I kind of failed 128 00:12:09,960 --> 00:12:14,120 the second one and then by the third day I was like able to just see where they were 129 00:12:14,120 --> 00:12:19,930 once you know I knew where to look. So that kind of made me think "OK, I'll just 130 00:12:19,930 --> 00:12:25,268 keep doing this for a long time and keep doing harder and harder." So this is 131 00:12:25,268 --> 00:12:31,180 essentially the strategy. Like, I think, you know, I'm probably the perfect example of 132 00:12:31,180 --> 00:12:36,630 someone who was like an intermediate CTF player, really like insecure or whatever 133 00:12:36,630 --> 00:12:43,390 like and just wanted to get into this but I had no idea what I was doing. And I just 134 00:12:43,390 --> 00:12:46,380 kept thinking if I just believe in this kind of process you know hopefully it 135 00:12:46,380 --> 00:12:51,990 works out. And so here's just like a little really basic roadmap if you want to 136 00:12:51,990 --> 00:12:59,672 try to replicate what I did which is to focus on CTF because if you can do CTF 137 00:12:59,672 --> 00:13:05,084 binary problems these are perfect examples of a kind of training where you try to do 138 00:13:05,084 --> 00:13:09,830 something yourself. There's a write up and like once you can do these problems you 139 00:13:09,830 --> 00:13:13,800 know all the kind of low level details that are needed. You know what a bug is. 140 00:13:13,800 --> 00:13:18,040 Things like that. And then from there you just kind of progressively do harder and 141 00:13:18,040 --> 00:13:23,820 harder targets. And so there's kind of this component where like you know, I 142 00:13:23,820 --> 00:13:29,380 don't - you can't really assess your own ability like how much of this is innate or 143 00:13:29,380 --> 00:13:37,392 something and it just seemed to me that regardless of that, like I'm saying here, 144 00:13:37,392 --> 00:13:41,640 you know, "This isn't chess where you have people trained from birth with perfect 145 00:13:41,640 --> 00:13:47,510 study and decades of, you know, like we're barely figuring this stuff out and it's 146 00:13:47,510 --> 00:13:53,938 just kind of a huge mess." And so there's plenty of room for new people to join in. 147 00:13:53,938 --> 00:13:59,260 And then also there's a lot of these kind of stories about people who are just 148 00:13:59,260 --> 00:14:04,220 insanely naturally gifted and stuff. And I tried really hard to like, look into what 149 00:14:04,220 --> 00:14:08,720 these people are actually doing and I haven't found a case where someone wasn't 150 00:14:08,720 --> 00:14:15,215 working extremely hard. And so you know just keep that in mind. So just for the 151 00:14:15,215 --> 00:14:18,519 sake of time I won't go into this too much but if you're looking at the slides later 152 00:14:18,519 --> 00:14:23,790 I just kind of give more detail on like how I pick the mini projects and got down 153 00:14:23,790 --> 00:14:32,430 to Chrome. So now let's talk about fuzzing and so before I get into it, I should 154 00:14:32,430 --> 00:14:37,409 emphasize that you should really know how to do auditing and the first couple of 155 00:14:37,409 --> 00:14:44,255 years like not until into the six months of failure on Chrome, you know, I was 156 00:14:44,255 --> 00:14:49,929 doing auditing the whole time. And I think fuzzing gets a bad rap because people 157 00:14:49,929 --> 00:14:54,740 think that these are unrelated strategies and people are only a fuzzer person or an 158 00:14:54,740 --> 00:15:02,990 auditor person. And really I think these things are extremely like their work 159 00:15:02,990 --> 00:15:07,150 really well together. But you can't really know why fuzzing is failing or how does it 160 00:15:07,150 --> 00:15:14,790 even apply it or where to apply it without being able to audit yourself. And part of 161 00:15:14,790 --> 00:15:19,590 this was like, I noticed on Chrome that I could audit things but essentially the bug 162 00:15:19,590 --> 00:15:24,350 density was so low on the sandbox attack surface that I needed a way to kind of 163 00:15:24,350 --> 00:15:32,330 automate what I was looking for in each subsystem I was looking at. So you know 164 00:15:32,330 --> 00:15:36,550 you have like 20 subsystems that you want to read, well you know it takes about a 165 00:15:36,550 --> 00:15:41,690 week each minimum to learn. It's a lot faster to try to fuzz for like a day or 166 00:15:41,690 --> 00:15:48,110 two each thing and then... I don't know like it's... I can't explain it it just 167 00:15:48,110 --> 00:15:54,265 did random things and then this is what worked. So. So how would you practice 168 00:15:54,265 --> 00:15:59,930 fuzzing. It's really the same idea that I had about auditing where you take a bug 169 00:15:59,930 --> 00:16:05,100 and just ask yourself like how would I have written a fuzzer in the first place 170 00:16:05,100 --> 00:16:15,420 to hit the bug. How could I have known to write the fuzzer that would have triggered 171 00:16:15,420 --> 00:16:20,110 this. Am I lacking something in auditing ability? Am I not able to write fuzzers 172 00:16:20,110 --> 00:16:25,149 well enough and actually it took me probably like a year of fuzzer writing to 173 00:16:25,149 --> 00:16:32,481 get good enough where I could actually act on my ideas, like, just it's kind of 174 00:16:32,481 --> 00:16:40,110 tricky. And so we'll get back to it later but this exact idea of practicing fuzzing 175 00:16:40,110 --> 00:16:44,730 on something that looks un-fuzzable is how I found this real exploitable sandbox 176 00:16:44,730 --> 00:16:52,250 escape. So really quick, just for those of you who don't know too much about fuzzing, 177 00:16:52,250 --> 00:16:56,770 at least in like the current meta. Essentially there's this tool called AFL 178 00:16:56,770 --> 00:17:05,030 that came out in 2014 which I think really shifted how well fuzzing worked. And the 179 00:17:05,030 --> 00:17:08,959 idea is essentially that you have some corpus of inputs that you want to Fuzz and 180 00:17:08,959 --> 00:17:14,040 then as you're mutating them you're looking for coverage feedback which is 181 00:17:14,040 --> 00:17:19,459 compiled into your code and then as you're mutating and running new test cases when 182 00:17:19,459 --> 00:17:24,839 you find new coverage you take that input and put in your corpus and over time your 183 00:17:24,839 --> 00:17:31,400 corpus kind of grows and grows as more coverage is hit. And so there's... this 184 00:17:31,400 --> 00:17:36,720 just seems to work really well and then there's another version of this basically 185 00:17:36,720 --> 00:17:43,700 called libFuzzer, and this is just written by the LLVM project and the same people 186 00:17:43,700 --> 00:17:49,540 who wrote address sanitiser also wrote libFuzzer and just in my experience it's 187 00:17:49,540 --> 00:17:53,090 written in a way that's a lot more extensible and easy to understand and play 188 00:17:53,090 --> 00:18:00,620 with. And so it makes it kind of easier to audit and fuzz together. And so if you 189 00:18:00,620 --> 00:18:06,080 want to think about what fuzzing is, essentially you're trying to replicate the 190 00:18:06,080 --> 00:18:13,450 normal testing process, but kind of parameterizing like what a unit test would 191 00:18:13,450 --> 00:18:17,540 be doing with some input bytes, that you're just feeding into something and seeing if 192 00:18:17,540 --> 00:18:24,650 it crashes. And so what's interesting is there's kind of this gap in the middle of 193 00:18:24,650 --> 00:18:29,380 like an end to end test which AFL will give you just feed [it] a binary or like 194 00:18:29,380 --> 00:18:34,370 the unit test which libFuzzer will give you where you just keep stuffing bytes 195 00:18:34,370 --> 00:18:38,652 into a parser and real security vulnerabilities are kind of logical in 196 00:18:38,652 --> 00:18:44,935 nature. And I think that's why people think that fuzzing isn't applicable and I 197 00:18:44,935 --> 00:18:49,940 think there's actually kind of this part in the middle where if you see a few 198 00:18:49,940 --> 00:18:54,490 components that look suspicious and then you can integrate them and fuzz them in 199 00:18:54,490 --> 00:19:00,500 isolation, but have the complexity that you'd kind of see in the real program, 200 00:19:00,500 --> 00:19:08,464 that's where a lot of bugs come out. And so how we do this is using a grammar. And 201 00:19:08,464 --> 00:19:14,040 so essentially it's combining generative fuzzing with coverage guided fuzzing and 202 00:19:14,040 --> 00:19:21,520 so we'll touch on how that works in a minute, but just for some more evidence on 203 00:19:21,520 --> 00:19:26,830 why does this work well, like I'm not the only person who is doing this. Kind of 204 00:19:26,830 --> 00:19:34,670 simultaneously myself and two other people I guess seem to have stumbled across this 205 00:19:34,670 --> 00:19:42,500 idea last year or two years ago, and those are Syzkaller and Lokihardt. So Syzkaller 206 00:19:42,500 --> 00:19:48,240 is a kind of fully automated Linux kernel fuzzer. And if you guys haven't seen this 207 00:19:48,240 --> 00:19:53,996 it's kind of hilarious like essentially they are automatically generating zero day 208 00:19:53,996 --> 00:20:01,056 bugs, like tens per month at least and they automatically generate the test case, 209 00:20:01,056 --> 00:20:05,525 like submit the report when the commit comes in it's like automatically tracked. 210 00:20:05,525 --> 00:20:09,850 It's basically this 0-day generator sitting there. *laughter* Yeah I know! And 211 00:20:09,850 --> 00:20:15,990 I see this. I'm like OK there's 3000 bugs that are being found. There's a web app 212 00:20:15,990 --> 00:20:20,570 for it and you can just download it, you know. And I saw the Linux talk from the 213 00:20:20,570 --> 00:20:24,976 author of Syzkaller and the YouTube videos has like 100 views and stuff. I'm just 214 00:20:24,976 --> 00:20:31,445 like OK, so people need to reiterate how important this stuff is. So then there's 215 00:20:31,445 --> 00:20:37,470 Lokihardt as well who's like a famous, extremely talented, kind of canonical 216 00:20:37,470 --> 00:20:45,895 auditing person and he seems to be doing a very similar thing with Chakra and V8 and 217 00:20:45,895 --> 00:20:52,220 he's finding like tens of interesting exploitable bugs. And then there's me who 218 00:20:52,220 --> 00:20:57,430 applied this on the Chrome sandbox and found over 30 bugs, about half of which 219 00:20:57,430 --> 00:21:02,830 are security relevant, and then five of which were a sandbox escape without render 220 00:21:02,830 --> 00:21:07,760 code execution. So you know this is just to emphasize like we're finding really 221 00:21:07,760 --> 00:21:14,280 important things with this technique. And since I discussed this the first time a 222 00:21:14,280 --> 00:21:20,809 couple of months ago at PoC conference it's been used by someone in their Chrome 223 00:21:20,809 --> 00:21:24,840 security team to fuzz SQLite, and they're already finding new bugs in the first 224 00:21:24,840 --> 00:21:32,260 week. So just more of the evidence like here's the kind of the breakdown of some 225 00:21:32,260 --> 00:21:38,580 of the bugs I found with this strategy. So just to highlight a couple of them, or 226 00:21:38,580 --> 00:21:42,960 maybe three of them. So the first one was an out of bounds read - just an integer 227 00:21:42,960 --> 00:21:52,410 overflow in blobs. And this lets you... you can make a blob and then ask to read 228 00:21:52,410 --> 00:21:56,258 part of it, and then the offset could have been negative and there's integer overflow 229 00:21:56,258 --> 00:22:00,690 - they got the check wrong so it was a full memory disclosure from the browser 230 00:22:00,690 --> 00:22:08,170 process. There's also this AppCache use after free which is what I used in the 231 00:22:08,170 --> 00:22:13,340 exploit this year. And then finally... I guess the critical bugs are pretty 232 00:22:13,340 --> 00:22:20,700 interesting so, two of these I guess the first pair are in QUIC and the first 233 00:22:20,700 --> 00:22:25,084 one is a stack buffer overflow with just a bad packet that comes in over the network. 234 00:22:25,084 --> 00:22:31,610 So you just browse to an attacker site and they stack buffer overflow Chrome browser 235 00:22:31,610 --> 00:22:40,247 process which is outside the sandbox and it jumped over the stack cookie so that 236 00:22:40,247 --> 00:22:49,006 was bad. And then then these block file cache problems. These were in the HTTP 237 00:22:49,006 --> 00:22:56,620 caching mechanism which is also in the privileged process and these were actually 238 00:22:56,620 --> 00:23:02,960 crashing in the wild for three years and they didn't know how to... I guess I don't 239 00:23:02,960 --> 00:23:05,230 know if they didn't have resources or they didn't know how to address the problem or 240 00:23:05,230 --> 00:23:09,360 something, but I sent them the test case and then they closed like four bug reports 241 00:23:09,360 --> 00:23:16,410 in ancient bugs. So you know it just goes to show that this kind of technique works 242 00:23:16,410 --> 00:23:22,059 in a variety of really interesting places that are really important. And so now 243 00:23:22,059 --> 00:23:30,049 let's get to the boring stuff. So what's Protobuf. Well Protobuf is this data 244 00:23:30,049 --> 00:23:34,379 serialisation format from Google, and it doesn't really matter that it's Protobuf, 245 00:23:34,379 --> 00:23:40,590 just this idea is you want some kind of... you want to encode like a little language 246 00:23:40,590 --> 00:23:46,700 for yourself that expresses what you want to fuzz a kind of a higher abstraction 247 00:23:46,700 --> 00:23:51,530 layer than just fuzzing bytes randomly. And so if any of you have done functional 248 00:23:51,530 --> 00:23:57,232 programming like, I had been doing stuff with OCaml and quickcheck for a couple of 249 00:23:57,232 --> 00:24:02,404 years, and then when I saw this I just immediately recognized the pattern. 250 00:24:02,404 --> 00:24:06,860 Essentially what you can do is, you can create this little tree structure of just 251 00:24:06,860 --> 00:24:13,894 basic types like enum, you create these messages, you can just kind of specify 252 00:24:13,894 --> 00:24:22,450 actions you want your fuzzer to take. And then next what libprotobuf-mutator will do 253 00:24:22,450 --> 00:24:28,001 is, it will take the specification you've written and link it into libFuzzer so that 254 00:24:28,001 --> 00:24:33,754 it will automatically fuzz and create these, like, trees that are these kind of 255 00:24:33,754 --> 00:24:37,279 random ASTs from this little language you wrote and then you can kind of parse this 256 00:24:37,279 --> 00:24:45,220 language which sounds crazy, or more hard than it is, but you essentially you can 257 00:24:45,220 --> 00:24:48,700 generate this highly structured input which makes it a lot easier to explore 258 00:24:48,700 --> 00:24:58,929 like, logical type of bugs. So I just really want to emphasize this strategy can 259 00:24:58,929 --> 00:25:04,770 be used to fuzz anything and so kind of this same exact idea is being used to find 260 00:25:04,770 --> 00:25:11,616 bugs in caching APIs, encrypted networking protocols, kernels, sandbox, serialisation 261 00:25:11,616 --> 00:25:18,420 code, stateful systems that have IPC and network interaction and timing as part of 262 00:25:18,420 --> 00:25:24,752 it, which is what we'll show at the end. And so like what's what's common here. You 263 00:25:24,752 --> 00:25:30,040 know we just fuzz all of these different systems in the same way. The idea is, as 264 00:25:30,040 --> 00:25:36,220 an auditor what you do is you kind of notice like "okay there's some subsystem, 265 00:25:36,220 --> 00:25:41,780 like some caching mechanism with a simple API" and you look at how it's implemented 266 00:25:41,780 --> 00:25:44,890 and it looks complicated, so you think "okay, you know if I can write a fuzzer in 267 00:25:44,890 --> 00:25:51,370 like a few hours for this, you know it seems like high value". So once you kind 268 00:25:51,370 --> 00:25:55,669 of play with the API a bit and understand like how the API works you know you can 269 00:25:55,669 --> 00:25:59,620 just write this little specification for the API in Protobuf, and go ahead and 270 00:25:59,620 --> 00:26:10,720 write the fuzzer. So basically I'll show how this works on Chrome. So just to make 271 00:26:10,720 --> 00:26:16,070 sure I cover all of the background knowledge, for those of you that don't 272 00:26:16,070 --> 00:26:19,299 really care about fuzzing or don't care about anything else at least you can get 273 00:26:19,299 --> 00:26:28,169 bootstrapped on Chrome IPC research. The basic idea of how the Chrome sandboxing 274 00:26:28,169 --> 00:26:34,860 situation works is - when I'm saying I'm finding bugs in the sandbox, like it's really 275 00:26:34,860 --> 00:26:41,460 finding bugs in the browser process which are reachable from a sandboxed process and 276 00:26:41,460 --> 00:26:48,778 so the sandbox itself is just constraining these render tab processes. So they can't 277 00:26:48,778 --> 00:26:53,770 really do much and then what you want to do is jump from there to the browser 278 00:26:53,770 --> 00:27:01,440 process which can do anything. It's a very common model. Like almost... on 3DS you 279 00:27:01,440 --> 00:27:06,580 have userland kernel then security code processor you have a Linux like you might 280 00:27:06,580 --> 00:27:10,937 have a userland process and then in the sandbox there's some APIs in the kernel 281 00:27:10,937 --> 00:27:15,220 you can hit - syscalls you can hit and basically everything just keeps boiling 282 00:27:15,220 --> 00:27:20,095 down to "there's some API that you can look at from the less privileged context". 283 00:27:20,095 --> 00:27:25,032 And then if you can trigger a bug in that API you escape, and you know, this kind of 284 00:27:25,032 --> 00:27:33,409 applies everywhere. And so this idea of understanding self-contained chunks of 285 00:27:33,409 --> 00:27:38,917 syscalls in Linux - like hundreds - but being able to look at and say like okay, 286 00:27:38,917 --> 00:27:46,980 here are 10 related syscalls. This is a subsystem that I want to fuzz in isolation 287 00:27:46,980 --> 00:27:52,290 - like this is kind of how you think about it. And so if you just want to get started 288 00:27:52,290 --> 00:27:59,240 on Chrome what you want to do is look at, "OK what are these endpoints in the 289 00:27:59,240 --> 00:28:04,799 browser process that I can reach from the render", and then.. you don't really have 290 00:28:04,799 --> 00:28:10,452 to understand how IPC works to do this. You just have to be able to recognize what 291 00:28:10,452 --> 00:28:15,690 you're allowed to hit from the render to the browser and what's actually in the 292 00:28:15,690 --> 00:28:20,500 browser. And so fortunately the Chrome codebase is pretty well organized so they 293 00:28:20,500 --> 00:28:25,100 just tell you if you just don't really go into any folder that says browser in it. 294 00:28:25,100 --> 00:28:31,540 All of this is outside the sandbox and prone to sandbox escape. And so most of my 295 00:28:31,540 --> 00:28:38,979 bugs I found were in this content browser subsystem kind of thing. But you can look 296 00:28:38,979 --> 00:28:42,110 anywhere and I think like all these results I've had the in last year were 297 00:28:42,110 --> 00:28:48,130 just in one folder. And so you know there's so many other places where bugs 298 00:28:48,130 --> 00:28:52,000 can manifest that I didn't even look at. So basically there's plenty of room for 299 00:28:52,000 --> 00:29:02,571 more. So just to go in on what I did is - in this kind of content stuff - is: you 300 00:29:02,571 --> 00:29:08,160 just want to see where APIs are reachable from the renderer are enumerated and those 301 00:29:08,160 --> 00:29:16,880 are in this RenderProcessHostImpl::Init function. So yeah C++ kind of wordy but 302 00:29:16,880 --> 00:29:21,588 you get used to it! Basically there's there's two places where the APIs are set 303 00:29:21,588 --> 00:29:27,501 up, or the interfaces are exposed. Those are CreateMessageFilters() and 304 00:29:27,501 --> 00:29:33,217 RegisterMojoInterfaces() and it took me a while to realize where these were. Like a 305 00:29:33,217 --> 00:29:39,789 year or something. But like those are the key functions to look at. And so I'll skip 306 00:29:39,789 --> 00:29:45,640 over old style IPC because it's going away, but it's pretty easy to figure out 307 00:29:45,640 --> 00:29:52,592 what's going on if you look at it. So I'll talk a bit about Mojo. So essentially this 308 00:29:52,592 --> 00:30:01,460 is a new IPC platform that the Chrome team has developed and the idea is they want 309 00:30:01,460 --> 00:30:07,769 to, I guess, simplify this process for developers in terms of defining an 310 00:30:07,769 --> 00:30:11,900 interface that you want to expose to a render or some other client somewhere 311 00:30:11,900 --> 00:30:19,064 else, and essentially you write these little interface files called .mojom and 312 00:30:19,064 --> 00:30:24,400 then the build system will generate all the C++ glue for you - you can just like 313 00:30:24,400 --> 00:30:31,770 subclass something and then it handles all the mechanics of actually exposing this to 314 00:30:31,770 --> 00:30:36,190 other processes and so on. And so as a security researcher you know you don't 315 00:30:36,190 --> 00:30:42,080 really care about that. All you care about is "what can I reach" and "how do I know 316 00:30:42,080 --> 00:30:47,700 what to fuzz" or something. So what I guess I looked at is just: what are some 317 00:30:47,700 --> 00:30:54,010 of the .mojom files that are subclassed in this content/browser and you can 318 00:30:54,010 --> 00:31:01,210 just do little grep to check this. So essentially the AppCache is one of the 319 00:31:01,210 --> 00:31:09,149 bugs I found this year. And here's the API that the render can... these are all the 320 00:31:09,149 --> 00:31:13,849 messages that the render can send to the browser and along with the types of 321 00:31:13,849 --> 00:31:19,350 documents. And so you know that's pretty straightforward. So in the browser process 322 00:31:19,350 --> 00:31:24,602 this is the code that we're trying to attack which is the actual C++ 323 00:31:24,602 --> 00:31:29,700 implementation code for this API. And so you can see they're subclassing there and 324 00:31:29,700 --> 00:31:34,863 then they just make sure to override all these virtual functions that actually 325 00:31:34,863 --> 00:31:40,846 implements the API. And so I won't go too into detail on this part because it's a 326 00:31:40,846 --> 00:31:47,929 little boring, but essentially: how does a render get from it to all the way over to 327 00:31:47,929 --> 00:31:54,929 this kind of browser C++ code? Well it essentially goes through this request 328 00:31:54,929 --> 00:31:59,210 mechanism where the render tells the browser process "Hey I have this kind of 329 00:31:59,210 --> 00:32:08,850 request to access this interface" and then it'll actually just create that 330 00:32:08,850 --> 00:32:16,921 DispatcherHost implementation object and just feed in that request over there. So 331 00:32:16,921 --> 00:32:24,770 essentially stuff gets glued together somehow. And then there's this stuff which 332 00:32:24,770 --> 00:32:32,360 is kind of ugly, but I mean here's where you're actually exposing the ability to do 333 00:32:32,360 --> 00:32:38,960 this. So here's here's where the request comes in, and then where this requests 334 00:32:38,960 --> 00:32:41,504 handler function gets fed in as that thing I mentioned earlier - the 335 00:32:41,504 --> 00:32:46,631 RegisterMojoInterfaces. So it's named pretty well it's kind of easy to follow. 336 00:32:46,631 --> 00:32:50,230 And they're adding new stuff constantly all of this stuff is on the attack 337 00:32:50,230 --> 00:32:55,220 surface. Like I think I stopped Chrome a couple of months ago, I think I looked and 338 00:32:55,220 --> 00:33:00,267 there's like you know five new APIs in there, they're constantly adding things. 339 00:33:00,267 --> 00:33:08,425 So just a quick point about this. Essentially you want to do fuzzing 340 00:33:08,425 --> 00:33:17,057 in-process with this LibFuzzer+protobuf- mutator strategy, and you don't want to be 341 00:33:17,057 --> 00:33:22,260 actually doing IPC - it's just very brittle and weird. So what you really want 342 00:33:22,260 --> 00:33:27,710 to do is just like here's the C++ object I want to just instantiate it and call those 343 00:33:27,710 --> 00:33:34,510 functions myself and then this whole thing is just very lightweight and easy to play 344 00:33:34,510 --> 00:33:39,260 with which is... you know having a lightweight and very easy to rebuild, 345 00:33:39,260 --> 00:33:44,950 tweak something and play with it, print things... like the faster you can iterate 346 00:33:44,950 --> 00:33:48,620 the better so anything that's too complicated, like the success rate goes 347 00:33:48,620 --> 00:33:58,220 way down. So essentially you know the fuzzer that I made open source is the way 348 00:33:58,220 --> 00:34:03,408 you should do it. But the way I actually did it was: I just made the object... 349 00:34:03,408 --> 00:34:07,565 commented out the private... I don't know if you can see it on here... Yeah, so just 350 00:34:07,565 --> 00:34:11,082 commented out the private, created the object, started calling these things 351 00:34:11,082 --> 00:34:14,911 randomly, it would crash and I would just hand fix things and you know it's kind of 352 00:34:14,911 --> 00:34:21,159 sloppy but you're testing something in a very small unit that's not really exposed 353 00:34:21,159 --> 00:34:27,219 to that kind of testing. So now let's kind of put together everything I've talked 354 00:34:27,219 --> 00:34:35,059 about so far. So this exploitable AppCache Use-After-Free I found this year was found 355 00:34:35,059 --> 00:34:43,959 using this same idea of deliberate practice. So I looked at this AppCache 356 00:34:43,959 --> 00:34:48,819 subsystem in the browser process, and I noticed that there were three old bug 357 00:34:48,819 --> 00:34:54,190 reports that were triggering memory corruption and they were pretty 358 00:34:54,190 --> 00:34:59,619 interesting because they involved different kind of ways of attacking and 359 00:34:59,619 --> 00:35:04,514 these things had clearly been audited and I had actually seen these bugs a couple of 360 00:35:04,514 --> 00:35:10,709 years ago and I kind of used it as evidence to myself at the time that 361 00:35:10,709 --> 00:35:16,219 fuzzing doesn't work and you need auditing. But it kind of stuck in my head 362 00:35:16,219 --> 00:35:21,890 and I kept thinking "Someday I'll come back to this and like I'll overcome it" 363 00:35:21,890 --> 00:35:25,789 you know. So essentially what's interesting is - I've already talked 364 00:35:25,789 --> 00:35:30,859 about - you know - it's easy to specify the API and just feed IPC messages into 365 00:35:30,859 --> 00:35:35,127 it. And I think everyone kind of understands that, who does any IPC 366 00:35:35,127 --> 00:35:39,940 fuzzing. But then there's also this idea that you've got some remote server that 367 00:35:39,940 --> 00:35:46,559 the AppCache thing like creates a network request, some server's serving the request 368 00:35:46,559 --> 00:35:52,489 and doing different things, and so on the second bug it actually matters when things 369 00:35:52,489 --> 00:35:58,809 were - like when the server was returning data. Because some jobs like stay alive 370 00:35:58,809 --> 00:36:03,491 and then if you send an IPC message to close your session and then the job is 371 00:36:03,491 --> 00:36:06,609 still alive, there's like a raw pointer somewhere, and you know something going on 372 00:36:06,609 --> 00:36:13,039 that it matters that the server keeps the connection open. And then the last thing 373 00:36:13,039 --> 00:36:20,920 is just kind of a logical issue. And if the server returns these HTTP codes in the 374 00:36:20,920 --> 00:36:25,119 headers of the response in this kind of weird order, you trigger some logical bug 375 00:36:25,119 --> 00:36:30,758 that actually leads to memory corruption. And so you know I looked at this and I 376 00:36:30,758 --> 00:36:38,050 said OK, well, so what do we need to test to cover all this? Basically IPC, Network, 377 00:36:38,050 --> 00:36:44,129 and that timing. And so not only that, but this is kind of a stateful thing. So we 378 00:36:44,129 --> 00:36:51,319 want to make sure that for each fuzzing session that we kind of reset the state 379 00:36:51,319 --> 00:36:56,292 completely. And fortunately in C++ this isn't too hard because you just destroy 380 00:36:56,292 --> 00:37:02,009 the object and if it doesn't exist anymore what state is there. So you just make sure 381 00:37:02,009 --> 00:37:08,380 that you don't leave things lingering. So yes I just had this basic idea: we'll call 382 00:37:08,380 --> 00:37:15,589 random IPCs with this fuzzed input, we return random data from the network, and 383 00:37:15,589 --> 00:37:21,900 then, we reset the state of the cache on every iteration. And then part of it was 384 00:37:21,900 --> 00:37:26,829 thinking OK, if I can repro these old bugs if I reintroduce them by editing the 385 00:37:26,829 --> 00:37:32,283 source, this is kind of appealing to this deliberate practice idea that I could have 386 00:37:32,283 --> 00:37:36,141 written a fuzzer that would trigger these old things, and this is a kind of the idea 387 00:37:36,141 --> 00:37:43,247 I was pursuing when actually triggered a new bug. So now what's tricky about this 388 00:37:43,247 --> 00:37:47,199 is if you just return random data from the network you're not going to make much 389 00:37:47,199 --> 00:37:53,533 progress. And this is kind of where the auditing background comes in. You want to 390 00:37:53,533 --> 00:37:59,859 think about what is expressive enough of... Like, how do I make my fuzzer 391 00:37:59,859 --> 00:38:05,940 expressive enough that I can hit everything, but then not so generic that 392 00:38:05,940 --> 00:38:12,210 it's just spraying... like it's just noise. And so I'll show he did that in 393 00:38:12,210 --> 00:38:18,729 this specification and so at a high level I kind of root node in the AST or the tree 394 00:38:18,729 --> 00:38:25,829 of the fuzzier message is this session message, and then this just contains a 395 00:38:25,829 --> 00:38:31,542 sequence of commands and so... Commands are something I also made up, and so the 396 00:38:31,542 --> 00:38:38,573 first ten of them are all the different IPC calls I can do, the eleventh one is 397 00:38:38,573 --> 00:38:44,257 handling any pending requests or pre- caching like a response to any new 398 00:38:44,257 --> 00:38:49,310 requests that comes in. So that handles like both the asynchronous case where it 399 00:38:49,310 --> 00:38:52,568 makes a request and it's waiting for the server and also like the synchronous 400 00:38:52,568 --> 00:38:57,019 version where the response comes immediately. And then lastly this run 401 00:38:57,019 --> 00:39:05,400 until idle thing which essentially just... it helps you... like if you place these 402 00:39:05,400 --> 00:39:12,349 RunUntilIdles randomly as you're doing these IPC messages, you're kind of 403 00:39:12,349 --> 00:39:17,359 flushing the queue of accumulated work. And so, what this lets you do, is kind of 404 00:39:17,359 --> 00:39:26,849 identify these race condition type things. Because you can do something like do a 405 00:39:26,849 --> 00:39:30,897 bunch of IPCs that come in and are handled at the same time without actually 406 00:39:30,897 --> 00:39:35,484 serving... like actually doing the work yet, and then you do this RunUntilIdle and 407 00:39:35,484 --> 00:39:42,231 then all the work happens. And you know I didn't think of this a priori in some 408 00:39:42,231 --> 00:39:46,781 smart way, I just looked at the unit tests and I just tried to think about like "OK 409 00:39:46,781 --> 00:39:49,999 how are these developers already testing it". And this is what it looked like 410 00:39:49,999 --> 00:39:57,610 they're doing. So these messages are very easy to write, essentially just provide 411 00:39:57,610 --> 00:40:03,020 for each IPC message that I could have sent to this thing, just make sure all the 412 00:40:03,020 --> 00:40:09,650 arguments are correct, and then there's a little bit of cleverness which is like the 413 00:40:09,650 --> 00:40:16,738 HostID also breaks down to just an enum of like 0,1,2 because just from 414 00:40:16,738 --> 00:40:22,690 looking at the code you know that if I'm randomly creating hosts, destroying them 415 00:40:22,690 --> 00:40:29,319 and stuff over the whole 4 billion int32 IDs, like it's just going to fall apart 416 00:40:29,319 --> 00:40:35,602 and not find anything interesting. So I just constrained that for the URL. That is 417 00:40:35,602 --> 00:40:41,070 also a custom message that I constrained to just return a few premade legit URLs so 418 00:40:41,070 --> 00:40:48,949 that way I'm also not testing the URL parsing stuff. So then, how do I handle 419 00:40:48,949 --> 00:40:53,073 the network? Well, I just read the source and looked at "what are all the types of 420 00:40:53,073 --> 00:40:58,960 HTTPS response codes that affect control flow?". And I just enumerated them and 421 00:40:58,960 --> 00:41:09,249 then for any given request, that comes in from the AppCache system, I just encode 422 00:41:09,249 --> 00:41:14,190 anything interesting about the response that I thought of just by reviewing the 423 00:41:14,190 --> 00:41:18,589 source. And it seems like the things that mattered were those HTTP codes whether or 424 00:41:18,589 --> 00:41:27,203 not the headers asked AppCache to do caching or just download it once. And then 425 00:41:27,203 --> 00:41:35,049 also the AppCache can request from the server this manifest file which has some 426 00:41:35,049 --> 00:41:41,644 metadata about what files that it should be caching. And so essentially just all of 427 00:41:41,644 --> 00:41:48,280 this is encoded in one message. And so how you go from this like high level 428 00:41:48,280 --> 00:41:53,219 description to actually fuzzing is just this. So you can see how simple it is. 429 00:41:53,219 --> 00:42:00,757 You're really just... you know I looked at the unit test code and saw how they set up 430 00:42:00,757 --> 00:42:08,160 this AppCache service. And so they let you pass in this URLLoaderFactory, and what 431 00:42:08,160 --> 00:42:17,769 this is, is just this kind of unit testable network request thing so this is how I'm 432 00:42:17,769 --> 00:42:22,479 like, intercepting the network requests and feeding data. And so I do this little 433 00:42:22,479 --> 00:42:29,710 setup and then here I just create the one render to browser host. This just kind of 434 00:42:29,710 --> 00:42:36,987 simulating how you would do the Mojo stuff if it was the real render to browser 435 00:42:36,987 --> 00:42:40,619 interaction, and then I just go through those commands that I mentioned and just 436 00:42:40,619 --> 00:42:44,469 do these things. So I mean this is all it is. You just pull the HostID out of this 437 00:42:44,469 --> 00:42:51,390 Protobuf message that we're getting at the top there - that session that I defined as 438 00:42:51,390 --> 00:42:58,935 the top level TreeNode. And you just go through and you just call the APIs that 439 00:42:58,935 --> 00:43:04,400 are there. And so, how to get the network stuff to work, as I mentioned they have 440 00:43:04,400 --> 00:43:16,079 this like mock URLLoaderFactory - also C++-y! But essentially it's this... well okay 441 00:43:16,079 --> 00:43:23,854 so this is when I basically handle one of my request messages that I came up with. I 442 00:43:23,854 --> 00:43:27,900 just simulated a response. This is a built in like unit test function that they have 443 00:43:27,900 --> 00:43:36,240 in their codebase and I just pass in the relevant bits that came from that message. 444 00:43:36,240 --> 00:43:45,229 So, yes this is what it looks like. I have some kind of DoRequest helper function and 445 00:43:45,229 --> 00:43:50,910 then I just pass my stuff through to it. And so it takes that URLFactory and then 446 00:43:50,910 --> 00:43:56,579 serves responses to anything that's waiting and then what's interesting here 447 00:43:56,579 --> 00:44:01,920 and what's necessary to find the bug is that - I mentioned that this is 448 00:44:01,920 --> 00:44:07,650 asynchronous - so what will happen is when you do RegisterHost... and then if I go 449 00:44:07,650 --> 00:44:14,601 back to.. Yeah, you like register host, select a cache, do some things - like the 450 00:44:14,601 --> 00:44:20,380 AppCache will make a request to the server and then get this manifest, and then it 451 00:44:20,380 --> 00:44:27,400 will start making requests to download things and then these things are pending like 452 00:44:27,400 --> 00:44:34,573 responses that it's waiting for from the server. And so it actually mattered that 453 00:44:34,573 --> 00:44:39,660 you mutate the state further before those responses come in. And so by doing this 454 00:44:39,660 --> 00:44:45,999 like in between the IPC messages - not like preloading the network factory with a 455 00:44:45,999 --> 00:44:52,569 bunch of responses - I'm actually serving thingsā€¦ Like I'm not encoding an assumption 456 00:44:52,569 --> 00:44:57,900 about when I'm serving responses. And I know this is kind of tedious to go so into 457 00:44:57,900 --> 00:45:07,140 detail, but essentially you run this thing that's maybe 150 lines or something, and 458 00:45:07,140 --> 00:45:14,889 then trigger this bug with AddressSanitiser. And so essentially a Use-After-Free 459 00:45:14,889 --> 00:45:21,549 happens and what what's going on here is you can see the scoped_refptr 460 00:45:21,549 --> 00:45:30,799 pointer destructor. And it turns out that when... yes so when you go to unregister 461 00:45:30,799 --> 00:45:37,899 the host - like it's an IPC message there at the bottom that I that I send - and 462 00:45:37,899 --> 00:45:44,949 then it just accidently... This is kind of inaccurate, this stack trace, but 463 00:45:44,949 --> 00:45:51,959 essentially some ref count goes from one to zero, and then it starts destroying 464 00:45:51,959 --> 00:45:58,140 this AppCache object. And then in the destructor one of these requests was 465 00:45:58,140 --> 00:46:05,188 waiting on a response from the server, and then it essentially gives a reference back 466 00:46:05,188 --> 00:46:12,265 to that other object. And that's kind of eliding some details but essentially the 467 00:46:12,265 --> 00:46:16,549 refcount went back up to 1 and then, now you're adding a bunch of references all 468 00:46:16,549 --> 00:46:21,630 over the place to something while it's being destroyed. And so what happens is 469 00:46:21,630 --> 00:46:27,819 now you have all these pointers to a freed object and then you can trigger access to 470 00:46:27,819 --> 00:46:33,103 that freed thing again later. And so this kind of the recipe for an exploitable bug. 471 00:46:33,103 --> 00:46:38,880 And so I just want to point out that all of this fuzzer is open source and it's 472 00:46:38,880 --> 00:46:43,099 just in the Chrome codebase. So if you download it or go online to the code 473 00:46:43,099 --> 00:46:52,670 search tool you can search for appcache fuzzer and it will come up. So then real 474 00:46:52,670 --> 00:47:00,840 quickly, just to cover the exploitation. You know I guess I have more time and I 475 00:47:00,840 --> 00:47:07,113 thought - I compressed this a lot! But essentially I did this part in a chain 476 00:47:07,113 --> 00:47:12,821 with two other guys - Saelo and Niklas - and so Saelo provided the RCE bug. And so 477 00:47:12,821 --> 00:47:17,670 from there we get code execution in the render, and then this lets us send 478 00:47:17,670 --> 00:47:26,127 arbitrary IPC messages and so it's kind of annoying to send IPC with Mojo like 479 00:47:26,127 --> 00:47:32,959 arbitrarily, so we kind of piggybacked on the renderer-sided glue code for sending 480 00:47:32,959 --> 00:47:37,269 these AppCache messages. So we just like found the C++ object and called into it 481 00:47:37,269 --> 00:47:45,930 and then all in all we end up with this primitive where we can decref and like 482 00:47:45,930 --> 00:47:52,070 release a reference to this refcounted thing. Like after it's been freed multiple 483 00:47:52,070 --> 00:47:58,809 times. So, there's two stages to exploiting this - because we're in the 484 00:47:58,809 --> 00:48:04,609 render and we only have one bug, we need to turn this into a memory disclosure. And 485 00:48:04,609 --> 00:48:09,410 so you know fortunately this bug can be triggered repeatedly. And so the idea here 486 00:48:09,410 --> 00:48:18,430 is triggering it once gives you this decrement-by-n primitive. And so when 487 00:48:18,430 --> 00:48:22,279 you're releasing - you know if you ever hit zero - you'll trigger the destructor 488 00:48:22,279 --> 00:48:27,089 again. And so essentially what you want to do for the leak is to not trigger the 489 00:48:27,089 --> 00:48:32,930 destructor because it will blow up, but rather find a string somewhere in memory 490 00:48:32,930 --> 00:48:38,349 where it is a string pointing to the heap and then decrement the string pointer so 491 00:48:38,349 --> 00:48:42,680 then it starts like sliding somewhere else into the heap. So that when you read that 492 00:48:42,680 --> 00:48:49,759 string back you're actually leaking heap data. And so, we did that. So there's some 493 00:48:49,759 --> 00:48:54,740 object that had a standard C++ string at the beginning. On Windows the first.. 494 00:48:54,740 --> 00:49:00,969 keyword[?] is like the, er, pointer to the string data. So you decrement this. 495 00:49:00,969 --> 00:49:05,599 It is actually a cookie object, so we just read the cookie back from the browser and 496 00:49:05,599 --> 00:49:12,490 then in the cookie value we see the leaked bytes. And then from there there was a 497 00:49:12,490 --> 00:49:21,430 vtable access that we can control in the destructor. So we make another fake object 498 00:49:21,430 --> 00:49:26,140 that looks like it has one reference left. Make it hit zero so the destructor is 499 00:49:26,140 --> 00:49:31,660 triggered and then this AppCache thing gets confused and essentially calls a 500 00:49:31,660 --> 00:49:35,789 control vtable pointer. And then from there, those are the primitives you need 501 00:49:35,789 --> 00:49:42,640 to write an exploit and then it was just a matter of putting it together. And if 502 00:49:42,640 --> 00:49:48,469 you're curious about that again you should look forward to Niklas' talk. And so just 503 00:49:48,469 --> 00:49:53,180 a summary: essentially starting all the way from the beginning you want to be 504 00:49:53,180 --> 00:49:59,539 practicing deliberately. Keep working constantly, and keep identifying gaps and 505 00:49:59,539 --> 00:50:07,089 actively working to improve. It sounds weird but you want to keep that in mind. 506 00:50:07,089 --> 00:50:12,449 And use this new technique with LibFuzzer and protobuf-mutator - I can promise you 507 00:50:12,449 --> 00:50:20,160 it's not going to be the last time you see someone using this. And I mentioned I've 508 00:50:20,160 --> 00:50:27,549 started on XNU and we'll see some initial results pretty soon on that. It's working. 509 00:50:27,549 --> 00:50:37,260 So yeah, and lastly never give up. It may take months but it's fine. So with that I 510 00:50:37,260 --> 00:50:42,549 guess I'll open to questions. Yeah, thank you. 511 00:50:42,549 --> 00:50:46,859 *Applause* 512 00:50:46,859 --> 00:50:54,710 Herald: Okay thank you for that talk. If you have a question please line up at the 513 00:50:54,710 --> 00:50:59,960 microphones in the room and try to limit your question to one single sentence. If 514 00:50:59,960 --> 00:51:03,890 you would like to leave at this point please do that as quietly as possible so 515 00:51:03,890 --> 00:51:13,329 that everyone else can still stay for the questions. And also if you're listening on 516 00:51:13,329 --> 00:51:27,640 a stream you can ask a question online. Seems there is... no question? Ah there is 517 00:51:27,640 --> 00:51:30,829 one. Microphone number 2, your question please. 518 00:51:30,829 --> 00:51:38,499 Mic 2: Hello. I just want to ask why have you chosen Chrome for bug hunting? Was it 519 00:51:38,499 --> 00:51:40,880 just like you picked one random browser and you started? 520 00:51:40,880 --> 00:51:49,799 Ned: Yeah no. I mean it's basically just kind of the hardest thing I could think of 521 00:51:49,799 --> 00:51:57,150 that I could plausibly do? You know just for the purpose of getting better. And 522 00:51:57,150 --> 00:52:01,589 there's more to it. I think Chrome the way it's written is very amenable to research 523 00:52:01,589 --> 00:52:05,630 and like I actually didn't know C++ before I worked on Chrome! 524 00:52:05,630 --> 00:52:10,759 So like learning looking at a great example of the C++ code base and learning 525 00:52:10,759 --> 00:52:17,289 from that was really helpful to me. And you know I glossed over kind of my path 526 00:52:17,289 --> 00:52:21,839 but I was actually finding random obscure library bugs that weren't even reachable 527 00:52:21,839 --> 00:52:27,269 at first. So, just the quality of Chrome makes it so that what your training is the 528 00:52:27,269 --> 00:52:33,920 real talent not just being able to decipher bad code. So, I highly recommend 529 00:52:33,920 --> 00:52:39,880 it. I can say that I definitely feel that the two years I invested on that one 530 00:52:39,880 --> 00:52:48,959 project completely helped me get better. Herald: Thank you. Signal angel, question 531 00:52:48,959 --> 00:52:57,369 from the internet. Signal Angel: Hello, there is one question 532 00:52:57,369 --> 00:53:02,950 from the internet.... *inaudible* Ned: So the question is: "is it possible 533 00:53:02,950 --> 00:53:10,460 to attack using Meltdown or SPECTRE?" I don't know. I guess it's possible. I was 534 00:53:10,460 --> 00:53:15,430 essentially focusing only on application level bugs. So things that I could 535 00:53:15,430 --> 00:53:23,589 trigger deterministically using only bugs in the Chrome code itself. And so... I 536 00:53:23,589 --> 00:53:28,390 mean also those things came along way after I was doing my research. So you know 537 00:53:28,390 --> 00:53:31,579 I can't comment on that but I'm sure someone knows. 538 00:53:31,579 --> 00:53:37,650 Herald: Thanks. Ned: Yeah. Herald: I see no more people on the 539 00:53:37,650 --> 00:53:42,279 microphones or questions on the internet. Yeah. OK. Thanks for your talk and thanks 540 00:53:42,279 --> 00:53:44,027 for Q&A. Ned: Thank you. 541 00:53:44,027 --> 00:53:50,055 *Applause* 542 00:53:50,055 --> 00:53:54,045 *postroll music* 543 00:53:54,045 --> 00:54:13,000 subtitles created by c3subtitles.de in the year 2020. Join, and help us!