0 00:00:00,000 --> 00:00:30,000 Dear viewer, these subtitles were generated by a machine via the service Trint and therefore are (very) buggy. If you are capable, please help us to create good quality subtitles: https://c3subtitles.de/talk/479 Thanks! 1 00:00:09,730 --> 00:00:11,559 Let's have a big hand for. 2 00:00:13,480 --> 00:00:14,480 David Caplan. 3 00:00:21,950 --> 00:00:24,199 Well, thank you very much, I'm 4 00:00:24,200 --> 00:00:26,389 happy to be here. This presentation 5 00:00:26,390 --> 00:00:28,459 was put together in conjunction with 6 00:00:28,460 --> 00:00:29,840 one of my colleagues, Farhan. 7 00:00:30,870 --> 00:00:33,079 I'm actually going to talk this evening 8 00:00:33,080 --> 00:00:35,479 about real world hardware design. 9 00:00:35,480 --> 00:00:37,819 I do security hardware now is 10 00:00:37,820 --> 00:00:39,439 my day job. 11 00:00:39,440 --> 00:00:41,539 But for this talk, I want to focus 12 00:00:41,540 --> 00:00:43,669 on what goes into 13 00:00:43,670 --> 00:00:45,929 making a real world 14 00:00:45,930 --> 00:00:47,479 excited CPU. 15 00:00:47,480 --> 00:00:49,549 And during this talk, I'm going to go 16 00:00:49,550 --> 00:00:51,679 through some of the practices, 17 00:00:51,680 --> 00:00:53,959 tools and various techniques that 18 00:00:53,960 --> 00:00:56,209 are used in this kind of development 19 00:00:56,210 --> 00:00:57,619 and some of the unique challenges 20 00:00:57,620 --> 00:00:58,620 associated with it. 21 00:01:00,560 --> 00:01:02,869 So starting off, I'm 22 00:01:02,870 --> 00:01:04,518 not sure how many of you have worked with 23 00:01:04,519 --> 00:01:06,259 hardware design before, but hardware 24 00:01:06,260 --> 00:01:08,359 design is very different than software. 25 00:01:08,360 --> 00:01:10,609 For starters, it takes 26 00:01:10,610 --> 00:01:12,339 a long time, especially with the next 27 00:01:12,340 --> 00:01:15,199 city CPU to design 28 00:01:15,200 --> 00:01:17,509 a brand new CPU from scratch 29 00:01:17,510 --> 00:01:19,669 can easily take up to four or five 30 00:01:19,670 --> 00:01:22,159 years of constant development, 31 00:01:22,160 --> 00:01:24,589 with teams of hundreds, if not thousands 32 00:01:24,590 --> 00:01:27,169 of people. It's simply 33 00:01:27,170 --> 00:01:28,519 a very complex beast. 34 00:01:28,520 --> 00:01:29,989 There's a lot that goes into it. 35 00:01:29,990 --> 00:01:31,789 It's also very expensive. 36 00:01:31,790 --> 00:01:34,219 I mean, besides just the cost of hundreds 37 00:01:34,220 --> 00:01:37,159 or thousands of people, it's 38 00:01:37,160 --> 00:01:39,529 doing a fabrication 39 00:01:39,530 --> 00:01:41,659 of silicon is very expensive in 40 00:01:41,660 --> 00:01:43,879 the kinds of process technologies that 41 00:01:43,880 --> 00:01:46,099 modern city chips use 42 00:01:46,100 --> 00:01:48,259 a mask set, as it's called, which 43 00:01:48,260 --> 00:01:50,659 is what's used by the fabrication 44 00:01:50,660 --> 00:01:52,549 facility to actually build the chip that 45 00:01:52,550 --> 00:01:54,679 you've created can cost upwards 46 00:01:54,680 --> 00:01:55,819 of three million dollars. 47 00:01:57,050 --> 00:01:59,359 And that's before, of course, you also 48 00:01:59,360 --> 00:02:01,039 pay for all the special test equipment 49 00:02:01,040 --> 00:02:02,480 and everything else that goes into it. 50 00:02:03,590 --> 00:02:05,929 And another big challenge 51 00:02:05,930 --> 00:02:07,459 is, as we'll see as we go through this 52 00:02:07,460 --> 00:02:09,888 talk, it's very difficult to test 53 00:02:09,889 --> 00:02:12,289 everything in the design before 54 00:02:12,290 --> 00:02:14,449 you send it to that fabrication plant, 55 00:02:14,450 --> 00:02:16,009 and then it can become very difficult to 56 00:02:16,010 --> 00:02:17,539 figure out what happened when it went 57 00:02:17,540 --> 00:02:18,589 wrong afterwards. 58 00:02:18,590 --> 00:02:19,789 And we'll talk about some of the 59 00:02:19,790 --> 00:02:21,979 different techniques that are used to 60 00:02:21,980 --> 00:02:22,980 help mitigate that. 61 00:02:24,410 --> 00:02:26,959 686 CPU's are 62 00:02:26,960 --> 00:02:29,029 especially challenging, in 63 00:02:29,030 --> 00:02:31,429 part because of the complexity of 64 00:02:31,430 --> 00:02:34,159 a modern high performance chip can easily 65 00:02:34,160 --> 00:02:36,499 be around 60 million NAND 66 00:02:36,500 --> 00:02:39,019 gates and the RTL 67 00:02:39,020 --> 00:02:40,669 code, as it's called, which is typically 68 00:02:40,670 --> 00:02:43,639 in a language like very log or VDL 69 00:02:43,640 --> 00:02:45,619 can easily reach a million lines 70 00:02:47,150 --> 00:02:49,639 X. Eighty six cores also run 71 00:02:49,640 --> 00:02:51,679 just ridiculously fast. 72 00:02:51,680 --> 00:02:54,469 It's almost hard to fathom 73 00:02:54,470 --> 00:02:56,659 what that say, a three gigahertz 74 00:02:56,660 --> 00:02:59,089 CPU runs at three billion cycles 75 00:02:59,090 --> 00:03:01,949 a second. That's just 76 00:03:01,950 --> 00:03:04,069 is very difficult and 77 00:03:04,070 --> 00:03:06,709 we'll see how that plays in a minute. 78 00:03:06,710 --> 00:03:08,809 Related to that is that these 79 00:03:08,810 --> 00:03:10,639 chips have to work basically the whole 80 00:03:10,640 --> 00:03:13,429 time. I don't think most of you, 81 00:03:13,430 --> 00:03:15,199 you know, lose sleep at night thinking 82 00:03:15,200 --> 00:03:17,509 about your CPU, having 83 00:03:17,510 --> 00:03:19,399 a bug or a malfunction or something in 84 00:03:19,400 --> 00:03:20,389 your program. 85 00:03:20,390 --> 00:03:21,709 But if it did, that could be 86 00:03:21,710 --> 00:03:22,639 catastrophic. 87 00:03:22,640 --> 00:03:24,769 I mean, the CPU, it has to 88 00:03:24,770 --> 00:03:26,989 be functionally correct for basically 89 00:03:26,990 --> 00:03:28,490 everything in your system to work. 90 00:03:29,510 --> 00:03:31,459 What this means from a harbor development 91 00:03:31,460 --> 00:03:33,559 standpoint is that 92 00:03:33,560 --> 00:03:36,199 you got to have basically 93 00:03:36,200 --> 00:03:37,200 no bugs. 94 00:03:38,060 --> 00:03:40,099 Even if you had a really rare bug. 95 00:03:40,100 --> 00:03:41,930 It occurs one time in a billion. 96 00:03:43,040 --> 00:03:44,389 That's three times a second. 97 00:03:44,390 --> 00:03:45,830 That's that's not going to really work. 98 00:03:48,340 --> 00:03:50,599 So CPU's must be perfect 99 00:03:50,600 --> 00:03:52,569 then? Well, you know, obviously not. 100 00:03:53,590 --> 00:03:55,089 I'm sure many of you are familiar with 101 00:03:55,090 --> 00:03:57,399 some of the infamous CPU 102 00:03:57,400 --> 00:03:59,529 bugs that have been the Pentium divide 103 00:03:59,530 --> 00:04:00,669 bug in the 90s. 104 00:04:00,670 --> 00:04:03,159 There is a AMD Taulbee 105 00:04:03,160 --> 00:04:06,249 bug in 2007, 106 00:04:06,250 --> 00:04:07,839 but there is a lot more than that. 107 00:04:07,840 --> 00:04:10,239 In fact, if you open up the 108 00:04:10,240 --> 00:04:12,699 ratha guide for a modern CPU, 109 00:04:12,700 --> 00:04:14,379 you'll see something like this. 110 00:04:14,380 --> 00:04:16,629 And if you're in the back that says 111 00:04:16,630 --> 00:04:18,699 no fixed plans on the right 112 00:04:18,700 --> 00:04:19,700 side there. 113 00:04:20,800 --> 00:04:22,149 That being said, 114 00:04:23,230 --> 00:04:26,139 these issues are very minor. 115 00:04:26,140 --> 00:04:28,059 You don't have to take my word for it. 116 00:04:28,060 --> 00:04:29,259 In fact, I would encourage that. 117 00:04:29,260 --> 00:04:31,779 You don't I would encourage you to go and 118 00:04:31,780 --> 00:04:34,089 download one of these and read what 119 00:04:34,090 --> 00:04:36,339 the type of Iraheta are 120 00:04:36,340 --> 00:04:38,289 that are in these production systems. 121 00:04:38,290 --> 00:04:41,049 In most cases, these 122 00:04:41,050 --> 00:04:43,389 simply don't really matter, 123 00:04:43,390 --> 00:04:44,769 or there are software workarounds 124 00:04:44,770 --> 00:04:46,899 available, but 125 00:04:46,900 --> 00:04:48,969 there are a lot that still make it into 126 00:04:48,970 --> 00:04:49,970 silicon. 127 00:04:51,860 --> 00:04:53,929 So I want to talk a bit about 128 00:04:53,930 --> 00:04:56,179 the hardware design process then and 129 00:04:56,180 --> 00:04:58,519 how the testing of these chips 130 00:04:58,520 --> 00:05:00,739 is done, C.P.U 131 00:05:00,740 --> 00:05:02,839 development starts with the 132 00:05:02,840 --> 00:05:04,939 design and verification, which is where 133 00:05:04,940 --> 00:05:07,309 the teams write the very blog 134 00:05:07,310 --> 00:05:08,749 or video code. 135 00:05:08,750 --> 00:05:10,729 They do a bunch of testing on it in a 136 00:05:10,730 --> 00:05:12,829 simulation environment that can 137 00:05:12,830 --> 00:05:15,049 take anywhere from one 138 00:05:15,050 --> 00:05:17,389 to about four years, depending on how 139 00:05:17,390 --> 00:05:19,759 much is changing in the design. 140 00:05:19,760 --> 00:05:22,069 Once that's completed, it's sent to 141 00:05:22,070 --> 00:05:24,139 a fabrication facility. 142 00:05:24,140 --> 00:05:26,269 It takes usually at least two to 143 00:05:26,270 --> 00:05:28,639 three months to get any 144 00:05:28,640 --> 00:05:31,639 silicon back from a fabrication facility. 145 00:05:31,640 --> 00:05:33,859 After you do you then still 146 00:05:33,860 --> 00:05:35,149 need to test it. 147 00:05:35,150 --> 00:05:37,309 That validation process, as 148 00:05:37,310 --> 00:05:39,259 I'm calling it here, can easily take up 149 00:05:39,260 --> 00:05:40,489 to a year or more. 150 00:05:40,490 --> 00:05:42,079 So that's where you put this all 151 00:05:42,080 --> 00:05:44,149 together. It can be four 152 00:05:44,150 --> 00:05:46,219 or five years to get 153 00:05:46,220 --> 00:05:48,379 something from concept all 154 00:05:48,380 --> 00:05:49,789 the way into mass production. 155 00:05:51,890 --> 00:05:54,499 So I first want to talk about the 156 00:05:54,500 --> 00:05:56,719 verification or the pre silicon 157 00:05:56,720 --> 00:05:59,269 phase of design and 158 00:05:59,270 --> 00:06:01,339 what is verification, verification 159 00:06:01,340 --> 00:06:03,409 is a discipline within 160 00:06:03,410 --> 00:06:05,909 silicon design that ensures 161 00:06:05,910 --> 00:06:07,729 design, matches the specification. 162 00:06:07,730 --> 00:06:10,069 And it's worth pointing out that when 163 00:06:10,070 --> 00:06:11,929 a hardware and CPU's in particular are 164 00:06:11,930 --> 00:06:13,849 developed, you don't just have a 165 00:06:13,850 --> 00:06:15,799 functional specification of the 166 00:06:15,800 --> 00:06:17,569 functional specification may say what 167 00:06:17,570 --> 00:06:19,789 instruction sets the CPU supports 168 00:06:19,790 --> 00:06:21,079 some things like that. 169 00:06:21,080 --> 00:06:23,389 You would also have a performance 170 00:06:23,390 --> 00:06:25,579 specification of power specification 171 00:06:25,580 --> 00:06:27,679 and those all need to be tested in the 172 00:06:27,680 --> 00:06:30,319 same way. If you build a processor 173 00:06:30,320 --> 00:06:32,599 and it's slower than you were expecting 174 00:06:32,600 --> 00:06:35,119 it to be, it's just as worthless 175 00:06:35,120 --> 00:06:36,949 as if it had a bug somewhere. 176 00:06:38,270 --> 00:06:40,489 And the goal of verification is 177 00:06:40,490 --> 00:06:42,859 to find defects, of course, or bugs. 178 00:06:42,860 --> 00:06:45,289 And like with many things, the earlier 179 00:06:45,290 --> 00:06:46,969 you find the bug in the design process, 180 00:06:46,970 --> 00:06:48,109 the cheaper it is to fix. 181 00:06:49,670 --> 00:06:51,559 So how do we find these bugs? 182 00:06:51,560 --> 00:06:54,169 Well, the standard way is to use 183 00:06:54,170 --> 00:06:55,729 simulation test bench. 184 00:06:55,730 --> 00:06:57,649 And as we're shown here is you're going 185 00:06:57,650 --> 00:06:59,629 to have your hardware design. 186 00:06:59,630 --> 00:07:01,729 So this is your very log code. 187 00:07:01,730 --> 00:07:04,009 That's the block and red and 188 00:07:04,010 --> 00:07:05,509 your first going to have some way of 189 00:07:05,510 --> 00:07:07,849 generating stimulus into that design. 190 00:07:07,850 --> 00:07:10,069 And the typical ways 191 00:07:10,070 --> 00:07:12,109 of that happens is either what's called 192 00:07:12,110 --> 00:07:14,329 directed testing or random 193 00:07:14,330 --> 00:07:16,639 testing. So directed testing, as you've 194 00:07:16,640 --> 00:07:18,739 written a sequence of instructions that 195 00:07:18,740 --> 00:07:21,079 are going to go and execute in 196 00:07:21,080 --> 00:07:23,689 random testing, you 197 00:07:23,690 --> 00:07:25,909 either open it up to some or maybe 198 00:07:25,910 --> 00:07:27,799 all of the instructions space and just 199 00:07:27,800 --> 00:07:29,119 throw stuff at it. 200 00:07:29,120 --> 00:07:31,219 The vast majority of testing that's 201 00:07:31,220 --> 00:07:33,559 done is random testing because 202 00:07:33,560 --> 00:07:35,389 it's great at finding all these weird 203 00:07:35,390 --> 00:07:37,909 corner cases and 204 00:07:37,910 --> 00:07:39,979 is much less time 205 00:07:39,980 --> 00:07:41,089 consuming for humans. 206 00:07:43,240 --> 00:07:45,369 In addition to applying stimulus to 207 00:07:45,370 --> 00:07:47,229 the design, it's also applied to some 208 00:07:47,230 --> 00:07:49,179 kind of checker, if you're working at the 209 00:07:49,180 --> 00:07:51,129 next 36 score level, this means that 210 00:07:51,130 --> 00:07:53,709 you'll probably have some kind of, say, 211 00:07:53,710 --> 00:07:55,839 C++ model of the 212 00:07:55,840 --> 00:07:58,059 86 architecture that is 213 00:07:58,060 --> 00:07:59,949 running in parallel alongside with the 214 00:07:59,950 --> 00:08:01,389 design. And you would send the same 215 00:08:01,390 --> 00:08:03,129 instruction to both of them. 216 00:08:03,130 --> 00:08:04,899 And then when the design is finished, 217 00:08:04,900 --> 00:08:06,609 you'd compare your register output or 218 00:08:06,610 --> 00:08:08,709 your memory output, whatever, and look 219 00:08:08,710 --> 00:08:10,209 for any variance. 220 00:08:11,460 --> 00:08:13,559 Checkers can also be 221 00:08:13,560 --> 00:08:15,629 at much lower levels and the design 222 00:08:15,630 --> 00:08:17,399 and this is a very common practice to 223 00:08:17,400 --> 00:08:19,859 have checkers around 224 00:08:19,860 --> 00:08:22,199 specific blocks and in fact, 225 00:08:22,200 --> 00:08:24,509 often probing directly into those blocks. 226 00:08:24,510 --> 00:08:26,939 And so, for instance, you might have 227 00:08:26,940 --> 00:08:29,189 a cash checker that would 228 00:08:29,190 --> 00:08:31,619 run and make sure that you don't insert 229 00:08:31,620 --> 00:08:33,210 duplicate lines in your cash. 230 00:08:34,620 --> 00:08:36,959 And these sort of things are really 231 00:08:36,960 --> 00:08:39,569 useful because verification 232 00:08:39,570 --> 00:08:40,889 time is so limited. 233 00:08:40,890 --> 00:08:43,048 You want to check for 234 00:08:43,049 --> 00:08:45,209 any discrepancies as early as 235 00:08:45,210 --> 00:08:47,519 you can and get the most 236 00:08:47,520 --> 00:08:48,899 out of your testing cycles. 237 00:08:48,900 --> 00:08:50,849 And so it's very common to have checkers 238 00:08:50,850 --> 00:08:53,070 kind of all the way down into the design. 239 00:08:54,330 --> 00:08:56,699 Another important characteristic 240 00:08:56,700 --> 00:08:58,889 of test benches is that 241 00:08:58,890 --> 00:09:00,479 there's typically some way to measure 242 00:09:00,480 --> 00:09:02,639 coverage and coverage is a very 243 00:09:02,640 --> 00:09:04,919 important metric that's used 244 00:09:04,920 --> 00:09:07,289 in at least Harbor Designs 245 00:09:07,290 --> 00:09:09,479 because it helps determine 246 00:09:09,480 --> 00:09:11,309 how far along your testing is. 247 00:09:11,310 --> 00:09:13,559 And if you are actually hitting 248 00:09:13,560 --> 00:09:15,419 all of the code that you expect and all 249 00:09:15,420 --> 00:09:17,549 the branches you expect to execute, 250 00:09:17,550 --> 00:09:19,529 you absolutely do not want to go in 251 00:09:19,530 --> 00:09:21,599 fabricated design that has untested 252 00:09:21,600 --> 00:09:22,499 code in it. 253 00:09:22,500 --> 00:09:24,809 And coverage is a great tool 254 00:09:24,810 --> 00:09:26,460 that's used to help prevent that. 255 00:09:28,840 --> 00:09:31,119 Now, test benches do not 256 00:09:31,120 --> 00:09:33,309 run very fast, and this becomes 257 00:09:33,310 --> 00:09:36,099 a major issue if you want to simulate 258 00:09:36,100 --> 00:09:39,249 the entire we call SLC 259 00:09:39,250 --> 00:09:41,709 Chip. So this might have multiple cause 260 00:09:41,710 --> 00:09:43,149 and Northbridge, Southbridge, this sort 261 00:09:43,150 --> 00:09:45,339 of thing, you're looking at a speed 262 00:09:45,340 --> 00:09:47,469 of about one hertz, meaning that 263 00:09:47,470 --> 00:09:49,659 you're simulating one cycle for 264 00:09:49,660 --> 00:09:50,869 every wall clock. 265 00:09:50,870 --> 00:09:53,199 Second, that you run this thing, 266 00:09:53,200 --> 00:09:55,299 you cannot get a lot done at one 267 00:09:55,300 --> 00:09:57,399 hertz. And this is 268 00:09:57,400 --> 00:09:58,989 with like top of the line tools, 269 00:09:58,990 --> 00:10:00,129 hardware, all this sort of thing. 270 00:10:01,330 --> 00:10:03,189 So the natural thing is to break things 271 00:10:03,190 --> 00:10:05,379 down into smaller, 272 00:10:05,380 --> 00:10:06,369 smaller levels. 273 00:10:06,370 --> 00:10:08,949 If you are testing just an equity score, 274 00:10:08,950 --> 00:10:11,949 that's about an order of magnitude faster 275 00:10:11,950 --> 00:10:14,289 to simulate, which still really slow. 276 00:10:14,290 --> 00:10:15,290 But it helps 277 00:10:16,840 --> 00:10:18,969 if you break it down even further, 278 00:10:18,970 --> 00:10:21,279 you can get a call multiunit 279 00:10:21,280 --> 00:10:23,739 testing. So this is a typical practice 280 00:10:23,740 --> 00:10:25,869 of combining a few related blocks like an 281 00:10:25,870 --> 00:10:28,029 instruction fetch and a Decode unit 282 00:10:28,030 --> 00:10:29,469 together. 283 00:10:29,470 --> 00:10:32,199 Or you can even go down to 284 00:10:32,200 --> 00:10:34,869 a single unit testing like the decoder 285 00:10:34,870 --> 00:10:36,789 or the load storage unit or something 286 00:10:36,790 --> 00:10:39,129 like that. And you're looking at 287 00:10:39,130 --> 00:10:41,349 in the ballpark of one hundred, maybe two 288 00:10:41,350 --> 00:10:43,449 hundred hertz. So two hundred 289 00:10:43,450 --> 00:10:46,029 cycles per second of simulation. 290 00:10:46,030 --> 00:10:48,669 Now, compare that for a minute 291 00:10:48,670 --> 00:10:50,979 to real silicon, 292 00:10:50,980 --> 00:10:53,169 which runs at three billion cycles per 293 00:10:53,170 --> 00:10:55,299 second. And you'll see this is far 294 00:10:55,300 --> 00:10:57,579 off. In fact, in the very first 295 00:10:57,580 --> 00:10:59,739 second that you power on 296 00:10:59,740 --> 00:11:02,049 a CPU, that's the equivalent 297 00:11:02,050 --> 00:11:04,209 of a nine and a half years of 298 00:11:04,210 --> 00:11:06,459 testing at the system level. 299 00:11:06,460 --> 00:11:08,589 So basically, 300 00:11:08,590 --> 00:11:09,879 as soon as you turn the thing on, it's 301 00:11:09,880 --> 00:11:11,529 already gone through more verification 302 00:11:11,530 --> 00:11:12,530 than it ever did. 303 00:11:16,890 --> 00:11:19,079 Now, there is a way that you can kind 304 00:11:19,080 --> 00:11:21,389 of throw more hardware 305 00:11:21,390 --> 00:11:23,459 and money at this problem, something 306 00:11:23,460 --> 00:11:25,679 called emulation, there are two two 307 00:11:25,680 --> 00:11:27,959 of the major design to a company's 308 00:11:27,960 --> 00:11:30,269 cadence and synopsize make 309 00:11:30,270 --> 00:11:32,039 emulation machines. 310 00:11:32,040 --> 00:11:33,729 And these are special hardware. 311 00:11:33,730 --> 00:11:36,659 I think they're kind of FPGA based things 312 00:11:36,660 --> 00:11:38,789 that allow you to load 313 00:11:38,790 --> 00:11:41,129 a design onto them and 314 00:11:41,130 --> 00:11:43,559 run testing at a faster rate. 315 00:11:43,560 --> 00:11:45,459 And I should point out, this Kaiden 316 00:11:45,460 --> 00:11:47,279 system is way bigger than the picture 317 00:11:47,280 --> 00:11:49,320 makes it. Look, it's a serious box. 318 00:11:50,820 --> 00:11:52,829 One of these boxes is going to set you 319 00:11:52,830 --> 00:11:56,159 back probably close to a million dollars, 320 00:11:56,160 --> 00:11:58,349 but they can run at around 321 00:11:58,350 --> 00:12:00,449 one to one and a half megahertz. 322 00:12:00,450 --> 00:12:02,639 So it's still two 323 00:12:02,640 --> 00:12:04,259 or three thousand times slower than the 324 00:12:04,260 --> 00:12:06,119 real silicon, but it's a million times 325 00:12:06,120 --> 00:12:09,239 faster than simulation. 326 00:12:09,240 --> 00:12:11,789 So they're very useful, 327 00:12:11,790 --> 00:12:13,510 but they are costly as well. 328 00:12:16,290 --> 00:12:18,509 Now, one question that I 329 00:12:18,510 --> 00:12:20,249 often get asked are a myth that I 330 00:12:20,250 --> 00:12:22,379 sometimes see as well, 331 00:12:22,380 --> 00:12:24,449 what about formal verification and formal 332 00:12:24,450 --> 00:12:26,699 methods? And if anyone is not 333 00:12:26,700 --> 00:12:29,099 familiar, formal methods is basically 334 00:12:29,100 --> 00:12:31,739 a mathematical proof of the behavior 335 00:12:31,740 --> 00:12:33,779 of certain design. 336 00:12:33,780 --> 00:12:36,179 And formal verification 337 00:12:36,180 --> 00:12:39,179 is great for some things. 338 00:12:39,180 --> 00:12:41,519 It is really 339 00:12:41,520 --> 00:12:43,679 cool. It's great to get a proof 340 00:12:43,680 --> 00:12:45,269 that says 100 percent, this is how it 341 00:12:45,270 --> 00:12:47,849 worked. The problem is 342 00:12:47,850 --> 00:12:50,159 formal tools, first off really crap 343 00:12:50,160 --> 00:12:51,569 themselves. When you give them a big 344 00:12:51,570 --> 00:12:53,789 design, they're basically sat 345 00:12:53,790 --> 00:12:54,749 solvers. 346 00:12:54,750 --> 00:12:56,369 They can't deal with that. 347 00:12:56,370 --> 00:12:57,779 The second thing is they have to have 348 00:12:57,780 --> 00:12:59,279 something to compare against. 349 00:12:59,280 --> 00:13:01,409 And when you're working 350 00:13:01,410 --> 00:13:03,239 with something like a multiplier or a 351 00:13:03,240 --> 00:13:05,309 divider, it's pretty easy to give 352 00:13:05,310 --> 00:13:07,469 it a multiplier and say make sure 353 00:13:07,470 --> 00:13:09,219 these things are the same. 354 00:13:09,220 --> 00:13:11,579 If you're dealing with an entire CPU, 355 00:13:11,580 --> 00:13:12,899 it's a very different story. 356 00:13:14,040 --> 00:13:16,289 It might even have 357 00:13:16,290 --> 00:13:18,359 to reinflate the entire design so 358 00:13:18,360 --> 00:13:20,309 then you can verify it against something. 359 00:13:20,310 --> 00:13:22,329 And that's that's difficult to do. 360 00:13:22,330 --> 00:13:24,479 So the experience 361 00:13:24,480 --> 00:13:26,219 I have seen is that formal verification 362 00:13:26,220 --> 00:13:28,739 is great for a few of these selected 363 00:13:28,740 --> 00:13:29,879 execution units. 364 00:13:29,880 --> 00:13:31,829 It's a very small piece of the overall 365 00:13:31,830 --> 00:13:32,830 puzzle. 366 00:13:35,330 --> 00:13:37,669 So the 367 00:13:37,670 --> 00:13:39,649 the track that we're in, right, is on on 368 00:13:39,650 --> 00:13:41,719 failures and philosophy, so 369 00:13:41,720 --> 00:13:43,789 one thing I want to talk about is what 370 00:13:43,790 --> 00:13:45,649 is verification fail at? 371 00:13:45,650 --> 00:13:47,719 And we'll start by saying 372 00:13:47,720 --> 00:13:49,849 what is good at verification is good 373 00:13:49,850 --> 00:13:52,129 at finding bugs with your basic 374 00:13:52,130 --> 00:13:53,329 functional behavior. 375 00:13:53,330 --> 00:13:55,879 Does this particular mode of operation 376 00:13:55,880 --> 00:13:58,519 work? Do these exceptions 377 00:13:58,520 --> 00:14:00,349 happen when they're supposed to, that 378 00:14:00,350 --> 00:14:01,729 sort of thing? 379 00:14:01,730 --> 00:14:04,069 Anything you can do formal press for 380 00:14:04,070 --> 00:14:06,499 is also is also good. 381 00:14:06,500 --> 00:14:08,629 And it's also useful for 382 00:14:08,630 --> 00:14:09,859 coverage analysis. 383 00:14:09,860 --> 00:14:12,289 Are there all your instructions executed 384 00:14:12,290 --> 00:14:13,699 or they executed in all the different 385 00:14:13,700 --> 00:14:15,259 modes? Do you get all the different 386 00:14:15,260 --> 00:14:16,570 exceptions, that sort of thing? 387 00:14:17,750 --> 00:14:20,059 But verification doesn't find 388 00:14:20,060 --> 00:14:21,829 other types of bugs. 389 00:14:21,830 --> 00:14:23,929 One big category are a system 390 00:14:23,930 --> 00:14:24,869 level bugs. 391 00:14:24,870 --> 00:14:27,109 And as you remember from a few days ago, 392 00:14:27,110 --> 00:14:28,999 the system level model runs so 393 00:14:29,000 --> 00:14:31,369 ridiculously slow that 394 00:14:31,370 --> 00:14:33,289 you simply don't get a lot of test time 395 00:14:33,290 --> 00:14:35,359 on it. And so those are going to be the 396 00:14:35,360 --> 00:14:36,949 bugs that are most likely to slip through 397 00:14:36,950 --> 00:14:38,029 the cracks. 398 00:14:38,030 --> 00:14:40,159 Those are bugs like two different 399 00:14:40,160 --> 00:14:42,259 components and the design having 400 00:14:42,260 --> 00:14:44,539 some sort of protocol, disagreement where 401 00:14:44,540 --> 00:14:46,669 they end up in 402 00:14:46,670 --> 00:14:48,739 an unknown state because they don't talk 403 00:14:48,740 --> 00:14:49,740 the same language. 404 00:14:50,870 --> 00:14:53,419 The most common thing that I've seen 405 00:14:53,420 --> 00:14:56,059 is that multiple, 406 00:14:56,060 --> 00:14:58,189 seemingly random events seem 407 00:14:58,190 --> 00:15:00,299 to be required to hit these bugs. 408 00:15:00,300 --> 00:15:02,539 So, you know, we're talking things like 409 00:15:02,540 --> 00:15:04,459 you're doing a compare exchange 410 00:15:04,460 --> 00:15:07,249 instruction when you get a cash probe, 411 00:15:07,250 --> 00:15:08,989 when an intercept is pending. 412 00:15:08,990 --> 00:15:11,209 And at the same cycle, the thermal sensor 413 00:15:11,210 --> 00:15:13,159 says the process is too hot. 414 00:15:13,160 --> 00:15:15,259 Right. Things like that are just 415 00:15:15,260 --> 00:15:18,259 difficult to hit all those kind of cases 416 00:15:18,260 --> 00:15:19,819 during testing. 417 00:15:19,820 --> 00:15:21,619 You start running at billions of cycles a 418 00:15:21,620 --> 00:15:23,479 second and they have a way of coming up a 419 00:15:23,480 --> 00:15:24,480 lot more often. 420 00:15:26,270 --> 00:15:28,549 Another thing that's difficult 421 00:15:28,550 --> 00:15:30,969 to find are any long run time events. 422 00:15:30,970 --> 00:15:32,869 So whenever you have a large data 423 00:15:32,870 --> 00:15:35,029 structure, it could be a three cache, 424 00:15:35,030 --> 00:15:36,030 something like that. 425 00:15:37,160 --> 00:15:39,289 Those are difficult to 426 00:15:39,290 --> 00:15:41,419 test some of those cases in 427 00:15:41,420 --> 00:15:42,349 verification. 428 00:15:42,350 --> 00:15:44,359 So you need to be aware of that. 429 00:15:44,360 --> 00:15:46,519 And a final thing I'll just mention that 430 00:15:46,520 --> 00:15:49,069 I've seen is what I'll call statistically 431 00:15:49,070 --> 00:15:50,479 unlikely matches. 432 00:15:50,480 --> 00:15:52,669 And imagine, 433 00:15:52,670 --> 00:15:55,339 for instance, that you have a design 434 00:15:55,340 --> 00:15:57,859 that has some sort of special behavior 435 00:15:57,860 --> 00:16:00,529 when you have two different, 436 00:16:00,530 --> 00:16:02,839 say, memory operations and 437 00:16:02,840 --> 00:16:04,909 the lower 20 bits of the address 438 00:16:04,910 --> 00:16:07,459 match. But the upper bits do not match. 439 00:16:07,460 --> 00:16:09,289 Well, if you're generating all of your 440 00:16:09,290 --> 00:16:11,359 addresses randomly in your random 441 00:16:11,360 --> 00:16:13,699 stimulus, the chance of that happening 442 00:16:13,700 --> 00:16:15,799 is really, really, really small. 443 00:16:15,800 --> 00:16:17,389 And you're not going to get a lot of test 444 00:16:17,390 --> 00:16:20,029 time on it and those 445 00:16:20,030 --> 00:16:22,069 bugs are going to slip through. 446 00:16:22,070 --> 00:16:24,109 Now, it's worth noting some of this you 447 00:16:24,110 --> 00:16:26,209 can kind of fix if you knew that the 448 00:16:26,210 --> 00:16:28,489 design was really sensitive to this case, 449 00:16:28,490 --> 00:16:30,559 were 20 bits match and the upper twenty 450 00:16:30,560 --> 00:16:32,299 eight bits don't. 451 00:16:32,300 --> 00:16:34,609 You could specifically 452 00:16:34,610 --> 00:16:35,689 have stimulus for that. 453 00:16:35,690 --> 00:16:37,789 You could constrain your random address 454 00:16:37,790 --> 00:16:39,889 generator to generate cases like 455 00:16:39,890 --> 00:16:42,079 that. Some of these cases, however, you 456 00:16:42,080 --> 00:16:44,689 can't do a whole lot about multiple 457 00:16:44,690 --> 00:16:46,549 random events. 458 00:16:46,550 --> 00:16:48,679 You do the best you can, but something is 459 00:16:48,680 --> 00:16:49,700 always going to slip through. 460 00:16:52,970 --> 00:16:55,039 Great, so the next thing 461 00:16:55,040 --> 00:16:57,949 that I want to talk about here is 462 00:16:57,950 --> 00:17:00,499 what is done after the silicon 463 00:17:00,500 --> 00:17:02,629 comes back and. 464 00:17:04,200 --> 00:17:05,848 As I'm sure you can imagine, you get your 465 00:17:05,849 --> 00:17:08,459 silicon back from the fab and 466 00:17:08,460 --> 00:17:10,588 it's not going to work perfectly and 467 00:17:10,589 --> 00:17:11,669 you have to debug it. 468 00:17:11,670 --> 00:17:12,868 So how do you debug stuff? 469 00:17:12,869 --> 00:17:14,608 While everyone here, I'm sure, has 470 00:17:14,609 --> 00:17:15,959 debugged things before? 471 00:17:15,960 --> 00:17:17,910 It usually looks something like this. 472 00:17:19,020 --> 00:17:21,088 If you're with software, you 473 00:17:21,089 --> 00:17:22,649 know, you probably investigate the 474 00:17:22,650 --> 00:17:24,879 problem. You run GDB, do some print 475 00:17:24,880 --> 00:17:26,848 of something, you figure out what you did 476 00:17:26,849 --> 00:17:29,009 wrong and fix you change the code, 477 00:17:29,010 --> 00:17:29,909 you recompile. 478 00:17:29,910 --> 00:17:30,910 And that's all great 479 00:17:32,040 --> 00:17:33,929 with hardware. It doesn't quite work like 480 00:17:33,930 --> 00:17:35,069 that. 481 00:17:35,070 --> 00:17:37,229 So first thing is, is what 482 00:17:37,230 --> 00:17:39,809 the what happened in the design? 483 00:17:39,810 --> 00:17:42,059 One common way of figuring this 484 00:17:42,060 --> 00:17:44,339 out is to use something called geotag 485 00:17:44,340 --> 00:17:45,989 interface. I'm sure some of you are 486 00:17:45,990 --> 00:17:48,329 familiar. Geotag stands for Joint Test 487 00:17:48,330 --> 00:17:50,759 Action Group. It's a I Tripolis standard 488 00:17:50,760 --> 00:17:53,429 that you'll see on a lot of hardware 489 00:17:53,430 --> 00:17:54,430 and 490 00:17:55,710 --> 00:17:57,899 modern, say, six processors 491 00:17:57,900 --> 00:17:59,999 have geotag pins. 492 00:18:00,000 --> 00:18:01,769 You can see an example of some of those 493 00:18:01,770 --> 00:18:04,079 here. You have your test data in 494 00:18:04,080 --> 00:18:06,299 test data out test clock and so on. 495 00:18:07,500 --> 00:18:09,779 And the Tripoli spec 496 00:18:09,780 --> 00:18:11,939 dictates how you communicate to these 497 00:18:11,940 --> 00:18:13,319 pins. 498 00:18:13,320 --> 00:18:15,659 Now, I should mention that 499 00:18:15,660 --> 00:18:18,059 these pins are physically there. 500 00:18:18,060 --> 00:18:20,249 They're generally not brought out 501 00:18:20,250 --> 00:18:22,199 in motherboard. So you won't find an easy 502 00:18:22,200 --> 00:18:23,609 way to connect to them. But they're still 503 00:18:23,610 --> 00:18:25,319 physically there on the package, if you 504 00:18:25,320 --> 00:18:26,320 look hard enough. 505 00:18:29,120 --> 00:18:31,249 Processors have to implement certain 506 00:18:31,250 --> 00:18:32,659 geotag commands that are part of the 507 00:18:32,660 --> 00:18:34,729 standard things like bypass 508 00:18:34,730 --> 00:18:37,279 ID code, which are typically used for 509 00:18:37,280 --> 00:18:39,289 verifying that, like Sawad, connections 510 00:18:39,290 --> 00:18:40,819 on a board are valid. 511 00:18:40,820 --> 00:18:43,519 But I probably also left the spec open 512 00:18:43,520 --> 00:18:45,619 to add whatever 513 00:18:45,620 --> 00:18:47,419 other proprietary commands a vendor 514 00:18:47,420 --> 00:18:49,579 wants. And so 515 00:18:49,580 --> 00:18:51,799 you can imagine that if you're debugging 516 00:18:51,800 --> 00:18:54,289 a CPU, you might want to have 517 00:18:54,290 --> 00:18:56,269 your kind of debugger commands write, 518 00:18:56,270 --> 00:18:58,789 read and write, register, stay memory, 519 00:18:58,790 --> 00:19:01,369 single step, these sort of things. 520 00:19:01,370 --> 00:19:03,469 And that can be very 521 00:19:03,470 --> 00:19:05,869 useful for figuring out what happened. 522 00:19:05,870 --> 00:19:07,849 Now, of course, the processor doesn't 523 00:19:07,850 --> 00:19:08,959 magically do this. 524 00:19:08,960 --> 00:19:11,419 You have to design this debugger 525 00:19:11,420 --> 00:19:13,309 into the system. 526 00:19:13,310 --> 00:19:15,499 But in in you're 527 00:19:15,500 --> 00:19:17,749 going to need something like that because 528 00:19:17,750 --> 00:19:18,769 you're going to have bugs. 529 00:19:20,730 --> 00:19:22,799 This sort of thing can be very useful. 530 00:19:22,800 --> 00:19:24,789 I mean, it's kind of like your GDB, 531 00:19:24,790 --> 00:19:27,209 whatever, but it 532 00:19:27,210 --> 00:19:28,889 it doesn't work always. 533 00:19:28,890 --> 00:19:31,319 And in particular, one very common 534 00:19:31,320 --> 00:19:33,659 thing to happen with silicon 535 00:19:33,660 --> 00:19:36,299 designs is the thing will just hang 536 00:19:36,300 --> 00:19:38,249 and it'll just be completely frozen and 537 00:19:38,250 --> 00:19:39,779 there's nothing you can do to it. 538 00:19:39,780 --> 00:19:42,419 So how do you debug that? 539 00:19:42,420 --> 00:19:44,129 The answer is something called a scan 540 00:19:44,130 --> 00:19:46,379 dump. And this is 541 00:19:46,380 --> 00:19:48,479 a feature that is kind of like 542 00:19:48,480 --> 00:19:50,549 a Crashdown. Basically, you take 543 00:19:50,550 --> 00:19:52,619 all of the register state, all 544 00:19:52,620 --> 00:19:54,239 the flip flops on the design. 545 00:19:54,240 --> 00:19:56,159 You dump them all out through the geotag 546 00:19:56,160 --> 00:19:58,289 port so you can go and analyze 547 00:19:58,290 --> 00:20:00,749 it. And the way this works 548 00:20:00,750 --> 00:20:02,879 is that when flip flops are 549 00:20:02,880 --> 00:20:05,189 built into the silicon, 550 00:20:05,190 --> 00:20:06,759 they look something like this. 551 00:20:06,760 --> 00:20:09,059 So you may remember your kind of standard 552 00:20:09,060 --> 00:20:10,949 flip flop from from class. 553 00:20:12,240 --> 00:20:14,519 There's now an extra mux in front 554 00:20:14,520 --> 00:20:16,559 of it that selects either between the 555 00:20:16,560 --> 00:20:18,809 normal data that that flop would store 556 00:20:18,810 --> 00:20:21,849 and something called the scan in data. 557 00:20:21,850 --> 00:20:23,979 These are then connected together 558 00:20:23,980 --> 00:20:26,259 in a chain like so 559 00:20:26,260 --> 00:20:28,119 so the output of one goes to the skin and 560 00:20:28,120 --> 00:20:30,219 for the next, when 561 00:20:30,220 --> 00:20:32,319 you want to dump the flop state, 562 00:20:32,320 --> 00:20:34,449 you assert the Scandinavia's signal that 563 00:20:34,450 --> 00:20:35,919 goes to all the flops. 564 00:20:35,920 --> 00:20:38,319 And then as you clock the design every 565 00:20:38,320 --> 00:20:40,509 cycle, all the flop shift 566 00:20:40,510 --> 00:20:42,459 into each other and the final flop and 567 00:20:42,460 --> 00:20:45,159 the chain is then connected to your 568 00:20:45,160 --> 00:20:47,289 pen. And over time, 569 00:20:47,290 --> 00:20:49,359 and of course not a lot of time, 570 00:20:49,360 --> 00:20:51,429 you can read out all of 571 00:20:51,430 --> 00:20:53,959 the register state in the design. 572 00:20:55,210 --> 00:20:56,929 So this is pretty nice. 573 00:20:56,930 --> 00:20:58,449 You can then analyze it offline, 574 00:20:58,450 --> 00:20:59,450 whatever. 575 00:21:00,370 --> 00:21:01,819 You know, it's not perfect. 576 00:21:01,820 --> 00:21:03,969 One big limitation is you only 577 00:21:03,970 --> 00:21:06,309 get the data that's stored and flops 578 00:21:06,310 --> 00:21:07,599 in the design. 579 00:21:07,600 --> 00:21:09,699 You don't get access to any of 580 00:21:09,700 --> 00:21:12,579 the intermediate signals in the design. 581 00:21:12,580 --> 00:21:14,709 Also, it's a single point of 582 00:21:14,710 --> 00:21:16,479 time thing. It's really like a crash 583 00:21:16,480 --> 00:21:17,469 dump. 584 00:21:17,470 --> 00:21:20,079 It's very likely that the information 585 00:21:20,080 --> 00:21:21,920 you're looking for is no longer there. 586 00:21:23,020 --> 00:21:25,269 One problem with running at the 587 00:21:25,270 --> 00:21:27,729 clock rates that CPAs do is 588 00:21:27,730 --> 00:21:29,829 that if you don't get the 589 00:21:29,830 --> 00:21:31,989 hang right, then there's no way 590 00:21:31,990 --> 00:21:34,839 you can manually stop this thing in time. 591 00:21:34,840 --> 00:21:36,609 Sometimes you have to take a whole bunch 592 00:21:36,610 --> 00:21:38,709 of these and see if you can 593 00:21:38,710 --> 00:21:40,869 get lucky. Sometimes you got 594 00:21:40,870 --> 00:21:43,149 to look at all the kind of invalid 595 00:21:43,150 --> 00:21:44,799 state and the things that are clearly 596 00:21:44,800 --> 00:21:47,019 left over from earlier 597 00:21:47,020 --> 00:21:48,020 iterations. 598 00:21:49,450 --> 00:21:52,059 But these are two examples of 599 00:21:52,060 --> 00:21:54,099 practices that are that are often used, 600 00:21:54,100 --> 00:21:56,439 there are many others 601 00:21:56,440 --> 00:21:58,229 which I can't talk about, 602 00:21:59,710 --> 00:22:02,079 so let's assume 603 00:22:02,080 --> 00:22:04,179 that we've used some methods and we've 604 00:22:04,180 --> 00:22:05,350 figured out what happened. 605 00:22:06,910 --> 00:22:09,129 Fixing it is not exactly a piece 606 00:22:09,130 --> 00:22:10,689 of cake either. You know, the simplest 607 00:22:10,690 --> 00:22:12,909 thing would be to go fix a very long code 608 00:22:12,910 --> 00:22:15,069 and redo the entire 609 00:22:15,070 --> 00:22:17,259 design, which takes two months and three 610 00:22:17,260 --> 00:22:18,669 million dollars. 611 00:22:18,670 --> 00:22:21,039 So the good news is that you don't have 612 00:22:21,040 --> 00:22:22,779 to all the time. 613 00:22:22,780 --> 00:22:24,789 The way that modern ships are built is 614 00:22:24,790 --> 00:22:26,439 that there is something called a base 615 00:22:26,440 --> 00:22:28,689 layer and then metal 616 00:22:28,690 --> 00:22:30,999 layers and modern process technology 617 00:22:31,000 --> 00:22:33,549 has up to about nine different layers. 618 00:22:33,550 --> 00:22:36,099 And to overly simplify 619 00:22:36,100 --> 00:22:38,829 this, because I'm not a process engineer, 620 00:22:38,830 --> 00:22:41,229 the way this works is the base layer has 621 00:22:41,230 --> 00:22:43,419 your logic, gates your and gates your 622 00:22:43,420 --> 00:22:45,639 gates, et cetera, and the metal layers 623 00:22:45,640 --> 00:22:47,949 or the wires that connect those gates 624 00:22:47,950 --> 00:22:48,939 together. 625 00:22:48,940 --> 00:22:51,219 So what that means is if 626 00:22:51,220 --> 00:22:53,229 you need to add new gates in the design, 627 00:22:53,230 --> 00:22:55,089 you got to change the bass player, which 628 00:22:55,090 --> 00:22:56,529 is the most expensive thing, and you've 629 00:22:56,530 --> 00:22:57,489 got to wait the full time. 630 00:22:57,490 --> 00:22:58,509 And that sucks. 631 00:22:58,510 --> 00:23:00,549 If you don't need to do that, you can 632 00:23:00,550 --> 00:23:02,709 just change maybe even one 633 00:23:02,710 --> 00:23:05,109 metal layer in that stack that tends 634 00:23:05,110 --> 00:23:06,819 to be significantly cheaper. 635 00:23:06,820 --> 00:23:09,399 It also means 636 00:23:09,400 --> 00:23:11,589 that it's less of a delay through the fab 637 00:23:11,590 --> 00:23:13,239 because you can kind of intercept their 638 00:23:13,240 --> 00:23:15,309 pipeline because chips are built first at 639 00:23:15,310 --> 00:23:17,259 the base and then the metal layers as 640 00:23:17,260 --> 00:23:18,159 they go up. 641 00:23:18,160 --> 00:23:20,289 So inserting a new metal layer into that 642 00:23:20,290 --> 00:23:22,839 might only cause a few weeks 643 00:23:22,840 --> 00:23:24,910 of delay before you get the results back. 644 00:23:26,200 --> 00:23:28,329 One thing that's common for 645 00:23:28,330 --> 00:23:30,879 physical designers to do is that whenever 646 00:23:30,880 --> 00:23:33,219 they're building a block, if they have 647 00:23:33,220 --> 00:23:35,679 any white space in that area, 648 00:23:35,680 --> 00:23:38,289 they will actually put extra gates 649 00:23:38,290 --> 00:23:39,939 that are not connected anything. 650 00:23:39,940 --> 00:23:42,009 They're just there on the off chance that 651 00:23:42,010 --> 00:23:44,079 there's a bug and you need a new gate and 652 00:23:44,080 --> 00:23:45,189 you want to wire it up. 653 00:23:45,190 --> 00:23:47,439 So it's it's 654 00:23:47,440 --> 00:23:48,729 kind of if you're building the silicon 655 00:23:48,730 --> 00:23:50,670 and why not put some useful things in it? 656 00:23:53,890 --> 00:23:55,989 Now, those are the costly solutions, 657 00:23:55,990 --> 00:23:58,239 that's the only way that you can really 658 00:23:58,240 --> 00:24:00,789 fix an issue, 659 00:24:00,790 --> 00:24:02,709 but there are a lot of things you can do 660 00:24:02,710 --> 00:24:05,229 to work around the issue, and 661 00:24:05,230 --> 00:24:07,179 that might not cost as much as three 662 00:24:07,180 --> 00:24:08,649 million dollars. 663 00:24:08,650 --> 00:24:10,959 So, you know, one thing that we do is 664 00:24:10,960 --> 00:24:13,149 like if there's a problem, you you 665 00:24:13,150 --> 00:24:15,279 know, you go to the lab and 666 00:24:15,280 --> 00:24:17,439 you try to look for one of these and see 667 00:24:17,440 --> 00:24:19,540 if it can rewire your chip for you. 668 00:24:20,770 --> 00:24:22,539 We tend to have the more sophisticated 669 00:24:22,540 --> 00:24:24,669 version, which looks like this. 670 00:24:24,670 --> 00:24:27,069 This is a focused ion beam machine. 671 00:24:28,840 --> 00:24:30,189 It's a cool beast. 672 00:24:30,190 --> 00:24:32,259 It can take a chip that's 673 00:24:32,260 --> 00:24:34,599 already been fabricated and 674 00:24:34,600 --> 00:24:36,729 it has an electron microscope in 675 00:24:36,730 --> 00:24:38,679 there. It actually shoots little ions 676 00:24:38,680 --> 00:24:40,869 into it and it can rewire 677 00:24:40,870 --> 00:24:42,249 small parts of the design. 678 00:24:42,250 --> 00:24:43,899 Can't do major things, but you can do 679 00:24:43,900 --> 00:24:44,900 small things. 680 00:24:46,010 --> 00:24:48,259 This is done 681 00:24:48,260 --> 00:24:50,329 on a per chip basis. 682 00:24:50,330 --> 00:24:53,419 The only issue with this is 683 00:24:53,420 --> 00:24:55,609 you have to do it chip 684 00:24:55,610 --> 00:24:57,769 by Chip. And also the chips have a very 685 00:24:57,770 --> 00:24:59,779 strong tendency to die in about one to 686 00:24:59,780 --> 00:25:01,399 two weeks afterwards. 687 00:25:01,400 --> 00:25:03,529 And that's just because the process is 688 00:25:03,530 --> 00:25:05,749 so destructive, destructive to the chip. 689 00:25:05,750 --> 00:25:08,179 So if you need a prototype something, 690 00:25:08,180 --> 00:25:09,139 it's great. 691 00:25:09,140 --> 00:25:10,999 You can do it in a couple of hours. 692 00:25:11,000 --> 00:25:13,309 You get your results, you can try it out. 693 00:25:13,310 --> 00:25:15,439 But this is not 694 00:25:15,440 --> 00:25:17,839 going to be a production solution and 695 00:25:17,840 --> 00:25:18,840 neither is the cat. 696 00:25:21,060 --> 00:25:22,619 So what else can we do? 697 00:25:22,620 --> 00:25:24,719 Well, one very common practice 698 00:25:24,720 --> 00:25:26,849 is that Harvard designers 699 00:25:26,850 --> 00:25:29,219 will put disabled bits into 700 00:25:29,220 --> 00:25:30,839 the hardware. And I've always heard this 701 00:25:30,840 --> 00:25:32,909 called chicken bits because 702 00:25:32,910 --> 00:25:34,799 the designer is chicken and maybe the 703 00:25:34,800 --> 00:25:36,149 thing won't work. 704 00:25:36,150 --> 00:25:38,729 And these are 705 00:25:38,730 --> 00:25:40,209 very useful. 706 00:25:40,210 --> 00:25:43,139 They are typically used for performance 707 00:25:43,140 --> 00:25:45,299 or power enhancements on a design 708 00:25:45,300 --> 00:25:47,339 so that you could disable a certain 709 00:25:47,340 --> 00:25:49,109 feature and the processor still works 710 00:25:49,110 --> 00:25:50,429 just fine. That's maybe a little bit 711 00:25:50,430 --> 00:25:51,689 slower. 712 00:25:51,690 --> 00:25:54,509 And it's worth noting about this, that 713 00:25:54,510 --> 00:25:57,269 when processors are built, 714 00:25:57,270 --> 00:25:59,579 there are some things that give 715 00:25:59,580 --> 00:26:00,779 you a ton of performance. 716 00:26:00,780 --> 00:26:02,369 I mean, branch prediction, everyone's got 717 00:26:02,370 --> 00:26:03,809 branch prediction nowadays. 718 00:26:03,810 --> 00:26:04,920 But the way the 719 00:26:06,210 --> 00:26:08,369 chips get the new 720 00:26:08,370 --> 00:26:09,959 performance that you see generation on 721 00:26:09,960 --> 00:26:12,119 generation is generally by 722 00:26:12,120 --> 00:26:14,609 a sum of very, very, very small parts. 723 00:26:14,610 --> 00:26:17,099 There will be features that get you 724 00:26:17,100 --> 00:26:19,289 point five percent over here, point 725 00:26:19,290 --> 00:26:20,669 to five percent over here. 726 00:26:20,670 --> 00:26:21,869 You know, maybe there's a big feature, 727 00:26:21,870 --> 00:26:22,870 gets you one percent. 728 00:26:24,000 --> 00:26:26,009 These all stack up and then you get your 729 00:26:26,010 --> 00:26:27,839 10, 15 percent improvement, whatever 730 00:26:27,840 --> 00:26:29,099 you're expecting. 731 00:26:29,100 --> 00:26:31,379 If you need to disable one of those 732 00:26:31,380 --> 00:26:33,629 to fix a critical bug, that's not always 733 00:26:33,630 --> 00:26:35,099 the end of the world. 734 00:26:35,100 --> 00:26:37,229 And there have been cases where that's 735 00:26:37,230 --> 00:26:39,359 that's been the workaround that had to go 736 00:26:39,360 --> 00:26:40,529 out. 737 00:26:40,530 --> 00:26:42,689 You can find some of these in a 738 00:26:42,690 --> 00:26:44,819 document that AMD 739 00:26:44,820 --> 00:26:46,409 publishes called the BIOS and Kernel 740 00:26:46,410 --> 00:26:47,369 Developers Guide. 741 00:26:47,370 --> 00:26:49,019 I'm sure there's an intellectual into 742 00:26:49,020 --> 00:26:50,159 this as well. 743 00:26:50,160 --> 00:26:52,469 In fact, I think we've seen screenshots 744 00:26:52,470 --> 00:26:54,539 of it and other presentations 745 00:26:54,540 --> 00:26:55,799 on Eighty-six. 746 00:26:55,800 --> 00:26:57,939 These sort of bits live in what's called 747 00:26:57,940 --> 00:27:00,519 the model specific registers or Mauser's. 748 00:27:00,520 --> 00:27:03,059 This is an example of one 749 00:27:03,060 --> 00:27:05,249 the DataCash configuration register. 750 00:27:05,250 --> 00:27:07,499 And there's a few bits in here that 751 00:27:07,500 --> 00:27:09,599 are defined that can be used to 752 00:27:09,600 --> 00:27:11,879 disable certain aspects of the Java 753 00:27:11,880 --> 00:27:13,739 Prefecture, which is, of course, 754 00:27:13,740 --> 00:27:15,099 performance enhancement. 755 00:27:15,100 --> 00:27:17,339 There's also a bit to disable 756 00:27:17,340 --> 00:27:19,589 speculative table walk's as 757 00:27:19,590 --> 00:27:20,819 well. 758 00:27:20,820 --> 00:27:22,949 These features are useful not 759 00:27:22,950 --> 00:27:24,629 only for production. 760 00:27:24,630 --> 00:27:26,639 If you have a bug with, say, the Java 761 00:27:26,640 --> 00:27:28,829 Prefecture, they're also very useful 762 00:27:28,830 --> 00:27:30,419 in debugging because you can start 763 00:27:30,420 --> 00:27:32,369 disabling things until you get down to 764 00:27:32,370 --> 00:27:33,899 the root cause. 765 00:27:33,900 --> 00:27:36,239 This does require that designers 766 00:27:36,240 --> 00:27:38,699 kind of think about what failures 767 00:27:38,700 --> 00:27:39,909 they're going to have and what things 768 00:27:39,910 --> 00:27:41,279 they're going to want to disable down the 769 00:27:41,280 --> 00:27:43,649 road. And it's one of these things 770 00:27:43,650 --> 00:27:46,049 where you might as well throw 771 00:27:46,050 --> 00:27:48,059 the kitchen sink at it, because I'd much 772 00:27:48,060 --> 00:27:50,399 rather never set a bit than 773 00:27:50,400 --> 00:27:52,499 have a bug that requires 774 00:27:52,500 --> 00:27:53,940 a three million dollar respin. 775 00:27:57,020 --> 00:27:59,389 Another option that is available 776 00:27:59,390 --> 00:28:01,339 on modern CPU's is something called Micra 777 00:28:01,340 --> 00:28:03,499 code patch MICRA code 778 00:28:03,500 --> 00:28:05,539 is like an on chip firm firmware. 779 00:28:05,540 --> 00:28:08,059 It's used on processors 780 00:28:08,060 --> 00:28:10,279 typically for implementing things 781 00:28:10,280 --> 00:28:12,379 like Complex 686 instructions. 782 00:28:12,380 --> 00:28:13,789 So like IRA at 783 00:28:14,900 --> 00:28:17,009 RSM, there's a whole bunch of 784 00:28:17,010 --> 00:28:18,769 other hardware tasks which things like 785 00:28:18,770 --> 00:28:21,199 that interrupt delivery 786 00:28:21,200 --> 00:28:23,149 and a lot of power management features 787 00:28:23,150 --> 00:28:25,429 and micro code basically breaks 788 00:28:25,430 --> 00:28:27,529 up these complex flows into 789 00:28:27,530 --> 00:28:29,509 a sequence of smaller operations that the 790 00:28:29,510 --> 00:28:31,759 hard work and natively understand. 791 00:28:31,760 --> 00:28:33,889 The way that myCar code is built 792 00:28:33,890 --> 00:28:35,959 is that it's in an ancient ROM 793 00:28:35,960 --> 00:28:38,719 that's physically present in the silicon. 794 00:28:38,720 --> 00:28:40,759 But a very common practice is to put a 795 00:28:40,760 --> 00:28:43,369 small s ram next to that wrong 796 00:28:43,370 --> 00:28:45,529 call patch ram and that patch 797 00:28:45,530 --> 00:28:47,959 ram can then be used to replace some 798 00:28:47,960 --> 00:28:50,449 or maybe even all of the micro code 799 00:28:50,450 --> 00:28:52,819 if needed, to either fix bugs 800 00:28:52,820 --> 00:28:54,799 or work around things in some way. 801 00:28:54,800 --> 00:28:57,679 And so this is useful for 802 00:28:57,680 --> 00:28:59,509 modifying instruction behavior. 803 00:28:59,510 --> 00:29:01,909 So, for instance, if you had to add 804 00:29:01,910 --> 00:29:04,219 a serialization after a C flush, 805 00:29:04,220 --> 00:29:05,959 that's something that that myCar code 806 00:29:05,960 --> 00:29:06,889 could do. 807 00:29:06,890 --> 00:29:09,019 If there is some rare corner case 808 00:29:09,020 --> 00:29:11,149 that you discover there's a bug in the 809 00:29:11,150 --> 00:29:13,339 micro code flow, you can 810 00:29:13,340 --> 00:29:14,340 patch it through that. 811 00:29:15,650 --> 00:29:17,569 There's not a lot of public documentation 812 00:29:17,570 --> 00:29:19,939 on myCar code patches. 813 00:29:19,940 --> 00:29:22,009 The best resource that 814 00:29:22,010 --> 00:29:23,899 you could probably find is going to be in 815 00:29:23,900 --> 00:29:25,069 the Linux kernel. 816 00:29:25,070 --> 00:29:26,539 There's a path to where the micro could 817 00:29:26,540 --> 00:29:28,159 patch letters are. There's one for Intel, 818 00:29:28,160 --> 00:29:29,179 there's one for and 819 00:29:30,310 --> 00:29:32,389 one. One thing I'll say about 820 00:29:32,390 --> 00:29:34,399 this is microcar patches are typically 821 00:29:34,400 --> 00:29:36,229 signed by the vendor. 822 00:29:36,230 --> 00:29:38,659 So I'm sorry, but it's not something 823 00:29:38,660 --> 00:29:40,789 that you can necessarily go off 824 00:29:40,790 --> 00:29:43,009 and write. But 825 00:29:43,010 --> 00:29:45,559 these are a very useful tool 826 00:29:45,560 --> 00:29:47,329 that I'm sure you can imagine would be 827 00:29:47,330 --> 00:29:49,399 applicable to things besides just 66 828 00:29:49,400 --> 00:29:51,769 cores as well for patching 829 00:29:51,770 --> 00:29:54,049 things in the field if necessary. 830 00:29:56,820 --> 00:29:59,609 So putting this all together 831 00:29:59,610 --> 00:30:01,229 when dealing with hardware, we've talked 832 00:30:01,230 --> 00:30:03,509 about geotag, debugging and scan 833 00:30:03,510 --> 00:30:05,639 is two ways that you can help identify 834 00:30:05,640 --> 00:30:07,529 the problem when it comes to fixing 835 00:30:07,530 --> 00:30:09,419 things. I use workaround in there as 836 00:30:09,420 --> 00:30:11,759 well, because sometimes 837 00:30:11,760 --> 00:30:13,319 you've got to get stuff out the door and 838 00:30:13,320 --> 00:30:14,579 you do whatever it takes. 839 00:30:14,580 --> 00:30:16,439 And the line between fix and workaround 840 00:30:16,440 --> 00:30:18,329 becomes very blurred. 841 00:30:18,330 --> 00:30:20,729 But you obviously have silicon spin's, 842 00:30:20,730 --> 00:30:22,619 you have myCar code patch, you have 843 00:30:22,620 --> 00:30:24,689 chicken bits, and if 844 00:30:24,690 --> 00:30:26,969 you need a quick fix, you can always 845 00:30:26,970 --> 00:30:28,079 go and get a fit done. 846 00:30:30,480 --> 00:30:32,879 So since we are 847 00:30:32,880 --> 00:30:35,069 a lot of us are security people, 848 00:30:35,070 --> 00:30:37,139 I felt like I should mention something 849 00:30:37,140 --> 00:30:38,609 about security. 850 00:30:38,610 --> 00:30:40,799 All of the debug interfaces that I've 851 00:30:40,800 --> 00:30:43,529 mentioned here might 852 00:30:43,530 --> 00:30:45,599 be considered more than debug interfaces 853 00:30:45,600 --> 00:30:47,819 to some, and that's something 854 00:30:47,820 --> 00:30:49,499 that can't be ignored. 855 00:30:49,500 --> 00:30:51,599 Debug interface security needs 856 00:30:51,600 --> 00:30:53,459 to be something that's part of the design 857 00:30:53,460 --> 00:30:54,779 and is tested. 858 00:30:54,780 --> 00:30:57,059 Some examples of the security that 859 00:30:57,060 --> 00:30:59,009 I've seen in the past have been, for 860 00:30:59,010 --> 00:31:01,199 instance, to disable 861 00:31:01,200 --> 00:31:03,209 some or all of the geotag commands on 862 00:31:03,210 --> 00:31:04,769 production parts. 863 00:31:04,770 --> 00:31:07,109 A typical way this is done is that 864 00:31:07,110 --> 00:31:09,299 there are fuzes that exist on 865 00:31:09,300 --> 00:31:10,379 the silicon. 866 00:31:10,380 --> 00:31:12,779 Basically one time programable memory. 867 00:31:12,780 --> 00:31:14,909 Once a part is configured for 868 00:31:14,910 --> 00:31:17,279 production, the Fuze has blown those 869 00:31:17,280 --> 00:31:18,509 instructions or disable. 870 00:31:19,900 --> 00:31:22,359 Another another 871 00:31:22,360 --> 00:31:24,489 possibility could be to 872 00:31:24,490 --> 00:31:26,829 ensure the debug access to sensitive 873 00:31:26,830 --> 00:31:28,509 information so there could be platform 874 00:31:28,510 --> 00:31:30,759 secrets like keys or 875 00:31:30,760 --> 00:31:32,709 firmware or something like that is 876 00:31:32,710 --> 00:31:34,449 blocked in production. 877 00:31:34,450 --> 00:31:36,549 You can have some 878 00:31:36,550 --> 00:31:38,889 sort of authentication done to 879 00:31:38,890 --> 00:31:41,679 use a debug interface like geotag. 880 00:31:41,680 --> 00:31:43,959 Obviously, you have to test that. 881 00:31:43,960 --> 00:31:46,389 Having to debug the debug authentication 882 00:31:46,390 --> 00:31:48,789 handshake is a real pain, 883 00:31:48,790 --> 00:31:51,369 but that's certainly 884 00:31:51,370 --> 00:31:53,649 a way that you could 885 00:31:53,650 --> 00:31:56,109 add some security to this and sign 886 00:31:56,110 --> 00:31:58,449 CPU marker code updates are the common 887 00:31:58,450 --> 00:31:59,499 practice nowadays. 888 00:32:01,590 --> 00:32:03,689 So there's some takeaways 889 00:32:03,690 --> 00:32:05,759 I want to leave you with, and 890 00:32:05,760 --> 00:32:08,159 I know that CPU design 891 00:32:08,160 --> 00:32:10,319 may not be the project that you are about 892 00:32:10,320 --> 00:32:12,569 to go and start after 893 00:32:12,570 --> 00:32:13,679 midnight tonight. 894 00:32:13,680 --> 00:32:15,869 But I think a lot of the 895 00:32:15,870 --> 00:32:17,339 techniques here are hopefully still 896 00:32:17,340 --> 00:32:19,529 interesting and hopefully can be applied 897 00:32:19,530 --> 00:32:22,169 to other projects as well. 898 00:32:22,170 --> 00:32:24,389 First off, breaking down large designs 899 00:32:24,390 --> 00:32:26,639 into small chunks seems somewhat 900 00:32:26,640 --> 00:32:28,709 obvious, but that's absolutely 901 00:32:28,710 --> 00:32:30,209 critical, especially when you're dealing 902 00:32:30,210 --> 00:32:32,729 with something that runs as slow as 903 00:32:32,730 --> 00:32:35,909 a CPU does and simulation 904 00:32:35,910 --> 00:32:37,829 using tools to get the most out of your 905 00:32:37,830 --> 00:32:39,959 test. Time is certainly important. 906 00:32:39,960 --> 00:32:42,899 Using coverage tools, using formal tools. 907 00:32:42,900 --> 00:32:45,209 Anything that you can get your hands on 908 00:32:45,210 --> 00:32:47,579 is very useful 909 00:32:47,580 --> 00:32:50,549 just for maximizing the 910 00:32:50,550 --> 00:32:52,859 usefulness out of each compute cycle 911 00:32:52,860 --> 00:32:54,509 you're running. But the biggest thing I 912 00:32:54,510 --> 00:32:56,309 would say is to just think about what the 913 00:32:56,310 --> 00:32:59,039 weaknesses in your testing flow are 914 00:32:59,040 --> 00:33:01,259 and have some way 915 00:33:01,260 --> 00:33:02,849 of addressing those. 916 00:33:02,850 --> 00:33:04,829 One thing you see with CPU designers is 917 00:33:04,830 --> 00:33:06,359 over the years as they work on multiple 918 00:33:06,360 --> 00:33:08,279 projects, they know where the bugs are 919 00:33:08,280 --> 00:33:09,959 going to be. Even though they wrote the 920 00:33:09,960 --> 00:33:11,639 code, they know they're still going to be 921 00:33:11,640 --> 00:33:13,769 bugs in there and they build the design 922 00:33:13,770 --> 00:33:14,819 accordingly. 923 00:33:14,820 --> 00:33:17,579 So try to design for failures 924 00:33:17,580 --> 00:33:19,649 by building in as much 925 00:33:19,650 --> 00:33:22,199 into the original design as possible. 926 00:33:22,200 --> 00:33:24,359 So building and debug features 927 00:33:24,360 --> 00:33:26,549 if you need to, and securing those 928 00:33:26,550 --> 00:33:29,159 debug features, anticipating 929 00:33:29,160 --> 00:33:30,869 what the risk areas are going to be. 930 00:33:30,870 --> 00:33:33,839 And, you know, with hardware, 931 00:33:33,840 --> 00:33:36,149 you never really have 932 00:33:36,150 --> 00:33:38,159 this option except with MICRA code of 933 00:33:38,160 --> 00:33:39,659 just telling your users to go and 934 00:33:39,660 --> 00:33:40,859 download a patch. 935 00:33:40,860 --> 00:33:43,079 I mean, if you actually had to fix 936 00:33:43,080 --> 00:33:45,179 a real hardware bug out in the 937 00:33:45,180 --> 00:33:47,519 field, you'd be shipping silicon 938 00:33:47,520 --> 00:33:49,019 to millions and millions of people in the 939 00:33:49,020 --> 00:33:51,149 world. And that's not a 940 00:33:51,150 --> 00:33:53,609 position anyone wants to be in. 941 00:33:53,610 --> 00:33:56,099 So you have to have these other features 942 00:33:56,100 --> 00:33:58,229 built into the parts so that you can 943 00:33:58,230 --> 00:34:00,479 address areas as they come up with 944 00:34:00,480 --> 00:34:01,480 software 945 00:34:02,610 --> 00:34:05,489 that sometimes can still be the case. 946 00:34:05,490 --> 00:34:07,769 I'm not a software person, so, you 947 00:34:07,770 --> 00:34:09,178 know, take that with a grain of salt. 948 00:34:09,179 --> 00:34:11,099 But I'm sure there are some situations 949 00:34:11,100 --> 00:34:13,499 where just doing a software upgrade 950 00:34:13,500 --> 00:34:15,089 and asking users to download something 951 00:34:15,090 --> 00:34:17,218 may not be a practical solution for 952 00:34:17,219 --> 00:34:18,209 their environment. 953 00:34:18,210 --> 00:34:20,638 And in those cases, it's helpful 954 00:34:20,639 --> 00:34:23,488 to have some way to 955 00:34:23,489 --> 00:34:25,859 deploy updates, especially critical 956 00:34:25,860 --> 00:34:27,189 ones, if necessary. 957 00:34:28,560 --> 00:34:31,019 So that is 958 00:34:31,020 --> 00:34:32,399 what I have here. 959 00:34:32,400 --> 00:34:34,859 I do have some additional links 960 00:34:34,860 --> 00:34:37,019 if people are interested in reading 961 00:34:37,020 --> 00:34:38,129 more about this. 962 00:34:38,130 --> 00:34:40,289 The Boston Kernel Developers Guide 963 00:34:40,290 --> 00:34:41,399 is a great resource. 964 00:34:41,400 --> 00:34:43,799 It's might be a thousand 965 00:34:43,800 --> 00:34:46,439 pages or more, but it's a fascinating 966 00:34:46,440 --> 00:34:48,718 read and it goes through 967 00:34:48,719 --> 00:34:51,029 virtually every register that exists 968 00:34:51,030 --> 00:34:53,158 in 86 processors. 969 00:34:53,159 --> 00:34:55,349 And what those bits do, the 970 00:34:55,350 --> 00:34:57,509 CPU revision guides or the 971 00:34:57,510 --> 00:34:59,579 Arada documents, both Intel and 972 00:34:59,580 --> 00:35:00,809 AMD publish them. 973 00:35:00,810 --> 00:35:02,519 You need to make sure to find the one for 974 00:35:02,520 --> 00:35:05,309 your specific CPU version. 975 00:35:05,310 --> 00:35:08,159 But they're very interesting to see 976 00:35:08,160 --> 00:35:10,229 not only what the bugs are, 977 00:35:10,230 --> 00:35:12,029 but you can also look at revision, 978 00:35:12,030 --> 00:35:14,069 revision and see kind of what bugs are so 979 00:35:14,070 --> 00:35:15,659 unimportant they're never getting fixed. 980 00:35:16,980 --> 00:35:19,259 And if you're interested 981 00:35:19,260 --> 00:35:21,359 in CPU verification, 982 00:35:21,360 --> 00:35:23,309 there's a lot of great resources. 983 00:35:23,310 --> 00:35:25,379 There's some YouTube links here, 984 00:35:25,380 --> 00:35:26,879 to be honest. You can just Google it. 985 00:35:26,880 --> 00:35:27,959 You'll find a lot of stuff. 986 00:35:29,100 --> 00:35:30,899 It's it's an interesting field. 987 00:35:30,900 --> 00:35:32,819 There are plenty there's plenty of work 988 00:35:32,820 --> 00:35:35,099 that's done in optimizing 989 00:35:35,100 --> 00:35:37,379 verification and doing 990 00:35:37,380 --> 00:35:38,759 more formal proofs and more 991 00:35:38,760 --> 00:35:42,209 circumstances, doing 992 00:35:42,210 --> 00:35:43,159 something called power. 993 00:35:43,160 --> 00:35:45,359 Aware verification is very important now 994 00:35:45,360 --> 00:35:46,979 when we're dealing with chips, where 995 00:35:46,980 --> 00:35:48,599 parts of it will be powered down at 996 00:35:48,600 --> 00:35:50,579 different times and you can't use those 997 00:35:50,580 --> 00:35:52,679 parts without powering them up. 998 00:35:52,680 --> 00:35:55,169 So there's a lot of interesting 999 00:35:55,170 --> 00:35:56,519 work being done in that space. 1000 00:35:56,520 --> 00:35:58,919 I encourage anyone interested to 1001 00:35:58,920 --> 00:36:00,449 go take a look at these. 1002 00:36:00,450 --> 00:36:02,579 And with that, I'll take some questions. 1003 00:36:14,670 --> 00:36:16,139 One, two. 1004 00:36:16,140 --> 00:36:17,829 OK, thank you very much, David. 1005 00:36:19,080 --> 00:36:20,999 Can I please add that? 1006 00:36:21,000 --> 00:36:22,529 No, I'm sorry. 1007 00:36:22,530 --> 00:36:25,469 Would you please cue up on one and three 1008 00:36:25,470 --> 00:36:27,089 something, some glitch? 1009 00:36:27,090 --> 00:36:28,090 I'm sorry. 1010 00:36:29,520 --> 00:36:31,529 OK, first question from the room, 1011 00:36:31,530 --> 00:36:32,489 gentlemen. 1012 00:36:32,490 --> 00:36:33,490 Number three 1013 00:36:34,620 --> 00:36:35,819 things of all. First of all, for the 1014 00:36:35,820 --> 00:36:38,279 awesome talk and I don't know if you 1015 00:36:38,280 --> 00:36:39,899 know anything about this or if this is 1016 00:36:39,900 --> 00:36:42,029 outside your domain, but I'm wondering 1017 00:36:42,030 --> 00:36:44,219 how does this sort of layout aspect 1018 00:36:44,220 --> 00:36:45,959 come into this? So if I'm programing 1019 00:36:45,960 --> 00:36:48,149 FPGA, I just blast my my 1020 00:36:48,150 --> 00:36:50,009 design in there and it's laid out 1021 00:36:50,010 --> 00:36:51,209 automatically and everything just 1022 00:36:51,210 --> 00:36:52,210 happens. 1023 00:36:52,740 --> 00:36:55,109 But RCP use actually laid on manually 1024 00:36:55,110 --> 00:36:57,239 or how does that look on the sort of 1025 00:36:57,240 --> 00:36:58,349 analog side? 1026 00:36:58,350 --> 00:36:59,350 Yeah. 1027 00:37:02,290 --> 00:37:04,510 A bit of a bit of everything, I would say 1028 00:37:06,070 --> 00:37:08,169 there's been a it used to be 1029 00:37:08,170 --> 00:37:10,479 that everything was laid out manually and 1030 00:37:10,480 --> 00:37:12,339 that was because the tools at the time 1031 00:37:12,340 --> 00:37:14,259 were simply not good at designs of that 1032 00:37:14,260 --> 00:37:15,889 size recently. 1033 00:37:15,890 --> 00:37:17,269 And when I say recently, I mean the last 1034 00:37:17,270 --> 00:37:19,539 probably five to 10 years, there has 1035 00:37:19,540 --> 00:37:21,879 been much more of a push towards 1036 00:37:21,880 --> 00:37:24,729 automated layout and automated synthesis. 1037 00:37:24,730 --> 00:37:26,949 And frankly, the modern tools are really, 1038 00:37:26,950 --> 00:37:27,999 really good. 1039 00:37:28,000 --> 00:37:30,099 And what tends to 1040 00:37:30,100 --> 00:37:32,199 happen is that you actually 1041 00:37:32,200 --> 00:37:34,479 do best if you take, say, 1042 00:37:34,480 --> 00:37:36,579 an entire CPU and just throw it 1043 00:37:36,580 --> 00:37:38,439 at the tool and say figure out where 1044 00:37:38,440 --> 00:37:39,489 stuff goes. Right. 1045 00:37:39,490 --> 00:37:41,649 And in some cases, you don't even 1046 00:37:41,650 --> 00:37:43,479 say, here's where I want instruction 1047 00:37:43,480 --> 00:37:45,039 fach, here's where I want to multiply or 1048 00:37:45,040 --> 00:37:46,749 things like that. You just give it that. 1049 00:37:46,750 --> 00:37:48,699 And it has a good way of figuring out 1050 00:37:48,700 --> 00:37:50,529 just based on the connections where 1051 00:37:50,530 --> 00:37:52,329 things need to be in relationship to each 1052 00:37:52,330 --> 00:37:54,729 other. So it's I 1053 00:37:54,730 --> 00:37:56,799 would say it's mostly automated, but 1054 00:37:56,800 --> 00:37:59,349 with some human input as well. 1055 00:37:59,350 --> 00:38:01,569 And, you 1056 00:38:01,570 --> 00:38:04,059 know, I think that it's 1057 00:38:04,060 --> 00:38:06,159 different from FPGA is 1058 00:38:06,160 --> 00:38:07,869 because of the way it's built, obviously, 1059 00:38:07,870 --> 00:38:08,829 with ASICs. 1060 00:38:08,830 --> 00:38:10,989 But the tools 1061 00:38:10,990 --> 00:38:12,820 certainly have some similarities to them. 1062 00:38:15,260 --> 00:38:17,719 OK, the gentleman number one, 1063 00:38:17,720 --> 00:38:20,099 so modern and support 1064 00:38:20,100 --> 00:38:22,219 these Fed into consists of 1065 00:38:22,220 --> 00:38:25,249 well, it doesn't include the third one 1066 00:38:25,250 --> 00:38:27,349 incased Latz Microsleep 1067 00:38:27,350 --> 00:38:29,509 of upper management and of course, an 1068 00:38:29,510 --> 00:38:31,469 ex 86 Gulf oil 1069 00:38:32,570 --> 00:38:33,469 processing. 1070 00:38:33,470 --> 00:38:36,289 And the third call include 1071 00:38:36,290 --> 00:38:39,259 in modern spaces, the 1072 00:38:39,260 --> 00:38:41,729 secure whatever 1073 00:38:41,730 --> 00:38:43,849 you test the whole package 1074 00:38:43,850 --> 00:38:45,739 or just individual parts. 1075 00:38:45,740 --> 00:38:48,919 So, yes. 1076 00:38:48,920 --> 00:38:51,379 So typically 1077 00:38:51,380 --> 00:38:53,509 the way it works is that each chip 1078 00:38:53,510 --> 00:38:54,949 is built as a collection of what's called 1079 00:38:54,950 --> 00:38:57,079 IPS, and the 1080 00:38:57,080 --> 00:38:59,129 six core is one i.p. 1081 00:38:59,130 --> 00:39:01,319 The power management controller is called 1082 00:39:01,320 --> 00:39:02,899 the same you as one IP. 1083 00:39:02,900 --> 00:39:05,269 The security processor is called the 1084 00:39:05,270 --> 00:39:07,339 platform security processor is one 1085 00:39:07,340 --> 00:39:09,199 IP and you have a whole bunch of others 1086 00:39:09,200 --> 00:39:10,639 in there. You have memory controllers, 1087 00:39:10,640 --> 00:39:13,129 you have Southbridge, etc. 1088 00:39:13,130 --> 00:39:15,229 So those do most of their 1089 00:39:15,230 --> 00:39:16,879 verification on their own. 1090 00:39:16,880 --> 00:39:19,279 But there is a system level verification, 1091 00:39:19,280 --> 00:39:21,109 that thing that runs at one hertz that's 1092 00:39:21,110 --> 00:39:23,839 done with all of those together and 1093 00:39:23,840 --> 00:39:26,119 that is more limited 1094 00:39:26,120 --> 00:39:27,889 and because of the speed involved. 1095 00:39:27,890 --> 00:39:30,319 But there certainly is verification done 1096 00:39:30,320 --> 00:39:31,320 on the whole piece. 1097 00:39:34,560 --> 00:39:36,000 We'll have to include 1098 00:39:37,980 --> 00:39:40,349 people outside, do we have 1099 00:39:40,350 --> 00:39:42,379 a question from the Internet? 1100 00:39:42,380 --> 00:39:42,929 Yes. 1101 00:39:42,930 --> 00:39:44,820 So where would 1102 00:39:45,840 --> 00:39:48,150 where would one actually set those 1103 00:39:49,290 --> 00:39:51,119 chicken bits and can it be set in the 1104 00:39:51,120 --> 00:39:52,459 operating system? 1105 00:39:52,460 --> 00:39:53,789 So, yeah. 1106 00:39:53,790 --> 00:39:55,529 So the chicken bits live in model 1107 00:39:55,530 --> 00:39:56,519 specific registers. 1108 00:39:56,520 --> 00:39:58,379 So they're set using the right MSRA 1109 00:39:58,380 --> 00:39:59,309 command. 1110 00:39:59,310 --> 00:40:01,469 You need to be at ring zero to do that. 1111 00:40:01,470 --> 00:40:03,050 But that's it. 1112 00:40:05,470 --> 00:40:06,429 OK, gentlemen. 1113 00:40:06,430 --> 00:40:08,709 Number three, you I 1114 00:40:08,710 --> 00:40:10,779 am very impressed with your talk because 1115 00:40:10,780 --> 00:40:12,549 I'm a security guy and definitely not the 1116 00:40:12,550 --> 00:40:14,829 hardware guy, but as a security 1117 00:40:14,830 --> 00:40:17,449 guy, I see you I see this process 1118 00:40:17,450 --> 00:40:19,539 that's struggling 1119 00:40:19,540 --> 00:40:21,999 with keeping out unintentional bugs. 1120 00:40:22,000 --> 00:40:24,189 So this process, how susceptible is 1121 00:40:24,190 --> 00:40:26,319 it to a person 1122 00:40:26,320 --> 00:40:28,269 within your organization trying to 1123 00:40:28,270 --> 00:40:31,059 introduce a really hard to detect 1124 00:40:31,060 --> 00:40:32,170 intentional bug? 1125 00:40:35,080 --> 00:40:36,459 How would you detect such a thing 1126 00:40:36,460 --> 00:40:38,400 otherwise, right? 1127 00:40:41,450 --> 00:40:43,789 You know, I think that's probably an area 1128 00:40:43,790 --> 00:40:45,409 I can't talk too much about, 1129 00:40:47,510 --> 00:40:49,639 but I'm trying to think if there's 1130 00:40:49,640 --> 00:40:51,709 anything I can say about that, 1131 00:40:51,710 --> 00:40:53,030 I would say that, 1132 00:40:54,110 --> 00:40:56,359 you know, there there are a lot 1133 00:40:56,360 --> 00:40:58,399 of different phases to the design 1134 00:40:58,400 --> 00:41:00,529 process. There's a lot of eyes that 1135 00:41:00,530 --> 00:41:01,530 see things. 1136 00:41:02,390 --> 00:41:04,819 Speaking as myself, I think it'd be 1137 00:41:04,820 --> 00:41:07,129 very difficult for someone 1138 00:41:07,130 --> 00:41:09,199 to get something through kind 1139 00:41:09,200 --> 00:41:11,029 of all the different checks and balances 1140 00:41:11,030 --> 00:41:12,030 that there are. 1141 00:41:12,680 --> 00:41:15,469 But beyond that, 1142 00:41:15,470 --> 00:41:17,239 there's not a whole lot I can tell you. 1143 00:41:17,240 --> 00:41:19,129 One thing a little bit related to what 1144 00:41:19,130 --> 00:41:21,319 you talk about is that there's 1145 00:41:21,320 --> 00:41:24,019 also the whole piece of the fab, right. 1146 00:41:24,020 --> 00:41:25,639 Let's say that your company produces a 1147 00:41:25,640 --> 00:41:28,669 design that is perfect, whatever. 1148 00:41:28,670 --> 00:41:30,409 It may not be, the design that the fab 1149 00:41:30,410 --> 00:41:31,579 sends you back. 1150 00:41:31,580 --> 00:41:33,259 And that's the whole issue called supply 1151 00:41:33,260 --> 00:41:35,689 chain security, which is 1152 00:41:35,690 --> 00:41:37,789 certainly something on our radar as well. 1153 00:41:37,790 --> 00:41:39,349 It's you know, unfortunately, it can be 1154 00:41:39,350 --> 00:41:41,929 very difficult to. 1155 00:41:41,930 --> 00:41:43,399 Well, let me say this. 1156 00:41:43,400 --> 00:41:45,589 When you get silicone back, you tend 1157 00:41:45,590 --> 00:41:47,539 to test for the features that you expect 1158 00:41:47,540 --> 00:41:49,669 to be there. It's very difficult to test 1159 00:41:49,670 --> 00:41:51,139 for features that are there that you're 1160 00:41:51,140 --> 00:41:52,140 not expecting. 1161 00:41:54,380 --> 00:41:56,199 OK, thank you, the gentleman. 1162 00:41:56,200 --> 00:41:58,439 Number one is California. 1163 00:41:58,440 --> 00:42:00,709 Yeah, so we all know the pics of failure 1164 00:42:00,710 --> 00:42:02,389 classes of our little monetarists. 1165 00:42:02,390 --> 00:42:04,939 We all bought are there are pics of 1166 00:42:04,940 --> 00:42:06,979 other classes for CPU's. 1167 00:42:06,980 --> 00:42:08,839 And I am able to give you a million 1168 00:42:08,840 --> 00:42:10,939 dollars to get a better test c.p.u than I 1169 00:42:10,940 --> 00:42:12,829 can buy on Amazon. 1170 00:42:12,830 --> 00:42:15,049 No, they're not 1171 00:42:15,050 --> 00:42:17,289 they're not failure classes 1172 00:42:17,290 --> 00:42:18,619 or anything like that. 1173 00:42:18,620 --> 00:42:21,109 The the CPU's 1174 00:42:21,110 --> 00:42:22,489 are all functionally the same. 1175 00:42:23,990 --> 00:42:26,329 The you know, there can be differences 1176 00:42:26,330 --> 00:42:28,399 depending on your bio's version as to 1177 00:42:28,400 --> 00:42:30,709 what fixes are applied, like what mycar 1178 00:42:30,710 --> 00:42:32,899 code version you're loading, 1179 00:42:32,900 --> 00:42:35,329 because unfortunately, you 1180 00:42:35,330 --> 00:42:37,409 know, kind of like like Tramel 1181 00:42:37,410 --> 00:42:39,469 was talking about, even 1182 00:42:39,470 --> 00:42:42,559 if there is a bug and we release, 1183 00:42:42,560 --> 00:42:44,959 say, myCar code update for something, 1184 00:42:44,960 --> 00:42:47,089 we can't force an 1185 00:42:47,090 --> 00:42:49,189 OEM to put in their bios, we can't force 1186 00:42:49,190 --> 00:42:50,779 you to download it so that there's an 1187 00:42:50,780 --> 00:42:51,780 issue there. 1188 00:42:52,580 --> 00:42:54,949 The the only thing I'll mention 1189 00:42:54,950 --> 00:42:57,049 about kind of we 1190 00:42:57,050 --> 00:42:58,879 call beaning, which is where you test 1191 00:42:58,880 --> 00:43:00,049 parts and you figure out what when they 1192 00:43:00,050 --> 00:43:02,269 go into is that when we do make 1193 00:43:02,270 --> 00:43:04,609 parts, there are not different speed 1194 00:43:04,610 --> 00:43:05,989 grades or anything like that. 1195 00:43:05,990 --> 00:43:07,759 The way that you get a two gigahertz 1196 00:43:07,760 --> 00:43:09,499 versus a two point two versus a two point 1197 00:43:09,500 --> 00:43:11,509 four, anything else is just simply how it 1198 00:43:11,510 --> 00:43:13,339 came out of the some parts just run 1199 00:43:13,340 --> 00:43:14,329 faster than others. 1200 00:43:14,330 --> 00:43:16,069 Some parts burn more or less power than 1201 00:43:16,070 --> 00:43:17,070 others. 1202 00:43:18,350 --> 00:43:20,149 So not one. 1203 00:43:20,150 --> 00:43:23,239 You know, you have to get lucky if you 1204 00:43:23,240 --> 00:43:25,399 want to fast like whether your part 1205 00:43:25,400 --> 00:43:27,439 can run faster or not is often just luck. 1206 00:43:29,740 --> 00:43:31,899 Sorry, um, we 1207 00:43:31,900 --> 00:43:34,359 got something on the Internet in between, 1208 00:43:35,510 --> 00:43:37,719 um, otherwise, all right. 1209 00:43:37,720 --> 00:43:40,419 Um, so in in the verification 1210 00:43:40,420 --> 00:43:42,219 stage, how do you distinguish design 1211 00:43:42,220 --> 00:43:44,619 flaws from, uh, fabrication 1212 00:43:44,620 --> 00:43:45,620 issues? 1213 00:43:46,700 --> 00:43:48,829 So in the verification 1214 00:43:48,830 --> 00:43:50,899 stage, in the verification 1215 00:43:50,900 --> 00:43:52,309 stage, the design has not gone through 1216 00:43:52,310 --> 00:43:54,889 fabrication yet, so you're just testing 1217 00:43:54,890 --> 00:43:56,929 the very long code. 1218 00:43:56,930 --> 00:43:59,209 There are mechanisms 1219 00:43:59,210 --> 00:44:01,249 for testing when it comes back from the 1220 00:44:01,250 --> 00:44:03,349 lab, whether the 1221 00:44:03,350 --> 00:44:05,569 part was built correctly or not. 1222 00:44:05,570 --> 00:44:07,219 Sometimes, you know, that can be as 1223 00:44:07,220 --> 00:44:09,439 simple as reproducing 1224 00:44:09,440 --> 00:44:11,179 the same bug on multiple different parts 1225 00:44:11,180 --> 00:44:13,369 because chances are they weren't all made 1226 00:44:13,370 --> 00:44:14,299 incorrectly. 1227 00:44:14,300 --> 00:44:16,099 There's a number of other features we 1228 00:44:16,100 --> 00:44:18,229 build in that are under a giant category 1229 00:44:18,230 --> 00:44:19,879 of design for test features that 1230 00:44:19,880 --> 00:44:22,099 functionally validate whether 1231 00:44:22,100 --> 00:44:23,449 all the flip flops in the design are 1232 00:44:23,450 --> 00:44:24,499 working, things like that. 1233 00:44:24,500 --> 00:44:25,500 That can be another talk. 1234 00:44:26,630 --> 00:44:27,559 OK, gentlemen. 1235 00:44:27,560 --> 00:44:29,749 Number three, I 1236 00:44:29,750 --> 00:44:31,999 would like to know how many 1237 00:44:32,000 --> 00:44:34,529 of these design cycles is a typical 1238 00:44:34,530 --> 00:44:36,079 excited six processor going through 1239 00:44:36,080 --> 00:44:37,959 before it's finished. 1240 00:44:39,200 --> 00:44:41,569 So it varies 1241 00:44:41,570 --> 00:44:43,789 significantly by how many new features, 1242 00:44:43,790 --> 00:44:45,319 of course, were added in a particular 1243 00:44:45,320 --> 00:44:46,519 generation. 1244 00:44:46,520 --> 00:44:48,649 I would say that sometimes it can 1245 00:44:48,650 --> 00:44:50,659 be as little as one or two. 1246 00:44:50,660 --> 00:44:53,129 Sometimes it can be, 1247 00:44:53,130 --> 00:44:54,920 I would say between five and 10. 1248 00:44:56,460 --> 00:44:58,579 One way of kind of tracking this is if 1249 00:44:58,580 --> 00:45:00,649 you hear about things like A zero 1250 00:45:00,650 --> 00:45:02,959 or B zero or B two parts, 1251 00:45:02,960 --> 00:45:05,689 the the first letter is 1252 00:45:05,690 --> 00:45:07,669 basically the base layer version and then 1253 00:45:07,670 --> 00:45:09,289 the number is the metal layer version. 1254 00:45:09,290 --> 00:45:11,539 So like a B two part is 1255 00:45:11,540 --> 00:45:13,099 the second version of the base layer and 1256 00:45:13,100 --> 00:45:15,050 the third version of the middle layer. 1257 00:45:17,640 --> 00:45:20,039 Gentlemen, number one, please 1258 00:45:20,040 --> 00:45:21,599 forgive me for asking this, but I'm a 1259 00:45:21,600 --> 00:45:23,789 security researcher, so 1260 00:45:23,790 --> 00:45:25,919 if you dramatically simplified the 1261 00:45:25,920 --> 00:45:28,589 process or by removing all the legacy 1262 00:45:28,590 --> 00:45:30,989 and other crap in it, how 1263 00:45:32,310 --> 00:45:33,310 are you going? 1264 00:45:34,110 --> 00:45:35,969 What are you calling the designs crap? 1265 00:45:35,970 --> 00:45:37,469 No, no, I didn't say that. 1266 00:45:37,470 --> 00:45:39,839 But how would the testing 1267 00:45:39,840 --> 00:45:41,129 change? 1268 00:45:41,130 --> 00:45:42,130 So. 1269 00:45:43,990 --> 00:45:46,079 There are a lot of legacy features in 1270 00:45:46,080 --> 00:45:47,409 next 36. 1271 00:45:47,410 --> 00:45:50,259 One interesting thing is that 1272 00:45:50,260 --> 00:45:52,269 it would not necessarily make things 1273 00:45:52,270 --> 00:45:54,909 simpler. And the reason is that 1274 00:45:54,910 --> 00:45:56,649 if you take something out, you have to 1275 00:45:56,650 --> 00:45:58,299 take it out of all of your existing tests 1276 00:45:58,300 --> 00:45:59,709 and out of all of your random test 1277 00:45:59,710 --> 00:46:01,449 generators and out of all of your models 1278 00:46:01,450 --> 00:46:03,309 that check things in this sort of thing. 1279 00:46:03,310 --> 00:46:05,619 And believe it or not, for some things 1280 00:46:05,620 --> 00:46:07,209 that can end up being more work than it 1281 00:46:07,210 --> 00:46:09,399 is to just put the darn thing in and 1282 00:46:09,400 --> 00:46:11,229 test it again, which I know is a little 1283 00:46:11,230 --> 00:46:13,509 counterintuitive, but that's the reality 1284 00:46:13,510 --> 00:46:15,059 of it. Unfortunately, 1285 00:46:16,120 --> 00:46:18,189 it's really hard to take things out 1286 00:46:18,190 --> 00:46:20,349 of 86 because there's so much 1287 00:46:20,350 --> 00:46:21,839 software out there. 1288 00:46:21,840 --> 00:46:23,949 We we finally got 1289 00:46:23,950 --> 00:46:27,019 rid of three of three now. 1290 00:46:27,020 --> 00:46:29,379 And I think we're getting rid of a 20, 1291 00:46:29,380 --> 00:46:31,909 which has been around since like the 286. 1292 00:46:31,910 --> 00:46:32,910 So I think. 1293 00:46:34,640 --> 00:46:36,769 But, yeah, it's a 1294 00:46:36,770 --> 00:46:37,770 tough battle. 1295 00:46:39,800 --> 00:46:41,149 Number three, please. 1296 00:46:41,150 --> 00:46:43,579 I actually have two questions. 1297 00:46:43,580 --> 00:46:45,469 The first one is you mentioned that 1298 00:46:45,470 --> 00:46:47,509 there's a piece of RAM that you can 1299 00:46:47,510 --> 00:46:50,269 program to modify microworld 1300 00:46:50,270 --> 00:46:52,639 programing after the module 1301 00:46:52,640 --> 00:46:53,640 has been manufactured. 1302 00:46:54,860 --> 00:46:57,319 Why burn the micro code 1303 00:46:57,320 --> 00:46:59,479 into the silicon at all when 1304 00:46:59,480 --> 00:47:01,489 there's a piece of RAM that is big enough 1305 00:47:01,490 --> 00:47:04,129 to hold the entire set of instructions? 1306 00:47:04,130 --> 00:47:06,859 So the RAM is not necessarily 1307 00:47:06,860 --> 00:47:08,389 always big enough to hold the 1308 00:47:08,390 --> 00:47:10,009 instructions. 1309 00:47:10,010 --> 00:47:12,169 The I 1310 00:47:12,170 --> 00:47:14,509 would say the primary reason for building 1311 00:47:14,510 --> 00:47:16,069 and well, there's there's a few. 1312 00:47:16,070 --> 00:47:18,229 The first thing is that RAM is much, much 1313 00:47:18,230 --> 00:47:20,579 smaller in silicon than than Ranna's. 1314 00:47:20,580 --> 00:47:22,789 So if you are not 1315 00:47:22,790 --> 00:47:24,859 going to build a ram that's as big 1316 00:47:24,860 --> 00:47:26,719 as everything could be, then you are 1317 00:47:26,720 --> 00:47:28,969 going to save area by putting the portion 1318 00:47:28,970 --> 00:47:30,649 of it. And Ron, there's also just a major 1319 00:47:30,650 --> 00:47:32,899 security advantage to doing that. 1320 00:47:32,900 --> 00:47:35,119 You don't have to trust your 1321 00:47:35,120 --> 00:47:37,249 loading process as much 1322 00:47:37,250 --> 00:47:38,599 and you don't have to. 1323 00:47:38,600 --> 00:47:40,579 And of course, that loading process, if 1324 00:47:40,580 --> 00:47:41,749 you didn't have any ram or need to be 1325 00:47:41,750 --> 00:47:43,699 built completely in hardware, which means 1326 00:47:43,700 --> 00:47:45,109 you've got to get it right the first 1327 00:47:45,110 --> 00:47:46,789 time, which can be difficult. 1328 00:47:46,790 --> 00:47:48,949 So those tend to be the reasons that 1329 00:47:48,950 --> 00:47:50,960 falls into RAM and legacy. 1330 00:47:52,400 --> 00:47:54,709 And the second 1331 00:47:54,710 --> 00:47:56,779 question was, at what point in the design 1332 00:47:56,780 --> 00:47:58,969 cycle do you decide which 1333 00:47:58,970 --> 00:48:01,189 clock rate the 1334 00:48:01,190 --> 00:48:03,709 processor gets marketed under 1335 00:48:03,710 --> 00:48:06,319 or like runs 1336 00:48:06,320 --> 00:48:07,339 in typical operation? 1337 00:48:07,340 --> 00:48:08,779 Right. I mean, is typically part of the 1338 00:48:08,780 --> 00:48:11,899 initial design specification that 1339 00:48:11,900 --> 00:48:14,329 when you're going to create a design, 1340 00:48:14,330 --> 00:48:16,759 you want to have a target performance 1341 00:48:16,760 --> 00:48:17,869 envelope for that. 1342 00:48:17,870 --> 00:48:19,969 And based on the process 1343 00:48:19,970 --> 00:48:21,949 technology that tells you, OK, you need 1344 00:48:21,950 --> 00:48:24,379 to have this many gaits per cycle 1345 00:48:24,380 --> 00:48:25,639 or something like that. 1346 00:48:25,640 --> 00:48:27,469 Now, of course, when you actually 1347 00:48:27,470 --> 00:48:29,839 fabricate the design, you test it. 1348 00:48:29,840 --> 00:48:32,059 It you know, it could be different. 1349 00:48:32,060 --> 00:48:33,799 It depends on how good the early data 1350 00:48:33,800 --> 00:48:35,899 was, but it's typically part 1351 00:48:35,900 --> 00:48:37,999 of kind of the day one specification. 1352 00:48:39,080 --> 00:48:40,759 There's a question from the Internet in 1353 00:48:40,760 --> 00:48:42,899 between, I think. 1354 00:48:42,900 --> 00:48:45,719 Yes, so if the medical 1355 00:48:45,720 --> 00:48:47,879 update is a sign, what is the 1356 00:48:47,880 --> 00:48:50,369 CPU check the signatures 1357 00:48:50,370 --> 00:48:51,370 against? 1358 00:48:52,800 --> 00:48:54,849 So, uh. 1359 00:48:54,850 --> 00:48:57,159 A typical implantation 1360 00:48:57,160 --> 00:48:59,339 might have a public 1361 00:48:59,340 --> 00:49:01,659 key burned into a room that 1362 00:49:01,660 --> 00:49:03,639 is used to check the signature, but I 1363 00:49:03,640 --> 00:49:05,409 really can't go into too much detail on 1364 00:49:05,410 --> 00:49:06,410 that. 1365 00:49:08,900 --> 00:49:10,729 Number one, please. 1366 00:49:10,730 --> 00:49:13,219 So from the talk, it seems the can 1367 00:49:13,220 --> 00:49:14,899 walk up to the mic the other way. 1368 00:49:14,900 --> 00:49:15,920 We don't have you on tape 1369 00:49:17,840 --> 00:49:19,939 from from your talk shows that this 1370 00:49:19,940 --> 00:49:23,119 thing is a huge part of this process 1371 00:49:23,120 --> 00:49:25,759 with SPUs becoming more and more complex. 1372 00:49:25,760 --> 00:49:27,979 How big will the impact be 1373 00:49:27,980 --> 00:49:28,939 of testing? 1374 00:49:28,940 --> 00:49:31,129 How much will it delay new CPU 1375 00:49:31,130 --> 00:49:32,130 features? 1376 00:49:33,390 --> 00:49:35,609 It's a major factor, testing 1377 00:49:35,610 --> 00:49:37,829 is the biggest issue with 1378 00:49:37,830 --> 00:49:39,899 with CPU's, both in terms of 1379 00:49:39,900 --> 00:49:41,129 the amount of time it takes, the amount 1380 00:49:41,130 --> 00:49:42,689 of people, the amount of cost associated 1381 00:49:42,690 --> 00:49:43,690 with it. 1382 00:49:45,000 --> 00:49:46,889 I mean, things 1383 00:49:48,360 --> 00:49:50,669 things get slower as you add more stuff 1384 00:49:50,670 --> 00:49:51,059 into it. 1385 00:49:51,060 --> 00:49:53,609 On the other hand, sometimes new tools 1386 00:49:53,610 --> 00:49:55,679 and emulation technologies will pop up to 1387 00:49:55,680 --> 00:49:57,959 help mitigate some of that. 1388 00:49:57,960 --> 00:50:00,239 But, you know, 1389 00:50:00,240 --> 00:50:01,619 when people look around, they say, well, 1390 00:50:01,620 --> 00:50:03,539 why is it take so long for the CPU 1391 00:50:03,540 --> 00:50:05,969 features to get into things? 1392 00:50:05,970 --> 00:50:07,709 You know, even when it comes to security 1393 00:50:07,710 --> 00:50:10,139 features, this is kind of the reason 1394 00:50:10,140 --> 00:50:12,749 it it's a long process. 1395 00:50:12,750 --> 00:50:15,269 And every new feature you add 1396 00:50:15,270 --> 00:50:17,339 extends the time before you can start 1397 00:50:17,340 --> 00:50:18,869 selling the device and you don't make 1398 00:50:18,870 --> 00:50:21,129 money until you start doing that. 1399 00:50:21,130 --> 00:50:22,130 Thank you. Uh. 1400 00:50:23,960 --> 00:50:26,209 All right, OK, 1401 00:50:26,210 --> 00:50:28,279 yeah, I know you guys are 1402 00:50:28,280 --> 00:50:29,869 like follow up questions, but maybe a 1403 00:50:29,870 --> 00:50:31,099 short answer. 1404 00:50:31,100 --> 00:50:33,739 Do you see any new testing methods 1405 00:50:33,740 --> 00:50:35,449 on the horizon, something more 1406 00:50:35,450 --> 00:50:37,549 revolutionary than simulating 1407 00:50:37,550 --> 00:50:38,550 or. 1408 00:50:39,650 --> 00:50:42,319 I can't say I do right now, but 1409 00:50:42,320 --> 00:50:44,539 I'm not as much of a expert in 1410 00:50:44,540 --> 00:50:46,279 kind of what's coming up in that field. 1411 00:50:46,280 --> 00:50:48,469 But I think it's there could be new 1412 00:50:48,470 --> 00:50:50,119 stuff, certainly, that would be helpful. 1413 00:50:50,120 --> 00:50:51,469 OK, thank you. 1414 00:50:51,470 --> 00:50:53,469 OK, thank you. 1415 00:50:53,470 --> 00:50:55,069 Number three, please. 1416 00:50:55,070 --> 00:50:56,269 OK. 1417 00:50:56,270 --> 00:50:58,549 I was wondering if we could double the 1418 00:50:58,550 --> 00:51:00,649 amount of space needed on 1419 00:51:00,650 --> 00:51:02,919 the silicon to implement 1420 00:51:02,920 --> 00:51:05,449 the geotech device, the chicken bits, 1421 00:51:05,450 --> 00:51:06,439 all the stuff. 1422 00:51:06,440 --> 00:51:08,749 Is it a percent or much? 1423 00:51:08,750 --> 00:51:12,049 Much less so the 1424 00:51:12,050 --> 00:51:14,089 chicken bits tend to be very, very, very 1425 00:51:14,090 --> 00:51:16,399 minor because it tends to be kind of 1426 00:51:16,400 --> 00:51:18,529 one register and a disable 1427 00:51:18,530 --> 00:51:19,599 wire to some gate 1428 00:51:21,110 --> 00:51:22,339 geotag stuff. 1429 00:51:22,340 --> 00:51:24,529 I, I couldn't 1430 00:51:24,530 --> 00:51:27,139 quantify that specifically. 1431 00:51:27,140 --> 00:51:29,419 It's not that big, 1432 00:51:29,420 --> 00:51:31,309 I would say, compared to the other stuff 1433 00:51:31,310 --> 00:51:33,379 in CPU. If you look at a photo 1434 00:51:33,380 --> 00:51:35,629 of things, everything is small compared 1435 00:51:35,630 --> 00:51:37,069 to the caches. 1436 00:51:37,070 --> 00:51:39,229 So you have that. 1437 00:51:39,230 --> 00:51:41,299 But you know, a lot of this logic, 1438 00:51:41,300 --> 00:51:43,489 a lot of the logic is not considered 1439 00:51:43,490 --> 00:51:45,679 optional in some cases 1440 00:51:45,680 --> 00:51:47,719 because and I respect requires that in 1441 00:51:47,720 --> 00:51:50,509 some cases it's because if you 1442 00:51:50,510 --> 00:51:52,699 you can't debug the part, then 1443 00:51:52,700 --> 00:51:53,989 what what good is it at the end of the 1444 00:51:53,990 --> 00:51:54,990 day? 1445 00:51:56,640 --> 00:51:58,149 OK, thank you, gentlemen. 1446 00:51:58,150 --> 00:51:59,150 Number one. 1447 00:52:00,840 --> 00:52:03,479 On the scale between the 1448 00:52:03,480 --> 00:52:05,699 full simulation and one end 1449 00:52:05,700 --> 00:52:07,499 and then the hardware emulator in the 1450 00:52:07,500 --> 00:52:09,389 middle and the silicon at the end, do you 1451 00:52:09,390 --> 00:52:12,359 also use FPGA 1452 00:52:12,360 --> 00:52:14,489 where you basically partition the full 1453 00:52:14,490 --> 00:52:16,709 system and stick them into 1454 00:52:16,710 --> 00:52:17,710 it? 1455 00:52:17,790 --> 00:52:19,949 So there 1456 00:52:19,950 --> 00:52:21,599 are some cases where that's useful. 1457 00:52:21,600 --> 00:52:23,849 The biggest challenge is FPGA capacity. 1458 00:52:23,850 --> 00:52:25,919 Most FPGA is or simply not big 1459 00:52:25,920 --> 00:52:28,859 enough to handle designs like this. 1460 00:52:28,860 --> 00:52:31,739 And also sometimes 1461 00:52:31,740 --> 00:52:34,109 the stuff in them 1462 00:52:34,110 --> 00:52:36,479 is designed for more of an easy flow 1463 00:52:36,480 --> 00:52:38,999 and won't map into the FPGA as well. 1464 00:52:39,000 --> 00:52:41,159 Some of the emulator systems are 1465 00:52:41,160 --> 00:52:43,319 based on very large upgrades that 1466 00:52:43,320 --> 00:52:45,239 are put together and something like that, 1467 00:52:46,290 --> 00:52:49,519 where I'd say with with excess CPUs 1468 00:52:49,520 --> 00:52:51,599 I looked at at one time, like trying to 1469 00:52:51,600 --> 00:52:53,219 see, you know, could you synthesize the 1470 00:52:53,220 --> 00:52:55,229 sort of thing into a Xilinx chip? 1471 00:52:55,230 --> 00:52:58,169 And the answer was it was so 1472 00:52:58,170 --> 00:52:59,129 big. 1473 00:52:59,130 --> 00:53:01,449 I mean, compared to even what the bigger 1474 00:53:01,450 --> 00:53:03,749 Xilinx chip was, that there's 1475 00:53:03,750 --> 00:53:06,239 just no way you could do it easily. 1476 00:53:06,240 --> 00:53:08,399 Not a single chip, but basically if 1477 00:53:08,400 --> 00:53:10,409 you have four Falkor on Autocar or 1478 00:53:10,410 --> 00:53:12,929 whatever and split them and 1479 00:53:12,930 --> 00:53:14,100 partition the mobile multiple 1480 00:53:15,450 --> 00:53:17,489 that I've seen that work in some smaller 1481 00:53:17,490 --> 00:53:19,199 designs, not in anything as big as a 1482 00:53:19,200 --> 00:53:20,200 core. 1483 00:53:22,360 --> 00:53:24,489 OK, people, if you're leaving 1484 00:53:24,490 --> 00:53:26,859 before the talk ends, which is OK, 1485 00:53:26,860 --> 00:53:28,989 can you please do it silently 1486 00:53:28,990 --> 00:53:30,489 because there are still people asking 1487 00:53:30,490 --> 00:53:32,049 questions. 1488 00:53:32,050 --> 00:53:33,909 Can you sort of respect, you know, let's 1489 00:53:33,910 --> 00:53:35,889 have some gentleman in number three, 1490 00:53:35,890 --> 00:53:36,669 please. 1491 00:53:36,670 --> 00:53:39,039 Yeah. You told us about those 1492 00:53:39,040 --> 00:53:40,989 in C.P.U debugging features. 1493 00:53:40,990 --> 00:53:43,479 Are they all left in the final design 1494 00:53:43,480 --> 00:53:45,609 or do you decide at some point all 1495 00:53:45,610 --> 00:53:47,979 this damn thing is really expensive, 1496 00:53:47,980 --> 00:53:49,989 you should remove it and now's the time 1497 00:53:49,990 --> 00:53:51,309 to do it? 1498 00:53:51,310 --> 00:53:53,169 Well, we saw how expensive it was to 1499 00:53:53,170 --> 00:53:54,170 modify hardware, 1500 00:53:55,900 --> 00:53:57,730 so maybe that answers your question. 1501 00:53:59,880 --> 00:54:01,319 OK, gentlemen. 1502 00:54:01,320 --> 00:54:03,629 Number one, any debug 1503 00:54:03,630 --> 00:54:05,699 features for the timing so you 1504 00:54:05,700 --> 00:54:07,829 can verify that the 1505 00:54:09,000 --> 00:54:11,099 Dybbuk, the clock 1506 00:54:11,100 --> 00:54:13,439 does not hit your target if 1507 00:54:13,440 --> 00:54:14,440 there's something 1508 00:54:16,930 --> 00:54:19,049 to validate, if the part runs 1509 00:54:19,050 --> 00:54:20,159 at the speed it's supposed to 1510 00:54:21,280 --> 00:54:22,859 and it does not one, that's a speed where 1511 00:54:22,860 --> 00:54:24,089 you expect it. One. 1512 00:54:24,090 --> 00:54:25,709 Which part is the fault then? 1513 00:54:27,360 --> 00:54:28,589 You got to figure that out. 1514 00:54:30,450 --> 00:54:32,579 I know. So it's not an area that 1515 00:54:32,580 --> 00:54:33,719 I work with personally. 1516 00:54:34,770 --> 00:54:37,199 I know that one thing that sometimes used 1517 00:54:37,200 --> 00:54:39,479 our lasers that you 1518 00:54:39,480 --> 00:54:41,669 can if you shine a laser on 1519 00:54:41,670 --> 00:54:43,979 a certain part, it can hit 1520 00:54:43,980 --> 00:54:47,369 that part up and make it run, 1521 00:54:47,370 --> 00:54:48,989 I think of it faster or something like 1522 00:54:48,990 --> 00:54:50,489 that. And so you can use that to kind of 1523 00:54:50,490 --> 00:54:52,359 help figure out where the slow path is 1524 00:54:52,360 --> 00:54:54,689 and the design. Because, I mean, the 1525 00:54:54,690 --> 00:54:56,939 simplest thing is it's supposed to run a 1526 00:54:56,940 --> 00:54:59,159 three gigahertz. You run at a two, 1527 00:54:59,160 --> 00:55:00,479 three gigahertz and it doesn't work. 1528 00:55:00,480 --> 00:55:01,529 So you run a two point nine. 1529 00:55:01,530 --> 00:55:02,609 It does work. 1530 00:55:02,610 --> 00:55:04,860 And you then kind of 1531 00:55:06,450 --> 00:55:08,309 have to figure out what circuit is 1532 00:55:08,310 --> 00:55:09,629 causing the problem. 1533 00:55:09,630 --> 00:55:11,699 That happens sometimes. 1534 00:55:11,700 --> 00:55:12,929 Typically, it's not 1535 00:55:14,610 --> 00:55:17,009 not a huge deal because the libraries 1536 00:55:17,010 --> 00:55:18,749 that you work with during the development 1537 00:55:18,750 --> 00:55:20,609 process are very, very good about 1538 00:55:20,610 --> 00:55:22,649 figuring out what the timing is of the 1539 00:55:22,650 --> 00:55:24,059 different gates and making sure that you 1540 00:55:24,060 --> 00:55:25,060 don't have any issues. 1541 00:55:26,690 --> 00:55:28,819 OK, three more questions, please keep 1542 00:55:28,820 --> 00:55:30,919 them short and simple because we're out 1543 00:55:30,920 --> 00:55:31,999 of time. 1544 00:55:32,000 --> 00:55:33,619 Number three, please. 1545 00:55:33,620 --> 00:55:34,549 Yes. 1546 00:55:34,550 --> 00:55:36,109 I wanted to ask you, a security 1547 00:55:36,110 --> 00:55:37,110 specialist. 1548 00:55:39,080 --> 00:55:41,329 What's your job when you're designing 1549 00:55:41,330 --> 00:55:42,330 and UCP you? 1550 00:55:43,970 --> 00:55:46,309 Well, so my current job is actually 1551 00:55:46,310 --> 00:55:48,889 working on security 1552 00:55:48,890 --> 00:55:51,259 features for the Amte roadmap, 1553 00:55:51,260 --> 00:55:53,599 both including C.P.U features as well 1554 00:55:53,600 --> 00:55:55,969 as the platform security processor. 1555 00:55:55,970 --> 00:55:58,339 And in 1556 00:55:58,340 --> 00:56:01,039 that capacity, we work with 1557 00:56:01,040 --> 00:56:03,559 the different teams that are involved in 1558 00:56:03,560 --> 00:56:06,169 in those security features to help them 1559 00:56:06,170 --> 00:56:07,969 develop the specifications and make sure 1560 00:56:07,970 --> 00:56:09,559 that they're testing all the cases there 1561 00:56:09,560 --> 00:56:11,269 that are necessary. 1562 00:56:11,270 --> 00:56:13,610 But I don't get to write code anymore. 1563 00:56:15,630 --> 00:56:17,729 OK, did we leave out the Internet or is 1564 00:56:17,730 --> 00:56:19,409 there no more questions on the Internet, 1565 00:56:19,410 --> 00:56:20,369 which is OK? 1566 00:56:20,370 --> 00:56:21,519 OK, gentlemen. 1567 00:56:21,520 --> 00:56:23,079 Number three, please. 1568 00:56:23,080 --> 00:56:25,319 Um, regarding to the fact that 1569 00:56:25,320 --> 00:56:27,509 Cluck's read this, uh, specification, day 1570 00:56:27,510 --> 00:56:29,699 one fact, how likely is it to 1571 00:56:29,700 --> 00:56:31,919 reduce the clock rate, uh, 1572 00:56:31,920 --> 00:56:34,019 for a fix in order to 1573 00:56:34,020 --> 00:56:36,149 not have to pay the three 1574 00:56:36,150 --> 00:56:38,249 three million mass process 1575 00:56:38,250 --> 00:56:40,349 again and reduce it for, 1576 00:56:40,350 --> 00:56:42,539 for marketing in order to 1577 00:56:42,540 --> 00:56:44,759 fix the design that otherwise 1578 00:56:44,760 --> 00:56:47,469 would not run on the target clock rate? 1579 00:56:47,470 --> 00:56:49,239 So all I'd say is that's a business 1580 00:56:49,240 --> 00:56:50,240 decision. 1581 00:56:52,730 --> 00:56:54,649 There you go, yeah. 1582 00:56:54,650 --> 00:56:56,239 OK, last question number one, 1583 00:56:57,320 --> 00:56:59,359 you were talking about pulmonary pulman 1584 00:56:59,360 --> 00:57:01,459 verification said one 1585 00:57:01,460 --> 00:57:03,529 of the issues is that you need to get a 1586 00:57:03,530 --> 00:57:05,929 model to check against the specification. 1587 00:57:05,930 --> 00:57:07,999 So how do you do the normal testing? 1588 00:57:08,000 --> 00:57:09,529 How do you make sure that you're actually 1589 00:57:09,530 --> 00:57:11,119 testing against the specification? 1590 00:57:11,120 --> 00:57:13,279 Because the issue is in the early 1591 00:57:13,280 --> 00:57:14,629 hours and days where actually there 1592 00:57:14,630 --> 00:57:16,699 wasn't a memory model and also 1593 00:57:16,700 --> 00:57:18,529 if you by not doing anything useful. 1594 00:57:18,530 --> 00:57:20,539 Right. I mean, the functional checking is 1595 00:57:20,540 --> 00:57:22,549 typically done with kind of a golden 1596 00:57:22,550 --> 00:57:24,979 model where you say you put instruction 1597 00:57:24,980 --> 00:57:26,929 in and you have the registers that need 1598 00:57:26,930 --> 00:57:28,279 to be there and everything else. 1599 00:57:28,280 --> 00:57:30,739 The issue with formal verification is 1600 00:57:30,740 --> 00:57:32,329 that if you're going to apply it to a 1601 00:57:32,330 --> 00:57:34,129 design size that the tools can work with, 1602 00:57:34,130 --> 00:57:36,319 say you're, you know, apply it to 1603 00:57:36,320 --> 00:57:38,389 a scheduling unit or to a load storage 1604 00:57:38,390 --> 00:57:40,969 unit. Those blocks are have hundreds 1605 00:57:40,970 --> 00:57:42,859 of different Io's that talk to other 1606 00:57:42,860 --> 00:57:44,389 blocks. You basically need to have a 1607 00:57:44,390 --> 00:57:46,459 formal model not of the architecture, but 1608 00:57:46,460 --> 00:57:48,439 of those blocks and exactly how they 1609 00:57:48,440 --> 00:57:50,749 behave, which can sometimes 1610 00:57:50,750 --> 00:57:52,159 just turn into reimplemented, those 1611 00:57:52,160 --> 00:57:53,299 blocks. 1612 00:57:53,300 --> 00:57:55,219 So it can it can be a lot of work for 1613 00:57:55,220 --> 00:57:56,220 that. 1614 00:57:57,630 --> 00:57:59,699 OK, thank you and let's have 1615 00:57:59,700 --> 00:58:01,769 a final hand for 1616 00:58:01,770 --> 00:58:03,209 David Kaplan, please. 1617 00:58:03,210 --> 00:58:04,210 Thank you.