0 00:00:00,000 --> 00:00:30,000 Dear viewer, these subtitles were generated by a machine via the service Trint and therefore are (very) buggy. If you are capable, please help us to create good quality subtitles: https://c3subtitles.de/talk/347 Thanks! 1 00:00:10,450 --> 00:00:12,039 Thanks, everybody, for coming to the talk 2 00:00:12,040 --> 00:00:13,040 tonight. 3 00:00:15,280 --> 00:00:17,449 It's called Space Time Adventures on 4 00:00:17,450 --> 00:00:20,199 Navina Introducing Balbir. 5 00:00:20,200 --> 00:00:21,849 I'm Andy Isaacson. 6 00:00:21,850 --> 00:00:22,850 And I'm Steve Inskeep. 7 00:00:24,910 --> 00:00:26,649 First, a little bit about what this talk 8 00:00:26,650 --> 00:00:28,749 is about in the beginning. 9 00:00:28,750 --> 00:00:31,059 We'll give a little overview about what 10 00:00:31,060 --> 00:00:32,589 Navina is. 11 00:00:32,590 --> 00:00:33,909 If this is the first year catching that 12 00:00:33,910 --> 00:00:36,459 word, that's the open hardware laptop 13 00:00:36,460 --> 00:00:37,389 slash dev board. 14 00:00:37,390 --> 00:00:39,909 We used to work on this project. 15 00:00:39,910 --> 00:00:42,009 Next, an overview of what FPGA 16 00:00:42,010 --> 00:00:44,259 is, are and why you might want to use 17 00:00:44,260 --> 00:00:45,260 one. 18 00:00:45,580 --> 00:00:47,679 Then a little summary of the tools 19 00:00:47,680 --> 00:00:49,670 that exist today for working with FPGA. 20 00:00:50,740 --> 00:00:53,649 And then finally, our manifesto 21 00:00:53,650 --> 00:00:55,389 and an overview of the work we've done on 22 00:00:55,390 --> 00:00:56,390 this project to date. 23 00:00:58,230 --> 00:00:59,999 So beginning first, here's some 24 00:01:00,000 --> 00:01:01,000 background. 25 00:01:02,760 --> 00:01:05,238 This is a picture of Navina. 26 00:01:05,239 --> 00:01:07,289 It's very exciting if you're just tuning 27 00:01:07,290 --> 00:01:09,479 in on this project, it's a completely 28 00:01:09,480 --> 00:01:11,129 open hardware laptop. 29 00:01:11,130 --> 00:01:13,499 In fact, it just began shipping last 30 00:01:13,500 --> 00:01:15,659 week in the 31 00:01:15,660 --> 00:01:18,299 very classic style of 32 00:01:18,300 --> 00:01:20,189 hardware, as it once was. 33 00:01:20,190 --> 00:01:21,899 It ships with a complete booklet of all 34 00:01:21,900 --> 00:01:23,849 the associated schematics. 35 00:01:23,850 --> 00:01:25,919 So you can go on and know quite 36 00:01:25,920 --> 00:01:27,539 a lot about what you're doing with the 37 00:01:27,540 --> 00:01:28,540 hardware in this laptop. 38 00:01:29,880 --> 00:01:31,980 It has a lot of very powerful hardware, 39 00:01:33,600 --> 00:01:34,979 Ethernet, dual USB. 40 00:01:34,980 --> 00:01:35,949 You can see some light here. 41 00:01:35,950 --> 00:01:38,669 You might recognize the connectors. 42 00:01:38,670 --> 00:01:40,769 And most importantly, it 43 00:01:40,770 --> 00:01:43,019 has an FPGA right 44 00:01:43,020 --> 00:01:44,020 there. 45 00:01:45,300 --> 00:01:46,709 So what is an FPGA? 46 00:01:47,760 --> 00:01:50,489 An FPGA is? 47 00:01:50,490 --> 00:01:52,499 Well, the acronym stands for Field 48 00:01:52,500 --> 00:01:54,329 programable Data Array. 49 00:01:54,330 --> 00:01:56,489 So coming 50 00:01:56,490 --> 00:01:57,959 from the concept of an array of 51 00:01:57,960 --> 00:02:00,389 transistor gates that you can 52 00:02:00,390 --> 00:02:02,639 reprogram after 53 00:02:02,640 --> 00:02:04,859 it's been manufactured to take 54 00:02:04,860 --> 00:02:06,449 any arrangement you like. 55 00:02:06,450 --> 00:02:08,668 Importantly, it's a programable circuit. 56 00:02:08,669 --> 00:02:11,129 So it's a really interesting combination 57 00:02:11,130 --> 00:02:12,839 of both software and hardware in that 58 00:02:12,840 --> 00:02:15,509 way, which you can use to implement 59 00:02:15,510 --> 00:02:17,099 any algorithm you can imagine 60 00:02:18,540 --> 00:02:18,779 there. 61 00:02:18,780 --> 00:02:21,059 Are there used in short, to solve any 62 00:02:21,060 --> 00:02:22,169 computable problem 63 00:02:23,520 --> 00:02:25,949 in any computable problem 64 00:02:25,950 --> 00:02:28,439 is given, given that you can implement a 65 00:02:28,440 --> 00:02:30,929 complete CPU to solve any computable 66 00:02:30,930 --> 00:02:32,999 problem that way, why they're 67 00:02:33,000 --> 00:02:34,739 better, why you'd want to use one. 68 00:02:34,740 --> 00:02:36,779 Is there often salt faster at solving 69 00:02:36,780 --> 00:02:39,179 specific problems, especially 70 00:02:39,180 --> 00:02:41,039 problems that can be computed in a highly 71 00:02:41,040 --> 00:02:43,319 parallel way, or in a way that's 72 00:02:43,320 --> 00:02:45,689 more gate efficient when you use 73 00:02:45,690 --> 00:02:47,160 a dedicated processor for it? 74 00:02:50,130 --> 00:02:52,409 So next, a couple of definitions, some 75 00:02:52,410 --> 00:02:53,410 terms. 76 00:02:54,480 --> 00:02:55,799 These are some words that you want to 77 00:02:55,800 --> 00:02:57,929 know when you begin working 78 00:02:57,930 --> 00:02:58,930 with FPGA as 79 00:03:00,180 --> 00:03:01,469 you hear the first term. 80 00:03:01,470 --> 00:03:03,749 LUTTS A lot and 81 00:03:03,750 --> 00:03:05,520 it stands for a lookup table. 82 00:03:06,660 --> 00:03:09,089 So I'm going to get a skim the surface 83 00:03:09,090 --> 00:03:10,619 of the hardware side of this talk about 84 00:03:10,620 --> 00:03:12,539 as close as we're going to get in 85 00:03:12,540 --> 00:03:15,089 discussing this, if anyone here 86 00:03:15,090 --> 00:03:17,849 recalls early 87 00:03:17,850 --> 00:03:19,949 electrical engineering, explorations, 88 00:03:19,950 --> 00:03:22,169 logic, Gates, you have 89 00:03:22,170 --> 00:03:24,479 and or so forth, 90 00:03:24,480 --> 00:03:26,759 those are basically a two 91 00:03:26,760 --> 00:03:29,009 input lookup tables 92 00:03:29,010 --> 00:03:31,139 themselves. You have two inputs and you 93 00:03:31,140 --> 00:03:33,209 get something as a result 94 00:03:33,210 --> 00:03:35,389 of what those inputs are in 95 00:03:35,390 --> 00:03:37,469 an FPGA, you have this concept sort of 96 00:03:37,470 --> 00:03:40,019 taken well beyond to the next level. 97 00:03:40,020 --> 00:03:42,629 So instead of just two inputs 98 00:03:42,630 --> 00:03:44,879 commonly today in Xilinx FPGA 99 00:03:44,880 --> 00:03:46,979 is you'll have six 100 00:03:46,980 --> 00:03:49,139 and the outputs aren't defined 101 00:03:49,140 --> 00:03:50,999 in terms of like standard canonical 102 00:03:51,000 --> 00:03:53,369 logic. It's anything 103 00:03:53,370 --> 00:03:54,449 that you want it to be. 104 00:03:54,450 --> 00:03:56,639 So any input, you can define 105 00:03:56,640 --> 00:03:58,560 every output in combinations of six. 106 00:04:01,440 --> 00:04:03,869 So that's the smallest unit of 107 00:04:03,870 --> 00:04:05,999 computational cell you discuss in 108 00:04:06,000 --> 00:04:08,309 terms of FPGA s larger 109 00:04:08,310 --> 00:04:10,889 sizes. From there, Logic's else 110 00:04:10,890 --> 00:04:12,780 will be your groups of Lutts, 111 00:04:14,340 --> 00:04:16,379 along with some flipflops and multiple 112 00:04:16,380 --> 00:04:18,148 lecture's. Those are the things that you 113 00:04:18,149 --> 00:04:20,669 set that determine how your 114 00:04:20,670 --> 00:04:23,399 FPGA will go about doing its computation 115 00:04:23,400 --> 00:04:25,709 figures in that logic slices or 116 00:04:25,710 --> 00:04:28,049 blocks of two cells 117 00:04:29,280 --> 00:04:31,529 or a couple of cells in combination logic 118 00:04:31,530 --> 00:04:33,599 blocks are a couple of 119 00:04:33,600 --> 00:04:35,100 slices, maybe four. 120 00:04:36,240 --> 00:04:38,879 And then zooming out even further, 121 00:04:38,880 --> 00:04:40,949 you have the FPGA fabric and 122 00:04:40,950 --> 00:04:42,779 that's discussed fairly often. 123 00:04:42,780 --> 00:04:44,879 The fabric is you can 124 00:04:44,880 --> 00:04:46,979 imagine that these logic blocks 125 00:04:46,980 --> 00:04:49,259 are sort of floating around in a C 126 00:04:49,260 --> 00:04:51,659 or mesh, which you 127 00:04:51,660 --> 00:04:54,749 also program the connections between 128 00:04:54,750 --> 00:04:57,119 and the fabric is that physical 129 00:04:57,120 --> 00:04:59,459 layer that those blocks are 130 00:04:59,460 --> 00:05:00,460 floating in. 131 00:05:02,820 --> 00:05:05,429 And finally, a core 132 00:05:05,430 --> 00:05:08,159 is the overall definition 133 00:05:08,160 --> 00:05:10,259 of how the gates 134 00:05:10,260 --> 00:05:11,260 will be assembled. 135 00:05:19,500 --> 00:05:21,149 And going a little bit more so you have 136 00:05:21,150 --> 00:05:23,129 this FPGA, the how do you go about 137 00:05:23,130 --> 00:05:25,319 programing it a 138 00:05:25,320 --> 00:05:27,599 are programed in what are called 139 00:05:27,600 --> 00:05:29,530 hardware description languages else, 140 00:05:31,230 --> 00:05:33,389 which are is a human readable source, 141 00:05:33,390 --> 00:05:35,639 code language similar to see 142 00:05:35,640 --> 00:05:37,529 or other system programing languages. 143 00:05:38,850 --> 00:05:40,349 It's similar to other programing 144 00:05:40,350 --> 00:05:42,149 languages you'll be familiar with in that 145 00:05:42,150 --> 00:05:44,219 it has a toolchain which is much 146 00:05:44,220 --> 00:05:46,739 like a compiler and 147 00:05:46,740 --> 00:05:48,809 it turns your code into a final 148 00:05:48,810 --> 00:05:50,789 bitstream, which is analogous for our 149 00:05:50,790 --> 00:05:52,680 purposes to an executable. 150 00:05:53,970 --> 00:05:56,369 And that bitstream actually switches 151 00:05:56,370 --> 00:05:58,919 the physical connections and describes 152 00:05:58,920 --> 00:06:00,569 the physical interconnects inside your 153 00:06:00,570 --> 00:06:01,570 chip. 154 00:06:02,910 --> 00:06:05,879 Here's an example of an HGL. 155 00:06:05,880 --> 00:06:06,899 This is very Verilli. 156 00:06:07,980 --> 00:06:09,439 It's one of a couple H.G. 157 00:06:09,440 --> 00:06:10,440 Wells. 158 00:06:11,610 --> 00:06:13,379 It's the one we actually chose for 159 00:06:13,380 --> 00:06:14,429 writing our project. 160 00:06:14,430 --> 00:06:16,559 And in fact, Easter, this 161 00:06:16,560 --> 00:06:18,659 is actually code we wrote for 162 00:06:18,660 --> 00:06:21,359 the earlier project, which became 163 00:06:21,360 --> 00:06:22,360 this project. 164 00:06:23,280 --> 00:06:25,389 This is part of the 165 00:06:25,390 --> 00:06:27,269 80s encryption algorithm 166 00:06:28,410 --> 00:06:30,899 doing a computation, 167 00:06:30,900 --> 00:06:32,009 if you're familiar with very like you've 168 00:06:32,010 --> 00:06:33,010 already read it 169 00:06:34,110 --> 00:06:35,309 and if you're familiar with a girl like 170 00:06:35,310 --> 00:06:36,510 you might be able to find the bug 171 00:06:37,650 --> 00:06:38,650 in this code. 172 00:06:41,130 --> 00:06:42,539 One thing that's interesting about this 173 00:06:42,540 --> 00:06:44,519 is if you're familiar with Python or C, 174 00:06:44,520 --> 00:06:46,199 you'll notice that that beginning line, 175 00:06:46,200 --> 00:06:48,209 the module submits. 176 00:06:48,210 --> 00:06:50,279 That looks like a series of arguments. 177 00:06:50,280 --> 00:06:52,589 But in fact, those are actual 178 00:06:52,590 --> 00:06:54,569 physical inputs here. 179 00:06:54,570 --> 00:06:55,919 Those are thought of as wires that 180 00:06:55,920 --> 00:06:57,929 connected to other things or wires. 181 00:07:00,660 --> 00:07:02,370 You will see more of those, I suppose. 182 00:07:09,300 --> 00:07:11,789 And so if you're interested 183 00:07:11,790 --> 00:07:14,159 in learning more about Vilborg, 184 00:07:14,160 --> 00:07:15,539 we worked through this book and found it 185 00:07:15,540 --> 00:07:17,339 very useful. Just a little shout out to 186 00:07:17,340 --> 00:07:19,439 weblog by example for those of you 187 00:07:19,440 --> 00:07:20,820 looking to go from here and learn more 188 00:07:21,960 --> 00:07:24,419 as a counterpoint, felt it 189 00:07:24,420 --> 00:07:27,149 prudent to include this video example 190 00:07:27,150 --> 00:07:29,189 in evaluating our HDL. 191 00:07:29,190 --> 00:07:30,869 We decided we like very like a lot more. 192 00:07:30,870 --> 00:07:33,149 But for the sake of completeness, here's 193 00:07:33,150 --> 00:07:34,739 what video looks like. 194 00:07:34,740 --> 00:07:37,019 VDL just one of many 195 00:07:37,020 --> 00:07:37,929 deals. 196 00:07:37,930 --> 00:07:39,480 Those terms are not interchangeable. 197 00:07:45,550 --> 00:07:47,829 So you write all this code, 198 00:07:47,830 --> 00:07:49,539 how do you go about building for the 199 00:07:49,540 --> 00:07:50,949 FPGA? 200 00:07:50,950 --> 00:07:52,809 What's the process like? 201 00:07:52,810 --> 00:07:54,879 It's analogous to compiling, but it 202 00:07:54,880 --> 00:07:57,069 exists in several different concrete 203 00:07:57,070 --> 00:07:58,070 parts. 204 00:07:59,200 --> 00:08:00,789 So the first part is synthesis. 205 00:08:00,790 --> 00:08:03,069 And during synthesis, your 206 00:08:03,070 --> 00:08:04,869 definition of what you want your circuits 207 00:08:04,870 --> 00:08:07,089 to do is converted to a list of internal 208 00:08:07,090 --> 00:08:08,090 connections 209 00:08:09,310 --> 00:08:11,019 and the output of that is called your 210 00:08:11,020 --> 00:08:12,609 Netlist. 211 00:08:12,610 --> 00:08:14,529 Then it moves to a stage called Place in 212 00:08:14,530 --> 00:08:16,689 Route. And this is where it gets 213 00:08:16,690 --> 00:08:18,699 interesting for me from the electrical 214 00:08:18,700 --> 00:08:20,979 engineering perspective, because at pujas 215 00:08:20,980 --> 00:08:23,529 are fundamentally two dimensional. 216 00:08:23,530 --> 00:08:26,019 And in order to implement 217 00:08:26,020 --> 00:08:28,209 the algorithm you want to 218 00:08:28,210 --> 00:08:30,369 compute, you have to actually, like, 219 00:08:30,370 --> 00:08:33,009 physically lay it out inside this chip. 220 00:08:33,010 --> 00:08:35,259 And it's similar, I think, to urban 221 00:08:35,260 --> 00:08:37,389 planning in some ways in that you 222 00:08:37,390 --> 00:08:39,519 might have two different modules that 223 00:08:39,520 --> 00:08:40,779 you need to be in communication with each 224 00:08:40,780 --> 00:08:42,489 other. And if you do this part wrong and 225 00:08:42,490 --> 00:08:44,199 run like a connection straight between 226 00:08:44,200 --> 00:08:45,249 them, they will never be able to 227 00:08:45,250 --> 00:08:46,250 communicate. 228 00:08:47,080 --> 00:08:49,129 It's also called mapping by some vendors. 229 00:08:49,130 --> 00:08:51,669 So the geometric aspect of this is 230 00:08:51,670 --> 00:08:52,670 certainly very important. 231 00:08:55,520 --> 00:08:57,709 And then subsequent to that, you have the 232 00:08:57,710 --> 00:08:59,929 bitstream generation stage where you take 233 00:08:59,930 --> 00:09:01,879 your physical layout and your connections 234 00:09:01,880 --> 00:09:04,639 and turn it into that file 235 00:09:04,640 --> 00:09:06,829 with the switches that turn the chip 236 00:09:06,830 --> 00:09:08,360 into what it should be. 237 00:09:10,760 --> 00:09:13,489 It's worth noting that 238 00:09:13,490 --> 00:09:15,649 bitstream generation seems 239 00:09:15,650 --> 00:09:17,210 like it'd be fairly straightforward, but 240 00:09:18,410 --> 00:09:20,479 is highly vendor 241 00:09:20,480 --> 00:09:21,469 specific. 242 00:09:21,470 --> 00:09:23,479 The tools for doing that are fairly 243 00:09:23,480 --> 00:09:25,009 proprietary. 244 00:09:25,010 --> 00:09:26,419 But as a note, there's some very 245 00:09:26,420 --> 00:09:28,309 interesting reverse engineering efforts 246 00:09:28,310 --> 00:09:30,529 here to understand and publicly 247 00:09:30,530 --> 00:09:32,749 document those systems. 248 00:09:32,750 --> 00:09:34,399 Find out more on the Internet. 249 00:09:42,320 --> 00:09:44,779 So START described a lot of the hardware 250 00:09:44,780 --> 00:09:47,299 of the FPGA and 251 00:09:47,300 --> 00:09:49,429 the last part in 252 00:09:49,430 --> 00:09:51,739 sort of the system diagram is 253 00:09:51,740 --> 00:09:54,589 this idea of an FPGA core. 254 00:09:54,590 --> 00:09:55,590 So 255 00:09:56,660 --> 00:09:58,940 what is an FPGA core? 256 00:10:00,230 --> 00:10:01,999 To be very concrete about it? 257 00:10:02,000 --> 00:10:04,069 The core is just a 258 00:10:04,070 --> 00:10:06,499 chunk of source code which 259 00:10:06,500 --> 00:10:08,389 does something which implements some 260 00:10:08,390 --> 00:10:10,069 process or algorithm. 261 00:10:10,070 --> 00:10:12,279 It's probably written in very log 262 00:10:12,280 --> 00:10:14,659 VDL or possibly in some 263 00:10:14,660 --> 00:10:16,999 higher level, more abstract 264 00:10:17,000 --> 00:10:19,309 hardware description language, which 265 00:10:19,310 --> 00:10:21,499 gets turned into logger VDL on 266 00:10:21,500 --> 00:10:22,730 its way into being in the chip. 267 00:10:26,030 --> 00:10:27,169 There's one really interesting 268 00:10:27,170 --> 00:10:30,109 distinction in the FPGA world 269 00:10:30,110 --> 00:10:31,970 between a soft core. 270 00:10:33,270 --> 00:10:35,339 Versus a hard core, and 271 00:10:35,340 --> 00:10:37,679 FPGA is have over the years 272 00:10:37,680 --> 00:10:39,899 evolved such 273 00:10:39,900 --> 00:10:41,639 that the vendors realized many people are 274 00:10:41,640 --> 00:10:43,649 spending a lot of their space in the FPGA 275 00:10:43,650 --> 00:10:45,779 doing some specific things. 276 00:10:45,780 --> 00:10:47,669 So let's just go ahead and put that down 277 00:10:47,670 --> 00:10:49,949 on the chip as as hard wired. 278 00:10:49,950 --> 00:10:51,749 And that's a hard core. 279 00:10:51,750 --> 00:10:52,919 And then 280 00:10:53,940 --> 00:10:56,159 if you're implementing something, 281 00:10:56,160 --> 00:10:58,259 using the hardware description, 282 00:10:58,260 --> 00:11:00,389 language in a log 283 00:11:00,390 --> 00:11:02,609 or HGL, that's a soft 284 00:11:02,610 --> 00:11:04,979 core, it's software. 285 00:11:04,980 --> 00:11:07,109 Now, many people in this industry, 286 00:11:07,110 --> 00:11:09,209 when they say core, what they mean is IP 287 00:11:09,210 --> 00:11:11,399 core and IP here 288 00:11:11,400 --> 00:11:13,349 is intellectual property core. 289 00:11:13,350 --> 00:11:15,449 So this is 290 00:11:15,450 --> 00:11:17,429 the idea that this source code is owned 291 00:11:17,430 --> 00:11:19,559 by someone and they have the right to 292 00:11:19,560 --> 00:11:21,629 sell it, to sell that idea, to 293 00:11:21,630 --> 00:11:22,709 market it to others. 294 00:11:22,710 --> 00:11:24,809 And they can be exclusionary about that, 295 00:11:24,810 --> 00:11:26,129 not share it with people they don't want 296 00:11:26,130 --> 00:11:27,130 to. 297 00:11:28,440 --> 00:11:30,509 So IP here is intellectual property, 298 00:11:30,510 --> 00:11:32,159 not Internet protocol, which is what I 299 00:11:32,160 --> 00:11:33,869 kind of was hoping when I first saw the 300 00:11:33,870 --> 00:11:34,870 phrase IP core. 301 00:11:36,210 --> 00:11:37,649 But alas. 302 00:11:37,650 --> 00:11:39,479 So it's it's interesting because this 303 00:11:39,480 --> 00:11:41,339 this FPGA world, like the community of 304 00:11:41,340 --> 00:11:43,589 people who developed refugees, are 305 00:11:43,590 --> 00:11:45,509 almost, I would say, mentally poisoned by 306 00:11:45,510 --> 00:11:46,739 this term. The fact that the thing that 307 00:11:46,740 --> 00:11:47,909 you're building is a core. And while, of 308 00:11:47,910 --> 00:11:49,259 course, it's an IP core, that means that 309 00:11:49,260 --> 00:11:50,729 I own it. That means that everything that 310 00:11:50,730 --> 00:11:52,529 is built is owned by someone. 311 00:11:54,510 --> 00:11:55,829 I don't think that's how it has to work. 312 00:11:55,830 --> 00:11:57,119 And that's one thing that we were really 313 00:11:57,120 --> 00:11:59,279 excited about in this project 314 00:11:59,280 --> 00:12:02,099 when we realized that open cause exists. 315 00:12:02,100 --> 00:12:04,169 So Open Cause is a 316 00:12:04,170 --> 00:12:06,569 website open cause dot org and 317 00:12:06,570 --> 00:12:08,939 a project and a philosophy to 318 00:12:08,940 --> 00:12:12,389 build free, freely licensed 319 00:12:12,390 --> 00:12:14,309 cause for refugees. 320 00:12:14,310 --> 00:12:15,749 Amazing project. 321 00:12:15,750 --> 00:12:18,419 They've got hundreds of cores, 322 00:12:18,420 --> 00:12:19,619 hundreds of different designs. 323 00:12:19,620 --> 00:12:21,299 You can customize them, you can integrate 324 00:12:21,300 --> 00:12:23,729 them. They have a 325 00:12:23,730 --> 00:12:25,859 shared bus interface for many 326 00:12:25,860 --> 00:12:27,959 of these, so they work together 327 00:12:27,960 --> 00:12:29,429 really well. 328 00:12:29,430 --> 00:12:31,499 It's really an amazing example of 329 00:12:31,500 --> 00:12:33,959 the free software philosophy in action. 330 00:12:39,340 --> 00:12:41,469 So open source is cool and 331 00:12:41,470 --> 00:12:43,149 IP courts are less cool, but there's an 332 00:12:43,150 --> 00:12:45,159 important concept there. 333 00:12:45,160 --> 00:12:47,259 But in the Balbo project 334 00:12:47,260 --> 00:12:49,329 we use the word core 335 00:12:49,330 --> 00:12:50,739 a little bit differently and we'll talk a 336 00:12:50,740 --> 00:12:53,049 lot about accelerator cause and 337 00:12:53,050 --> 00:12:55,149 what that exactly is. I'll get into in 338 00:12:55,150 --> 00:12:57,309 a little bit, but 339 00:12:57,310 --> 00:12:59,469 I wanted to put a pin in that 340 00:12:59,470 --> 00:13:00,609 here so that everybody's. 341 00:13:02,870 --> 00:13:04,939 On the same page, it's the 342 00:13:04,940 --> 00:13:06,439 next we'll discuss a little bit more 343 00:13:06,440 --> 00:13:08,089 about the environment that this project 344 00:13:08,090 --> 00:13:09,090 exists within. 345 00:13:11,800 --> 00:13:14,319 From the hardware to the software 346 00:13:14,320 --> 00:13:17,139 that currently exists to support 347 00:13:17,140 --> 00:13:18,729 FPGA on. 348 00:13:18,730 --> 00:13:19,980 So here again is no Viña 349 00:13:21,070 --> 00:13:23,229 and this slide exists to give you a sense 350 00:13:23,230 --> 00:13:25,989 of how FPGA is are measured, sort of 351 00:13:25,990 --> 00:13:27,819 you can see that this has forty three 352 00:13:27,820 --> 00:13:29,109 thousand logic cells. 353 00:13:29,110 --> 00:13:31,659 That makes it about a medium sized FPGA 354 00:13:31,660 --> 00:13:32,830 in the Spartans family. 355 00:13:33,940 --> 00:13:36,009 It's got two different kinds of 356 00:13:36,010 --> 00:13:37,239 ram here. 357 00:13:37,240 --> 00:13:39,459 And then actually the second to last 358 00:13:39,460 --> 00:13:41,049 line item here is interesting. 359 00:13:41,050 --> 00:13:43,419 It's got a hard core, 360 00:13:43,420 --> 00:13:45,549 which is a DSP that DSP forty eight, 361 00:13:45,550 --> 00:13:47,919 a digital signal processor built into 362 00:13:47,920 --> 00:13:49,959 it, just as Andy was describing. 363 00:13:49,960 --> 00:13:52,029 They've chosen to include 364 00:13:52,030 --> 00:13:53,949 that here so that it's permanently 365 00:13:53,950 --> 00:13:55,099 available. 366 00:13:55,100 --> 00:13:56,469 It gives you a bunch of things. 367 00:13:56,470 --> 00:13:59,019 You get an 18 by 18 multiplier, 368 00:13:59,020 --> 00:14:00,669 an ad or an accumulator, and you don't 369 00:14:00,670 --> 00:14:01,909 have to write those things for yourself. 370 00:14:01,910 --> 00:14:02,910 They're just there. They're 371 00:14:07,000 --> 00:14:08,889 a little bit more zoomed out on the 372 00:14:08,890 --> 00:14:10,929 Navina laptop itself. 373 00:14:10,930 --> 00:14:13,059 The FPGA is directly connected 374 00:14:13,060 --> 00:14:15,309 to the ARM processor via 375 00:14:15,310 --> 00:14:16,690 a very fast bus, 376 00:14:17,950 --> 00:14:20,049 16 bits, and we will use 377 00:14:20,050 --> 00:14:21,050 that more. 378 00:14:22,700 --> 00:14:24,889 And, you know, even as in the course 379 00:14:24,890 --> 00:14:26,689 of preparing this talk, like, you know, 380 00:14:26,690 --> 00:14:27,709 that's great and everything, but like, 381 00:14:27,710 --> 00:14:29,779 why is there an FPGA on, you 382 00:14:29,780 --> 00:14:31,789 know, like what what would I want to do 383 00:14:31,790 --> 00:14:33,109 with it? 384 00:14:33,110 --> 00:14:34,369 And these are not things you might 385 00:14:34,370 --> 00:14:36,199 necessarily even want to do with it. 386 00:14:36,200 --> 00:14:37,579 These are things you could want to do 387 00:14:37,580 --> 00:14:39,679 with it. So you could you know, 388 00:14:39,680 --> 00:14:41,929 you could have fun in an attempt 389 00:14:41,930 --> 00:14:44,809 to put some Bitcoin mining into action. 390 00:14:44,810 --> 00:14:46,699 You could emulate another processor with 391 00:14:46,700 --> 00:14:48,779 it so you can have that arm talking to 392 00:14:48,780 --> 00:14:50,959 a software processor or some other 393 00:14:50,960 --> 00:14:52,129 kind. 394 00:14:52,130 --> 00:14:54,259 You can use it for crypto. 395 00:14:54,260 --> 00:14:56,329 As I alluded to earlier, our first 396 00:14:56,330 --> 00:14:59,209 project was to implement A-S. 397 00:14:59,210 --> 00:15:01,759 Our goal then was to accelerate 398 00:15:01,760 --> 00:15:04,100 SSL computations on the CPU. 399 00:15:05,180 --> 00:15:06,409 You can do code processing 400 00:15:07,430 --> 00:15:09,739 and you can also use it for 401 00:15:09,740 --> 00:15:11,959 processing over, you know, 402 00:15:11,960 --> 00:15:13,339 data you might be acquiring. 403 00:15:13,340 --> 00:15:15,079 Actually, the novenas. 404 00:15:15,080 --> 00:15:17,239 It's got this amazing analog 405 00:15:17,240 --> 00:15:19,459 to digital converter which lets 406 00:15:19,460 --> 00:15:20,460 you get in. 407 00:15:21,800 --> 00:15:24,139 Stahmann Software is is 500 408 00:15:24,140 --> 00:15:26,509 mega samples per second actually 409 00:15:26,510 --> 00:15:28,049 often tell me 410 00:15:29,090 --> 00:15:30,090 it's fast enough for you. 411 00:15:31,130 --> 00:15:33,439 It's also great for doing 412 00:15:33,440 --> 00:15:36,019 the Navina has additional boards, 413 00:15:36,020 --> 00:15:37,999 for example, a software defined radio. 414 00:15:38,000 --> 00:15:39,589 You might use it for video or image 415 00:15:39,590 --> 00:15:41,390 processing, things of that sort. 416 00:15:42,920 --> 00:15:46,069 So now a little bit about the 417 00:15:46,070 --> 00:15:48,409 ecosystem of 418 00:15:48,410 --> 00:15:50,719 open source tools that 419 00:15:50,720 --> 00:15:52,969 you could use for 420 00:15:52,970 --> 00:15:53,970 targeting 421 00:15:55,070 --> 00:15:57,649 the FPGA, for building 422 00:15:57,650 --> 00:15:59,929 your course, shout 423 00:15:59,930 --> 00:16:01,999 out to the creators and maintainers 424 00:16:02,000 --> 00:16:03,000 of all of these products. 425 00:16:04,550 --> 00:16:06,139 First, this is Yosses. 426 00:16:06,140 --> 00:16:07,609 Yosses is something we're actually 427 00:16:07,610 --> 00:16:09,079 extremely excited about. 428 00:16:09,080 --> 00:16:11,239 It exists not 429 00:16:11,240 --> 00:16:12,829 in the same space as Balboa, but in a 430 00:16:12,830 --> 00:16:15,049 very complementary space. 431 00:16:15,050 --> 00:16:17,509 It strives to be what 432 00:16:17,510 --> 00:16:19,309 I think the proprietary tools that you 433 00:16:19,310 --> 00:16:21,319 can get today, like Xilinx, Issie and so 434 00:16:21,320 --> 00:16:22,320 forth 435 00:16:23,390 --> 00:16:24,799 to do that, but in a free way. 436 00:16:26,180 --> 00:16:28,339 It is not a complete 437 00:16:28,340 --> 00:16:29,419 replacement to date. 438 00:16:30,500 --> 00:16:31,459 It has. 439 00:16:31,460 --> 00:16:33,289 It can get you through synthesis. 440 00:16:33,290 --> 00:16:35,389 It has the start of plays and root 441 00:16:35,390 --> 00:16:37,369 system, and it doesn't do bitstream 442 00:16:37,370 --> 00:16:38,370 generation. 443 00:16:39,650 --> 00:16:42,049 But we're hopeful that with a couple 444 00:16:42,050 --> 00:16:44,179 of Xilinx extensions, 445 00:16:44,180 --> 00:16:45,289 Yosses could get there. 446 00:16:45,290 --> 00:16:46,969 It's also worth noting that this project 447 00:16:46,970 --> 00:16:49,159 has existed for about, I think, only 448 00:16:49,160 --> 00:16:51,319 two years and I think has 449 00:16:51,320 --> 00:16:53,509 been sort of worked on by like one guy so 450 00:16:53,510 --> 00:16:56,419 far. So it's pretty amazing how far 451 00:16:56,420 --> 00:16:57,420 it's come. 452 00:16:58,490 --> 00:16:59,809 It's opensource. 453 00:16:59,810 --> 00:17:01,340 It has a very large toolchain 454 00:17:02,840 --> 00:17:05,299 and also output's to 455 00:17:05,300 --> 00:17:06,409 any S.A. 456 00:17:06,410 --> 00:17:07,430 FPGA backing. 457 00:17:11,099 --> 00:17:13,828 Next, a quick word about my Jen, 458 00:17:13,829 --> 00:17:15,779 if you attended previous school, you 459 00:17:15,780 --> 00:17:17,969 might be familiar with this board called 460 00:17:17,970 --> 00:17:19,439 the Milky Mist. 461 00:17:19,440 --> 00:17:21,088 I like the Milky Mist a lot because its 462 00:17:21,089 --> 00:17:23,279 sole purpose in life was to 463 00:17:23,280 --> 00:17:25,499 do really beautiful video 464 00:17:25,500 --> 00:17:28,469 generation using an FPGA. 465 00:17:28,470 --> 00:17:30,389 And the team that worked on that produced 466 00:17:30,390 --> 00:17:32,969 a lot of useful software as well, 467 00:17:32,970 --> 00:17:35,249 including this, which is a python toolbox 468 00:17:35,250 --> 00:17:36,250 for building hardware. 469 00:17:37,410 --> 00:17:38,729 It lets you write a high level 470 00:17:38,730 --> 00:17:41,579 description of your circuit in Python 471 00:17:41,580 --> 00:17:43,650 and then outputs for a log or VDL. 472 00:17:45,210 --> 00:17:47,279 It's also used in actual projects, which 473 00:17:47,280 --> 00:17:48,319 is a strong plus. 474 00:17:49,950 --> 00:17:51,089 We just thought we'd include a little 475 00:17:51,090 --> 00:17:53,339 example of what the code looks like for 476 00:17:53,340 --> 00:17:55,139 this in the following three examples. 477 00:17:55,140 --> 00:17:57,059 So you can sort of get a sense for what 478 00:17:57,060 --> 00:17:58,060 it would be like. 479 00:18:00,000 --> 00:18:01,859 It's a basic Mildren example. 480 00:18:06,810 --> 00:18:08,969 My HDL is a project that I like 481 00:18:08,970 --> 00:18:09,970 a whole lot. 482 00:18:10,440 --> 00:18:11,729 We ended up writing directly in very 483 00:18:11,730 --> 00:18:13,199 large, but we considered writing in my 484 00:18:13,200 --> 00:18:14,200 HDL. 485 00:18:15,000 --> 00:18:17,219 It is rather than being a tool box for 486 00:18:17,220 --> 00:18:18,479 targeting hardware, it is. 487 00:18:18,480 --> 00:18:21,329 You're actually designing hardware 488 00:18:21,330 --> 00:18:22,330 with Python. 489 00:18:23,610 --> 00:18:25,319 It has a lot of docs. 490 00:18:25,320 --> 00:18:27,359 Their website is is really good. 491 00:18:27,360 --> 00:18:28,829 And I really liked what the developers of 492 00:18:28,830 --> 00:18:31,649 this project put into their website. 493 00:18:31,650 --> 00:18:34,409 To support people who would want to 494 00:18:34,410 --> 00:18:35,750 use it to program their hardware 495 00:18:37,160 --> 00:18:39,389 is sort of like just a Python 496 00:18:39,390 --> 00:18:41,429 syntax for very long. 497 00:18:41,430 --> 00:18:42,430 And it also 498 00:18:43,890 --> 00:18:46,409 runs and gives either very logger video, 499 00:18:46,410 --> 00:18:47,410 you're your choice. 500 00:18:50,040 --> 00:18:52,289 So here's a snippet of 501 00:18:52,290 --> 00:18:54,029 a my HDL project. 502 00:18:55,590 --> 00:18:56,879 One thing I really like about this 503 00:18:56,880 --> 00:18:59,159 example is it's a very 504 00:18:59,160 --> 00:19:00,119 well documented style. 505 00:19:00,120 --> 00:19:01,919 All of their documentation is like this. 506 00:19:11,100 --> 00:19:13,409 And finally, Chisel Chisel 507 00:19:13,410 --> 00:19:15,149 is very different from the previous two 508 00:19:15,150 --> 00:19:17,429 examples, it strives to let 509 00:19:17,430 --> 00:19:18,750 you design hardware 510 00:19:19,980 --> 00:19:22,140 in a embedded in Schola code. 511 00:19:24,270 --> 00:19:26,579 It's often one to one with very log 512 00:19:26,580 --> 00:19:28,889 that you'd be writing, but 513 00:19:28,890 --> 00:19:30,509 their philosophy is very different and 514 00:19:30,510 --> 00:19:31,889 their philosophy is really what's most 515 00:19:31,890 --> 00:19:33,209 interesting. 516 00:19:33,210 --> 00:19:35,549 It's actually produced by a UC 517 00:19:35,550 --> 00:19:38,339 Berkeley group led by Jonathan Bacharach. 518 00:19:38,340 --> 00:19:40,769 And one thing that's really interesting 519 00:19:40,770 --> 00:19:43,259 is that Risk five, 520 00:19:43,260 --> 00:19:45,419 which is new, completely 521 00:19:45,420 --> 00:19:47,789 open CPU was written entirely 522 00:19:47,790 --> 00:19:50,039 in chisel, so that's 523 00:19:50,040 --> 00:19:51,040 pretty cool. 524 00:19:51,870 --> 00:19:53,939 There's a paper associated with 525 00:19:53,940 --> 00:19:56,039 Chisel that's pretty great, 526 00:19:56,040 --> 00:19:58,109 which goes into further depth about the 527 00:19:58,110 --> 00:20:00,209 tension between having a general 528 00:20:00,210 --> 00:20:02,849 purpose, hardware, description, language 529 00:20:02,850 --> 00:20:04,949 versus having a domain specific language 530 00:20:04,950 --> 00:20:06,089 for describing your hardware. 531 00:20:12,540 --> 00:20:15,269 And here's an example of 532 00:20:15,270 --> 00:20:17,160 that scholar code describing Harvard. 533 00:20:27,620 --> 00:20:29,779 All right, so 534 00:20:29,780 --> 00:20:31,609 you might ask, like, why we need a free 535 00:20:31,610 --> 00:20:33,799 toolchain, like other than free 536 00:20:33,800 --> 00:20:35,510 software is as great. 537 00:20:37,370 --> 00:20:39,499 One problem is that a lot 538 00:20:39,500 --> 00:20:41,689 of the proprietary tools 539 00:20:41,690 --> 00:20:43,759 are sort of monolithic and 540 00:20:43,760 --> 00:20:45,769 they seem a little creaky. 541 00:20:45,770 --> 00:20:48,139 And our hope is that a free 542 00:20:48,140 --> 00:20:49,339 FPGA toolchain 543 00:20:50,690 --> 00:20:53,119 would not only let you flexibly 544 00:20:53,120 --> 00:20:55,129 target any FPGA you might be working 545 00:20:55,130 --> 00:20:57,199 with, it will also 546 00:20:57,200 --> 00:20:59,299 lay the groundwork for new hardware 547 00:20:59,300 --> 00:21:01,699 description, language experiments, 548 00:21:01,700 --> 00:21:03,349 which I think is part of what chiselers 549 00:21:03,350 --> 00:21:05,509 getting out in their paper about the 550 00:21:05,510 --> 00:21:07,989 tension between current monolithic HGL 551 00:21:09,410 --> 00:21:11,509 faster built targeting 552 00:21:11,510 --> 00:21:13,639 FPGA is today is notoriously 553 00:21:13,640 --> 00:21:14,659 takes a long time. 554 00:21:14,660 --> 00:21:17,029 Are Xilinx our fastest selling IAC 555 00:21:17,030 --> 00:21:19,099 build to about 40 seconds, which 556 00:21:19,100 --> 00:21:20,719 is all day. You know, it's enough time to 557 00:21:20,720 --> 00:21:22,919 go get a new cup of coffee and you 558 00:21:22,920 --> 00:21:24,799 can only drink so much coffee. 559 00:21:24,800 --> 00:21:27,559 And then finally, longevity. 560 00:21:27,560 --> 00:21:29,929 The idea here is that free software 561 00:21:29,930 --> 00:21:31,909 has a greater lifespan than any 562 00:21:31,910 --> 00:21:32,990 proprietary software 563 00:21:34,070 --> 00:21:35,329 vendors come and go. 564 00:21:35,330 --> 00:21:36,769 You don't want to be locked in. 565 00:21:36,770 --> 00:21:38,329 Jaeschke has been around for a really 566 00:21:38,330 --> 00:21:39,799 long time. We think something like that 567 00:21:39,800 --> 00:21:40,800 could exist. 568 00:21:44,360 --> 00:21:45,559 So we promised you 569 00:21:46,820 --> 00:21:48,079 space time adventures. 570 00:21:49,370 --> 00:21:51,709 So here's the space time picture from 571 00:21:51,710 --> 00:21:53,209 Wikipedia. 572 00:21:53,210 --> 00:21:54,229 Like Wikipedia a lot. 573 00:22:00,670 --> 00:22:02,019 In order to get into the space time 574 00:22:02,020 --> 00:22:03,669 adventures, we have to set the stage a 575 00:22:03,670 --> 00:22:04,670 little bit. 576 00:22:05,810 --> 00:22:08,299 Roll back and look at the history. 577 00:22:08,300 --> 00:22:10,699 Let's go back to nineteen sixty 578 00:22:10,700 --> 00:22:13,219 five, nineteen seventy 579 00:22:13,220 --> 00:22:14,359 computers existed. 580 00:22:14,360 --> 00:22:16,909 Everyone had read a newspaper article 581 00:22:16,910 --> 00:22:19,039 about computers, but 582 00:22:19,040 --> 00:22:21,259 how did a computer come to solve 583 00:22:21,260 --> 00:22:23,539 an actual problem in nineteen sixty 584 00:22:23,540 --> 00:22:24,649 five? 585 00:22:24,650 --> 00:22:27,679 Well you had this computer with the CPU 586 00:22:27,680 --> 00:22:29,899 and you have a team of programmers 587 00:22:29,900 --> 00:22:31,639 who are going to build an application for 588 00:22:31,640 --> 00:22:32,640 this computer. 589 00:22:35,610 --> 00:22:36,719 That's CPU, 590 00:22:37,800 --> 00:22:39,719 that's computer hardware, that 591 00:22:39,720 --> 00:22:41,789 application are going to 592 00:22:41,790 --> 00:22:43,439 exist together, that there's going to be 593 00:22:43,440 --> 00:22:45,539 a fixed thing which exists at 594 00:22:45,540 --> 00:22:47,489 the end of this project, which is solving 595 00:22:47,490 --> 00:22:49,079 that problem, whatever that problem is, 596 00:22:49,080 --> 00:22:50,849 maybe it's a banking application. 597 00:22:50,850 --> 00:22:52,109 So this bank is going to have this 598 00:22:52,110 --> 00:22:53,369 computer and it's going to be their 599 00:22:53,370 --> 00:22:55,439 computer. So 600 00:22:55,440 --> 00:22:57,689 you have the CPU and 601 00:22:57,690 --> 00:22:59,549 you have the team of programmers and 602 00:22:59,550 --> 00:23:01,019 maybe the team of programmers realizes 603 00:23:01,020 --> 00:23:03,059 that the CPU doesn't quite do what they 604 00:23:03,060 --> 00:23:05,219 want. So they they ask, well, let's 605 00:23:05,220 --> 00:23:06,839 just change the CPU, let's add an 606 00:23:06,840 --> 00:23:08,429 instruction to it. 607 00:23:08,430 --> 00:23:10,589 So now the CPU gains the ability 608 00:23:10,590 --> 00:23:12,389 to do binary code of decimal because 609 00:23:12,390 --> 00:23:13,889 that's what they were going to do in this 610 00:23:13,890 --> 00:23:15,659 bank banking application. 611 00:23:15,660 --> 00:23:17,999 And the team of application programmers 612 00:23:18,000 --> 00:23:19,109 are building for this computer. 613 00:23:19,110 --> 00:23:20,819 So they write in, of course, in the 614 00:23:20,820 --> 00:23:22,979 assembly language for this computer, for 615 00:23:22,980 --> 00:23:25,649 the CPU with its new custom instruction. 616 00:23:25,650 --> 00:23:27,449 And that application is obviously only 617 00:23:27,450 --> 00:23:28,889 going to run on this computer, the one in 618 00:23:28,890 --> 00:23:31,169 Poughkeepsie you wouldn't necessarily 619 00:23:31,170 --> 00:23:33,239 think of also running it on the computer 620 00:23:33,240 --> 00:23:34,649 in Minneapolis. 621 00:23:34,650 --> 00:23:36,089 So that application and that computer are 622 00:23:36,090 --> 00:23:38,039 bound together and the peripherals as 623 00:23:38,040 --> 00:23:39,699 well. The printer gets wired into the 624 00:23:39,700 --> 00:23:41,129 computer. The computer is modified to 625 00:23:41,130 --> 00:23:42,130 talk to the peripheral. 626 00:23:43,170 --> 00:23:45,449 And so now you have a printer 627 00:23:45,450 --> 00:23:47,099 and maybe a tape drive attached. 628 00:23:47,100 --> 00:23:49,349 And now you have this this object 629 00:23:49,350 --> 00:23:50,879 which has been created, which which 630 00:23:50,880 --> 00:23:52,020 solves this problem. 631 00:23:53,760 --> 00:23:55,709 It's really cool. It solves the problem. 632 00:23:55,710 --> 00:23:56,999 This was really cool, especially at the 633 00:23:57,000 --> 00:23:58,000 time. 634 00:24:00,390 --> 00:24:02,789 But it's it's monolithic, it does 635 00:24:02,790 --> 00:24:05,069 exactly one thing, so rolling 636 00:24:05,070 --> 00:24:07,199 forward a little bit, 637 00:24:07,200 --> 00:24:09,899 nineteen seventy to 1972 638 00:24:09,900 --> 00:24:12,329 is the year that Eunuch's was released 639 00:24:12,330 --> 00:24:13,330 in a portable form, 640 00:24:14,910 --> 00:24:17,249 a new operating system, not 641 00:24:17,250 --> 00:24:19,349 the first operating system, not the 642 00:24:19,350 --> 00:24:22,169 first operating system to provide 643 00:24:22,170 --> 00:24:24,029 the features that it provided. 644 00:24:24,030 --> 00:24:26,399 But I would argue at this point, clearly 645 00:24:26,400 --> 00:24:29,069 of clearly the most successful, 646 00:24:29,070 --> 00:24:31,529 long lived, really groundbreaking 647 00:24:31,530 --> 00:24:34,139 operating system units brought in amazing 648 00:24:34,140 --> 00:24:36,690 innovations for software modularity. 649 00:24:38,160 --> 00:24:40,289 Unix really popularized multitasking. 650 00:24:40,290 --> 00:24:42,149 So you could have multiple programs 651 00:24:42,150 --> 00:24:44,529 running on a single CPU. 652 00:24:44,530 --> 00:24:45,719 This was kind of crazy in nineteen 653 00:24:45,720 --> 00:24:47,069 seventy two. Why would you do that. 654 00:24:47,070 --> 00:24:48,869 How, how could you do that in a way that 655 00:24:48,870 --> 00:24:49,870 actually works. 656 00:24:51,090 --> 00:24:53,669 Unix also introduced virtual memory. 657 00:24:53,670 --> 00:24:56,129 You no longer had to map 658 00:24:56,130 --> 00:24:58,379 the physical addresses directly and 659 00:24:58,380 --> 00:25:00,659 programed your overlays so that 660 00:25:00,660 --> 00:25:02,819 the thing that set your code would fit on 661 00:25:02,820 --> 00:25:04,259 on the machine. 662 00:25:04,260 --> 00:25:06,389 And kind of most revolutionary 663 00:25:06,390 --> 00:25:08,069 Unix was programed in a high level 664 00:25:08,070 --> 00:25:09,599 language, at least what was considered 665 00:25:09,600 --> 00:25:11,219 high level at the time. 666 00:25:11,220 --> 00:25:12,359 Programing Language C. 667 00:25:16,960 --> 00:25:19,179 This is just amazing. 668 00:25:19,180 --> 00:25:21,429 I mean, it's hard looking back 669 00:25:21,430 --> 00:25:24,039 today to imagine how amazing 670 00:25:24,040 --> 00:25:26,629 it was that that. 671 00:25:26,630 --> 00:25:29,329 This underlying technology 672 00:25:29,330 --> 00:25:31,429 enabled component reusability 673 00:25:31,430 --> 00:25:33,169 for software development. 674 00:25:33,170 --> 00:25:34,170 I mean, 675 00:25:35,870 --> 00:25:36,870 wow. 676 00:25:45,760 --> 00:25:46,760 Such computer. 677 00:25:51,290 --> 00:25:53,559 So we gained a ton of infrastructure 678 00:25:53,560 --> 00:25:56,239 software developers, and it was a really 679 00:25:56,240 --> 00:25:58,819 long, drawn out, painful process, 680 00:25:58,820 --> 00:26:01,489 not everyone got it at the beginning. 681 00:26:01,490 --> 00:26:03,829 The idea that reusability, 682 00:26:05,090 --> 00:26:07,369 small components, pieces working 683 00:26:07,370 --> 00:26:08,370 together 684 00:26:09,470 --> 00:26:11,959 in the service of the goals of 685 00:26:11,960 --> 00:26:14,659 the final user of the computer 686 00:26:14,660 --> 00:26:16,429 would actually be a good thing. 687 00:26:16,430 --> 00:26:18,379 And some of these technologies today are 688 00:26:18,380 --> 00:26:19,909 so taken for granted that it's hard to 689 00:26:19,910 --> 00:26:21,799 even remember that they exist as 690 00:26:21,800 --> 00:26:22,800 discrete, 691 00:26:24,410 --> 00:26:26,719 controversial at the time, technologies, 692 00:26:26,720 --> 00:26:30,199 virtual memory, very controversial. 693 00:26:30,200 --> 00:26:33,229 I started my career at Quray, 694 00:26:33,230 --> 00:26:35,539 the super computer company, and I had 695 00:26:35,540 --> 00:26:37,669 coworkers who remembered this battle. 696 00:26:37,670 --> 00:26:39,199 Now, virtual memory is a fad. 697 00:26:39,200 --> 00:26:40,460 It's never going to take off. 698 00:26:46,830 --> 00:26:48,449 Timesharing the fact that you could have 699 00:26:48,450 --> 00:26:50,819 multiple users using the same 700 00:26:50,820 --> 00:26:53,069 CPU simultaneously, I mean, 701 00:26:53,070 --> 00:26:55,019 looks good in a research paper, but who's 702 00:26:55,020 --> 00:26:56,609 actually going to deploy that production 703 00:26:59,490 --> 00:27:01,499 operating systems with these APIs? 704 00:27:01,500 --> 00:27:03,809 Like you can just call things and 705 00:27:03,810 --> 00:27:05,699 things happen in a semi reliable manner 706 00:27:05,700 --> 00:27:08,399 and you get bug fixes and 707 00:27:08,400 --> 00:27:09,989 your application just keeps calling thing 708 00:27:09,990 --> 00:27:10,679 functioning. 709 00:27:10,680 --> 00:27:12,179 It works better. How does it even how 710 00:27:12,180 --> 00:27:13,829 does it even happen? 711 00:27:13,830 --> 00:27:16,199 Networking. I just call socket and TCP 712 00:27:16,200 --> 00:27:17,639 does a bunch of packets under the hood 713 00:27:17,640 --> 00:27:19,919 and it's its amazing device 714 00:27:19,920 --> 00:27:21,779 drivers that TCAP layer isn't talking 715 00:27:21,780 --> 00:27:22,829 directly to the Internet cart. 716 00:27:22,830 --> 00:27:24,389 It's going through several layers of 717 00:27:24,390 --> 00:27:25,769 abstraction. 718 00:27:25,770 --> 00:27:28,049 We write all of these tools 719 00:27:28,050 --> 00:27:29,789 that we're building these days in high 720 00:27:29,790 --> 00:27:32,159 level languages and they all rest 721 00:27:32,160 --> 00:27:34,229 on the foundation of compilers 722 00:27:34,230 --> 00:27:37,049 that turn our expressive 723 00:27:37,050 --> 00:27:38,579 thoughts into 724 00:27:39,600 --> 00:27:41,819 into discrete actions 725 00:27:41,820 --> 00:27:43,349 on the hardware. 726 00:27:43,350 --> 00:27:45,239 And we have libraries have these vast 727 00:27:45,240 --> 00:27:46,319 libraries of software. 728 00:27:46,320 --> 00:27:47,519 Some of it's really terrible. 729 00:27:47,520 --> 00:27:49,139 Some of it's better, some of it's very 730 00:27:49,140 --> 00:27:50,519 new and won't last for very long. 731 00:27:50,520 --> 00:27:52,739 And others other parts of it are like 732 00:27:52,740 --> 00:27:54,359 Lindback and we're still learning code 733 00:27:54,360 --> 00:27:55,469 that was written in nineteen seventy 734 00:27:55,470 --> 00:27:56,470 three. 735 00:27:58,070 --> 00:28:00,289 It's amazing when you think about it, 736 00:28:00,290 --> 00:28:02,839 how will this infrastructure has really 737 00:28:02,840 --> 00:28:05,149 separated us from where 738 00:28:05,150 --> 00:28:07,309 we were at the beginning 739 00:28:07,310 --> 00:28:09,889 of the computer revolution? 740 00:28:09,890 --> 00:28:11,959 In conclusion, I can wrap it up and say 741 00:28:11,960 --> 00:28:13,939 that we don't program to bare metal 742 00:28:13,940 --> 00:28:16,099 anymore. We've managed to move 743 00:28:16,100 --> 00:28:18,409 up layers of abstraction and 744 00:28:18,410 --> 00:28:20,479 have gotten a lot of 745 00:28:20,480 --> 00:28:22,519 capability as a result. 746 00:28:23,980 --> 00:28:26,319 So now, given that background, 747 00:28:26,320 --> 00:28:28,449 it's time to tell you the manifesto of 748 00:28:28,450 --> 00:28:29,450 our project, 749 00:28:31,000 --> 00:28:33,129 our vision is to let us do 750 00:28:33,130 --> 00:28:35,229 more than one thing at a time on 751 00:28:35,230 --> 00:28:38,559 the FPGA and to do so flexibly. 752 00:28:38,560 --> 00:28:41,049 Or in other words, writing and using 753 00:28:41,050 --> 00:28:43,389 an FPGA accelerator core 754 00:28:43,390 --> 00:28:45,579 can and should 755 00:28:45,580 --> 00:28:47,679 be as easy as writing 756 00:28:47,680 --> 00:28:50,109 a high performance C application. 757 00:29:00,340 --> 00:29:02,469 I actually expected laughter rather 758 00:29:02,470 --> 00:29:04,809 than applause there, because 759 00:29:04,810 --> 00:29:06,040 I blame our test audience, 760 00:29:10,090 --> 00:29:12,189 because as far as those of you in 761 00:29:12,190 --> 00:29:14,169 the audience, you've done this no writing 762 00:29:14,170 --> 00:29:15,639 on High-Performance, the application is 763 00:29:15,640 --> 00:29:17,739 no easy matter, 764 00:29:17,740 --> 00:29:19,749 but imagine trying to do that in 765 00:29:19,750 --> 00:29:20,750 assembly. 766 00:29:21,370 --> 00:29:23,079 It would be a lot more work. 767 00:29:23,080 --> 00:29:25,449 So I wanted to emphasize 768 00:29:25,450 --> 00:29:27,639 that the goal of the 769 00:29:27,640 --> 00:29:30,099 Ballboy Project is to build 770 00:29:30,100 --> 00:29:32,589 dynamic, reconfigurable 771 00:29:32,590 --> 00:29:34,299 computing. And each of those words 772 00:29:34,300 --> 00:29:35,300 matters. 773 00:29:36,790 --> 00:29:38,950 It's dynamic because. 774 00:29:40,760 --> 00:29:42,829 What the hardware is doing at any 775 00:29:42,830 --> 00:29:45,559 specific time can be changed 776 00:29:45,560 --> 00:29:48,289 with the flip of a switch. 777 00:29:48,290 --> 00:29:51,050 It's reconfigurable because 778 00:29:52,130 --> 00:29:54,439 what you've expressed as the accelerator 779 00:29:54,440 --> 00:29:56,509 core can be moved 780 00:29:56,510 --> 00:29:59,299 around the FPGA to optimize resources 781 00:29:59,300 --> 00:30:01,609 or to allow room 782 00:30:01,610 --> 00:30:02,869 for another accelerator core to be 783 00:30:02,870 --> 00:30:04,939 loaded. And it's computing because 784 00:30:04,940 --> 00:30:07,609 Bilboa is focused on the idea 785 00:30:07,610 --> 00:30:10,009 of enabling computational 786 00:30:10,010 --> 00:30:11,529 acceleration on FPGA. 787 00:30:12,800 --> 00:30:15,019 FPGA is are also amazing because 788 00:30:15,020 --> 00:30:17,209 they have fantastic IO capabilities. 789 00:30:17,210 --> 00:30:19,519 They have really fast digital IO, really 790 00:30:19,520 --> 00:30:21,229 fast analog IO. 791 00:30:21,230 --> 00:30:22,159 Those are awesome. 792 00:30:22,160 --> 00:30:23,839 Those are fantastic. I'm glad they exist. 793 00:30:23,840 --> 00:30:25,669 I totally use things that have FPGA in 794 00:30:25,670 --> 00:30:28,069 them. But that's not what the Bilboa 795 00:30:28,070 --> 00:30:29,749 project is about. It's not the killer app 796 00:30:29,750 --> 00:30:30,750 for me. 797 00:30:33,050 --> 00:30:34,339 So Bilboa is. 798 00:30:36,380 --> 00:30:38,599 Getting down to the actual nuts 799 00:30:38,600 --> 00:30:41,149 and bolts of the system that 800 00:30:41,150 --> 00:30:42,649 we've built and that we're planning to 801 00:30:42,650 --> 00:30:43,650 build 802 00:30:44,750 --> 00:30:46,969 ballboy is a library and 803 00:30:46,970 --> 00:30:49,099 some control software which runs on the 804 00:30:49,100 --> 00:30:50,990 CPU of the platform. 805 00:30:52,730 --> 00:30:54,739 And the control software makes sure that 806 00:30:54,740 --> 00:30:57,049 the FPGA is doing its job 807 00:30:57,050 --> 00:30:58,849 and that the apps are able to access it 808 00:30:58,850 --> 00:30:59,989 correctly. 809 00:30:59,990 --> 00:31:02,089 And Bilboa is also 810 00:31:02,090 --> 00:31:04,249 a bus and management 811 00:31:04,250 --> 00:31:05,989 layer on the FPGA. 812 00:31:05,990 --> 00:31:08,119 Implemented in a log which 813 00:31:08,120 --> 00:31:10,219 you can plug into you 814 00:31:10,220 --> 00:31:12,679 as a developer of a 815 00:31:12,680 --> 00:31:14,929 accelerator core can plug 816 00:31:14,930 --> 00:31:16,339 into when you're writing the score. 817 00:31:16,340 --> 00:31:18,199 And we hope that it will provide 818 00:31:18,200 --> 00:31:20,449 infrastructure and useful 819 00:31:20,450 --> 00:31:22,250 management so that. 820 00:31:23,960 --> 00:31:26,349 This job becomes a lot easier. 821 00:31:30,430 --> 00:31:32,049 The Navina Hardware platform 822 00:31:33,520 --> 00:31:35,829 looks a lot like this in block diagram, 823 00:31:35,830 --> 00:31:37,960 leaving out some of the complicated bits. 824 00:31:44,050 --> 00:31:45,189 We're a little bit further in than we 825 00:31:45,190 --> 00:31:46,539 were in the last slide that had this 826 00:31:46,540 --> 00:31:47,499 picture on it. 827 00:31:47,500 --> 00:31:49,779 The FPGA and the ARM CPU 828 00:31:49,780 --> 00:31:51,819 are connected by that fast bus. 829 00:31:51,820 --> 00:31:54,069 Both the FPGA and the ARM have 830 00:31:54,070 --> 00:31:56,139 their own dram associated 831 00:31:56,140 --> 00:31:57,140 with them. 832 00:31:58,120 --> 00:32:00,429 The ARM CPU has the standard 833 00:32:00,430 --> 00:32:03,219 complement of interfaces 834 00:32:03,220 --> 00:32:05,979 USB, gigabit, Ethernet, HDMI 835 00:32:05,980 --> 00:32:07,209 and so on. 836 00:32:07,210 --> 00:32:09,130 The FPGA also has IO 837 00:32:10,420 --> 00:32:12,549 up at the top for those times 838 00:32:12,550 --> 00:32:13,599 when you just have to get out into the 839 00:32:13,600 --> 00:32:15,819 outside world and 840 00:32:15,820 --> 00:32:18,039 now diving in to see how 841 00:32:18,040 --> 00:32:20,919 the software maps onto 842 00:32:20,920 --> 00:32:22,900 this physical architecture. 843 00:32:25,350 --> 00:32:27,959 We've got the 844 00:32:27,960 --> 00:32:30,089 Balboa's system here 845 00:32:30,090 --> 00:32:31,619 in the FPGA world. 846 00:32:32,860 --> 00:32:35,229 We have the Balboa 847 00:32:35,230 --> 00:32:37,419 bus, which provides 848 00:32:37,420 --> 00:32:39,309 the management and the interconnect. 849 00:32:39,310 --> 00:32:41,739 We have many different cores running 850 00:32:41,740 --> 00:32:44,559 on the FPGA simultaneously 851 00:32:44,560 --> 00:32:46,929 and then on the software side. 852 00:32:46,930 --> 00:32:49,659 Interface, the interface is 853 00:32:49,660 --> 00:32:51,879 mediated by Leyb 854 00:32:51,880 --> 00:32:54,279 Bilboa and 855 00:32:54,280 --> 00:32:55,989 talking to Alibaba, we have the 856 00:32:55,990 --> 00:32:58,389 management software, FPGA Deman 857 00:32:58,390 --> 00:33:00,819 and any apps that are trying to use 858 00:33:00,820 --> 00:33:03,099 the FPGA cause 859 00:33:03,100 --> 00:33:05,619 accelerator cores to 860 00:33:05,620 --> 00:33:07,660 compute their favorite function. 861 00:33:14,310 --> 00:33:15,310 The Naveena. 862 00:33:16,230 --> 00:33:18,929 Laptop is amazing. 863 00:33:18,930 --> 00:33:21,089 It has this really quite large 864 00:33:21,090 --> 00:33:23,229 FPGA, I say huge here and 865 00:33:23,230 --> 00:33:24,359 I should quantify that. 866 00:33:24,360 --> 00:33:26,429 There's enough room on the 867 00:33:26,430 --> 00:33:28,589 FPGA to have two full 868 00:33:28,590 --> 00:33:30,689 thirty two bit CPUs 869 00:33:30,690 --> 00:33:32,759 running simultaneously with plenty of 870 00:33:32,760 --> 00:33:33,760 room to spare. 871 00:33:37,890 --> 00:33:40,679 So there's a huge, unexplored 872 00:33:40,680 --> 00:33:42,689 territory here of things that could be 873 00:33:42,690 --> 00:33:43,710 done with the FPGA. 874 00:33:48,480 --> 00:33:50,790 And it's time to go exploring. 875 00:33:52,880 --> 00:33:54,109 And it's a brave new world. 876 00:33:58,150 --> 00:34:00,519 So here's the vision where 877 00:34:00,520 --> 00:34:02,649 we can go and 878 00:34:02,650 --> 00:34:04,749 why having Balbo 879 00:34:04,750 --> 00:34:07,059 will be cool and is pretty cool already, 880 00:34:08,409 --> 00:34:10,749 Balbo provides interoperability 881 00:34:10,750 --> 00:34:13,239 of the accelerator course so you can. 882 00:34:17,850 --> 00:34:20,129 Use multiple cores simultaneously 883 00:34:20,130 --> 00:34:21,539 on a single FPGA. 884 00:34:23,150 --> 00:34:25,040 And it also. 885 00:34:26,870 --> 00:34:28,289 We're hoping that we will that the 886 00:34:28,290 --> 00:34:29,689 tooling will improve over time. 887 00:34:29,690 --> 00:34:31,189 Currently we're using the vendor 888 00:34:31,190 --> 00:34:33,319 toolchain and it's not 889 00:34:33,320 --> 00:34:35,718 great, although it's getting better 890 00:34:35,719 --> 00:34:37,789 as we figure it out and learn the ins and 891 00:34:37,790 --> 00:34:38,899 outs. 892 00:34:38,900 --> 00:34:40,968 FPGA is currently aren't as 893 00:34:40,969 --> 00:34:42,649 useful as they could be. 894 00:34:42,650 --> 00:34:44,779 Many Navina users 895 00:34:44,780 --> 00:34:46,339 don't have any plans to use their FPGA 896 00:34:46,340 --> 00:34:48,138 because they don't understand what it 897 00:34:48,139 --> 00:34:50,509 could possibly be useful for. 898 00:34:50,510 --> 00:34:52,488 So we think that with some 899 00:34:52,489 --> 00:34:55,729 infrastructure, the Balbo infrastructure, 900 00:34:55,730 --> 00:34:57,109 the FPGA, I'm going to be there will be a 901 00:34:57,110 --> 00:34:58,829 lot more useful. 902 00:34:58,830 --> 00:35:00,919 Now, one point that it took 903 00:35:00,920 --> 00:35:03,049 me a while to get to here is that 904 00:35:03,050 --> 00:35:05,659 the point of Bilboa is not building 905 00:35:05,660 --> 00:35:07,340 the tools and the compilers. 906 00:35:10,630 --> 00:35:12,849 The point of Elbo is really to 907 00:35:12,850 --> 00:35:16,269 allow us to use the FPGA efficiently 908 00:35:16,270 --> 00:35:18,159 with less overhead of doing so. 909 00:35:19,170 --> 00:35:21,299 It doesn't matter to me 910 00:35:21,300 --> 00:35:23,999 if my FPGA can run something really fast, 911 00:35:24,000 --> 00:35:26,069 if it's a lot of work for me to set 912 00:35:26,070 --> 00:35:28,139 up, I can't even begin to tell 913 00:35:28,140 --> 00:35:30,419 you how many cool FPGA boards 914 00:35:30,420 --> 00:35:32,399 I have that I bought because FPGA is are 915 00:35:32,400 --> 00:35:34,439 cool and they're sitting in my closet 916 00:35:34,440 --> 00:35:36,239 because they're too much work to set up. 917 00:35:38,170 --> 00:35:40,539 The point of Bilboa is to let us do more 918 00:35:40,540 --> 00:35:42,639 than one thing at a time on the FPGA and 919 00:35:42,640 --> 00:35:43,749 to do that flexibly. 920 00:35:46,600 --> 00:35:49,000 So the goals of the ballboy project. 921 00:35:50,280 --> 00:35:52,379 Is that you developers, right, 922 00:35:52,380 --> 00:35:54,839 cause in your chosen HDL currently, 923 00:35:54,840 --> 00:35:56,519 I would recommend long because the 924 00:35:56,520 --> 00:35:59,279 alternatives aren't completely baked, 925 00:35:59,280 --> 00:36:01,469 but I can see paths to a better 926 00:36:01,470 --> 00:36:02,429 future. 927 00:36:02,430 --> 00:36:04,499 You get fast direct access to 928 00:36:04,500 --> 00:36:05,809 the core. 929 00:36:05,810 --> 00:36:07,979 Bilboa just sets things up and then gets 930 00:36:07,980 --> 00:36:08,980 out of the way. 931 00:36:09,930 --> 00:36:12,179 You get standard interfaces for both 932 00:36:12,180 --> 00:36:14,819 the core and the application. 933 00:36:14,820 --> 00:36:17,219 I spent several months at the beginning 934 00:36:17,220 --> 00:36:19,649 of my of this a 935 00:36:19,650 --> 00:36:21,779 project trying to figure out how the 936 00:36:21,780 --> 00:36:23,679 heck I was supposed to connect the very 937 00:36:23,680 --> 00:36:25,289 blog that I was writing to the software 938 00:36:25,290 --> 00:36:27,239 that I wanted it to talk to. 939 00:36:27,240 --> 00:36:29,399 And most importantly to me, 940 00:36:29,400 --> 00:36:31,619 is that with Bilboa, the 941 00:36:31,620 --> 00:36:33,989 end user chooses what runs 942 00:36:33,990 --> 00:36:36,329 and when drawing a connection 943 00:36:36,330 --> 00:36:38,429 back to how software reconfigure 944 00:36:38,430 --> 00:36:41,279 ability completely changed the world 945 00:36:41,280 --> 00:36:42,689 today, no one would think of having a 946 00:36:42,690 --> 00:36:45,209 single purpose computer that 947 00:36:45,210 --> 00:36:47,279 costs an enormous amount of money and 948 00:36:47,280 --> 00:36:48,629 only does one thing. 949 00:36:48,630 --> 00:36:50,129 The person who's using the computer 950 00:36:50,130 --> 00:36:51,899 chooses what it's doing. 951 00:36:51,900 --> 00:36:53,789 And with Balbo, what we can do. 952 00:36:53,790 --> 00:36:55,980 The same for FPGA is the 953 00:36:57,180 --> 00:36:59,219 authors of Accelerated Course and the 954 00:36:59,220 --> 00:37:01,529 authors of applications that use them 955 00:37:01,530 --> 00:37:03,779 can build something, can build some 956 00:37:03,780 --> 00:37:05,909 components in the Unix philosophy, 957 00:37:05,910 --> 00:37:07,829 build something small that does one thing 958 00:37:07,830 --> 00:37:10,259 well, and the end user 959 00:37:10,260 --> 00:37:12,419 gets to decide when and where 960 00:37:12,420 --> 00:37:14,309 that runs and what it does. 961 00:37:14,310 --> 00:37:16,739 Taking it places that the original author 962 00:37:16,740 --> 00:37:17,999 never could have dreamed of. 963 00:37:21,150 --> 00:37:23,249 So how far along 964 00:37:23,250 --> 00:37:25,379 on this ambitious vision are we 965 00:37:25,380 --> 00:37:26,380 today? 966 00:37:28,640 --> 00:37:30,799 Frankly, after 967 00:37:30,800 --> 00:37:32,389 we started this talk, after we started 968 00:37:32,390 --> 00:37:34,459 this project most of a 969 00:37:34,460 --> 00:37:36,319 year ago, we're not as far along as I had 970 00:37:36,320 --> 00:37:38,570 hoped I would be at Congress this year. 971 00:37:39,800 --> 00:37:41,329 But we have some really encouraging 972 00:37:41,330 --> 00:37:42,289 results. 973 00:37:42,290 --> 00:37:44,389 We do have multiple cores 974 00:37:44,390 --> 00:37:46,519 running on the FPGA one 975 00:37:46,520 --> 00:37:49,579 at a time, so we can 976 00:37:49,580 --> 00:37:51,859 bring up one to one core, 977 00:37:51,860 --> 00:37:53,809 run with it for a while and then say, I'm 978 00:37:53,810 --> 00:37:54,859 done with this, I'm going to start a 979 00:37:54,860 --> 00:37:56,299 different core and start that. 980 00:37:56,300 --> 00:37:58,129 And the Navina system continues running 981 00:37:58,130 --> 00:37:59,130 flawlessly. 982 00:38:02,210 --> 00:38:04,639 Apps using the Balbo system 983 00:38:04,640 --> 00:38:06,859 can and map the core and get direct 984 00:38:06,860 --> 00:38:09,649 access to the registers exposed 985 00:38:09,650 --> 00:38:11,839 by the accelerator core. 986 00:38:11,840 --> 00:38:14,329 And we're doing this all currently 987 00:38:14,330 --> 00:38:15,590 without a kernel driver. 988 00:38:16,610 --> 00:38:17,929 I'm not sure that we can keep this up. 989 00:38:17,930 --> 00:38:19,909 I think that a kernel driver is going to 990 00:38:19,910 --> 00:38:22,039 be a critical part of the system 991 00:38:22,040 --> 00:38:23,480 in the 992 00:38:24,590 --> 00:38:25,969 medium to long term. 993 00:38:25,970 --> 00:38:28,339 But for now, everything can be done 994 00:38:28,340 --> 00:38:30,419 from a usual end process running as 995 00:38:30,420 --> 00:38:32,929 root YOLO. 996 00:38:36,800 --> 00:38:38,959 So that's the current status of 997 00:38:38,960 --> 00:38:40,639 the ballboy project. 998 00:38:40,640 --> 00:38:42,799 So at this point, we're going to discuss 999 00:38:42,800 --> 00:38:45,019 what's coming up next 1000 00:38:45,020 --> 00:38:47,509 for us on the project 1001 00:38:47,510 --> 00:38:48,829 and what we'd like to see happen. 1002 00:38:51,020 --> 00:38:53,539 So first, 1003 00:38:53,540 --> 00:38:55,609 some issues. 1004 00:38:55,610 --> 00:38:56,610 Yeah, so 1005 00:39:00,110 --> 00:39:02,599 the introduction 1006 00:39:02,600 --> 00:39:05,179 of timesharing and virtual memory, 1007 00:39:05,180 --> 00:39:07,399 in addition to enabling enormous kinds 1008 00:39:07,400 --> 00:39:09,289 of productivity and creativity, also 1009 00:39:09,290 --> 00:39:11,749 brought in some security challenges. 1010 00:39:11,750 --> 00:39:12,800 Now you have 1011 00:39:15,020 --> 00:39:16,159 resource exhaustion issues. 1012 00:39:16,160 --> 00:39:17,869 You can use a computer because there are 1013 00:39:17,870 --> 00:39:19,309 multiple things running and maybe one of 1014 00:39:19,310 --> 00:39:20,509 them is trying to prevent another one 1015 00:39:20,510 --> 00:39:21,510 from running. 1016 00:39:22,160 --> 00:39:24,499 Similar many similar kinds of problems 1017 00:39:24,500 --> 00:39:26,689 can happen. What should we call this? 1018 00:39:26,690 --> 00:39:28,939 What should we call this seven security 1019 00:39:28,940 --> 00:39:30,139 issues? I don't know. 1020 00:39:30,140 --> 00:39:32,419 There are challenges, 1021 00:39:32,420 --> 00:39:34,219 things that could happen on an FPGA if 1022 00:39:34,220 --> 00:39:36,529 you're doing it, or seven 1023 00:39:36,530 --> 00:39:37,530 awesome hacks. 1024 00:39:39,860 --> 00:39:41,929 So seven ideas 1025 00:39:41,930 --> 00:39:43,999 for potential security problems 1026 00:39:44,000 --> 00:39:46,309 or awesome hacks, which 1027 00:39:46,310 --> 00:39:48,409 I hope someone at a future Congress will 1028 00:39:48,410 --> 00:39:50,889 present on the first one 1029 00:39:50,890 --> 00:39:52,219 electromagnetic coupling. 1030 00:39:53,600 --> 00:39:55,939 There's this amazing paper by Adrian 1031 00:39:55,940 --> 00:39:58,219 Thompson from 1996 called 1032 00:39:58,220 --> 00:39:59,239 An Evolved Circuit. 1033 00:39:59,240 --> 00:40:01,069 The full title is actually a little poem, 1034 00:40:01,070 --> 00:40:02,239 but it didn't fit in our slide. 1035 00:40:04,550 --> 00:40:06,229 The picture here is a picture of the 1036 00:40:06,230 --> 00:40:07,790 circuit which which he evolved. 1037 00:40:09,560 --> 00:40:11,689 And there is some gray oh, they're 1038 00:40:11,690 --> 00:40:12,619 completely unreadable. 1039 00:40:12,620 --> 00:40:13,959 Unfortunately, they're not visible. 1040 00:40:13,960 --> 00:40:15,799 Yeah, there's some gray squares up there. 1041 00:40:15,800 --> 00:40:18,019 Don't turn around around 1042 00:40:18,020 --> 00:40:19,099 the edges of that circuit. 1043 00:40:19,100 --> 00:40:20,100 They're not connected. 1044 00:40:21,890 --> 00:40:23,809 And according to the documentation of the 1045 00:40:23,810 --> 00:40:26,209 FPGA, unconnected, 1046 00:40:26,210 --> 00:40:28,129 unconnected, functional blocks shouldn't 1047 00:40:28,130 --> 00:40:29,479 be able to affect the outcome of the 1048 00:40:29,480 --> 00:40:30,679 circuit at all. 1049 00:40:32,030 --> 00:40:34,099 Shouldn't be able to, according to the 1050 00:40:34,100 --> 00:40:35,389 documentation. 1051 00:40:35,390 --> 00:40:37,519 Turns out that if you remove 1052 00:40:37,520 --> 00:40:39,109 any of those functional blocks from the 1053 00:40:39,110 --> 00:40:41,149 design, the circuit stops working. 1054 00:40:43,130 --> 00:40:45,019 The documentation isn't telling me the 1055 00:40:45,020 --> 00:40:46,070 complete truth, 1056 00:40:47,870 --> 00:40:49,630 I'm shocked, shocked. 1057 00:40:53,760 --> 00:40:55,229 An interesting side note here. 1058 00:40:55,230 --> 00:40:57,539 So this paper is amazing, I recommend 1059 00:40:57,540 --> 00:40:58,550 that everyone read it. 1060 00:40:59,560 --> 00:41:01,689 It was on the Web for like 15 years. 1061 00:41:01,690 --> 00:41:03,809 It's totally incredible paper, 1062 00:41:03,810 --> 00:41:06,119 really well done with some high quality 1063 00:41:06,120 --> 00:41:07,900 nineteen ninety six web design. 1064 00:41:10,710 --> 00:41:11,699 It's great. 1065 00:41:11,700 --> 00:41:13,919 And the University of Sussex 1066 00:41:13,920 --> 00:41:16,139 restructured the website in 2013, 1067 00:41:16,140 --> 00:41:18,299 and now that paper is no longer 1068 00:41:18,300 --> 00:41:20,459 on their website. The links on Google go 1069 00:41:20,460 --> 00:41:21,749 nowhere. 1070 00:41:21,750 --> 00:41:23,849 Adrien's paper is only available 1071 00:41:23,850 --> 00:41:25,589 on archive dot. 1072 00:41:25,590 --> 00:41:28,319 So a shout out to the Internet archive. 1073 00:41:28,320 --> 00:41:30,389 Even the academics are trying 1074 00:41:30,390 --> 00:41:31,390 to delete the web. 1075 00:41:33,090 --> 00:41:35,249 It's important for us because, of course, 1076 00:41:35,250 --> 00:41:37,199 have never really been shared on the same 1077 00:41:37,200 --> 00:41:38,429 FPGA before. 1078 00:41:38,430 --> 00:41:40,199 So the question this raises is whether 1079 00:41:40,200 --> 00:41:42,869 you could create an appropriate antenna 1080 00:41:42,870 --> 00:41:45,509 in hardware to snoop on other processes 1081 00:41:45,510 --> 00:41:47,079 being computed on the same FPGA 1082 00:41:48,420 --> 00:41:50,039 seems like it should be possible. 1083 00:41:50,040 --> 00:41:51,659 Documentation says it won't work, but. 1084 00:41:54,500 --> 00:41:55,969 Second awesome hack, 1085 00:41:57,650 --> 00:42:00,049 that bitstream, that is output 1086 00:42:00,050 --> 00:42:02,449 from the toolchain and then fed 1087 00:42:02,450 --> 00:42:04,130 to the 1088 00:42:05,180 --> 00:42:07,390 FPGA over a hardware interface, 1089 00:42:08,690 --> 00:42:10,309 it's a stream of bits. 1090 00:42:10,310 --> 00:42:13,009 It's a language in 1091 00:42:13,010 --> 00:42:15,349 in the in the in the hash 1092 00:42:15,350 --> 00:42:17,209 tag, it's called Lansac. 1093 00:42:17,210 --> 00:42:19,459 The idea of tweaking inputs to 1094 00:42:19,460 --> 00:42:21,799 a given interpreter to 1095 00:42:21,800 --> 00:42:23,209 find unexpected behavior. 1096 00:42:23,210 --> 00:42:25,879 And that interpreter, the FPGA bitstream 1097 00:42:25,880 --> 00:42:28,129 is a language with an interpreter 1098 00:42:28,130 --> 00:42:30,229 implemented in hardware in 1099 00:42:30,230 --> 00:42:31,349 the FPGA fabric. 1100 00:42:32,360 --> 00:42:34,759 What happens if you give that interpreter 1101 00:42:34,760 --> 00:42:36,289 unexpected input? 1102 00:42:36,290 --> 00:42:37,969 Do we get weird results? 1103 00:42:37,970 --> 00:42:39,439 I bet we do. It's only gotten trusted 1104 00:42:39,440 --> 00:42:40,440 input so far. 1105 00:42:43,090 --> 00:42:44,859 Which leads to the second option, 1106 00:42:44,860 --> 00:42:46,390 bitstream exploits 1107 00:42:48,910 --> 00:42:51,369 in our pre talk prep, 1108 00:42:51,370 --> 00:42:53,559 we talked to someone who's 1109 00:42:53,560 --> 00:42:54,939 actually done some reverse engineering in 1110 00:42:54,940 --> 00:42:57,159 this space labs, and 1111 00:42:57,160 --> 00:42:59,079 he pointed out that the bitstream 1112 00:42:59,080 --> 00:43:01,449 apparently can fairly 1113 00:43:01,450 --> 00:43:03,879 easily be configured to connect 1114 00:43:03,880 --> 00:43:05,709 power directly to ground. 1115 00:43:08,440 --> 00:43:10,719 And everyone who laughed has let 1116 00:43:10,720 --> 00:43:12,459 out the magic smoke at some point by 1117 00:43:12,460 --> 00:43:13,510 doing exactly that, 1118 00:43:14,920 --> 00:43:17,139 probably some really interesting, 1119 00:43:17,140 --> 00:43:18,879 fairly expensive experiments that need to 1120 00:43:18,880 --> 00:43:20,259 be run there. 1121 00:43:20,260 --> 00:43:22,599 You two can talk to us about the Navina 1122 00:43:22,600 --> 00:43:24,489 hardware firms are very good. 1123 00:43:24,490 --> 00:43:25,490 It's true. 1124 00:43:26,560 --> 00:43:28,899 The best protocol, the FPGA 1125 00:43:28,900 --> 00:43:31,479 on the Navina and the city on the Nubeena 1126 00:43:31,480 --> 00:43:33,729 are connected with this A.I.M. 1127 00:43:33,730 --> 00:43:35,529 bus. Some acronym doesn't matter what it 1128 00:43:35,530 --> 00:43:37,299 says. It's a bus. 1129 00:43:37,300 --> 00:43:39,729 It's pretty complicated because it has 1130 00:43:39,730 --> 00:43:41,469 a very sophisticated protocol 1131 00:43:42,550 --> 00:43:44,529 with very strict timing, timing 1132 00:43:44,530 --> 00:43:45,759 requirements. 1133 00:43:45,760 --> 00:43:48,069 So when the 1134 00:43:48,070 --> 00:43:50,499 CPU asks the FPGA 1135 00:43:50,500 --> 00:43:52,599 for a given 1136 00:43:52,600 --> 00:43:54,669 memory address, the FPGA 1137 00:43:54,670 --> 00:43:57,039 has a responsibility to return 1138 00:43:57,040 --> 00:43:58,329 an answer within a certain number of 1139 00:43:58,330 --> 00:43:59,529 clock cycles. 1140 00:43:59,530 --> 00:44:02,439 And if the FPGA doesn't return, 1141 00:44:02,440 --> 00:44:03,969 the CPU might crash. 1142 00:44:03,970 --> 00:44:06,559 Might Hangu ask me how I know? 1143 00:44:06,560 --> 00:44:08,879 Ask me how many times I Powerset 1144 00:44:08,880 --> 00:44:10,599 cycled my dev hardware, 1145 00:44:12,700 --> 00:44:14,829 either the core, the 1146 00:44:14,830 --> 00:44:17,229 accelerator core on the FPGA or the app 1147 00:44:17,230 --> 00:44:18,429 might be able to trigger this. 1148 00:44:18,430 --> 00:44:20,619 So there's some really interesting 1149 00:44:20,620 --> 00:44:22,029 fuzzing work to be done. 1150 00:44:22,030 --> 00:44:23,799 Hopefully we can't let things on fire as 1151 00:44:23,800 --> 00:44:25,689 it is as easily at that part of the 1152 00:44:25,690 --> 00:44:27,849 protocol that 1153 00:44:27,850 --> 00:44:30,969 leads into the fifth 1154 00:44:30,970 --> 00:44:33,339 awesome hack malicious apps. 1155 00:44:33,340 --> 00:44:35,559 Once I have an app with 1156 00:44:35,560 --> 00:44:38,079 access to an FPGA accelerator core, 1157 00:44:38,080 --> 00:44:40,449 it's talking over that Yamba's 1158 00:44:40,450 --> 00:44:43,299 to the Balbo bus on the FPGA. 1159 00:44:43,300 --> 00:44:45,609 What can the app do to trigger unexpected 1160 00:44:45,610 --> 00:44:47,259 behavior now? 1161 00:44:47,260 --> 00:44:48,519 Most things are pretty robust. 1162 00:44:48,520 --> 00:44:50,169 I'm sure that if you do normal things 1163 00:44:50,170 --> 00:44:51,819 from your app, then nothing bad will 1164 00:44:51,820 --> 00:44:54,099 happen. But what happens if the app 1165 00:44:54,100 --> 00:44:56,529 issues and unaligned read 1166 00:44:56,530 --> 00:44:57,550 to the FPGA? 1167 00:44:58,570 --> 00:45:00,099 Maybe it works, right? 1168 00:45:00,100 --> 00:45:01,389 Maybe we see some really interesting 1169 00:45:01,390 --> 00:45:02,390 behavior. 1170 00:45:06,820 --> 00:45:08,349 Had a joke there, but I've lost it. 1171 00:45:12,520 --> 00:45:14,589 Another awesome hack timing 1172 00:45:14,590 --> 00:45:15,489 attacks. 1173 00:45:15,490 --> 00:45:17,649 One of the reasons that figures are 1174 00:45:17,650 --> 00:45:19,959 so cool is that you get down to 1175 00:45:19,960 --> 00:45:22,209 the cycle, the clock cycle, 1176 00:45:22,210 --> 00:45:23,949 fifty megahertz or one hundred megahertz 1177 00:45:23,950 --> 00:45:25,989 or two megahertz, accurate timing 1178 00:45:25,990 --> 00:45:27,489 information. When something happens, you 1179 00:45:27,490 --> 00:45:29,710 can know exactly when it happened. 1180 00:45:31,000 --> 00:45:33,189 Sounds like a great way to build 1181 00:45:33,190 --> 00:45:35,289 a timing attack on a software 1182 00:45:35,290 --> 00:45:36,249 algorithm. 1183 00:45:36,250 --> 00:45:38,379 If I can extract the exact 1184 00:45:38,380 --> 00:45:40,539 cycle when some IO happened, I 1185 00:45:40,540 --> 00:45:41,949 can probably find out some really 1186 00:45:41,950 --> 00:45:44,019 interesting things about your cache 1187 00:45:44,020 --> 00:45:46,569 misses the layout 1188 00:45:46,570 --> 00:45:48,579 of your a key table. 1189 00:45:49,630 --> 00:45:51,610 Your RSA implementation 1190 00:45:52,750 --> 00:45:53,750 should be a lot of fun. 1191 00:45:56,470 --> 00:45:58,719 And the 7th 1192 00:45:58,720 --> 00:45:59,800 awesome hack 1193 00:46:01,060 --> 00:46:03,939 is hardware back doors. 1194 00:46:03,940 --> 00:46:06,099 Now there's this 1195 00:46:06,100 --> 00:46:08,409 interesting idea that some 1196 00:46:08,410 --> 00:46:10,509 people have bandied about for why 1197 00:46:10,510 --> 00:46:12,219 we should switch to FGS. 1198 00:46:12,220 --> 00:46:14,559 And to make this make any sense, 1199 00:46:14,560 --> 00:46:16,779 I have to put on my paranoid 1200 00:46:16,780 --> 00:46:19,359 tinfoil hat and say, 1201 00:46:19,360 --> 00:46:21,729 you know, I'm worried that the NSA 1202 00:46:21,730 --> 00:46:23,949 might have put a backdoor in my Intel 1203 00:46:23,950 --> 00:46:26,829 CPU's A.I.S instructions. 1204 00:46:26,830 --> 00:46:27,830 And 1205 00:46:30,490 --> 00:46:33,339 I'm going to fix this by 1206 00:46:33,340 --> 00:46:35,199 not using an Intel c.p.u to do my 1207 00:46:35,200 --> 00:46:37,419 critical secure computation. 1208 00:46:37,420 --> 00:46:39,309 Instead, I'm going to use. 1209 00:46:44,780 --> 00:46:46,199 Our I look fine here 1210 00:46:48,390 --> 00:46:50,259 in the NSA had nothing to do with that, 1211 00:46:50,260 --> 00:46:52,469 I'm sure I'm 1212 00:46:52,470 --> 00:46:53,470 not so sure. 1213 00:46:57,320 --> 00:46:59,329 So I'm going to use a soft spot. 1214 00:46:59,330 --> 00:47:01,399 It's written in very long, I can look at 1215 00:47:01,400 --> 00:47:03,169 it, read the source code and convince 1216 00:47:03,170 --> 00:47:04,729 myself that it's correct and I can look 1217 00:47:04,730 --> 00:47:06,499 at the bitstream and convince myself that 1218 00:47:06,500 --> 00:47:08,299 it's correct. You, the NSA have not back 1219 00:47:08,300 --> 00:47:10,489 toward my toolchain and I'm running 1220 00:47:10,490 --> 00:47:12,679 on an FPGA, which is just a sea of gates. 1221 00:47:12,680 --> 00:47:13,729 Everything should be fine, right? 1222 00:47:13,730 --> 00:47:15,469 I can trust that they didn't backdoor 1223 00:47:15,470 --> 00:47:18,079 every gate on this FPGA. 1224 00:47:18,080 --> 00:47:20,179 It turns out the urban planning metaphor 1225 00:47:20,180 --> 00:47:21,499 was very accurate. 1226 00:47:21,500 --> 00:47:23,749 You can't just put the parts of your soft 1227 00:47:23,750 --> 00:47:25,730 CPU anywhere on the FPGA fabric. 1228 00:47:27,470 --> 00:47:30,229 If you have an FPGA, the ACLU 1229 00:47:30,230 --> 00:47:32,389 and any accelerators that you put are 1230 00:47:32,390 --> 00:47:34,309 going to end up in a fairly small set of 1231 00:47:34,310 --> 00:47:36,529 places on that fabric. 1232 00:47:36,530 --> 00:47:38,719 So if I am the NSA and I'm 1233 00:47:38,720 --> 00:47:41,179 in charge of factoring in FPGA 1234 00:47:41,180 --> 00:47:43,429 so that it will be able to snoop on 1235 00:47:43,430 --> 00:47:45,529 the eskies of a soft CPU, 1236 00:47:45,530 --> 00:47:47,599 implement that and then FPGA, 1237 00:47:47,600 --> 00:47:48,949 I can probably manage to do it. 1238 00:47:48,950 --> 00:47:51,440 Turns out Shadforths 1239 00:47:52,460 --> 00:47:54,289 the back doors. In short, the back doors 1240 00:47:54,290 --> 00:47:56,599 will probably still work if they exist 1241 00:47:56,600 --> 00:47:59,089 at all, which they don't, because I took 1242 00:47:59,090 --> 00:48:00,860 off my tinfoil hat 1243 00:48:02,150 --> 00:48:03,349 and their slides are back, so that's 1244 00:48:03,350 --> 00:48:04,350 good. 1245 00:48:04,790 --> 00:48:06,979 As far as the rest of Balbo goes, we 1246 00:48:06,980 --> 00:48:08,419 could use your help. We'd love to have 1247 00:48:08,420 --> 00:48:09,679 people join your project. 1248 00:48:09,680 --> 00:48:12,439 And we're going to wrap up by describing 1249 00:48:12,440 --> 00:48:14,449 seven things that we would like to have 1250 00:48:14,450 --> 00:48:15,450 help working on. 1251 00:48:19,820 --> 00:48:22,219 So our current proof of concept 1252 00:48:22,220 --> 00:48:24,679 is just one core at a time. 1253 00:48:24,680 --> 00:48:26,899 We can build the 1254 00:48:26,900 --> 00:48:29,059 accelerator core and synthesize it 1255 00:48:29,060 --> 00:48:31,489 and loaded onto the FPGA. 1256 00:48:31,490 --> 00:48:33,709 But as a next step, 1257 00:48:33,710 --> 00:48:35,599 one goal of the project is to make it 1258 00:48:35,600 --> 00:48:37,789 really easy to take multiple 1259 00:48:37,790 --> 00:48:40,129 cores, synthesize them into a single 1260 00:48:40,130 --> 00:48:42,199 fixed bitstream and load that 1261 00:48:42,200 --> 00:48:44,059 onto the FPGA. 1262 00:48:44,060 --> 00:48:46,339 Pretty straightforward, Hadayet. 1263 00:48:46,340 --> 00:48:47,959 A little bit more kudret could use some 1264 00:48:47,960 --> 00:48:48,960 help. 1265 00:48:50,060 --> 00:48:52,129 Dynamic reconfiguration of refugees is 1266 00:48:52,130 --> 00:48:54,049 a really interesting area. 1267 00:48:54,050 --> 00:48:56,239 Now most people who 1268 00:48:56,240 --> 00:48:58,699 are using FPGA is reconfigure 1269 00:48:58,700 --> 00:48:59,869 the whole thing at once. 1270 00:48:59,870 --> 00:49:02,059 You can figure at one time 1271 00:49:02,060 --> 00:49:04,159 and now you're FPGA is a 1272 00:49:04,160 --> 00:49:05,929 software software, defined radio. 1273 00:49:07,850 --> 00:49:10,339 And you 1274 00:49:10,340 --> 00:49:11,659 if you need to do something else with it, 1275 00:49:11,660 --> 00:49:13,729 if you want to use it as a Bitcoin 1276 00:49:13,730 --> 00:49:15,559 miner, you stop using it as a software 1277 00:49:15,560 --> 00:49:18,019 defined radio and load the Bitcoin 1278 00:49:18,020 --> 00:49:19,519 mining bitstream. 1279 00:49:19,520 --> 00:49:21,619 And now it's a Bitcoin 1280 00:49:21,620 --> 00:49:23,689 miner. Well, it would be really cool if I 1281 00:49:23,690 --> 00:49:25,069 could have the software defined radio 1282 00:49:25,070 --> 00:49:27,769 taking up two thirds of the FPGA and 1283 00:49:27,770 --> 00:49:29,389 dynamically decide, oh, I would like to 1284 00:49:29,390 --> 00:49:31,849 use Bitcoin mining, load the 1285 00:49:31,850 --> 00:49:34,309 Bitcoin core on the remaining space 1286 00:49:34,310 --> 00:49:37,159 and go from there 1287 00:49:37,160 --> 00:49:39,559 and then decide, oh, I need to do some 1288 00:49:39,560 --> 00:49:41,899 secure, I need do lots of 1289 00:49:41,900 --> 00:49:44,269 crypto. So I'm going to unload my Bitcoin 1290 00:49:44,270 --> 00:49:46,399 mining rig and load up a 1291 00:49:46,400 --> 00:49:48,289 crypto accelerator without stopping the 1292 00:49:48,290 --> 00:49:49,759 SDR. 1293 00:49:49,760 --> 00:49:52,429 This is theoretically possible. 1294 00:49:52,430 --> 00:49:53,689 The documentation says that it should 1295 00:49:53,690 --> 00:49:55,130 work. Haven't gotten to working yet, 1296 00:49:57,920 --> 00:50:00,379 having multiple cores on the FPGA, 1297 00:50:00,380 --> 00:50:02,959 running on the FPGA, simultaneously 1298 00:50:02,960 --> 00:50:05,599 using the bus between 1299 00:50:05,600 --> 00:50:07,939 the FPGA and the CPU 1300 00:50:09,020 --> 00:50:10,459 in a fair manner. 1301 00:50:10,460 --> 00:50:12,230 So we need bus arbitration. 1302 00:50:13,670 --> 00:50:16,009 And related to that, allowing multiple 1303 00:50:16,010 --> 00:50:18,289 apps to use the FPGA 1304 00:50:18,290 --> 00:50:20,479 but simultaneously will require 1305 00:50:20,480 --> 00:50:22,639 some software work on the Unix side, 1306 00:50:22,640 --> 00:50:24,739 on the winning side to let 1307 00:50:24,740 --> 00:50:27,349 the map mappings interoperate 1308 00:50:27,350 --> 00:50:28,350 cleanly. 1309 00:50:29,510 --> 00:50:31,789 The Balbo 1310 00:50:31,790 --> 00:50:34,249 FPGA, the Navina FPGA has 1311 00:50:34,250 --> 00:50:36,139 a bunch of really cool IO and we should 1312 00:50:36,140 --> 00:50:37,669 be able to use that. 1313 00:50:37,670 --> 00:50:39,679 That's just a matter of teaching the 1314 00:50:39,680 --> 00:50:41,749 Balbo a framework about those resources 1315 00:50:41,750 --> 00:50:43,669 and giving it a management layer. 1316 00:50:43,670 --> 00:50:46,369 And related to that, using the RAM, 1317 00:50:46,370 --> 00:50:48,379 that ram that's attached to the FPGA 1318 00:50:48,380 --> 00:50:50,809 currently, if you wanted to use that in 1319 00:50:50,810 --> 00:50:52,879 a Balbo core, you would have to do a 1320 00:50:52,880 --> 00:50:54,529 bunch of hand coding and very log. 1321 00:50:54,530 --> 00:50:55,789 That should be easier. There's no reason 1322 00:50:55,790 --> 00:50:57,889 for it to be as complicated 1323 00:50:57,890 --> 00:51:00,109 as it is. And the last and perhaps I 1324 00:51:00,110 --> 00:51:02,389 think the most important is that 1325 00:51:02,390 --> 00:51:04,639 we should be able to implement 1326 00:51:04,640 --> 00:51:05,839 cause for Bilboa. 1327 00:51:05,840 --> 00:51:07,129 Using higher level language does not have 1328 00:51:07,130 --> 00:51:09,799 any right very ourselves because 1329 00:51:09,800 --> 00:51:12,050 writing for our log is kind of a pen. 1330 00:51:14,120 --> 00:51:15,560 So here's how you join. 1331 00:51:16,580 --> 00:51:18,349 I will admit that when we found out our 1332 00:51:18,350 --> 00:51:20,509 talk was accepted, we hastily 1333 00:51:20,510 --> 00:51:21,919 whipped up a wiki, etc.. 1334 00:51:22,970 --> 00:51:24,679 You can join the conversation on Twitter 1335 00:51:24,680 --> 00:51:26,330 with our cool hashtag 1336 00:51:27,530 --> 00:51:30,049 and we hope to see you join. 1337 00:51:30,050 --> 00:51:32,479 Finally, I will say that the Bilboa FPGA 1338 00:51:32,480 --> 00:51:34,849 GitHub is of special note. 1339 00:51:34,850 --> 00:51:36,169 Cleanaway, check it out. 1340 00:51:36,170 --> 00:51:37,170 We'd love your feedback 1341 00:51:38,360 --> 00:51:39,799 today. The best way to get in touch is 1342 00:51:39,800 --> 00:51:41,689 via Twitter, which we'll use to bootstrap 1343 00:51:41,690 --> 00:51:42,690 to email. 1344 00:51:43,130 --> 00:51:44,479 Thank you so much. 1345 00:51:44,480 --> 00:51:46,279 Special thanks to everyone who helps us 1346 00:51:46,280 --> 00:51:47,689 with this project. 1347 00:51:47,690 --> 00:51:49,819 And the creators of Levine are in fact 1348 00:51:49,820 --> 00:51:51,979 in the room. It's very exciting and 1349 00:51:51,980 --> 00:51:54,080 we hope to spur a huge amount of. 1350 00:51:55,110 --> 00:51:56,729 Work and excitement on this project. 1351 00:51:56,730 --> 00:51:57,730 So thank you. 1352 00:52:09,270 --> 00:52:11,699 OK, thank you very much. 1353 00:52:11,700 --> 00:52:13,829 We still have some minutes left, but 1354 00:52:13,830 --> 00:52:16,229 to anybody left for questions, 1355 00:52:16,230 --> 00:52:18,059 so as usual, please line up behind the 1356 00:52:18,060 --> 00:52:19,499 microphones in the room. 1357 00:52:19,500 --> 00:52:21,839 I think the first question goes to you. 1358 00:52:21,840 --> 00:52:23,219 So start. 1359 00:52:23,220 --> 00:52:24,809 One thing that we never explain in our 1360 00:52:24,810 --> 00:52:27,269 talk is the name 1361 00:52:27,270 --> 00:52:28,270 Bilboa. 1362 00:52:29,190 --> 00:52:30,989 Where does that name come from? 1363 00:52:30,990 --> 00:52:33,419 And he wants me to answer where the name 1364 00:52:33,420 --> 00:52:34,799 comes from. 1365 00:52:34,800 --> 00:52:37,049 I don't know if you all read Bunny's 1366 00:52:37,050 --> 00:52:39,269 blog as closely as I do, but 1367 00:52:39,270 --> 00:52:40,949 we learned that Navina, the name of the 1368 00:52:40,950 --> 00:52:43,169 laptop is in fact the name 1369 00:52:43,170 --> 00:52:46,499 of a Singaporean Emirati station, 1370 00:52:46,500 --> 00:52:48,389 somewhere apparently close to where he 1371 00:52:48,390 --> 00:52:49,390 lives. 1372 00:52:49,860 --> 00:52:52,079 We just thought we'd shoot one back. 1373 00:52:52,080 --> 00:52:54,599 Balboa is, we think, the equivalent 1374 00:52:54,600 --> 00:52:57,269 part stop in San Francisco. 1375 00:52:57,270 --> 00:52:59,489 So whenever you're passing through Balboa 1376 00:52:59,490 --> 00:53:01,919 Park Station, a little not 1377 00:53:01,920 --> 00:53:02,939 think about it. 1378 00:53:04,590 --> 00:53:06,899 OK, first question from here is 1379 00:53:06,900 --> 00:53:08,969 yes, about a bit of question 1380 00:53:08,970 --> 00:53:09,989 about the numbers. 1381 00:53:09,990 --> 00:53:12,209 And earlier, like you said, Novenas 1382 00:53:12,210 --> 00:53:14,339 features an ADC, but two times five 1383 00:53:14,340 --> 00:53:16,329 Nomikos a bit. 1384 00:53:16,330 --> 00:53:18,239 So there's one gigabyte per second. 1385 00:53:18,240 --> 00:53:19,829 How much of this benefit is available 1386 00:53:19,830 --> 00:53:21,809 over those Bilboa interface between FBG 1387 00:53:21,810 --> 00:53:22,889 and ARM? 1388 00:53:22,890 --> 00:53:24,989 The question is how much bandwidth 1389 00:53:24,990 --> 00:53:26,429 is available between the F.J. 1390 00:53:26,430 --> 00:53:27,900 and the arm or 1391 00:53:29,160 --> 00:53:31,409 between the epigenome between the FPGA 1392 00:53:31,410 --> 00:53:33,210 and the on the the 1393 00:53:34,800 --> 00:53:36,519 the FPGA suffers. 1394 00:53:36,520 --> 00:53:37,799 I'm not an expert in this area, but I can 1395 00:53:37,800 --> 00:53:38,999 address the question. 1396 00:53:39,000 --> 00:53:41,039 The question is we have a lot of samples 1397 00:53:41,040 --> 00:53:43,259 coming in to the ADC 1398 00:53:43,260 --> 00:53:45,420 converter, analog to digital converter, 1399 00:53:46,680 --> 00:53:48,659 and only a smaller amount of bandwidth 1400 00:53:48,660 --> 00:53:50,399 available to the CPU. 1401 00:53:50,400 --> 00:53:52,739 You what you end up doing in a case 1402 00:53:52,740 --> 00:53:55,619 where you're doing something like that is 1403 00:53:55,620 --> 00:53:57,029 doing some kind of downsampling or 1404 00:53:57,030 --> 00:53:59,249 processing in the FPGA 1405 00:53:59,250 --> 00:54:01,439 to reduce the samples 1406 00:54:01,440 --> 00:54:03,239 to site to a rate that the CPU can 1407 00:54:03,240 --> 00:54:04,439 accept. 1408 00:54:04,440 --> 00:54:06,719 And there's a lot of work. 1409 00:54:06,720 --> 00:54:08,759 That's one of the primary things that 1410 00:54:08,760 --> 00:54:11,279 software defined radio is do. 1411 00:54:11,280 --> 00:54:13,659 So for that kind of thing, SDR expertize 1412 00:54:13,660 --> 00:54:15,239 the people to talk to you, okay, you can 1413 00:54:15,240 --> 00:54:16,979 you can build a digital converter and a 1414 00:54:16,980 --> 00:54:19,649 number, of course, later in the FPGA and 1415 00:54:19,650 --> 00:54:21,099 build over the phone. Right, exactly. 1416 00:54:21,100 --> 00:54:22,279 Yeah. 1417 00:54:22,280 --> 00:54:24,119 OK, can I please remind everybody that 1418 00:54:24,120 --> 00:54:25,949 we're still in Q&A, so please keep the 1419 00:54:25,950 --> 00:54:28,029 volume down and please if you have to 1420 00:54:28,030 --> 00:54:30,009 leave to it quietly. 1421 00:54:30,010 --> 00:54:31,739 Otherwise we can't hear anything. 1422 00:54:33,180 --> 00:54:34,589 These are questions from the phone from 1423 00:54:34,590 --> 00:54:35,759 just outside. 1424 00:54:35,760 --> 00:54:37,409 This is really cool and exciting. 1425 00:54:37,410 --> 00:54:39,959 What kind of applications 1426 00:54:39,960 --> 00:54:42,119 do you think would be most suited in 1427 00:54:42,120 --> 00:54:44,009 terms of, I don't know, performance, but 1428 00:54:44,010 --> 00:54:46,229 what are peak performance and how would 1429 00:54:46,230 --> 00:54:49,139 it compare to GPU acceleration 1430 00:54:49,140 --> 00:54:51,449 in terms of what you can do with it? 1431 00:54:51,450 --> 00:54:54,269 So GPU acceleration. 1432 00:54:54,270 --> 00:54:57,239 The question is GPU acceleration versus 1433 00:54:57,240 --> 00:54:58,349 FPGA acceleration. 1434 00:54:58,350 --> 00:55:00,599 What applications are well suited 1435 00:55:00,600 --> 00:55:01,600 for 1436 00:55:03,320 --> 00:55:05,699 for FPGA is 1437 00:55:05,700 --> 00:55:07,799 the one thing that 1438 00:55:07,800 --> 00:55:09,689 springs immediately to mind is anything 1439 00:55:09,690 --> 00:55:11,579 where the timing matters a lot. 1440 00:55:12,750 --> 00:55:16,169 So we started out with this project doing 1441 00:55:16,170 --> 00:55:18,239 a yes GCM because 1442 00:55:18,240 --> 00:55:20,819 it turns out that the GCM 1443 00:55:20,820 --> 00:55:23,009 mode of A-S 1444 00:55:23,010 --> 00:55:25,769 requires hardware support 1445 00:55:25,770 --> 00:55:27,089 in order to be secure. 1446 00:55:27,090 --> 00:55:29,219 You can write software that does GCM, 1447 00:55:29,220 --> 00:55:31,469 but it will nearly inevitably 1448 00:55:31,470 --> 00:55:33,959 have a timing 1449 00:55:33,960 --> 00:55:35,819 information leak. 1450 00:55:35,820 --> 00:55:37,409 With hardware, you can avoid the timing 1451 00:55:37,410 --> 00:55:38,549 information leak. 1452 00:55:38,550 --> 00:55:41,549 So there's a case where it matters a lot. 1453 00:55:41,550 --> 00:55:43,229 Another answer would be I'm personally 1454 00:55:43,230 --> 00:55:44,879 very excited about the potential 1455 00:55:44,880 --> 00:55:47,159 additional uses of Navina as the sole 1456 00:55:47,160 --> 00:55:48,989 scope and just off the cuff. 1457 00:55:48,990 --> 00:55:50,849 I can imagine a project where you're 1458 00:55:50,850 --> 00:55:52,379 sampling a high speed circuit and you 1459 00:55:52,380 --> 00:55:54,989 have some action that the FPGA takes by 1460 00:55:54,990 --> 00:55:57,329 sampling that and outputting 1461 00:55:57,330 --> 00:55:59,459 any lives in an iota of white world. 1462 00:55:59,460 --> 00:56:01,109 But that would be something I would think 1463 00:56:01,110 --> 00:56:02,110 was pretty cool. 1464 00:56:02,790 --> 00:56:03,790 Next question. 1465 00:56:04,860 --> 00:56:06,749 When swapping out cost is distastefully 1466 00:56:06,750 --> 00:56:09,569 process to do have to Roussos, 1467 00:56:09,570 --> 00:56:10,679 I can't we can't hear you. 1468 00:56:10,680 --> 00:56:11,879 Could you speak a little closer to Mike? 1469 00:56:11,880 --> 00:56:13,889 Very close to the mic. When you when 1470 00:56:13,890 --> 00:56:15,629 you're swapping. Of course, it's a state 1471 00:56:15,630 --> 00:56:17,279 fully preserved to do you have to 1472 00:56:17,280 --> 00:56:19,769 continue from a safe state 1473 00:56:19,770 --> 00:56:20,759 when you're switching between? 1474 00:56:20,760 --> 00:56:21,779 Course, yes. 1475 00:56:21,780 --> 00:56:23,879 When you're switching back to one used 1476 00:56:23,880 --> 00:56:25,349 before. Yeah. 1477 00:56:25,350 --> 00:56:27,269 Excellent question. Is state preserved or 1478 00:56:27,270 --> 00:56:28,829 do you restart in a new state? 1479 00:56:28,830 --> 00:56:30,359 Was the question exactly. 1480 00:56:32,460 --> 00:56:33,460 The 1481 00:56:35,250 --> 00:56:36,899 I don't want to answer incorrectly here, 1482 00:56:36,900 --> 00:56:38,040 so I'm taking a second 1483 00:56:39,150 --> 00:56:41,519 when switching between, cause if 1484 00:56:41,520 --> 00:56:43,679 the core has state in 1485 00:56:43,680 --> 00:56:45,779 the in its FPGA circuit, 1486 00:56:45,780 --> 00:56:48,239 that state needs to be preserved. 1487 00:56:48,240 --> 00:56:50,939 When you swap back what else 1488 00:56:50,940 --> 00:56:53,399 you can preserve state 1489 00:56:53,400 --> 00:56:55,619 at a higher level 1490 00:56:55,620 --> 00:56:57,689 where the application or rather 1491 00:56:57,690 --> 00:57:00,569 the library, the ballboy library 1492 00:57:00,570 --> 00:57:02,459 preserves the transaction before it's 1493 00:57:02,460 --> 00:57:05,009 sent to the to the FPGA 1494 00:57:05,010 --> 00:57:06,929 so that when it's swapped back in, you 1495 00:57:06,930 --> 00:57:09,519 can restart that calculation. 1496 00:57:09,520 --> 00:57:11,259 One good way to define the future is to 1497 00:57:11,260 --> 00:57:12,260 help us write it, 1498 00:57:13,840 --> 00:57:15,909 and thanks for giving a 1499 00:57:15,910 --> 00:57:17,799 talk on the nice project, really. 1500 00:57:17,800 --> 00:57:20,139 So although I don't a new project, 1501 00:57:20,140 --> 00:57:22,419 I'm wondering how heavily tied 1502 00:57:22,420 --> 00:57:24,610 is the Balboa architecture to the way 1503 00:57:26,670 --> 00:57:29,379 to the how tight is 1504 00:57:29,380 --> 00:57:31,389 the Boardwalk project to the way that the 1505 00:57:31,390 --> 00:57:33,489 now has the FPGA attached to the 1506 00:57:33,490 --> 00:57:35,739 process or how possible is it to 1507 00:57:35,740 --> 00:57:37,959 add a piece of hardware? 1508 00:57:37,960 --> 00:57:40,149 So our goal is to be able 1509 00:57:40,150 --> 00:57:41,109 to sorry. 1510 00:57:41,110 --> 00:57:43,209 The question is how heavily tied is 1511 00:57:43,210 --> 00:57:46,389 Balboa to Nubeena and specific? 1512 00:57:46,390 --> 00:57:48,309 And our answer I think is that our goal 1513 00:57:48,310 --> 00:57:49,899 is to target any FPGA. 1514 00:57:49,900 --> 00:57:52,059 Ultimately, at present we're using 1515 00:57:52,060 --> 00:57:54,129 Navina as a dev work platform to get 1516 00:57:54,130 --> 00:57:55,029 started. 1517 00:57:55,030 --> 00:57:56,439 You need somewhere. 1518 00:57:56,440 --> 00:57:57,399 That's exactly right. 1519 00:57:57,400 --> 00:57:59,469 And I'll expand on it by saying that you 1520 00:57:59,470 --> 00:58:00,699 need somewhere to run the software 1521 00:58:00,700 --> 00:58:01,959 component of this. 1522 00:58:01,960 --> 00:58:04,389 That doesn't have to be an arm CPU 1523 00:58:04,390 --> 00:58:06,309 attached to Rambus. 1524 00:58:06,310 --> 00:58:08,589 One of the goals of having the 1525 00:58:08,590 --> 00:58:10,869 abstraction layer is that you could have 1526 00:58:10,870 --> 00:58:11,799 different interconnects. 1527 00:58:11,800 --> 00:58:13,899 And I hope that someone will 1528 00:58:13,900 --> 00:58:15,999 make a computer I can actually afford, 1529 00:58:16,000 --> 00:58:18,309 which has an FPGA on the hyper 1530 00:58:18,310 --> 00:58:20,499 transport directly attached to an Opteron 1531 00:58:20,500 --> 00:58:22,539 or similar computer. 1532 00:58:22,540 --> 00:58:24,249 Quray actually makes a supercomputer like 1533 00:58:24,250 --> 00:58:25,599 this. They started like ten million 1534 00:58:25,600 --> 00:58:27,579 dollars in the interim. 1535 00:58:27,580 --> 00:58:29,019 An interesting project you might want to 1536 00:58:29,020 --> 00:58:31,389 look at is the Zinc FPGA 1537 00:58:31,390 --> 00:58:33,519 platform, which has a hard core 1538 00:58:33,520 --> 00:58:35,379 CPU on board for very high speed 1539 00:58:35,380 --> 00:58:37,299 throughput between the FPGA part and the 1540 00:58:37,300 --> 00:58:39,129 CPU part. And another possible starting 1541 00:58:39,130 --> 00:58:41,019 point. That's a great starting, a great 1542 00:58:41,020 --> 00:58:43,209 next step. And the last next step is 1543 00:58:43,210 --> 00:58:45,369 that I really want to see the Bilboa 1544 00:58:45,370 --> 00:58:47,319 architecture or something like it running 1545 00:58:47,320 --> 00:58:49,419 on a soft CPU that's running 1546 00:58:49,420 --> 00:58:52,059 on the FPGA while reconfiguring 1547 00:58:52,060 --> 00:58:53,469 its own FPGA. 1548 00:58:53,470 --> 00:58:54,470 And if we can. 1549 00:58:55,250 --> 00:58:56,250 Yo, dog. 1550 00:58:57,020 --> 00:58:58,020 Thank you. 1551 00:58:58,570 --> 00:59:00,529 Sorry, I have to cut it short here, 1552 00:59:00,530 --> 00:59:02,059 because we ran out of time and the next 1553 00:59:02,060 --> 00:59:03,289 topic is going to be really crowded. 1554 00:59:03,290 --> 00:59:04,999 We want all of your questions will be 1555 00:59:05,000 --> 00:59:05,999 outside. All right. 1556 00:59:06,000 --> 00:59:08,029 So all people that are having some 1557 00:59:08,030 --> 00:59:09,949 questions, please go to my Web site and 1558 00:59:09,950 --> 00:59:11,599 again, asking for a. 1559 00:59:11,600 --> 00:59:12,619 Applause. Thank you very much. 1560 00:59:15,950 --> 00:59:18,049 One last request, could someone 1561 00:59:18,050 --> 00:59:20,569 could someone please make a free 1562 00:59:20,570 --> 00:59:22,369 Twitter replacement so I can stop using 1563 00:59:22,370 --> 00:59:24,289 Twitter as my primary means of talking to 1564 00:59:24,290 --> 00:59:25,290 people about stuff like this?