[Music] Herald: Has anyone in here ever worked with libusb or PI USB? Hands up. Okay. Who also thinks USB is a pain? laughs Okay. Sergey and Alexander were here back in at the 26C3, that's a long time ago. I think it was back in Berlin, and back then they presented their first homemade, or not homemade, SDR, software-defined radio. This year they are back again and they want to show us how they implemented another one, using an FPGA, and to communicate with it they used PCI Express. So I think if you thought USB was a pain, let's see what they can tell us about PCI Express. A warm round of applause for Alexander and Sergey for building a high throughput, low latency, PCIe-based software-defined radio [Applause] Alexander Chemeris: Hi everyone, good morning, and welcome to the first day of the Congress. So, just a little bit background about what we've done previously and why we are doing what we are doing right now, is that we started working with software-defined radios and by the way, who knows what software defined radio is? Okay, perfect. laughs And who ever actually used a software- defined radio? RTL-SDR or...? Okay, less people but that's still quite a lot. Okay, good. I wonder whether anyone here used more expensive radios like USRPs? Less people, but okay, good. Cool. So before 2008 I've had no idea what software- defined radio is, was working with voice over IP software person, etc., etc., so I in 2008 I heard about OpenBTS, got introduced to software-defined radio and I wanted to make it really work and that's what led us to today. In 2009 we had to develop a clock tamer. A hardware which allows to use, allowed to use USRP1 to run GSM without problems. If anyone ever tried doing this without a good clock source knows what I'm talking about. And we presented this - it wasn't an SDR it was just a clock source - we presented this in 2009 in 26C3. Then I realized that using USRP1 is not really a good idea, because we wanted to build a robust, industrial-grade base stations. So we started developing our own software defined radio, which we call UmTRX and it was in - we started started this in 2011. Our first base stations with it were deployed in 2013, but I always wanted to have something really small and really inexpensive and back then it wasn't possible. My original idea in 2011, we were to build a PCI Express card. Mini, sorry, not PCI Express card but mini PCI card. If you remember there were like all the Wi-Fi cards and mini PCI form factor and I thought that would be really cool to have an SDR and mini PCI, so I can plug this into my laptop or in some embedded PC and have a nice SDR equipment, but back then it just was not really possible, because electronics were bigger and more power hungry and just didn't work that way, so we designed UmTRX to work over gigabit ethernet and it was about that size. So now we spend this year at designing something, which really brings me to what I wanted those years ago, so the XTRX is a mini PCI Express - again there was no PCI Express back then, so now it's mini PCI Express, which is even smaller than PCI, I mean mini PCI and it's built to be embedded friendly, so you can plug this into a single board computer, embedded single board computer. If you have a laptop with a mini PCI Express you can plug this into your laptop and you have a really small, software-defined radio equipment. And we really want to make it inexpensive, that's why I was asking how many of you have ever worked it with RTL- SDR, how many of you ever worked with you USRPs, because the gap between them is pretty big and we want to really bring the software-defined radio to masses. Definitely won't be as cheap as RTL-SDR, but we try to make it as close as possible. And at the same time, so at the size of RTL-SDR, at the price well higher but, hopeful hopefully it will be affordable to pretty much everyone, we really want to bring high performance into your hands. And by high performance I mean this is a full transmit/receive with two channels transmit, two channels receive, which is usually called 2x2 MIMO in in the radio world. The goal was to bring it to 160 megasamples per second, which can roughly give you like 120 MHz of radio spectrum available. So what we were able to achieve is, again this is mini PCI Express form factor, it has small Artix7, that's the smallest and most inexpensive FPGA, which has ability to work with a PCI Express. It has LMS7000 chip for RFIC, very high performance, very tightly embedded chip with even a DSP blocks inside. It has even a GPS chip here, you can actually on the right upper side, you can see a GPS chip, so you can accually synchronize your SDR to GPS for perfect clock stability, so you won't have any problems running any telecommunication systems like GSM, 3G, 4G due to clock problems, and it also has interface for SIM cards, so you can actually create a software-defined radio modem and run other open source projects to build one in a four LT called SRSUI, if you're interested, etc., etc. so really really tightly packed one. And if you put this into perspective: that's how it all started in 2006 and that's what you have ten years later. It's pretty impressive. applause Thanks. But I think it actually applies to the whole industry who is working on shrinking the sizes because we just put stuff on the PCB, you know. We're not building the silicon itself. Interesting thing is that we did the first approach: we said let's pack everything, let's do a very tight PCB design. We did an eight layer PCB design and when we send it to a fab to estimate the cost it turned out it's $15,000 US per piece. Well in small volumes obviously but still a little bit too much. So we had to redesign this and the first thing which we did is we still kept eight layers, because in our experience number of layers nowadays have only minimal impact on the cost of the device. So like six, eight layers - the price difference is not so big. But we did complete rerouting and only kept 2-Deep MicroVIAs and never use the buried VIAs. So this make it much easier and much faster for the fab to manufacture it and the price suddenly went five, six times down and in volume again it will be significantly cheaper. And that's just for geek porn how PCB looks inside. So now let's go into real stuff. So PCI Express: why did we choose PCI Express? As it was said USB is a pain in the ass. You can't really use USB in industrial systems. For a whole variety of reasons just unstable. So we did use Ethernet for many years successfully but Ethernet has one problem: first of all inexpensive Ethernet is only one gigabit and one gigabit does not offer you enough bandwidth to carry all the data we want, plus its power-hungry etc. etc. So PCI Express is really a good choice because it's low power, it has low latency, it has very high bandwidth and it's available almost universally. When we started looking into this we realize that even ARM boards, some of ARM boards have PCI Express, mini PCI Express slots, which was a big surprise for me for example. So the problems is that unlike USB you do need to write your own kernel driver for this and there's no way around. And it is really hard to write this driver universally so we are writing it obviously for Linux because they're working with embedded systems, but if we want to rewrite it for Windows or for macOS we'll have to do a lot of rewriting. So we focus on what we want on Linux only right now. And now the hardest part: debugging is really non-trivial. One small error and your PC is completely hanged because you use something wrong. And you have to reboot it and restart it. That's like debugging kernel but sometimes even harder. To make it worse there is no really easy-to-use plug-and-play interface. If you want to restart; normally, when you when you develop a PCI Express card, when you want when you want to restart it you have to restart your development machine. Again not a nice way, it's really hard. So the first thing we did is we found, that we can use Thunderbolt 3 which is just recently released, and it has ability to work directly with PCI Express bus. So it basically has a mode in which it converts a PCI Express into plug-and-play interface. So if you have a laptop which supports Thunderbolt 3 then you can use this to do plug and play your - plug or unplug your device to make your development easier. There are always problems: there's no easy way, there's no documentation. Thunderbolt is not compatible with Thunderbolt. Thunderbold 3 is not compatible with Thunderbold 2. So we had to buy a special laptop with Thunderbold 3 with special cables like all this all this hard stuff. And if you really want to get documentation you have to sign NDA and send a business plan to them so they can approve that your business makes sense. laughter I mean... laughs So we actually opted out. We set not to go through this, what we did is we found that someone is actually making PCI Express to Thunderbolt 3 converters and selling them as dev boards and that was a big relief because it saved us lots of time, lots of money. You just order it from from some from some Asian company. And yeah this is how it looks like this converter. So you buy it, like several pieces you can plug in your PCI Express card there and you plug this into your laptop. And this is the with XTRX already plugged into it. Now the only problem we found is that typically UEFI has a security control enabled, so that any random thunderbold device can't hijack your PCI bus and can't get access to your kernel memory and do some bad stuff. Which is a good idea - the only problem is that there is, it's not fully implemented in Linux. So under Windows if you plug in a device which is which has no security features, which is not certified, it will politely ask you like: "Do you really trust this device? Do you want to use it?" you can say "yes". Under Linux it just does not work. laughs So we spend some time trying to figure out how to get around this. Right, some patches from Intel which are not mainline and we were not able to actually get them work. So we just had to disable all this security measure in the laptop. So be aware that this is the case and we suspect that happy users of Apple might not be able to do this because Apple don't have BIOS so it probably can't disable this feature. So probably good incentive for someone to actually finish writing the driver. So now to the goal: so we wanted to, we want to achieve 160 mega samples per second, 2x2 MIMO, which means two transceiver, two transmit, two receive channels at 12 bits, which is roughly 7.5 Gbit/s. So first result when we plug this when we got this board on the fab it didn't work Sergey Kostanbaev mumbles: as expected Alexander Chemeris: yes as expected so the first the interesting thing we realized is that: first of all the FPGA has Hardware blocks for talking to a PCI Express which was called GTP which basically implement like a PCI Express serial physical layer but the thing is the numbering is reversed in the in PCI Express in FPGA and we did not realize this so we had to do very very fine soldiering to actually swap the laughs swap the lanes you can see this very fine work there. We also found that one of the components was deadbug which is a well-known term for chips which design stage are placed at mirrored so we mirrored occasionally mirrored that they pin out so we had to solder it upside down and if you can realize how small it is you can also appreciate the work done. And what's funny when I was looking at dead bugs I actually found a manual from NASA which describes how to properly soldier dead bugs to get it approved. audience laughs So this is the link I think you can go there and enjoy it's also fun stuff there. So after fixing all of this our next attempt this kind of works. So next stage is debugging the FPGA code, which has to talk to PCI Express and PCI Express has to talk to Linux kernel and the kernel has to talk to the driver, driver has talked to the user space. So peripherals are easy so the UART SPIs we've got to work almost immediately no problems with that, but DMA was a real beast. So we spent a lot of time trying to get DMA to work and the problem is that with DMA it's on FPGA so you can't just place a breakpoint like you do in C or C++ or in other languages it's real-time system running on system like it's real-time hardware, which is running on the fabric so you we had to Sergey was mainly developing this had to write a lot of small test benches and and test everything piece by piece. So all parts of the DMA code we had was wrapped into a small test bench which was emulating all the all the tricks and as classics predicted it took about five to ten times more than actually writing the code. So we really blew up our and predicted timelines by doing this, but the end we've got really stable stable work. So some suggestions for anyone who will try to repeat this exercise is there is a logic analyzer built-in to Xilinx and you can use, it it's nice it's, sometimes it's very helpful but you can't debug transient box, which are coming out at when some weird conditions are coming up. So you have to implement some read back registers which shows important statistic like important data about how your system behaves, in our case it's various counters on the DMA interface. So you can actually see kind of see what's happening with your with your data: Is it received? Is it sent? How much is and how much is received? So like for example, we can see when we saturate the bus or when actually is an underrun so host is not providing data fast enough, so we can at least understand whether it's a host problem or whether it's an FPGA, problem on which part we do we debug next because again: it's a very multi layer problem you start with FPGA, PCI Express, kernel, driver, user space, and any part can fail. so you can't work blind like this. So again the goal was to get 160 MSPS with the first implementation we could 2 MSPS: roughly 60 times slower. The problem is that software just wasn't keeping up and wasn't sending data fast enough. So it was like many things done but the most important parts is: use real- time priority if you want to get very stable results and well fix software bugs. And one of the most important bugs we had was that DMA buffers were not freed in proper time immediately so they were busy for longer than they should be, which introduced extra cycles and basically just reduced the bandwidth. At this point let's talk a little bit about how to implement a high-performance driver for Linux, because if you want to get real real performance you have to start with the right design. There are basically three approaches and the whole spectrum in between; like two approaches and the whole spectrum in between, which is where you can refer to three. The first approach is full kernel control, in which case kernel driver not only is on the transfer, it actually has all the logics of controlling your device and all the export ioctl to the user space and that's the kind of a traditional way of writing drivers. Your your user space is completely abstracted from all the details. The problem is that this is probably the slowest way to do it. The other way is what's called the "zero cup interface": your only control is held in the kernel and data is provided, the raw data is provided to user space "as-is". So you avoid memory copy which make it faster. But still not fast enough if you really want to achieve maximum performance, because you still have context switches between the kernel and the user space. The most... the fastest approach possible is to have full user space implementation when kernel just exposed everything and says "now you do it yourself" and you have no you have no context switches, like almost no, and you can really optimize everything. So what is... what are the problems with this? The pro the pros I already mentioned: no no switches between kernel user space, it's very low latency because of this as well, it's very high bandwidth. But if you are not interested in getting the very high performance, the most performance, and you just want to have like some little, like say low bandwidth performance, then you will have to add hacks, because you can't get notifications of the kernel that resources available is more data available. It also makes it vulnerable vulnerable because if user space can access it, then it can do whatever it want. We at the end decided that... one more important thing: how to actually to get the best performance out of out of the bus. This is a very (?)(?) set as we want to poll your device or not to poll and get notified. What is polling? I guess everyone as programmer understands it, so polling is when you asked repeatedly: "Are you ready?", "Are you ready?", "Are you ready?" and when it's ready you get the data immediately. It's basically a busy loop of your you just constantly asking device what's happening. You need to dedicate a full core, and thanks God we have multi-core CPUs nowadays, so you can dedicate the full core to this polling and you can just pull constantly. But again if you don't need this highest performance, you just need to get something, then you will be wasting a lot of CPU resources. At the end we decided to do a combined architecture of your, it is possible to pull but there's also a chance and to get notification from a kernel to for for applications, which recover, which needs low bandwidth, but also require a better CPU performance. Which I think is the best way if you are trying to target both worlds. Very quickly: the architecture of system. We try to make it very very portable so and flexible. There is a kernel driver, which talks to low-level library which implements all this logic, which we took out of the driver: to control the PCI Express, to work with DMA, to provide all the... to hide all the details of the actual bus implementation. And then there is a high-level library which talks to this low-level library and also to libraries which implement control of actual peripherals, and most importantly to the library which implements control over our RFIC chip. This way it's very modular, we can replace PCI Express with something else later, we might be able to port it to other operating systems, and that's the goal. Another interesting issue is: when you start writing the Linux kernel driver you very quickly realize that while LDD, which is a classic book for a Linux driver, writing is good and it will give you a good insight; it's not actually up-to- date. It's more than ten years old and there's all of new interfaces which are not described there, so you have to resort to reading the manuals and all the documentation in the kernel itself. Well at least you get the up-to-date information. The decisions we made is to make everything easy. We use TTY for GPS and so you can really attach a pretty much any application which talks to GPS. So all of existing applications can just work out of the box. And we also wanted to be able to synchronize system clock to GPS, so we get automatic log synchronization across multiple systems, which is very important when we are deploying many, many devices around the world. We plan to do two interfaces, one as key PPS and another is a DCT, because DCT line on the UART exposed over TTY. Because again we found that there are two types of applications: one to support one API, others that support other API and there is no common thing so we have to support both. As we described, we want to have polls so we can get notifications of the kernel when data is available and we don't need to do real busy looping all the time. After all the software optimizations we've got to like 10 MSPS: still very, very far from what we want to achieve. Now there should have been a lot of explanations about PCI Express, but when we actually wrote everything we wanted to say we realize, it's just like a full two hours talk just on PCI Express. So we are not going to give it here, I'll just give some highlights which are most interesting. If you if there is real interest, we can set up a workshop and some of the later days and talking more details about PCI Express specifically. The thing is there is no open source cores for PCI Express, which are optimized for high performance, real time applications. There is Xillybus which as I understand is going to be open source, but they provide you a source if you pay them. It's very popular because it's very very easy to do, but it's not giving you performance. If I remember correctly the best it can do is maybe like 50 percent bus saturation. So there's also Xilinx implementation, but if you are using Xilinx implementation with AXI bus than you're really locked in with AXI bus with Xilinx. And it also not very efficient in terms of resources and if you remember we want to make this very, very inexpensive. So our goal is to you ... is to be able to fit everything in the smallest Arctic's 7 FPGA, and that's quite challenging with all the stuff in there and we just can't waste resources. So decision is to write your own PCI Express implementation. That's how it looks like. I'm not going to discuss it right now. There are several iterations. Initially it looked much simpler, turned out not to work well. So some interesting stuff about PCI Express which we stumbled upon is that it was working really well on Atom which is our main development platform because we are doing a lot of embedded stuff. Worked really well. When we try to plug this into core i7 just started hanging once in a while. So after like several not days maybe with debugging, Sergey found that very interesting statement in the standard which says that value is zero in byte count actually stands not for zero bytes but for 4096 bytes. I mean that's a really cool optimization. So another thing is completion which is a term in PCI Express basically for acknowledgment which also can carry some data back to your request. And sometimes if you're not sending completion, device just hangs. And what happens is that in this case due to some historical heritage of x86 it just starts returning you FFF. And if you have a register which says: „Is your device okay?“ and this register shows one to say „The device is okay“, guess what will happen? You will be always reading that your device is okay. So the suggestion is not to use one as the status for okay and use either zero or better like a two-beat sequence. So you are definitely sure that you are okay and not getting FFF's. So when you have a device which again may fail at any of the layers, you just got this new board, it's really hard, it's really hard to debug because of memory corruption. So we had a software bug and it was writing DMA addresses incorrectly and we were wondering why we are not getting any data in our buffers at the same time. After several starts, operating system just crashes. Well, that's the reason why there is this UEFI protection which prevents you from plugging in devices like this into your computer. Because it was basically writing data, like random data into random portions of your memory. So a lot of debugging, a lot of tests and test benches and we were able to find this. And another thing is if you deinitialize your driver incorrectly, and that's what's happening when you have plug-and-play device, which you can plug and unplug, then you may end up in a situation of your ... you are trying to write into memory which is already freed by approaching system and used for something else. Very well-known problem but it also happens here. So there ... why DMA is really hard is because it has this completion architecture for writing for ... sorry ... for reading data. Writes are easy. You just send the data, you forget about it. It's a fire- and-forget system. But for reading you really need to get your data back. And the thing is, it looks like this. You really hope that there would be some pointing device here. But basically on the top left you can see requests for read and on the right you can see completion transactions. So basically each transaction can be and most likely will be split into multiple transactions. So first of all you have to collect all these pieces and like write them into proper parts of the memory. But that's not all. The thing is the latency between request and completion is really high. It's like 50 cycles. So if you have a single, only single transaction in fly you will get really bad performance. You do need to have multiple transactions in flight. And the worst thing is that transactions can return data in random order. So it's a much more complicated state machine than we expected originally. So when I said, you know, the architecture was much simpler originally, we don't have all of this and we had to realize this while implementing. So again here was a whole description of how exactly this works. But not this time. So now after all these optimizations we've got 20 mega samples per second which is just six times lower than what we are aiming at. So now the next thing is PCI Express lanes scalability. So PCI Express is a serial bus. So it has multiple lanes and they allow you to basically horizontally scale your bandwidth. One lane is like x, than two lane is 2x, four lane is 4x. So the more lanes you have the more performance you are getting out of your, out of your bus. So the more bandwidth you're getting out of your bus. Not performance. So the issue is that typical a mini PCI Express, so the mini PCI Express standard only standardized one lane. And second lane is left as optional. So most motherboards don't support this. There are some but not all of them. And we really wanted to get this done. So we designed a special converter board which allows you to plug your mini PCI Express into a full-size PCI Express and get two lanes working. And we're also planning to have a similar board which will have multiple slots so you will be able to get multiple XTRX-SDRs on to the same, onto the same carrier board and plug this into let's say PCI Express 16x and you will get like really a lot of ... SDR ... a lot of IQ data which then will be your problem how to, how to process. So with two x's it's about twice performance so we are getting fifty mega samples per second. And that's the time to really cut the fat because the real sample size of LMS7 is 12 bits and we are transmitting 16 because it's easier. Because CPU is working on 8, 16, 32. So we originally designed the driver to support 8 bit, 12 bit and 16 bit to be able to do this scaling. And for the test we said okay let's go from 16 to 8 bit. We'll lose some dynamic range but who cares these days. Still stayed the same, it's still 50 mega samples per second, no matter what we did. And that was a lot of interesting debugging going on. And we realized that we actually made another, not a really mistake. We didn't, we didn't really know this when we designed. But we should have used a higher voltage for this high speed bus to get it to the full performance. And at 1.8 it was just degrading too fast and the bus itself was not performing well. So our next prototype will be using higher voltage specifically for this bus. And this is kind of stuff which makes designing hardware for high speed really hard because you have to care about coherence of the parallel buses on your, on your system. So at the same time we do want to keep 1.8 volts for everything else as much as possible. Because another problem we are facing with this device is that by the standard mini PCI Express allows only like ... Sergey Kostanbaev: ... 2.5 ... Alexander Chemeris: ... 2.5 watts of power consumption, no more. And that's we were, we were very lucky that LMS7 has such so good, so good power consumption performance. We actually had some extra space to have FPGA and GPS and all this stuff. But we just can't let the power consumption go up. Our measurements on this device showed about ... Sergey Kostanbaev: ... 2.3 ... Alexander Chemeris: ... 2.3 watts of power consumption. So we are like at the limit at this point. So when we fix the bus with the higher voltage, you know it's a theoretical exercise, because we haven't done this yet, that's plenty to happen in a couple months. We should be able to get to this numbers which was just 1.2 times slower. Then the next thing will be to fix another issue which we made at the very beginning: we have procured a wrong chip. Just one digit difference, you can see it's highlighted in red and green, and this chip it supports only a generation 1 PCI Express which is twice slower than generation 2 PCI Express. So again, hopefully we'll replace the chip and just get very simple doubling of the performance. Still it will be slower than we wanted it to be and here is what comes like practical versus theoretical numbers. Well as every bus it has it has overheads and one of the things which again we realized when we were implementing this is, that even though the standard standardized is the payload size of 4kB, actual implementations are different. For example desktop computers like Intel Core or Intel Atom they only have 128 byte payload. So there is much more overhead going on the bus to transfer data and even theoretically you can only achieve 87% efficiency. And on Xeon we tested and we found that they're using 256 payload size and this can give you like a 92% efficiency on the bus and this is before the overhead so the real reality is even worse. An interesting thing which we also did not expect, is that we originally were developing on Intel Atom and everything was working great. When we plug this into laptop like Core i7 multi-core really powerful device, we didn't expect that it wouldn't work. Obviously Core i7 should work better than Atom: no, not always. The thing is, we were plugging into a laptop, which had a built-in video card which was sitting on the same PCI bus and probably manufacturer hard-coded the higher priority for the video card than for everything else in the system, because I don't want your your screen to flicker. And so when you move a window you actually see the late packets coming to your PCI device. We had to introduce a jitter buffer and add more FIFO into the device to smooth it out. On the other hand the Xeon is performing really well. So it's very optimized. That said, we have tested it with discreet card and it outperforms everything by whooping five seven percent. What you get four for the price. So this is actually the end of the presentation. We still have not scheduled any workshop, but if there if there is any interest in actually seeing the device working or if you interested in learning more about the PCI Express in details let us know we'll schedule something in the next few days. That's the end, I think we can proceed with questions if there are any. Applause Herald: Okay, thank you very much. If you are leaving now: please try to leave quietly because we might have some questions and you want to hear them. If you have questions please line up right behind the microphones and I think we'll just wait because we don't have anything from the signal angel. However, if you are watching on stream you can hop into the channels and over social media to ask questions and they will be answered, hopefully. So on that microphone. Question 1: What's the minimum and maximum frequency of the card? Alexander Chemeris: You mean RF frequency? Question 1: No, the minimum frequency you can sample at. the most SDR devices can only sample at over 50 MHz. Is there a similar limitation at your card? Alexander Chemeris: Yeah, so if you're talking about RF frequency it can go from like almost zero even though that works worse below 50MHz and all the way to 3.8GHz if I remember correctly. And in terms of the sample rate right now it works from like about 2 MSPS and to about 50 right now. But again, we're planning to get it to these numbers we quoted. Herald: Okay. The microphone over there. Question 2: Thanks for your talk. Did you manage to put your Linux kernel driver to the main line? Alexander Chemeris: No, not yet. I mean, it's not even like fully published. So I did not say in the beginning, sorry for this. We only just manufactured the first prototype, which we debugged heavily. So we are only planning to manufacture the second prototype with all these fixes and then we will release, like, the kernel driver and everything. And maybe we'll try or maybe won't try, haven't decided yet. Question 2: Thanks Herald: Okay... Alexander Chemeris: and that will be the whole other experience. Herald: Okay, over there. Question 3: Hey, looks like you went through some incredible amounts of pain to make this work. So, I was wondering, aren't there any simulators at least for parts of the system, or the PCIe bus for the DMA something? Any simulator so that you can actually first design the system there and debug it more easily? Sergey Kostanbaev: Yes, there are available simulators, but the problem's all there are non-free. So you have to pay for them. So yeah and we choose the hard way. Question 3: Okay thanks. Herald: We have a question from the signal angel. Question 4: Yeah are the FPGA codes, Linux driver, and library code, and the design project files public and if so, did they post them yet? They can't find them on xtrx.io. Alexander Chemeris: Yeah, so they're not published yet. As I said, we haven't released them. So, the drivers and libraries will definitely be available, FPGA code... We are considering this probably also will be available in open source. But we will publish them together with the public announcement of the device. Herald: Ok, that microphone. Question 5: Yes. Did you guys see any signal integrity issues between on the PCI bus, or on this bus to the LMS chip, the Lime microchip, I think, this doing the RF ? AC: Right. Question 5: Did you try to measure signal integrity issues, or... because there were some reliability issues, right? AC: Yeah, we actually... so, PCI. With PCI we never had issues, if I remember correctly. SK: No. AC: I just... it was just working. SK: Well, the board is so small, and when there are small traces there's no problem in signal integrity. So it's actually saved us. AC: Yeah. Designing a small board is easier. Yeah, with the LMS 7, the problem is not the signal integrity in terms of difference in the length of the traces, but rather the fact that the signal degrades over voltage, also over speed in terms of voltage, and drops below the detection level, and all this stuff. We use some measurements. I actually wanted to add some pictures here, but decided that's not going to be super interesting. H: Okay. Microphone over there. Question 6: Yes. Thanks for the talk. How much work would it be to convert the two by two SDR into an 8-input logic analyzer in terms of hard- and software? So, if you have a really fast logic analyzer, where you can record unlimited traces with? AC: A logic analyzer... Q6: So basically it's just also an analog digital converter and you largely want fast sampling and a large amount of memory to store the traces. AC: Well, I just think it's not the best use for it. It's probably... I don't know. Maybe Sergey has any ideas, but I think it just may be easier to get high-speed ADC and replace the Lime chip with a high- speed ADC to get what you want, because the Lime chip has so many things there specifically for RF. SK: Yeah, the main problem you cannot just sample original data. You should shift it over frequency, so you cannot sample original signal, and using it for something else except spectrum analyzing is hard. Q6: OK. Thanks. H: OK. Another question from the internet. Signal angel: Yes. Have you compared the sample rate of the ADC of the Lime DA chip to the USRP ADCs, and if so, how does the lower sample rate affect the performance? AC: So, comparing low sample rate to higher sample rate. We haven't done much testing on the RF performance yet, because we were so busy with all this stuff, so we are yet to see in terms of low bit rates versus sample rates versus high sample rate. Well, high sample rate always gives you better performance, but you also get higher power consumption. So, I guess it's the question of what's more more important for you. H: Okay. Over there. Question 7: I've gathered there is no mixer bypass, so you can't directly sample the signal. Is there a way to use the same antenna for send and receive, yet. AC: Actually, there is... Input for ADC. SK: But it's not a bypass, it's a dedicated pin on LMS chip, and since we're very space-constrained, we didn't route them, so you can not actually bypass it. AC: Okay, in our specific hardware, so in general, so in the LMS chip there is a special pin which allows you to drive your signal directly to ADC without all the mixers, filters, all this radio stuff, just directly to ADC. So, yes, theoretically that's possible. SK: We even thought about this, but it doesn't fit this design. Q7: Okay. And can I share antennas, because I have an existing laptop with existing antennas, but I would use the same antenna to send and receive. AC: Yeah, so, I mean, that's... depends on what exactly do you want to do. If you want a TDG system, then yes, if you want an FDG system, then you will have to put a small duplexer in there, but yeah, that's the idea. So you can plug this into your laptop and use your existing antennas. That's one of the ideas of how to use xtrx. Q7: Yeah, because there's all four connectors. AC: Yeah. One thing which I actually forgot to mention is - I kind of mentioned in the slides - is that any other SDRs which are based on Ethernet or on the USB can't work with a CSMA wireless systems, and the most famous CSMA system is Wi-Fi. So, it turns out that because of the latency between your operating system and your radio on USB, you just can't react fast enough for Wi-Fi to work, because you - probably you know that - in Wi-Fi you carrier sense, and if you sense that the spectrum is free, you start transmitting. Does make a sense when you have huge latency, because you all know that... you know the spectrum was free back then, so, with xtrx, you actually can work with CSMA systems like Wi-Fi, so again it makes it possible to have a fully software implementation of Wi-Fi in your laptop. It obviously won't work like as good as your commercial Wi-Fi, because you will have to do a lot of processing on your CPU, but for some purposes like experimentation, for example, for wireless labs and R&D labs, that's really valuable. Q7: Thanks. H: Okay. Over there. Q8: Okay. what PCB design package did you use?. AC: Altium. SK: Altium, yeah. Q8: And I'd be interested in the PCIe workshop. Would be really great if you do this one. AC: Say this again? Q8: Would be really great if you do the PCI Express workshop. AC: Ah. PCI Express workshop. Okay. Thank you. H: Okay, I think we have one more question from the microphones, and that's you. Q9: Okay. Great talk. And again, I would appreciate a PCI Express workshop, if it ever happens. What are these synchronization options between multiple cards. Can you synchronize the ADC clock, and can you synchronize the presumably digitally created IF? SK: Yes, so... so, unfortunately, just IF synchronization is not possible, because Lime chip doesn't expose a low frequency. But we can synchronize digitally. So, we have special one PPS signal synchronization. We have lines for clock synchronization and other stuff. We can do it in software. So the Lime chip has phase correction register, so when you measure... if there is a phase difference, so you can compensate it on different boards. Q9: Tune to a station a long way away and then rotate the phase until it aligns. SK: Yeah. Q9: Thank you. AC: Little tricky, but possible. So, that's one of our plans for future, because we do want to see, like 128 by 128 MIMO at home. H: Okay, we have another question from the internet. Signal angel: I actually have two questions. The first one is: What is the expected price after a prototype stage? And the second one is: Can you tell us more about this setup you had for debugging the PCIe issues? AC: Could you repeat the second question? SK: It's ????????????, I think. SA: It's more about the setup you had for debugging the PCIe issues. SK: Second question, I think it's most about our next workshop, because it's a more complicated setup, so... mostly remove everything about its now current presentation. AC: Yeah, but in general, and in terms of hardware setup, that was our hardware setup, so we bought this PCI Express to Thunderbolt3, we bought the laptop which supports Thunderbolt3, and that's how we were debugging it. So, we don't need, like a full-fledged PC, we don't have to restart it all the time. So, in terms of price, we don't have the fixed price yet. So, all I can say right now is that we are targeting no more than your bladeRF or HackRF devices, and probably even cheaper. For some versions. H: Okay. We are out of time, so thank you again Sergey and Alexander. [Applause] [Music] subtitles created by c3subtitles.de in the year 20??. Join, and help us!