< Return to Video

Building a high throughput low-latency PCIe based SDR (33c3)

  • 0:04 - 0:09
    [Music]
  • 0:09 - 0:22
    Herald: Has anyone in here ever worked
    with libusb or PI USB? Hands up. Okay. Who
  • 0:22 - 0:32
    also thinks USB is a pain? laughs Okay.
    Sergey and Alexander were here back in at
  • 0:32 - 0:39
    the 26C3, that's a long time ago. I think
    it was back in Berlin, and back then they
  • 0:39 - 0:45
    presented their first homemade, or not
    homemade, SDR, software-defined radio.
  • 0:45 - 0:49
    This year they are back again and they
    want to show us how they implemented
  • 0:49 - 0:55
    another one, using an FPGA, and to
    communicate with it they used PCI Express.
  • 0:55 - 1:02
    So I think if you thought USB was a pain,
    let's see what they can tell us about PCI
  • 1:02 - 1:07
    Express. A warm round of applause for
    Alexander and Sergey for building a high
  • 1:07 - 1:12
    throughput, low latency, PCIe-based
    software-defined radio
  • 1:12 - 1:20
    [Applause]
    Alexander Chemeris: Hi everyone, good
  • 1:20 - 1:30
    morning, and welcome to the first day of
    the Congress. So, just a little bit
  • 1:30 - 1:36
    background about what we've done
    previously and why we are doing what we
  • 1:36 - 1:42
    are doing right now, is that we started
    working with software-defined radios and
  • 1:42 - 1:52
    by the way, who knows what software
    defined radio is? Okay, perfect. laughs
  • 1:52 - 1:59
    And who ever actually used a software-
    defined radio? RTL-SDR or...? Okay, less
  • 1:59 - 2:06
    people but that's still quite a lot. Okay,
    good. I wonder whether anyone here used
  • 2:06 - 2:17
    more expensive radios like USRPs? Less
    people, but okay, good. Cool. So before
  • 2:17 - 2:23
    2008 I've had no idea what software-
    defined radio is, was working with voice
  • 2:23 - 2:30
    over IP software person, etc., etc., so I
    in 2008 I heard about OpenBTS, got
  • 2:30 - 2:40
    introduced to software-defined radio and I
    wanted to make it really work and that's
  • 2:40 - 2:52
    what led us to today. In 2009 we had to
    develop a clock tamer. A hardware which
  • 2:52 - 3:00
    allows to use, allowed to use USRP1 to run GSM
    without problems. If anyone ever tried
  • 3:00 - 3:05
    doing this without a good clock source
    knows what I'm talking about. And we
  • 3:05 - 3:11
    presented this - it wasn't an SDR it was
    just a clock source - we presented this in
  • 3:11 - 3:19
    2009 in 26C3.
    Then I realized that using USRP1 is not
  • 3:19 - 3:24
    really a good idea, because we wanted to
    build a robust, industrial-grade base
  • 3:24 - 3:30
    stations. So we started developing our own
    software defined radio, which we call
  • 3:30 - 3:41
    UmTRX and it was in - we started started
    this in 2011. Our first base stations with
  • 3:41 - 3:52
    it were deployed in 2013, but I always
    wanted to have something really small and
  • 3:52 - 4:00
    really inexpensive and back then it wasn't
    possible. My original idea in 2011, we
  • 4:00 - 4:08
    were to build a PCI Express card. Mini,
    sorry, not PCI Express card but mini PCI
  • 4:08 - 4:10
    card.
    If you remember there were like all the
  • 4:10 - 4:14
    Wi-Fi cards and mini PCI form factor and I
    thought that would be really cool to have
  • 4:14 - 4:22
    an SDR and mini PCI, so I can plug this
    into my laptop or in some embedded PC and
  • 4:22 - 4:32
    have a nice SDR equipment, but back then
    it just was not really possible, because
  • 4:32 - 4:38
    electronics were bigger and more power
    hungry and just didn't work that way, so
  • 4:38 - 4:50
    we designed UmTRX to work over gigabit
    ethernet and it was about that size. So
  • 4:50 - 4:57
    now we spend this year at designing
    something, which really brings me to what
  • 4:57 - 5:05
    I wanted those years ago, so the XTRX is a
    mini PCI Express - again there was no PCI
  • 5:05 - 5:10
    Express back then, so now it's mini PCI
    Express, which is even smaller than PCI, I
  • 5:10 - 5:18
    mean mini PCI and it's built to be
    embedded friendly, so you can plug this
  • 5:18 - 5:24
    into a single board computer, embedded
    single board computer. If you have a
  • 5:24 - 5:28
    laptop with a mini PCI Express you can
    plug this into your laptop and you have a
  • 5:28 - 5:35
    really small, software-defined radio
    equipment. And we really want to make it
  • 5:35 - 5:39
    inexpensive, that's why I was asking how
    many of you have ever worked it with RTL-
  • 5:39 - 5:44
    SDR, how many of you ever worked with you
    USRPs, because the gap between them is
  • 5:44 - 5:54
    pretty big and we want to really bring the
    software-defined radio to masses.
  • 5:54 - 6:00
    Definitely won't be as cheap as RTL-SDR,
    but we try to make it as close as
  • 6:00 - 6:03
    possible.
    And at the same time, so at the size of
  • 6:03 - 6:10
    RTL-SDR, at the price well higher but,
    hopeful hopefully it will be affordable to
  • 6:10 - 6:17
    pretty much everyone, we really want to
    bring high performance into your hands.
  • 6:17 - 6:23
    And by high performance I mean this is a
    full transmit/receive with two channels
  • 6:23 - 6:28
    transmit, two channels receive, which is
    usually called 2x2 MIMO in in the radio
  • 6:28 - 6:37
    world. The goal was to bring it to 160
    megasamples per second, which can roughly
  • 6:37 - 6:44
    give you like 120 MHz of radio spectrum
    available.
  • 6:44 - 6:53
    So what we were able to achieve is, again
    this is mini PCI Express form factor, it
  • 6:53 - 7:02
    has small Artix7, that's the smallest and
    most inexpensive FPGA, which has ability
  • 7:02 - 7:18
    to work with a PCI Express. It has LMS7000
    chip for RFIC, very high performance, very
  • 7:18 - 7:27
    tightly embedded chip with even a DSP
    blocks inside. It has even a GPS chip
  • 7:27 - 7:37
    here, you can actually on the right upper
    side, you can see a GPS chip, so you can
  • 7:37 - 7:44
    accually synchronize your SDR to GPS for
    perfect clock stability,
  • 7:44 - 7:51
    so you won't have any problems running any
    telecommunication systems like GSM, 3G, 4G
  • 7:51 - 7:59
    due to clock problems, and it also has
    interface for SIM cards, so you can
  • 7:59 - 8:06
    actually create a software-defined radio
    modem and run other open source projects
  • 8:06 - 8:16
    to build one in a four LT called SRSUI, if
    you're interested, etc., etc. so really
  • 8:16 - 8:22
    really tightly packed one. And if you put
    this into perspective: that's how it all
  • 8:22 - 8:31
    started in 2006 and that's what you have
    ten years later. It's pretty impressive.
  • 8:31 - 8:37
    applause
    Thanks. But I think it actually applies to
  • 8:37 - 8:40
    the whole industry who is working on
    shrinking the sizes because we just put
  • 8:40 - 8:49
    stuff on the PCB, you know. We're not
    building the silicon itself. Interesting
  • 8:49 - 8:55
    thing is that we did the first approach:
    we said let's pack everything, let's do a
  • 8:55 - 9:03
    very tight PCB design. We did an eight
    layer PCB design and when we send it to a
  • 9:03 - 9:10
    fab to estimate the cost it turned out
    it's $15,000 US per piece. Well in small
  • 9:10 - 9:19
    volumes obviously but still a little bit
    too much. So we had to redesign this and
  • 9:19 - 9:27
    the first thing which we did is we still
    kept eight layers, because in our
  • 9:27 - 9:33
    experience number of layers nowadays have
    only minimal impact on the cost of the
  • 9:33 - 9:42
    device. So like six, eight layers - the
    price difference is not so big. But we did
  • 9:42 - 9:52
    complete rerouting and only kept 2-Deep
    MicroVIAs and never use the buried VIAs.
  • 9:52 - 9:57
    So this make it much easier and much
    faster for the fab to manufacture it and
  • 9:57 - 10:04
    the price suddenly went five, six times
    down and in volume again it will be
  • 10:04 - 10:18
    significantly cheaper. And that's just for
    geek porn how PCB looks inside. So now
  • 10:18 - 10:25
    let's go into real stuff. So PCI Express:
    why did we choose PCI Express? As it was
  • 10:25 - 10:33
    said USB is a pain in the ass. You can't
    really use USB in industrial systems. For
  • 10:33 - 10:41
    a whole variety of reasons just unstable.
    So we did use Ethernet for many years
  • 10:41 - 10:47
    successfully but Ethernet has one problem:
    first of all inexpensive Ethernet is only
  • 10:47 - 10:52
    one gigabit and one gigabit does not offer
    you enough bandwidth to carry all the data
  • 10:52 - 11:00
    we want, plus its power-hungry etc. etc.
    So PCI Express is really a good choice
  • 11:00 - 11:06
    because it's low power, it has low
    latency, it has very high bandwidth and
  • 11:06 - 11:11
    it's available almost universally. When we
    started looking into this we realize that
  • 11:11 - 11:17
    even ARM boards, some of ARM boards have
    PCI Express, mini PCI Express slots, which
  • 11:17 - 11:27
    was a big surprise for me for example.
    So the problems is that unlike USB you do
  • 11:27 - 11:37
    need to write your own kernel driver for
    this and there's no way around. And it is
  • 11:37 - 11:41
    really hard to write this driver
    universally so we are writing it obviously
  • 11:41 - 11:45
    for Linux because they're working with
    embedded systems, but if we want to
  • 11:45 - 11:51
    rewrite it for Windows or for macOS we'll
    have to do a lot of rewriting. So we focus
  • 11:51 - 11:57
    on what we want on Linux only right now.
    And now the hardest part: debugging is
  • 11:57 - 12:03
    really non-trivial. One small error and
    your PC is completely hanged because you
  • 12:03 - 12:09
    use something wrong. And you have to
    reboot it and restart it. That's like
  • 12:09 - 12:16
    debugging kernel but sometimes even
    harder. To make it worse there is no
  • 12:16 - 12:19
    really easy-to-use plug-and-play
    interface. If you want to restart;
  • 12:19 - 12:24
    normally, when you when you develop a PCI
    Express card, when you want when you want
  • 12:24 - 12:31
    to restart it you have to restart your
    development machine. Again not a nice way,
  • 12:31 - 12:39
    it's really hard. So the first thing we
    did is we found, that we can use
  • 12:39 - 12:47
    Thunderbolt 3 which is just recently
    released, and it has ability to work
  • 12:47 - 12:57
    directly with PCI Express bus. So it
    basically has a mode in which it converts
  • 12:57 - 13:01
    a PCI Express into plug-and-play
    interface. So if you have a laptop which
  • 13:01 - 13:09
    supports Thunderbolt 3 then you can use
    this to do plug and play your - plug or
  • 13:09 - 13:16
    unplug your device to make your
    development easier. There are always
  • 13:16 - 13:24
    problems: there's no easy way, there's no
    documentation. Thunderbolt is not
  • 13:24 - 13:27
    compatible with Thunderbolt. Thunderbold 3
    is not compatible with Thunderbold 2.
  • 13:27 - 13:34
    So we had to buy a special laptop with
    Thunderbold 3 with special cables like all
  • 13:34 - 13:40
    this all this hard stuff. And if you
    really want to get documentation you have
  • 13:40 - 13:48
    to sign NDA and send a business plan to
    them so they can approve that your
  • 13:48 - 13:51
    business makes sense.
    laughter
  • 13:51 - 13:59
    I mean... laughs So we actually opted
    out. We set not to go through this, what
  • 13:59 - 14:05
    we did is we found that someone is
    actually making PCI Express to Thunderbolt
  • 14:05 - 14:11
    3 converters and selling them as dev
    boards and that was a big relief because
  • 14:11 - 14:17
    it saved us lots of time, lots of money.
    You just order it from from some from some
  • 14:17 - 14:25
    Asian company. And yeah this is how it
    looks like this converter. So you buy it,
  • 14:25 - 14:30
    like several pieces you can plug in your
    PCI Express card there and you plug this
  • 14:30 - 14:38
    into your laptop. And this is the with
    XTRX already plugged into it. Now the only
  • 14:38 - 14:50
    problem we found is that typically UEFI
    has a security control enabled, so that
  • 14:50 - 14:57
    any random thunderbold device can't hijack
    your PCI bus and can't get access to your
  • 14:57 - 15:02
    kernel memory and do some bad stuff. Which
    is a good idea - the only problem is that
  • 15:02 - 15:07
    there is, it's not fully implemented in
    Linux. So under Windows if you plug in a
  • 15:07 - 15:12
    device which is which has no security
    features, which is not certified, it will
  • 15:12 - 15:17
    politely ask you like: "Do you really
    trust this device? Do you want to use it?"
  • 15:17 - 15:22
    you can say "yes". Under Linux it just
    does not work. laughs So we spend some
  • 15:22 - 15:26
    time trying to figure out how to get
    around this. Right, some patches from
  • 15:26 - 15:30
    Intel which are not mainline and we were
    not able to actually get them work. So we
  • 15:30 - 15:39
    just had to disable all this security
    measure in the laptop. So be aware that
  • 15:39 - 15:47
    this is the case and we suspect that happy
    users of Apple might not be able to do
  • 15:47 - 15:54
    this because Apple don't have BIOS so it
    probably can't disable this feature. So
  • 15:54 - 16:02
    probably good incentive for someone to
    actually finish writing the driver.
  • 16:02 - 16:08
    So now to the goal: so we wanted to, we
    want to achieve 160 mega samples per
  • 16:08 - 16:14
    second, 2x2 MIMO, which means two
    transceiver, two transmit, two receive
  • 16:14 - 16:24
    channels at 12 bits, which is roughly 7.5
    Gbit/s. So first result when we plug this
  • 16:24 - 16:26
    when we got this board on the fab it
    didn't work
  • 16:26 - 16:30
    Sergey Kostanbaev mumbles: as expected
    Alexander Chemeris: yes as expected so the
  • 16:30 - 16:40
    first the interesting thing we realized is
    that: first of all the FPGA has Hardware
  • 16:40 - 16:47
    blocks for talking to a PCI Express which
    was called GTP which basically implement
  • 16:47 - 16:57
    like a PCI Express serial physical layer
    but the thing is the numbering is reversed
  • 16:57 - 17:04
    in the in PCI Express in FPGA and we did
    not realize this so we had to do very very
  • 17:04 - 17:11
    fine soldiering to actually swap the
    laughs swap the lanes you can see this
  • 17:11 - 17:18
    very fine work there.
    We also found that one of the components
  • 17:18 - 17:29
    was deadbug which is a well-known term for
    chips which design stage are placed at
  • 17:29 - 17:36
    mirrored so we mirrored occasionally
    mirrored that they pin out so we had to
  • 17:36 - 17:42
    solder it upside down and if you can
    realize how small it is you can also
  • 17:42 - 17:49
    appreciate the work done. And what's funny
    when I was looking at dead bugs I actually
  • 17:49 - 17:57
    found a manual from NASA which describes
    how to properly soldier dead bugs to get
  • 17:57 - 18:01
    it approved.
    audience laughs
  • 18:01 - 18:08
    So this is the link I think you can go
    there and enjoy it's also fun stuff there.
  • 18:08 - 18:17
    So after fixing all of this our next
    attempt this kind of works. So next stage
  • 18:17 - 18:23
    is debugging the FPGA code, which has to
    talk to PCI Express and PCI Express has to
  • 18:23 - 18:28
    talk to Linux kernel and the kernel has to
    talk to the driver, driver has talked to
  • 18:28 - 18:38
    the user space. So peripherals are easy so
    the UART SPIs we've got to work almost
  • 18:38 - 18:45
    immediately no problems with that, but DMA
    was a real beast. So we spent a lot of
  • 18:45 - 18:53
    time trying to get DMA to work and the
    problem is that with DMA it's on FPGA so
  • 18:53 - 19:00
    you can't just place a breakpoint like you
    do in C or C++ or in other languages it's
  • 19:00 - 19:07
    real-time system running on system like
    it's real-time hardware, which is running
  • 19:07 - 19:16
    on the fabric so you we had to Sergey was
    mainly developing this had to write a lot
  • 19:16 - 19:23
    of small test benches and and test
    everything piece by piece.
  • 19:23 - 19:31
    So all parts of the DMA code we had was
    wrapped into a small test bench which was
  • 19:31 - 19:40
    emulating all the all the tricks and as
    classics predicted it took about five to
  • 19:40 - 19:48
    ten times more than actually writing the
    code. So we really blew up our and
  • 19:48 - 19:55
    predicted timelines by doing this, but the
    end we've got really stable stable work.
  • 19:55 - 20:04
    So some suggestions for anyone who will
    try to repeat this exercise is there is a
  • 20:04 - 20:10
    logic analyzer built-in to Xilinx and you
    can use, it it's nice it's, sometimes it's
  • 20:10 - 20:16
    very helpful but you can't debug
    transient box, which are coming out at
  • 20:16 - 20:23
    when some weird conditions are coming up.
    So you have to implement some read back
  • 20:23 - 20:29
    registers which shows important statistic
    like important data about how your system
  • 20:29 - 20:35
    behaves, in our case it's various counters
    on the DMA interface. So you can actually
  • 20:35 - 20:41
    see kind of see what's happening with your
    with your data: Is it received? Is it
  • 20:41 - 20:46
    sent? How much is and how much is
    received? So like for example, we can see
  • 20:46 - 20:54
    when we saturate the bus or when actually
    is an underrun so host is not providing
  • 20:54 - 20:57
    data fast enough, so we can at least
    understand whether it's a host problem or
  • 20:57 - 21:02
    whether it's an FPGA, problem on which
    part we do we debug next because again:
  • 21:02 - 21:08
    it's a very multi layer problem you start
    with FPGA, PCI Express, kernel, driver,
  • 21:08 - 21:15
    user space, and any part can fail. so you
    can't work blind like this. So again the
  • 21:15 - 21:23
    goal was to get 160 MSPS with the first
    implementation we could 2 MSPS: roughly 60
  • 21:23 - 21:30
    times slower.
    The problem is that software just wasn't
  • 21:30 - 21:36
    keeping up and wasn't sending data fast
    enough. So it was like many things done
  • 21:36 - 21:41
    but the most important parts is: use real-
    time priority if you want to get very
  • 21:41 - 21:47
    stable results and well fix software bugs.
    And one of the most important bugs we had
  • 21:47 - 21:54
    was that DMA buffers were not freed in
    proper time immediately so they were busy
  • 21:54 - 21:59
    for longer than they should be, which
    introduced extra cycles and basically just
  • 21:59 - 22:06
    reduced the bandwidth.
    At this point let's talk a little bit
  • 22:06 - 22:14
    about how to implement a high-performance
    driver for Linux, because if you want to
  • 22:14 - 22:21
    get real real performance you have to
    start with the right design. There are
  • 22:21 - 22:27
    basically three approaches and the whole
    spectrum in between; like two approaches
  • 22:27 - 22:34
    and the whole spectrum in between, which
    is where you can refer to three. The first
  • 22:34 - 22:42
    approach is full kernel control, in which
    case kernel driver not only is on the
  • 22:42 - 22:46
    transfer, it actually has all the logics
    of controlling your device and all the
  • 22:46 - 22:52
    export ioctl to the user space and
    that's the kind of a traditional way of
  • 22:52 - 22:58
    writing drivers. Your your user space is
    completely abstracted from all the
  • 22:58 - 23:07
    details. The problem is that this is
    probably the slowest way to do it. The
  • 23:07 - 23:14
    other way is what's called the "zero cup
    interface": your only control is held in
  • 23:14 - 23:21
    the kernel and data is provided, the raw
    data is provided to user space "as-is". So
  • 23:21 - 23:28
    you avoid memory copy which make it
    faster. But still not fast enough if you
  • 23:28 - 23:34
    really want to achieve maximum
    performance, because you still have
  • 23:34 - 23:41
    context switches between the kernel and
    the user space. The most... the fastest
  • 23:41 - 23:47
    approach possible is to have full user
    space implementation when kernel just
  • 23:47 - 23:53
    exposed everything and says "now you do it
    yourself" and you have no you have no
  • 23:53 - 24:02
    context switches, like almost no, and you
    can really optimize everything. So what
  • 24:02 - 24:09
    is... what are the problems with this?
    The pro the pros I already mentioned: no
  • 24:09 - 24:14
    no switches between kernel user space,
    it's very low latency because of this as
  • 24:14 - 24:21
    well, it's very high bandwidth. But if you
    are not interested in getting the very
  • 24:21 - 24:28
    high performance, the most performance, and
    you just want to have like some little,
  • 24:28 - 24:33
    like say low bandwidth performance, then
    you will have to add hacks, because you
  • 24:33 - 24:37
    can't get notifications of the kernel that
    resources available is more data
  • 24:37 - 24:46
    available. It also makes it vulnerable
    vulnerable because if user space can
  • 24:46 - 24:55
    access it, then it can do whatever it
    want. We at the end decided that... one
  • 24:55 - 25:03
    more important thing: how to actually to
    get the best performance out of out of the
  • 25:03 - 25:10
    bus. This is a very (?)(?) set as we want
    to poll your device or not to poll and get
  • 25:10 - 25:14
    notified. What is polling? I guess
    everyone as programmer understands it, so
  • 25:14 - 25:18
    polling is when you asked repeatedly: "Are
    you ready?", "Are you ready?", "Are you
  • 25:18 - 25:20
    ready?" and when it's ready you get the
    data immediately.
  • 25:20 - 25:25
    It's basically a busy loop of your you
    just constantly asking device what's
  • 25:25 - 25:33
    happening. You need to dedicate a full
    core, and thanks God we have multi-core
  • 25:33 - 25:40
    CPUs nowadays, so you can dedicate the
    full core to this polling and you can just
  • 25:40 - 25:46
    pull constantly. But again if you don't
    need this highest performance, you just
  • 25:46 - 25:53
    need to get something, then you will be
    wasting a lot of CPU resources. At the end
  • 25:53 - 26:00
    we decided to do a combined architecture
    of your, it is possible to pull but
  • 26:00 - 26:06
    there's also a chance and to get
    notification from a kernel to for for
  • 26:06 - 26:11
    applications, which recover, which needs
    low bandwidth, but also require a better
  • 26:11 - 26:17
    CPU performance. Which I think is the best
    way if you are trying to target both
  • 26:17 - 26:31
    worlds. Very quickly: the architecture of
    system. We try to make it very very
  • 26:31 - 26:51
    portable so and flexible. There is a
    kernel driver, which talks to low-level
  • 26:51 - 26:56
    library which implements all this logic,
    which we took out of the driver: to
  • 26:56 - 27:01
    control the
    PCI Express, to work with DMA, to provide
  • 27:01 - 27:09
    all the... to hide all the details of the
    actual bus implementation.
  • 27:09 - 27:17
    And then there is a high-level library
    which talks to this low-level library and
  • 27:17 - 27:22
    also to libraries which implement control
    of actual peripherals, and most
  • 27:22 - 27:29
    importantly to the library which
    implements control over our RFIC chip.
  • 27:29 - 27:35
    This way it's very modular, we can replace
    PCI Express with something else later, we
  • 27:35 - 27:46
    might be able to port it to other
    operating systems, and that's the goal.
  • 27:46 - 27:50
    Another interesting issue is: when you
    start writing the Linux kernel driver you
  • 27:50 - 27:57
    very quickly realize that while LDD, which
    is a classic book for a Linux driver,
  • 27:57 - 28:02
    writing is good and it will give you a
    good insight; it's not actually up-to-
  • 28:02 - 28:09
    date. It's more than ten years old and
    there's all of new interfaces which are
  • 28:09 - 28:15
    not described there, so you have to resort
    to reading the manuals and all the
  • 28:15 - 28:20
    documentation in the kernel itself. Well
    at least you get the up-to-date
  • 28:20 - 28:32
    information. The decisions we made is to
    make everything easy. We use TTY for GPS
  • 28:32 - 28:38
    and so you can really attach a pretty much
    any application which talks to GPS. So all
  • 28:38 - 28:46
    of existing applications can just work out
    of the box. And we also wanted to be able
  • 28:46 - 28:55
    to synchronize system clock to GPS, so we
    get automatic log synchronization across
  • 28:55 - 28:59
    multiple systems, which is very important
    when we are deploying many, many devices
  • 28:59 - 29:07
    around the world.
    We plan to do two interfaces, one as key
  • 29:07 - 29:16
    PPS and another is a DCT, because DCT line
    on the UART exposed over TTY. Because
  • 29:16 - 29:20
    again we found that there are two types of
    applications: one to support one API,
  • 29:20 - 29:26
    others that support other API and there is
    no common thing so we have to support
  • 29:26 - 29:39
    both. As we described, we want to have
    polls so we can get notifications of the
  • 29:39 - 29:48
    kernel when data is available and we don't
    need to do real busy looping all the time.
  • 29:48 - 29:56
    After all the software optimizations we've
    got to like 10 MSPS: still very, very far
  • 29:56 - 30:02
    from what we want to achieve.
    Now there should have been a lot of
  • 30:02 - 30:07
    explanations about PCI Express, but when
    we actually wrote everything we wanted to
  • 30:07 - 30:14
    say we realize, it's just like a full two
    hours talk just on PCI Express. So we are
  • 30:14 - 30:18
    not going to give it here, I'll just give
    some highlights which are most
  • 30:18 - 30:24
    interesting. If you if there is real
    interest, we can set up a workshop and
  • 30:24 - 30:32
    some of the later days and talking more
    details about PCI Express specifically.
  • 30:32 - 30:39
    The thing is there is no open source cores
    for PCI Express, which are optimized for
  • 30:39 - 30:48
    high performance, real time applications.
    There is Xillybus which as I understand is
  • 30:48 - 30:53
    going to be open source, but they provide
    you a source if you pay them. It's very
  • 30:53 - 31:00
    popular because it's very very easy to do,
    but it's not giving you performance. If I
  • 31:00 - 31:05
    remember correctly the best it can do is
    maybe like 50 percent bus saturation.
  • 31:05 - 31:11
    So there's also Xilinx implementation, but
    if you are using Xilinx implementation
  • 31:11 - 31:21
    with AXI bus than you're really locked in
    with AXI bus with Xilinx. And it also not
  • 31:21 - 31:25
    very efficient in terms of resources and
    if you remember we want to make this very,
  • 31:25 - 31:30
    very inexpensive. So our goal is to you
    ... is to be able to fit everything in the
  • 31:30 - 31:38
    smallest Arctic's 7 FPGA, and that's quite
    challenging with all the stuff in there
  • 31:38 - 31:48
    and we just can't waste resources. So
    decision is to write your own PCI Express
  • 31:48 - 31:53
    implementation. That's how it looks like.
    I'm not going to discuss it right now.
  • 31:53 - 32:00
    There are several iterations. Initially it
    looked much simpler, turned out not to
  • 32:00 - 32:06
    work well.
    So some interesting stuff about PCI
  • 32:06 - 32:13
    Express which we stumbled upon is that it
    was working really well on Atom which is
  • 32:13 - 32:17
    our main development platform because we
    are doing a lot of embedded stuff. Worked
  • 32:17 - 32:26
    really well. When we try to plug this into
    core i7 just started hanging once in a
  • 32:26 - 32:35
    while. So after like several not days
    maybe with debugging, Sergey found that
  • 32:35 - 32:39
    very interesting statement in the standard
    which says that value is zero in byte
  • 32:39 - 32:46
    count actually stands not for zero bytes
    but for 4096 bytes.
  • 32:46 - 32:59
    I mean that's a really cool optimization.
    So another thing is completion which is a
  • 32:59 - 33:04
    term in PCI Express basically for
    acknowledgment which also can carry some
  • 33:04 - 33:12
    data back to your request. And sometimes
    if you're not sending completion, device
  • 33:12 - 33:21
    just hangs. And what happens is that in
    this case due to some historical heritage
  • 33:21 - 33:30
    of x86 it just starts returning you FFF.
    And if you have a register which says: „Is
  • 33:30 - 33:35
    your device okay?“ and this register shows
    one to say „The device is okay“, guess
  • 33:35 - 33:38
    what will happen?
    You will be always reading that your
  • 33:38 - 33:47
    device is okay. So the suggestion is not
    to use one as the status for okay and use
  • 33:47 - 33:53
    either zero or better like a two-beat
    sequence. So you are definitely sure that
  • 33:53 - 34:04
    you are okay and not getting FFF's. So
    when you have a device which again may
  • 34:04 - 34:10
    fail at any of the layers, you just got
    this new board, it's really hard, it's
  • 34:10 - 34:18
    really hard to debug because of memory
    corruption. So we had a software bug and
  • 34:18 - 34:25
    it was writing DMA addresses
    incorrectly and we were wondering why we
  • 34:25 - 34:32
    are not getting any data in our buffers at
    the same time. After several starts,
  • 34:32 - 34:41
    operating system just crashes. Well, that's
    the reason why there is this UEFI
  • 34:41 - 34:47
    protection which prevents you from
    plugging in devices like this into your
  • 34:47 - 34:52
    computer. Because it was basically writing
    data, like random data into random
  • 34:52 - 35:00
    portions of your memory. So a lot of
    debugging, a lot of tests and test benches
  • 35:00 - 35:11
    and we were able to find this. And another
    thing is if you deinitialize your driver
  • 35:11 - 35:15
    incorrectly, and that's what's happening
    when you have plug-and-play device, which
  • 35:15 - 35:22
    you can plug and unplug, then you may end
    up in a situation of your ... you are
  • 35:22 - 35:28
    trying to write into memory which is
    already freed by approaching system and
  • 35:28 - 35:36
    used for something else. Very well-known
    problem but it also happens here. So there
  • 35:36 - 35:51
    ... why DMA is really hard is because it
    has this completion architecture for
  • 35:51 - 35:56
    writing for ... sorry ... for reading
    data. Writes are easy. You just send the
  • 35:56 - 36:00
    data, you forget about it. It's a fire-
    and-forget system. But for reading you
  • 36:00 - 36:10
    really need to get your data back. And the
    thing is, it looks like this. You really
  • 36:10 - 36:16
    hope that there would be some pointing
    device here. But basically on the top left
  • 36:16 - 36:24
    you can see requests for read and on the
    right you can see completion transactions.
  • 36:24 - 36:30
    So basically each transaction can be and
    most likely will be split into multiple
  • 36:30 - 36:39
    transactions. So first of all you have to
    collect all these pieces and like write
  • 36:39 - 36:46
    them into proper parts of the memory.
    But that's not all. The thing is the
  • 36:46 - 36:53
    latency between request and completion is
    really high. It's like 50 cycles. So if
  • 36:53 - 36:59
    you have a single, only single transaction
    in fly you will get really bad
  • 36:59 - 37:04
    performance. You do need to have multiple
    transactions in flight. And the worst
  • 37:04 - 37:13
    thing is that transactions can return data
    in random order. So it's a much more
  • 37:13 - 37:20
    complicated state machine than we expected
    originally. So when I said, you know, the
  • 37:20 - 37:26
    architecture was much simpler originally,
    we don't have all of this and we had to
  • 37:26 - 37:32
    realize this while implementing. So again
    here was a whole description of how
  • 37:32 - 37:41
    exactly this works. But not this time. So
    now after all these optimizations we've
  • 37:41 - 37:49
    got 20 mega samples per second which is
    just six times lower than what we are
  • 37:49 - 38:00
    aiming at. So now the next thing is PCI
    Express lanes scalability. So PCI Express
  • 38:00 - 38:07
    is a serial bus. So it has multiple lanes
    and they allow you to basically
  • 38:07 - 38:14
    horizontally scale your bandwidth. One
    lane is like x, than two lane is 2x, four
  • 38:14 - 38:20
    lane is 4x. So the more lanes you have the
    more performance you are getting out of
  • 38:20 - 38:24
    your, out of your bus. So the more
    bandwidth you're getting out of your bus.
  • 38:24 - 38:32
    Not performance. So the issue is that
    typical a mini PCI Express, so the mini
  • 38:32 - 38:39
    PCI Express standard only standardized one
    lane. And second lane is left as optional.
  • 38:39 - 38:46
    So most motherboards don't support this.
    There are some but not all of them. And we
  • 38:46 - 38:52
    really wanted to get this done. So we
    designed a special converter board which
  • 38:52 - 38:58
    allows you to plug your mini PCI Express
    into a full-size PCI Express and
  • 38:58 - 39:07
    get two lanes working. And we're also
    planning to have a similar board which
  • 39:07 - 39:13
    will have multiple slots so you will be
    able to get multiple XTRX-SDRs on to the
  • 39:13 - 39:21
    same, onto the same carrier board and plug
    this into let's say PCI Express 16x and
  • 39:21 - 39:29
    you will get like really a lot of ... SDR
    ... a lot of IQ data which then will be
  • 39:29 - 39:39
    your problem how to, how to process. So
    with two x's it's about twice performance
  • 39:39 - 39:49
    so we are getting fifty mega samples per
    second. And that's the time to really cut
  • 39:49 - 39:59
    the fat because the real sample size of
    LMS7 is 12 bits and we are transmitting 16
  • 39:59 - 40:07
    because it's easier. Because CPU is
    working on 8, 16, 32. So we originally
  • 40:07 - 40:14
    designed the driver to support 8 bit, 12
    bit and 16 bit to be able to do this
  • 40:14 - 40:24
    scaling. And for the test we said okay
    let's go from 16 to 8 bit. We'll lose
  • 40:24 - 40:33
    some dynamic range but who cares these
    days. Still stayed the same, it's still 50
  • 40:33 - 40:42
    mega samples per second, no matter what we
    did. And that was a lot of interesting
  • 40:42 - 40:50
    debugging going on. And we realized that
    we actually made another, not a really
  • 40:50 - 40:59
    mistake. We didn't, we didn't really know
    this when we designed. But we should have
  • 40:59 - 41:04
    used a higher voltage for this high speed
    bus to get it to the full performance. And
  • 41:04 - 41:13
    at 1.8 it was just degrading too fast and
    the bus itself was not performing well. So
  • 41:13 - 41:22
    our next prototype will be using higher
    voltage specifically for this bus. And
  • 41:22 - 41:27
    this is kind of stuff which makes
    designing hardware for high speed really
  • 41:27 - 41:32
    hard because you have to care about
    coherence of the parallel buses on your,
  • 41:32 - 41:39
    on your system. So at the same time we do
    want to keep 1.8 volts for everything else
  • 41:39 - 41:43
    as much as possible. Because another
    problem we are facing with this device is
  • 41:43 - 41:47
    that by the standard mini PCI Express
    allows only like ...
  • 41:47 - 41:51
    Sergey Kostanbaev: ... 2.5 ...
    Alexander Chemeris: ... 2.5 watts of power
  • 41:51 - 41:58
    consumption, no more. And that's we were,
    we were very lucky that LMS7 has such so
  • 41:58 - 42:04
    good, so good power consumption
    performance. We actually had some extra
  • 42:04 - 42:10
    space to have FPGA and GPS and all this
    stuff. But we just can't let the power
  • 42:10 - 42:15
    consumption go up. Our measurements on
    this device showed about ...
  • 42:15 - 42:19
    Sergey Kostanbaev: ... 2.3 ...
    Alexander Chemeris: ... 2.3 watts of power
  • 42:19 - 42:27
    consumption. So we are like at the limit
    at this point. So when we fix the bus with
  • 42:27 - 42:31
    the higher voltage, you know it's a
    theoretical exercise, because we haven't
  • 42:31 - 42:38
    done this yet, that's plenty to happen in
    a couple months. We should be able to get
  • 42:38 - 42:47
    to this numbers which was just 1.2 times
    slower. Then the next thing will be to fix
  • 42:47 - 42:56
    another issue which we made at the very
    beginning: we have procured a wrong chip.
  • 42:56 - 43:05
    Just one digit difference, you can see
    it's highlighted in red and green, and
  • 43:05 - 43:13
    this chip it supports only a generation 1
    PCI Express which is twice slower than
  • 43:13 - 43:18
    generation 2 PCI Express.
    So again, hopefully we'll replace the chip
  • 43:18 - 43:30
    and just get very simple doubling of the
    performance. Still it will be slower than
  • 43:30 - 43:40
    we wanted it to be and here is what comes
    like practical versus theoretical numbers.
  • 43:40 - 43:47
    Well as every bus it has it has overheads
    and one of the things which again we
  • 43:47 - 43:51
    realized when we were implementing this
    is, that even though the standard
  • 43:51 - 43:59
    standardized is the payload size of 4kB,
    actual implementations are different. For
  • 43:59 - 44:08
    example desktop computers like Intel Core
    or Intel Atom they only have 128 byte
  • 44:08 - 44:19
    payload. So there is much more overhead
    going on the bus to transfer data and even
  • 44:19 - 44:29
    theoretically you can only achieve 87%
    efficiency. And on Xeon we tested and we
  • 44:29 - 44:37
    found that they're using 256 payload size
    and this can give you like a 92%
  • 44:37 - 44:45
    efficiency on the bus and this is before
    the overhead so the real reality is even
  • 44:45 - 44:53
    worse. An interesting thing which we also
    did not expect, is that we originally were
  • 44:53 - 45:03
    developing on Intel Atom and everything
    was working great. When we plug this into
  • 45:03 - 45:11
    laptop like Core i7 multi-core really
    powerful device, we didn't expect that it
  • 45:11 - 45:20
    wouldn't work. Obviously Core i7 should
    work better than Atom: no, not always.
  • 45:20 - 45:26
    The thing is, we were plugging into a
    laptop, which had a built-in video card
  • 45:26 - 45:45
    which was sitting on the same PCI bus and
    probably manufacturer hard-coded the higher
  • 45:45 - 45:51
    priority for the video card than for
    everything else in the system, because I
  • 45:51 - 45:56
    don't want your your screen to flicker.
    And so when you move a window you actually
  • 45:56 - 46:04
    see the late packets coming to your PCI
    device. We had to introduce a jitter
  • 46:04 - 46:15
    buffer and add more FIFO into the device
    to smooth it out. On the other hand the
  • 46:15 - 46:20
    Xeon is performing really well. So it's
    very optimized. That said, we have tested
  • 46:20 - 46:28
    it with discreet card and it outperforms
    everything by whooping five seven percent.
  • 46:28 - 46:39
    What you get four for the price. So this
    is actually the end of the presentation.
  • 46:39 - 46:44
    We still have not scheduled any workshop,
    but if there if there is any interest in
  • 46:44 - 46:53
    actually seeing the device working or if
    you interested in learning more about the
  • 46:53 - 46:58
    PCI Express in details let us know we'll
    schedule something in the next few days.
  • 46:58 - 47:05
    That's the end, I think we can proceed
    with questions if there are any.
  • 47:05 - 47:15
    Applause
    Herald: Okay, thank you very much. If you
  • 47:15 - 47:18
    are leaving now: please try to leave
    quietly because we might have some
  • 47:18 - 47:23
    questions and you want to hear them. If
    you have questions please line up right
  • 47:23 - 47:29
    behind the microphones and I think we'll
    just wait because we don't have anything
  • 47:29 - 47:35
    from the signal angel. However, if you are
    watching on stream you can hop into the
  • 47:35 - 47:40
    channels and over social media to ask
    questions and they will be answered,
  • 47:40 - 47:48
    hopefully. So on that microphone.
    Question 1: What's the minimum and maximum
  • 47:48 - 47:52
    frequency of the card?
    Alexander Chemeris: You mean RF
  • 47:52 - 47:56
    frequency?
    Question 1: No, the minimum frequency you
  • 47:56 - 48:06
    can sample at. the most SDR devices can
    only sample at over 50 MHz. Is there a
  • 48:06 - 48:09
    similar limitation at your card?
    Alexander Chemeris: Yeah, so if you're
  • 48:09 - 48:16
    talking about RF frequency it can go
    from like almost zero even though that
  • 48:16 - 48:27
    works worse below 50MHz and all the way to
    3.8GHz if I remember correctly. And in
  • 48:27 - 48:35
    terms of the sample rate right now it
    works from like about 2 MSPS and to about
  • 48:35 - 48:40
    50 right now. But again, we're planning to
    get it to these numbers we quoted.
  • 48:40 - 48:46
    Herald: Okay. The microphone over there.
    Question 2: Thanks for your talk. Did you
  • 48:46 - 48:49
    manage to put your Linux kernel driver to
    the main line?
  • 48:49 - 48:54
    Alexander Chemeris: No, not yet. I mean,
    it's not even like fully published. So I
  • 48:54 - 48:59
    did not say in the beginning, sorry for
    this. We only just manufactured the first
  • 48:59 - 49:04
    prototype, which we debugged heavily. So
    we are only planning to manufacture the
  • 49:04 - 49:10
    second prototype with all these fixes and
    then we will release, like, the kernel
  • 49:10 - 49:17
    driver and everything. And maybe we'll try
    or maybe won't try, haven't decided yet.
  • 49:17 - 49:18
    Question 2: Thanks
    Herald: Okay...
  • 49:18 - 49:22
    Alexander Chemeris: and that will be the
    whole other experience.
  • 49:22 - 49:26
    Herald: Okay, over there.
    Question 3: Hey, looks like you went
  • 49:26 - 49:30
    through some incredible amounts of pain to
    make this work. So, I was wondering,
  • 49:30 - 49:35
    aren't there any simulators at least for
    parts of the system, or the PCIe bus for
  • 49:35 - 49:40
    the DMA something? Any simulator so that
    you can actually first design the system
  • 49:40 - 49:45
    there and debug it more easily?
    Sergey Kostanbaev: Yes, there are
  • 49:45 - 49:50
    available simulators, but the problem's
    all there are non-free. So you have to pay
  • 49:50 - 49:57
    for them. So yeah and we choose the hard
    way.
  • 49:57 - 50:00
    Question 3: Okay thanks.
    Herald: We have a question from the signal
  • 50:00 - 50:03
    angel.
    Question 4: Yeah are the FPGA codes, Linux
  • 50:03 - 50:08
    driver, and library code, and the design
    project files public and if so, did they
  • 50:08 - 50:13
    post them yet? They can't find them on
    xtrx.io.
  • 50:13 - 50:18
    Alexander Chemeris: Yeah, so they're not
    published yet. As I said, we haven't
  • 50:18 - 50:25
    released them. So, the drivers and
    libraries will definitely be available,
  • 50:25 - 50:29
    FPGA code... We are considering this
    probably also will be available in open
  • 50:29 - 50:36
    source. But we will publish them together
    with the public announcement of the
  • 50:36 - 50:42
    device.
    Herald: Ok, that microphone.
  • 50:42 - 50:46
    Question 5: Yes. Did you guys see any
    signal integrity issues between on the PCI
  • 50:46 - 50:50
    bus, or on this bus to the LMS chip, the
    Lime microchip, I think, this doing
  • 50:50 - 50:51
    the RF ?
    AC: Right.
  • 50:51 - 50:56
    Question 5: Did you try to measure signal
    integrity issues, or... because there were
  • 50:56 - 51:01
    some reliability issues, right?
    AC: Yeah, we actually... so, PCI. With PCI
  • 51:01 - 51:03
    we never had issues, if I remember
    correctly.
  • 51:03 - 51:05
    SK: No.
    AC: I just... it was just working.
  • 51:05 - 51:11
    SK: Well, the board is so small, and when
    there are small traces there's no problem
  • 51:11 - 51:15
    in signal integrity. So it's actually
    saved us.
  • 51:15 - 51:21
    AC: Yeah. Designing a small board is easier.
    Yeah, with the LMS 7, the problem is not
  • 51:21 - 51:26
    the signal integrity in terms of
    difference in the length of the traces,
  • 51:26 - 51:37
    but rather the fact that the signal
    degrades over voltage, also over speed in
  • 51:37 - 51:44
    terms of voltage, and drops below the
    detection level, and all this stuff. We
  • 51:44 - 51:47
    use some measurements. I actually wanted
    to add some pictures here, but decided
  • 51:47 - 51:54
    that's not going to be super interesting.
    H: Okay. Microphone over there.
  • 51:54 - 51:58
    Question 6: Yes. Thanks for the talk. How
    much work would it be to convert the two
  • 51:58 - 52:06
    by two SDR into an 8-input logic analyzer
    in terms of hard- and software? So, if you
  • 52:06 - 52:12
    have a really fast logic analyzer, where
    you can record unlimited traces with?
  • 52:12 - 52:19
    AC: A logic analyzer...
    Q6: So basically it's just also an analog
  • 52:19 - 52:27
    digital converter and you largely want
    fast sampling and a large amount of memory
  • 52:27 - 52:31
    to store the traces.
    AC: Well, I just think it's not the best
  • 52:31 - 52:40
    use for it. It's probably... I don't know.
    Maybe Sergey has any ideas, but I think it
  • 52:40 - 52:48
    just may be easier to get high-speed ADC
    and replace the Lime chip with a high-
  • 52:48 - 52:57
    speed ADC to get what you want, because
    the Lime chip has so many things there
  • 52:57 - 53:01
    specifically for RF.
    SK: Yeah, the main problem you cannot just
  • 53:01 - 53:09
    sample original data. You should shift it
    over frequency, so you cannot sample
  • 53:09 - 53:17
    original signal, and using it for
    something else except spectrum analyzing
  • 53:17 - 53:21
    is hard.
    Q6: OK. Thanks.
  • 53:21 - 53:26
    H: OK. Another question from the internet.
    Signal angel: Yes. Have you compared the
  • 53:26 - 53:32
    sample rate of the ADC of the Lime DA chip
    to the USRP ADCs, and if so, how does the
  • 53:32 - 53:40
    lower sample rate affect the performance?
    AC: So, comparing low sample rate to
  • 53:40 - 53:49
    higher sample rate. We haven't done much
    testing on the RF performance yet, because
  • 53:49 - 53:58
    we were so busy with all this stuff, so we
    are yet to see in terms of low bit rates
  • 53:58 - 54:03
    versus sample rates versus high sample
    rate. Well, high sample rate always gives
  • 54:03 - 54:10
    you better performance, but you also get
    higher power consumption. So, I guess it's
  • 54:10 - 54:14
    the question of what's more more important
    for you.
  • 54:14 - 54:20
    H: Okay. Over there.
    Question 7: I've gathered there is no
  • 54:20 - 54:25
    mixer bypass, so you can't directly sample
    the signal. Is there a way to use the same
  • 54:25 - 54:32
    antenna for send and receive, yet.
    AC: Actually, there is... Input for ADC.
  • 54:32 - 54:38
    SK: But it's not a bypass, it's a
    dedicated pin on LMS chip, and since we're
  • 54:38 - 54:46
    very space-constrained, we didn't route
    them, so you can not actually bypass it.
  • 54:46 - 54:50
    AC: Okay, in our specific hardware, so in
    general, so in the LMS chip there is a
  • 54:50 - 54:58
    special pin which allows you to drive your
    signal directly to ADC without all the
  • 54:58 - 55:03
    mixers, filters, all this radio stuff,
    just directly to ADC. So, yes,
  • 55:03 - 55:07
    theoretically that's possible.
    SK: We even thought about this, but it
  • 55:07 - 55:11
    doesn't fit this design.
    Q7: Okay. And can I share antennas,
  • 55:11 - 55:16
    because I have an existing laptop with
    existing antennas, but I would use the
  • 55:16 - 55:22
    same antenna to send and receive.
    AC: Yeah, so, I mean, that's... depends on
  • 55:22 - 55:26
    what exactly do you want to do. If you
    want a TDG system, then yes, if you
  • 55:26 - 55:31
    want an FDG system, then you will have to
    put a small duplexer in there, but yeah,
  • 55:31 - 55:35
    that's the idea. So you can plug this into
    your laptop and use your existing
  • 55:35 - 55:40
    antennas. That's one of the ideas of how
    to use xtrx.
  • 55:40 - 55:42
    Q7: Yeah, because there's all four
    connectors.
  • 55:42 - 55:45
    AC: Yeah. One thing which I actually
    forgot to mention is - I kind of mentioned
  • 55:45 - 55:54
    in the slides - is that any other SDRs
    which are based on Ethernet or on the USB
  • 55:54 - 56:02
    can't work with a CSMA wireless systems,
    and the most famous CSMA system is Wi-Fi.
  • 56:02 - 56:09
    So, it turns out that because of the
    latency between your operating system and
  • 56:09 - 56:18
    your radio on USB, you just can't react
    fast enough for Wi-Fi to work, because you
  • 56:18 - 56:23
    - probably you know that - in Wi-Fi you
    carrier sense, and if you sense that the
  • 56:23 - 56:30
    spectrum is free, you start transmitting.
    Does make a sense when you have huge
  • 56:30 - 56:36
    latency, because you all know that... you
    know the spectrum was free back then, so,
  • 56:36 - 56:44
    with xtrx, you actually can work with CSMA
    systems like Wi-Fi, so again it makes it
  • 56:44 - 56:51
    possible to have a fully software
    implementation of Wi-Fi in your laptop. It
  • 56:51 - 56:59
    obviously won't work like as good as your
    commercial Wi-Fi, because you will have to
  • 56:59 - 57:04
    do a lot of processing on your CPU, but
    for some purposes like experimentation,
  • 57:04 - 57:08
    for example, for wireless labs and R&D
    labs, that's really valuable.
  • 57:08 - 57:11
    Q7: Thanks.
    H: Okay. Over there.
  • 57:11 - 57:16
    Q8: Okay. what PCB design package did you
    use?.
  • 57:16 - 57:18
    AC: Altium.
    SK: Altium, yeah.
  • 57:18 - 57:23
    Q8: And I'd be interested in the PCIe
    workshop. Would be really great if you do
  • 57:23 - 57:25
    this one.
    AC: Say this again?
  • 57:25 - 57:28
    Q8: Would be really great if you do the
    PCI Express workshop.
  • 57:28 - 57:33
    AC: Ah. PCI Express workshop. Okay. Thank
    you.
  • 57:33 - 57:37
    H: Okay, I think we have one more question
    from the microphones, and that's you.
  • 57:37 - 57:43
    Q9: Okay. Great talk. And again, I would
    appreciate a PCI Express workshop, if it
  • 57:43 - 57:47
    ever happens. What are these
    synchronization options between multiple
  • 57:47 - 57:55
    cards. Can you synchronize the ADC clock,
    and can you synchronize the presumably
  • 57:55 - 58:05
    digitally created IF? SK: Yes, so... so,
    unfortunately, just IF synchronization is
  • 58:05 - 58:10
    not possible, because Lime chip doesn't
    expose a low frequency. But we can
  • 58:10 - 58:16
    synchronize digitally. So, we have special
    one PPS signal synchronization. We have
  • 58:16 - 58:25
    lines for clock synchronization and other
    stuff. We can do it in software. So the
  • 58:25 - 58:32
    Lime chip has phase correction register,
    so when you measure... if there is a phase
  • 58:32 - 58:35
    difference, so you can compensate it on
    different boards.
  • 58:35 - 58:39
    Q9: Tune to a station a long way away and
    then rotate the phase until it aligns.
  • 58:39 - 58:42
    SK: Yeah.
    Q9: Thank you.
  • 58:42 - 58:46
    AC: Little tricky, but possible. So,
    that's one of our plans for future,
  • 58:46 - 58:53
    because we do want to see, like 128 by 128
    MIMO at home.
  • 58:53 - 58:56
    H: Okay, we have another question from the
    internet.
  • 58:56 - 59:00
    Signal angel: I actually have two
    questions. The first one is: What is the
  • 59:00 - 59:08
    expected price after a prototype stage?
    And the second one is: Can you tell us
  • 59:08 - 59:10
    more about this setup you had for
    debugging the PCIe
  • 59:10 - 59:16
    issues?
    AC: Could you repeat the second question?
  • 59:16 - 59:20
    SK: It's ????????????, I think.
    SA: It's more about the setup you had for
  • 59:20 - 59:24
    debugging the PCIe issues.
    SK: Second question, I think it's most
  • 59:24 - 59:31
    about our next workshop, because it's a
    more complicated setup, so... mostly
  • 59:31 - 59:36
    remove everything about its now current
    presentation.
  • 59:36 - 59:40
    AC: Yeah, but in general, and in terms of
    hardware setup, that was our hardware
  • 59:40 - 59:48
    setup, so we bought this PCI Express to
    Thunderbolt3, we bought the laptop which
  • 59:48 - 59:53
    supports Thunderbolt3, and that's how we
    were debugging it. So, we don't need, like
  • 59:53 - 59:58
    a full-fledged PC, we don't have to
    restart it all the time. So, in terms of
  • 59:58 - 60:07
    price, we don't have the fixed price yet.
    So, all I can say right now is that we are
  • 60:07 - 60:18
    targeting no more than your bladeRF or
    HackRF devices, and probably even cheaper.
  • 60:18 - 60:25
    For some versions.
    H: Okay. We are out of time, so thank you
  • 60:25 - 60:45
    again Sergey and Alexander.
    [Applause]
  • 60:45 - 60:50
    [Music]
  • 60:50 - 60:55
    subtitles created by c3subtitles.de
    in the year 20??. Join, and help us!
Title:
Building a high throughput low-latency PCIe based SDR (33c3)
Description:

more » « less
Video Language:
English
Duration:
01:00:55

English subtitles

Revisions