Skip to main content
Category

Blog

Joint Analog Workgroup / MOS-AK Panel Session

By F4PGA

Please join us for a special joint panel webinar session for the CHIPS Alliance Analog Workgroup and MOS-AK Foundation.

This panel will feature speakers with 20 minute talks on the following topic areas:

  • @Mehdi Saligane : Introduction to the open source EDA tool flow for IC design (with reference to [1])
  • @Makris Nikolaos : EKV3 in NGSPICE using ADMSXL
  • @Keiter, Eric R  : Xyce and its support for commercial (hSpice/spectre) libs/syntax
  • @Tim Edwards : his work on the SkyWater 130 nm compatibility with ngspice
  • @Kevin Cameron : update on the P1800 (SystemVerilog) AMS standardization efforts (public doc [2]) 

There will be time for Q & A after each talk and conversation after the presentation completion.

This webinar can be accessed via the following Zoom link, and will be recorded:

Dec 7, 2022 12:00 PM Eastern Time (US and Canada)

Topic: AWG/MOS-AK Panel Discussion

Please click the link below to join the webinar:https://zoom.us/j/93058965332

F4PGA open source flow gets a new Python-based build system and CLI tool

By F4PGA

One of the most recent projects developed within the workgroup is the unified f4pga CLI tool. In the broader context of our continuous efforts to make the FPGA space more unified and flexible, creating the f4pga CLI tool was a logical next step – it allowed us to wrap the underlying tools into a single CLI, making the F4PGA toolchain a more complete flow. The currently supported architectures are AMD’s (former Xilinx) 7 Series, Lattice’s iCE40 and QuickLogic’s EOS S3. Details are here:

https://antmicro.com/blog/2022/09/f4pga-new-build-system-and-cli-tool

Skywater

By Blog

It’s great to learn that Google  announced the expansion of its partnership with SkyWater Technology. They are working together to release an open source process design kit (PDK) for SKY90-FD, SkyWater’s commercial 90nm fully depleted silicon on insulator (FDSOI) CMOS process technology. SKY90-FD is based on MIT Lincoln Laboratory’s 90 nm commercial FDSOI technology, and enables designers to create complex integrated circuits for a diverse range of applications.

You can read more @ https://opensource.googleblog.com/2022/07/SkyWater-and-Google-expand-open-source-program-to-new-90nm-technology.html

Enhanced System Verilog Support for Yosys via Antmicro plug-in

By Blog

CHIPS Alliance is pleased to see the announcement by Antmicro for its development and contribution to the open source hardware community to provide a easy to use plug-in for any version of Yosys to allow import of System Verilog based designs. This development is made possible by the underlying utilization of the Unified Hardware Data Model (UHDM), a key open source data representation upon which EDA applications can be built. Details can be seen here from Antmicro: https://antmicro.com/blog/2022/02/simplifying-open-source-sv-synthesis-with-the-yosys-uhdm-plugin/

Towards UVM: Using Coroutines for Low-overhead Dynamic Scheduling in Verilator

By Blog

This post was originally published at Antmicro.

Verilator is a popular open source SystemVerilog simulator and one of the key tools in the ASIC and FPGA ecosystem, which Antmicro is actively using and developing, e.g. by enabling co-simulation with Renode or Cocotb integration. It’s also one of the fastest available HDL simulators, including proprietary alternatives. It achieves that speed by generating highly optimized C++ code from a given hardware design. Verilator does a lot of work at compile-time to make the generated (‘verilated’) code extremely fast, such as ordering statements in an optimal way.

Verilation diagram

This static ordering of code also means that support for some SystemVerilog features has been sacrificed to make Verilator so performant. Namely, Verilator does not support what is known as the stratified scheduler, an algorithm that specifies the correct order of execution of SystemVerilog designs. This algorithm is dynamic by nature, and does not fit with Verilator’s static approach.

Because of this, it doesn’t support UVM, a widely-used framework for testing hardware design. Testbenches for Verilator have to be written using C++, which is not ideal – you shouldn’t have to know how to program in C++ in order to use a SystemVerilog simulator. Many ASIC projects out there are not able to take advantage of Verilator, because verification in this space is very often done with UVM. This is a gap that together with Western Digital, Google and the entire CHIPS Alliance we have been working to close, to enable fully open source, cloud-scalable verification usable by the broad ASIC industry.

A milestone towards open source UVM

Some of the key features UVM requires are dynamically-triggered event variables and delays. To support them, we introduced to Verilator what we call a dynamic scheduler with a proof-of-concept implementation which we described in more detail in a previous blog note earlier this year. Essentially, it enabled us to suspend execution of SystemVerilog processes when waiting for delays to finish or events to be triggered, thus postponing some of the scheduling from compile-time to runtime.

initial forever begin
    @ping;
    #1 ->pong;
end

initial forever begin
    #1 ->ping;
    @pong;
end

That thread-based implementation worked, but it required us to run each process in a design in a separate thread, using mutexes and condition variables to facilitate communication. With a working solution in hand, which proved that what we set out to do was possible, we started thinking about a different approach which would allow us to avoid the significant performance overhead introduced by threads and hopefully also simplify the implementation. That’s when coroutines came up as a possible solution.

What is a coroutine?

The concept of coroutines has been around for decades. Arguably, most programmers have used them, knowingly or not. They are available in some form for most modern programming languages, and now they are also included in the newest C++20 standard. But what are they exactly?

Normally, when a function or procedure is called, it needs to finish execution in order for the control flow to go back to a previously executed function. This is reflected in the way the call stack works. A coroutine is a generalization of the concept of a function, but it differs in that its execution can be paused at any point, and resumed from any other point in the program, even from a different thread. Implementations vary, but often this is achieved by allocating coroutine state on the heap.

Diagram depicting the call stack and coroutine state

Unlike threads which are commonly used in desktop operating systems, coroutines are a form of cooperative multitasking, meaning that they have to yield control by themselves – there is no scheduler controlling them from the outside. A programmer needs to specify when and where a coroutine should resume execution.

A popular use case for coroutines is writing generators. As the name suggests, a generator is used for generating some set of values, but instead of returning them all at once, it yields them one by one to the function that called the generator.

generator<uint64_t> fib(int n) {
    uint64_t a = 0, b = 1;
    for (int i = 0; i < n; i++) {
        b = b + exchange(a, b);
        co_yield a;
    }
}
for (uint64_t n : fib(40))
    printf("%d\n", n);

Coroutines are also useful for asynchronous programming, for writing functions that start their execution on one thread, but continue on another (i.e. a background thread intended for heavy computation).

ui_task click_compute() {
    label = "Computing...";
    co_await compute();
    label = "Finished!";
}

Currently, coroutines are supported by many C++ compilers, including GCC 11 and Clang 13 (which offers experimental support). It’s worth mentioning that Clang is excellent at optimizing them: if a coroutine does not outlive the calling function’s stack frame, and its state object’s size is known at compile time, the heap allocation can be elided. Coroutine state is then simply stored on the stack. This gives Clang a significant performance edge over GCC in some cases, such as when using generators.

Coroutines for dynamic scheduling

From the get-go, coroutines seemed like a good fit for dynamic scheduling of SystemVerilog in Verilator. As previously mentioned, they follow the cooperative model of multitasking, which is sufficient for handling delays and events in SV processes. Preemption is not necessary, as there is no danger of starving a task. That is because all SystemVerilog processes should yield in a given time slot either after they finish or when they’re awaiting an event.

A significant drawback of threads, which was what the initial implementation was based on, is that it’s not possible to spawn thousands of them, one for each process in a design. However, it is possible to spawn thousands of coroutines, and that number is only bound by the amount of RAM available to the user. Also, with coroutines, one does not have to worry about multithreading problems like data races. All multitasking can be done on one thread.

The only issue with coroutines is the allocation of coroutine state. However, there are ways to mitigate that by using a custom allocator, as well as only using coroutines for the parts of a design that actually require it. After all, dynamic scheduling is not relevant to the synthesizable subset of SystemVerilog.

Thus, we decided to go ahead and replace threads with coroutines in our implementation. The new approach immediately proved to be easier to work with, and development pace increased significantly. The new version already surpassed the thread-based implementation in completeness as well as performance, and is available here. Let’s take a closer look at how it works.

Implementation

initial forever begin
    @ping;
    #1;
    ->pong;
end
while (true) {
    co_await ping;
    co_await 1;
    resume(pong);
}

The general idea for the implementation was to reflect the behavior of SystemVerilog delay and event trigger statements in the co_await statement in C++20. This statement is responsible for suspending coroutines, and we use it to suspend SystemVerilog processes represented by coroutines in a verilated design.

When a delay is encountered, the current coroutine (or process) is suspended and put into a queue. When the awaited time comes, the corresponding coroutine is removed from the queue and resumed.

Diagram depicting how delays are handled

Event variables work in a similar way. When we are awaiting an event, we suspend the current coroutine and put it in what we call an event dispatcher. If the event is triggered at a later point, we inform the event dispatcher which resumes the corresponding coroutine.

Diagram depicting how event variables are handled

With all this, the C++ code that Verilator generates for delays and event statements is very similar to the original SystemVerilog source code.

initial forever begin
    @ping;
    #10;
    ->pong;
end

This SystemVerilog corresponds to the following C++ code. The snippet shown here is simplified for readability, but the structure of the verilated code is preserved.

Coroutine initial() {
    while (true) {
        co_await eventDispatcher[&ping];
        co_await delayedQueue[TIME() + 10];
        eventDispatcher.trigger(&pong);
    }
}

As mentioned before, one of the main reasons for the switch to coroutines is performance. The original, thread-based implementation was hundreds of times slower than vanilla Verilator when simulating CHIPS Alliance’s SweRV EH1 core. Just replacing threads with coroutines resulted in 3-time speedup in SWeRV. Further optimization, the most crucial part being detecting which parts of a design need dynamic scheduling, resulted in indistinguishable performance between vanilla Verilator and our version when using Clang for verilated code compilation.

Next steps and future goals

There is still more work to be done. We are continuously working on improving the dynamic scheduler in the following areas:

  • working out some remaining edge cases,
  • making it work with Verilator’s built-in multithreading solution,
  • adding new test cases to push these new features to their limits.

Our goal is to provide the dynamic scheduler in Verilator as an optional scheduler that users can enable if they want more SystemVerilog compatibility. Of course users should bear in mind that it is not as well-tested as Verilator’s default behavior, but this will most likely improve as we find more practical use cases to make use of the solution.

Naturally, many more features are needed to provide full UVM support. This, among others, includes:

  • the built-in process class, which is used for controlling the behavior of a SystemVerilog process,
  • randomized constraints, which let the user generate test data easily by specifying constraints for random generation of said data,
  • better support for assertions, which are statements that allow for verifying that certain conditions are fulfilled by a tested design.

The dynamic scheduler is part of a bigger undertaking driven by Antmicro within the CHIPS Alliance to create fully open source toolchains and flows for FPGA and ASIC development. Together with Surelog/UHDM, a project aiming at providing a complete SystemVerilog parsing and elaboration solution, this brings us closer to being able to simulate, test and verify designs which use UVM with entirely open source tools.

SATA Design Implementation on FPGAs with Open Source Tools

By Blog

This post was originally published at Antmicro.

Real-world FPGAs designs often require high rate transmission protocols such as PCIe, USB and SATA which rely on high speed transceivers for external communication. These protocols are used to interface with various devices such as graphics cards and storage devices, and many of our clients reach out to us specifically because they need the flexibility, high-throughput and low-latency characteristics of FPGAs.

In particular, for customers that deal with high data volumes (which is very common in video applications), implementing SATA to communicate and transfer data with e.g. an SSD hard drive is a must.

Since Antmicro believes in an open source, vendor neutral approach to FPGAs, today we will describe how to build a SATA-enabled system using a completely open source flow, including the hardware platform, FPGA IP as well as, perhaps most importantly, tooling which we have been developing as part our bigger effort within CHIPS Alliance.

Origin and motivation

Antmicro is a pioneer in a software-driven approach to developing FPGAs. On top of new hardware description languages, open source IP and software that have been gaining traction in the FPGA space, one necessary missing element has been open source tooling. Open tools allow a workflow more familiar to software developers, who are used to just downloading their toolchain without having to log in anywhere or managing licenses.

Moreover, open tools provide the great advantage of easy to set up CI systems that keep track of regressions and allow more efficient and robust development.

Some of our forward-looking customers such as Google require these kinds of workflows to take full control of their development toolchain, for various reasons: securitydevelopment productivityscale. Others, like QuickLogic, who thanks to the cooperation with us are the first ever FPGA vendor company to fully embrace open source tools, are looking to deliver a more tailored experience to their own customers, which is easier to do based on open source.

To prove the viability of open source FPGA tools, being able to implement high-speed interfaces to verify how the toolchain handles high-speed transceivers is key; thus, a fully open source SATA is a very good target, especially that an open source core, LiteSATA, was available in our favorite open source SoC generator for FPGAs, LiteX. What was missing was a hardware platform, putting it all together, and – of course – tools.

Hardware setup

The SATA design we developed is meant to run on top of a Nexys Video board from Digilent, featuring an Artix7 200T Xilinx FPGA, coupled with custom expansion board connected through the FMC connector and hosting an M.2 SSD module. Thanks to the FMC connector on the Nexys Video we achieved a relatively simple and modular hardware setup.

The FMC expansion board, developed by Antmicro, is fully open-sourced and available on GithHub.

Open source SATA hardware setup

FPGA gateware and system block diagram

The FPGA design is generated with the LiteX SoC builder and the main components that we used are:

  • The VexRiscV RISC-V CPU
  • The LiteDRAM controller to communicate with the DDR Memory
  • The LiteSATA core to communicate with the SSD module on the custom expansion board
  • a UART controller to be able to control the system from the host

Moreover, the software running in the SoC includes a simple BIOS that can perform SATA initialization and basic read and write operations of the sectors in the SSD drive.

Running open source SATA diagram

Open source toolchain

The SATA setup proves that high speed protocols can be enabled on mainstream FPGAs such as Xilinx 7-series with an open source toolchain, with Yosys for synthesis and VPR for place and route. The LiteSATA IP core makes use of so-called GTP hard blocks, and in fact one of the main challenges we dealt with here was enabling these hard blocks in the Artix-7 architecture definition to get an end-to-end open source toolchain.

Other than enabling more coverage of popular FPGAs, much of our current FPGA toolchain effort goes into increasing the interoperability of tools like VPR and nextpnr as well as their proprietary counterparts to enable a more collaborative ecosystem which would enable the community – including universities, commercial companies, FPGA vendors and individual developers – to tackle the ambitious goal of open source FPGAs together.

For more information on the FPGA interchange and how much value it brings to the open source FPGA tooling refer to the dedicated Antmicro blog note. In the future, once that work is at a more advanced stage, LiteSATA will be one of the first example designs to be tested with the FPGA interchange-enabled tools.

Building and running the setup

The FPGA SATA design is available in the symbiflow-examples repository and can be built with the open toolchain, and run on the hardware setup described above.

After following the instructions to install the toolchain and preparing the environment, run the following to build the LiteX SATA design:

cd xc7
make TARGET=”nexys_video” -C litex_sata_demo

When the bitstream is generated, you can find it in the build directory under litex_sata_demo/build/nexys_video.

To load the bitstream on the Nexys Video board you can use the OpenFPGALoader tool, which has support for the board.

Once the bitstream is loaded on the FPGA, you can access the BIOS console through the UART connected to your host system and run the following (note that X depends on the assigned USB device):

picocom --baud 115200 /dev/ttyUSBX

When the LiteX BIOS gives you control, you need to perform the SATA initialization before being able to read and write sectors on the drive. See the output below:
Running LiteSATA with SymbiFlow in console

Future goals

The work on enabling the SATA protocol in a fully open source flow was one of the steps on the way towards supporting PCIe in the toolchain which will unlock even more advanced use cases. PCIe can be used for a variety of purposes, such as connecting external graphic cards or accelerators to an FPGA design, and generally enable even faster transmission rates from and to the FPGA chip.

Open Source FPGA Platform for Rowhammer Security Testing in the Data Center

By Blog

This post was originally published at Antmicro.

Our work together with Google and the world’s research community on detecting and mitigating the Rowhammer problem in DRAM memories has been proving that the challenge is far from being solved and a lot of systems are still vulnerable.
The DDR Rowhammer testing framework that we developed together with an open hardware LPDDR4 DRAM tester board has been used to detect new attack methods such as Half-Double and Blacksmith and all data seems to suggest this more such methods will be discovered with time.

But consumer-facing devices are not the only ones at risk. With the growing role of shared compute infrastructure in the data center, keeping the cloud secure is critical. That is why we again teamed up with Google to bring the open source FPGA-based Rowhammer security research methodology to DDR4 RDIMM used in servers by designing a new Kintex-7 platform for that use case specifically, to foster collaboration around what seems to be one of the world’s largest security challenges.

Hardware overview

Open source data center Rowhammer tester board

The data center DRAM tester is an open source hardware test platform that enables testing and experimenting with various DDR4 RDIMMs (Registered Dual In-Line Memory Module).

The main processing platform on this board is a Xilinx Kintex-7 FPGA which interfaces directly with a regular DDR4 DIMM connector. The new design required more IOs compared to the LPDDR version, which was a major driving factor for changing the Kintex-7 FPGA package from 484 to 686 pins.

Basing the test platform on the Kintex-7 FPGA allowed us to implement a completely open source memory controller – LiteDRAM – fully within the FPGA just like for the LPDDR case. The system can thus be modified and re-configured on both hardware and software level to freely sculpt memory testing scenarios, providing developers with a flexible platform that can be easily adjusted to new data center use cases. Our previous design was targeting a single channel from a single LPDDR4 IC, featuring specially-designed modules to cover for the fact that LPDDR memories aren’t meant to particulary “modular”. For the data center use case however, as reflecting the more standardized nature of that space, the new board can handle a full-fledged, off-the-shelf DDR4 RDIMM with multiple DRAM chips.

As in the LPDDR4 version, the new board features different interfaces to communicate with the FPGA, such as RJ45 Gigabit Ethernet and a Micro USB console. Additionally, there is an HDMI output connector for development purposes. Other features include:

  • A JTAG programming connector
  • A microSD card slot and 12 MBytes flash memory
  • HyperRam – external DRAM memory that can be used as an FPGA cache.

What is worth stressing here is that unlike LPDDR4, DDR4 modules don’t have to be custom made and are available to buy off the shelf – an advantage that greatly expands the potential applicability and outreach of the platform.

Block diagram depicting open source data center Rowhammer tester platform

Using open source to transform data centers

The DRAM tester described here is meant, of course, to be used with the Antmicro open source Rowhammer testing framework mentioned in the opening of this blog note. The list of devices discovered to be vulnerable to attacks so far is significant, and the new design will help to cover a huge chunk of data center oriented memory modules.

The DRAM testing capabilities of the Rowhammer tester are not limited to DDR4 RDIMM memories and LPDDR4 only. Plans for 2022 include support for LPDDR5 and DDR5, which will result in more hardware and collaborations, and hopefully more mitigation techniques. With an open source DRAM controller at the heart, the framework offers potential of collaboration around building Rowhammer mitigations into the controller itself, using the transparency of open source IP to stay one step ahead of the potential attacks.

The recent data center security work is part of our wider effort to push open source tooling, methodologies and approach to data center customers. In a similar vein, within the LibreBMC group in OpenPOWER Foundation, we are leading a project to replace ASIC-based BMC (board management controllers) with soft CPUs running on popular and low-cost FPGA platforms. LibreBMC will be a completely transparent security and management solution both in terms of hardware and software and includes two boards compatible with OCP’s DC-SCM standard based on the Xilinx Artix-7 and Lattice ECP5 FPGAs respectively.

Complementing our software capabilities in scaling huge workloads and building robust design, development, test and CI, simulation and verification pipelines, our data center oriented platforms also include Scalenode, which shows how open source hardware can be used to build modular servers based on both ARM (Raspberry Pi 4 CM) and RISC-V (ARVSOM).

Our open source based services ranging from ASIC and hardware design through IP and software development lets us offer comprehensive help to a wide array of data center customers, to improve their security, development speed and collaboration capacities.

The DDR testing platform in a broader context

The data center DRAM tester is further proof that the open source hardware trend spearheaded by Antmicro can bring practical value, especially in terms of security and collaboration capability. Developing a completely open framework, configurable down to the DRAM controller itself, has led us to some fantastic collaborations and sparked ideas which would otherwise be impossible to implement. Both the CHIPS Alliance, the OpenPOWER Foundation and RISC-V International have a keen interest in taking the memory controller work forward, potentially leading up to ASIC-proven DDR controller IP.

An open source IP ecosystem which we are actively participating in could revolutionize how ASIC and FPGA systems are built. It is one of the key components in a wider push for a more open source, pragmatic and software-centric approach to hardware that we are helping shape at the global level by participating in policy-making initiatives in the EU and US.

On a more down-to-earth note, the data center platform is yet another permissively licensed open source board in our arsenal, and can serve as a good reference design for Kintex-7 projects which we are happy to customize and build upon for other areas or types of research for our customers.

Software-driven ASIC Prototyping Using the Open Source SkyWater Shuttle

By Blog

This post was originally published at Antmicro.

The growing cost and complexity of advanced nodes, supply chain issues and demand for silicon independence mean that the ASIC design process is in need of innovation. Antmicro believes the answer to those challenges is bound to come from the software-driven, open source approach which has shaped the Internet and gave rise to modern cloud computing. Applying the methodologies of software design to ASICs is however notoriously viewed as difficult, given the closed nature of many components needed to build chips – tools, IP and process design kits, or PDKs for short, as well as the slow turnaround of manufacturing.

The open source, collaborative SkyWater PDK project, combined with the free ASIC manufacturing shuttles running every quarter from Google and efabless, has been filling one of those gaps. Add to it open source licensed ASIC design tools we are helping develop as part of CHIPS Alliance as well as massive parallelization capabilities offered by the cloud and what you get is an ASIC design ecosystem at the verge of a breakthrough. To effect this change, together with Google, efabless, SkyWater and others we are working on more developments, including letting the shuttle designs benefit from software-driven hardware productivity tools such as LiteX and Renode (which we are currently helping the SkyWater shuttle effort to adopt), as well as new and exciting developments in the process technology itself.

If you want to participate in making ASIC design history, let us show you why and how the shuttle program is the way to do that. And by the end of this article, hopefully you will want to participate in the next, fourth shuttle with the submission deadline at the end of this year.

chip tapeout

SkyWater PDK – some background

In May 2020 Google and SkyWater Technology Foundry released the first ASIC proven open source PDK. The PDK targets the 130 nm process which, while not state-of-the-art, is still in widespread practical use, especially in mixed-signal and specialized designs.

The PDK release involved restructuring the original code and data and properly documenting all the available cells in the PDK. This operation was performed in a collaboration between a group of industrial and academic partners, with Antmicro’s effort focused mostly on developing tools for automatic PDK structuring and documentation.

An open source PDK was a key missing piece in end-to-end open source ASIC development, but in itself would not allow the average developer to feel the change. To enable developers to work with the PDK in practice and build fully open source chips with fast turnaround that is necessary to breed the necessary innovation, Google funded the Open MPW Shuttle Program operated by efabless, a fellow CHIPS Alliance member. The program assumes the applying projects are fully open source and based on a permissive license, targets the 130 nm SkyWater process and uses open source ASIC toolchain. Projects accepted in the program are then manufactured and the authors receive their packaged ASICs without any additional costs – production, packaging, testing and delivery is all covered for.

The program is a great opportunity for any developer wanting to develop open source ASICs and contribute to the emerging open source ASIC community. The first shuttle program attracted 37 projects, including:

  • Five RISC-V SoCs
  • A cryptocurrency miner
  • A robotic app processor
  • A template SoC based on OpenPOWER
  • An Amateur Satellite Radio Transceiver
  • Analog/RF IPs
  • Four eFPGAs
  • Antmicro’s AES-128 core integration.

We have been assisting customers expressing the desire to participate in the SkyWater shuttle in assessing the feasibility of their designs, creating the necessary workflows and adapting the tools involved to their particular needs.

Our engineering services can be used to enhance your development teams with the ability to use open source tools more effectively and integrate with your infrastructure in a way which allows you to benefit from the open source’s capabilities while not disrupting your internal workflows unnecessarily.

In total, over 100 designs have been sent to fabrication so far, many authored by teams with a predominantly software background. With over 2500 users in the SkyWater open source PDK slack, this is truly a community in the making.

Most of the designs in the shuttles use the Caravel harness design which implements a RISC-V CPU with some base peripherals, OpenRAM generated memory, an I/O ring and a user area where developers can place their designs. The harness design is meant to be a fixed block / starting point which significantly lowers the entry level for the ASIC developers, but as such is also subject to evolution to better answer the needs of the shuttle participants, which we will describe later in the note.

Open source ASIC tools

The core part of the PDK shuttle process uses the OpenLane toolchain, a flow based on the OpenROAD project, also a part of CHIPS Alliance. The toolchain implements all the steps required to generate a production-ready ASIC layout (GDS) from an RTL design.

ASIC design with SkyWater Shuttle diagram

Since production is the most expensive and time consuming part of the process, testing and validation are key stages in need of innovation, and the experiences learned from the SkyWater shuttle effort are invaluable.

Under the auspices of CHIPS Alliance, Google, Western Digital and Antmicro are leading the work on enabling fully open source SystemVerilog development, testing and validation. The work focuses on a number of design flow aspects, including:

All these are meant to improve the development experience and benefit from the inherent scalability and reusability of open source tools to offer practical value for teams building new ASIC designs.

Adoption of LiteX for Caravel

Open source design tools constitute one aspect of fully open source ASIC design. The other aspect, just as important as tooling, is open source, high-quality, reusable IP cores, and indeed the very rules of the SkyWater shuttle program encourage developers to open source their design and reuse existing cores.

At the core of the shuttles is the Caravel harness. To improve the shuttle’s user experience and let the community benefit from a wider array of off-the-shelf tools and cores, we are assisting with the ongoing effort of adopting the Caravel design to be based on LiteX.

LiteX, a widely known open source SoC generator, will make it possible for more open source cores to be integrated with ASIC designs, ultimately lowering the entry barrier for software engineers. It comes with multiple ready to use cores, including an open source DRAM controller used in the Rowhammer test platform we described some time ago. This alternative harness, whose development you can track in a dedicated GitHub repository, will open the door to more contributions from the LiteX community and allow us to use a bunch of tools that we have already integrated, like our open source simulation framework, Renode.

Renode’s hardware/software co-development capabilities

The LiteX framework provides developers with an easy way to experiment with various different CPU cores. Testing a system against many possible cores, often running complex software, makes validation no trivial task.

Renode, Antmicro’s open source development framework, features advanced SW/HW/FPGA/ASIC co-simulation capabilities and has been directly integrated with LiteX to generate the necessary configurations that correspond to the hardware system. Renode supports a multitude of CPU, I/O IP, sensor and network building blocks, both native to LiteX and otherwise, allowing its users to simulate the entire platform design before implementation, i.e. in the pre-silicon stage.

Renode addresses the profound challenge of testing complex software, running it on various CPUs and using custom peripheral cores at the same time. Developers can make use of Renode’s ability to co-simulate with Verilator or with physical hardware, reducing the simulation time of SoC systems that utilize custom IP cores.

Back in September, Antmicro presented a case of co-simulating the popular Xilinx Zynq-7000 SoC running Linux with a verilated FastVDMA core, and of course co-simulation with platforms like the PolarFire SoC is something we have been steadily improving on with our partner Microchip.

A similar kind of development methodology will be possible with the new Caravel harness.

Taking that HW/SW co-design workflow to its natural consequence, as showcased by our work with Google, Dover Microsystems and others, Renode allows developers to build SW-oriented hardware faster than HDL and benefit from the flexibility known from software development cycles where iterations happen in a matter of days. Recently, Renode has been extended with support for RISC-V vector instructions which translates into a further improvement of the development process of machine learning algorithms in open source ASICs.

Scaling into the cloud and hybrid setups

Building and testing ASIC designs is often a time and resource intensive task. The open source tooling approach, endorsed by the SkyWater shuttle program, possesses an important advantage over any proprietary perspective – it allows for infinite scaling of compute resources as there are neither licensing costs nor other license related limitations involved.

Developments around distributed and scalable cloud based CI/CD systems like self-hosted GitHub Actions runners in GCP, a collaboration between Antmicro and Google, are providing the ecosystem with new options for reliable, fast testing and deployment of ASIC designs. Cloud based CI systems can be built to combine both closed and open source solutions, providing hybrid solutions that fill the gaps of either approach. And on a more general level, scalable and accessible CI/CD systems facilitate collaboration between large and geographically distributed teams of developers.

New developments

SkyWater PDK is being constantly improved, extending the possibilities for future designs. One of the recent add-ons to the PDK is a ReRAM library which can be used to develop non-volatile memories using the SkyWater 130nm technology.

Further SkyWater PDK development plans include extending the PDK portfolio with 180nm, 90nm and 45nm technology processes – stay tuned for upcoming developments in that space!

Participate in shuttle runs

Three shuttle runs have already happened, and thanks to Google’s commitment as well as the overwhelming interest from business, research and government institutions, the project will continue through 2022 and most likely beyond. The 4th shuttle run is currently open and will be accepting submissions by December 31, 2021.

For projects that, for any reason, cannot be open sourced or submitted within the timeline of the open shuttle, a private shuttle called ChipIgnite has been created.

Open Source Debayerization Blocks in FPGA

By Blog

This post was originally published at Antmicro.

In modern digital camera systems, the captured image undergoes a complex process involving various image signal processing (ISP) techniques to reproduce the observed scene as accurately as possible while preserving bandwidth. On the most basic level, most CCD and CMOS image sensors use the Bayer pattern filter, where 50% of the pixels are green, 25% are red and 25% are blue (corresponding to the increased sensitivity of the human eye to the green color). Demosaicing, also known as debayering, is an important part of any ISP pipeline whereby an algorithm reconstructs the missing RGB components/channels of each pixel by performing interpolation of the values collected for individual pixels.

Diagram depicting debayerization process

The rapid development of FPGA technologies has made it possible to use advanced ISP algorithms in FPGAs even for high-res, multi-camera arrays, which is great news for resource-constrained real-time applications where image quality is essential. In our R&D work, we are developing reusable open source building blocks for various I/O and data processing purposes that can be used as a starting point for customer projects which need to be bootstrapped quickly, and those include IP cores for debayerization.

Open source debayerization

As part of a recent project, we implemented and tested an open source FPGA-based demosaicing system that converts raw data obtained from CCD or CMOS sensors and reconstructs the image using three different interpolation algorithms controlled via a dedicated wrapper. The three interpolation methods are:

  • Nearest neighbour interpolation, where the nearest pixel is used to approximate the color value. This algorithm is simple to implement and is common in real-time 3D rendering. It uses a 2×2 px matrix and is the lightest and easiest method to implement.
  • Bilinear interpolation, which establishes color intensity by calculating the average value of the 4 nearest pixels located diagonally in relation to the given pixel. This method uses a 3×3 px matrix and gives better results than the nearest neighbour interpolation method but takes up more FPGA resources.
  • Edge directed interpolation, which calculates the pixel components in a similar way to bilinear interpolation, but uses edge detection with a 5×5 px matrix. This algorithm is the most sophisticated of the three, but gives the best results and eliminates zippering.

System structure

The demosaicing system consists of two parts. The most important part is formed by the demosaicing cores representing the three algorithms described earlier.

Diagram depicting dmosaicing wrapper

The software part of the system runs in the FPGA and features the bootloader, operating system (Linux), Antmicro’s open source FastVDMA IP core that controls data transmission between the demosaicing setup and the DDR3 memory, and a dedicated Linux driver that makes it possible to control the demosaicing cores from software.

Open source FPGA IP cores for vendor-neutral solutions

Apart from building highly-capable vision systems based on FPGA platforms, we are developing various tools, open source IP cores and other resources to provide our customers with a complete, end-to-end workflow that they can fully control.

Some of our recent projects include the FPGA Interchange Format to enable interoperability between FPGA development tools, an open source PCIe link for ASIC prototyping in FPGA or an FPGA-based testing framework for hardware security of DRAM. If you would like to benefit from introducing a more software-driven, open source friendly work methodology into your next product development cycle, reach out to us at contact@antmicro.com and keep track of our growing list of open IP cores at the Antmicro Open Source Portal.