15 4月 2026 8 min read

FPGAs: Programming Hardware with Code (Yes, Really)

What Is an FPGA, Actually?

You know how a CPU runs your code by executing instructions one at a time? An FPGA does something completely different — it becomes the hardware your code describes.

An FPGA (Field-Programmable Gate Array) is a chip full of tiny configurable logic blocks connected by a programmable routing network. When you "program" an FPGA, you're not writing a program that runs on hardware — you're configuring the hardware itself.

The difference is profound:

CPU	FPGA
Runs instructions sequentially	Everything runs in parallel
Fixed hardware, flexible software	The hardware IS your design
Great for general purpose	Great for dedicated, high-speed tasks
3 GHz clock, but one thing at a time	100 MHz clock, but 10,000 things at once

Think of a CPU as a Swiss Army knife. An FPGA is a factory where you design your own tools.

Real uses: high-frequency trading (microsecond decisions), 5G base stations, video processing, Bitcoin mining, AI inference, radar systems. If it needs to be fast AND flexible, it's probably an FPGA.

The Mental Model That Changes Everything

Before writing a single line of code, internalize this:

HDL code describes hardware, not behavior.

When you write this in Verilog:

assign out = a & b;

You're not telling a processor to "AND a and b and store the result." You're literally describing a wire that is permanently connected through an AND gate. The moment your design is loaded onto the FPGA, that AND gate exists in silicon. Always. Forever (until you reprogram).

This is why FPGA programming feels weird at first — you're not writing an algorithm, you're drawing a circuit with text.

HDL: The Language of Hardware

There are two main Hardware Description Languages:

VHDL — verbose, strongly typed, favored in Europe and aerospace
Verilog — concise, C-like, favored in Silicon Valley

We'll use Verilog because it's more readable for beginners. Everything here also applies conceptually to VHDL.

Setup: Free Tools

You don't need expensive hardware to start. Here's a completely free setup:

Simulation (no hardware needed):

# Install Icarus Verilog (simulator) + GTKWave (waveform viewer)
# macOS
brew install icarus-verilog gtkwave

# Ubuntu
apt install iverilog gtkwave

Online alternative: EDA Playground — write and simulate Verilog in your browser, zero installation.

Cheap real hardware: The Basys 3 (~$150) or the iCEstick (~$25) are great starter boards.

Your First Module: Hello, AND Gate

In Verilog, the basic unit is a module — think of it as a component with input and output pins.

module and_gate (
    input  wire a,
    input  wire b,
    output wire out
);
    assign out = a & b;
endmodule

That's it. A two-input AND gate. Let's break it down:

module and_gate — name your component
input wire a, b — two input pins
output wire out — one output pin
assign out = a & b — the logic: output is a AND b
endmodule — done

assign creates a continuous assignment — it's not executed once, it's a permanent connection. Whenever a or b changes, out updates instantly.

Simulation: See It Work Without Hardware

Write a testbench — a Verilog file that drives your module with test signals:

// testbench.v
module testbench;
    // Declare test signals
    reg a, b;        // reg = we drive these
    wire out;        // wire = module drives this

    // Instantiate the module under test
    and_gate uut (
        .a(a),
        .b(b),
        .out(out)
    );

    // Apply test stimuli
    initial begin
        $dumpfile("waves.vcd");   // save waveforms
        $dumpvars(0, testbench);

        // Test all combinations
        a = 0; b = 0; #10;  // wait 10 time units
        a = 0; b = 1; #10;
        a = 1; b = 0; #10;
        a = 1; b = 1; #10;

        $display("Simulation complete!");
        $finish;
    end

    // Print whenever output changes
    initial begin
        $monitor("a=%b b=%b | out=%b", a, b, out);
    end
endmodule

Run it:

iverilog -o sim testbench.v and_gate.v
vvp sim

Output:

a=0 b=0 | out=0
a=0 b=1 | out=0
a=1 b=0 | out=0
a=1 b=1 | out=1
Simulation complete!

Open the waveform:

gtkwave waves.vcd

You'll see a visual timeline of all your signals. This is how hardware engineers debug — not with print statements, but with waveforms showing exactly how signals change over time.

The Clock: Heartbeat of Digital Logic

Almost everything interesting in FPGAs is synchronous — it happens on the edge of a clock signal. The clock is a square wave that alternates between 0 and 1 at a fixed frequency. On each rising edge (0→1), flip-flops capture their inputs.

This is the fundamental building block of sequential logic:

module d_flip_flop (
    input  wire clk,
    input  wire d,
    output reg  q
);
    always @(posedge clk) begin
        q <= d;
    end
endmodule

always @(posedge clk) — "whenever there's a rising clock edge, do this"
q <= d — non-blocking assignment: sample d and store it in q
reg q — a register (has memory, unlike wire)

This flip-flop remembers the value of d at each clock edge. It's the Verilog equivalent of a variable — but it only updates once per clock cycle.

Critical rule: In always @(posedge clk) blocks, always use <= (non-blocking). In combinational always @(*) blocks, use = (blocking). Mix them up and you'll get subtle bugs that take days to find.

Building a Counter

Let's build something actually useful — a 4-bit counter that counts from 0 to 15 and wraps around:

module counter (
    input  wire       clk,
    input  wire       reset,
    output reg  [3:0] count   // 4-bit output
);
    always @(posedge clk) begin
        if (reset) begin
            count <= 4'b0000;  // reset to 0
        end else begin
            count <= count + 1;  // increment
        end
    end
endmodule

New syntax:

[3:0] — a 4-bit bus (bits 3 down to 0)
4'b0000 — a 4-bit binary literal (4 bits, binary, value 0000)
4'd15 would be decimal 15, 4'hF would be hex F

Testbench for the counter:

module counter_tb;
    reg clk, reset;
    wire [3:0] count;

    counter uut (.clk(clk), .reset(reset), .count(count));

    // Generate clock: toggle every 5 time units
    always #5 clk = ~clk;

    initial begin
        clk = 0; reset = 1;
        #15 reset = 0;  // release reset after 15 units
        #200 $finish;
    end

    initial begin
        $monitor("time=%0t count=%d", $time, count);
    end
endmodule

The always #5 clk = ~clk line generates a clock with 10-unit period (5 high, 5 low) — 100 MHz if each unit is 1 ns.

Combinational vs Sequential: The Core Distinction

Everything in digital logic falls into one of two categories:

Combinational logic — output depends only on current inputs. No memory. No clock.

// Combinational: output is always inputs OR'd together
assign out = a | b | c;

// Or use always @(*) for more complex combinational logic
always @(*) begin
    case (sel)
        2'b00: out = a;
        2'b01: out = b;
        2'b10: out = c;
        default: out = 0;
    endcase
end

Sequential logic — output depends on inputs AND past state. Has memory. Uses clock.

// Sequential: output only changes on clock edge
always @(posedge clk) begin
    if (enable)
        stored_value <= new_value;
end

Most real designs are a mix: combinational logic computes new values, sequential logic stores them on each clock edge.

A Real Design: PWM Generator

Let's build something you'd actually use — a PWM (Pulse Width Modulation) generator. PWMs are used to control LED brightness, motor speed, servo position. The key idea: a signal that's ON 75% of the time looks 75% as bright to your eye.

module pwm (
    input  wire       clk,
    input  wire [7:0] duty,    // 0-255, duty cycle
    output reg        pwm_out
);
    reg [7:0] counter;

    always @(posedge clk) begin
        counter <= counter + 1;  // 8-bit counter wraps 0→255→0

        if (counter < duty)
            pwm_out <= 1;
        else
            pwm_out <= 0;
    end
endmodule

If duty = 128, the output is high for 128/256 = 50% of cycles. duty = 255? Always high. duty = 0? Always low.

Connect this to an LED on an FPGA board and you've got a dimmable light. Change duty dynamically and you can fade in/out. This is running at clock speed — potentially 100 million times per second — with zero CPU involvement.

Finite State Machines: Giving Your Hardware a Brain

Real hardware logic often needs to remember what it's doing — it has states. This is where FSMs (Finite State Machines) come in.

Let's build a simple traffic light controller:

module traffic_light (
    input  wire       clk,
    input  wire       reset,
    output reg  [2:0] lights   // [2]=red, [1]=yellow, [0]=green
);
    // State encoding
    localparam RED    = 2'd0;
    localparam GREEN  = 2'd1;
    localparam YELLOW = 2'd2;

    reg [1:0]  state;
    reg [31:0] timer;

    // State durations (in clock cycles)
    localparam RED_TIME    = 100;
    localparam GREEN_TIME  = 80;
    localparam YELLOW_TIME = 20;

    always @(posedge clk) begin
        if (reset) begin
            state <= RED;
            timer <= 0;
            lights <= 3'b100;  // red on
        end else begin
            timer <= timer + 1;

            case (state)
                RED: begin
                    lights <= 3'b100;
                    if (timer >= RED_TIME) begin
                        state <= GREEN;
                        timer <= 0;
                    end
                end

                GREEN: begin
                    lights <= 3'b001;
                    if (timer >= GREEN_TIME) begin
                        state <= YELLOW;
                        timer <= 0;
                    end
                end

                YELLOW: begin
                    lights <= 3'b010;
                    if (timer >= YELLOW_TIME) begin
                        state <= RED;
                        timer <= 0;
                    end
                end

                default: state <= RED;
            endcase
        end
    end
endmodule

This is the standard FSM pattern in Verilog:

Define states as localparam constants
Use a case statement to handle each state
Transition to the next state when conditions are met
Update outputs based on current state

FSMs appear everywhere in hardware: UART receivers, SPI controllers, memory arbiters, protocol handlers.

Timing: The Thing That Bites Everyone

Here's something that surprises every beginner: your logic has physical delay.

A signal takes a few nanoseconds to propagate through gates and wires. If your clock is faster than that propagation delay, your flip-flops will capture wrong values — and your design will fail in mysterious ways that only show up on real hardware, not simulation.

This is called a timing violation, and the tool that catches it is called static timing analysis.

The key concept: setup time — how much before the clock edge does data need to be stable?

Data must be stable here ↓
                          |←setup time→|
─────────────────────────────────────────────→ time
                                        ↑
                                   Clock edge

Modern FPGA synthesis tools (Vivado, Quartus) automatically analyze timing and tell you if your design meets timing. If it doesn't, you either:

Slow down your clock
Add pipeline registers (break long paths into shorter stages)
Restructure your logic

For beginners: start with a slow clock (1-10 MHz) and don't worry about timing until you're hitting performance limits.

Synthesis: From Code to Silicon

The journey from Verilog to running hardware:

Verilog/VHDL
     ↓
  Synthesis          (maps your code to LUTs and flip-flops)
     ↓
 Place & Route       (physically places logic blocks on the chip)
     ↓
Timing Analysis      (checks if everything is fast enough)
     ↓
Bitstream Generation (creates the binary file to program the FPGA)
     ↓
 Programming         (load the bitstream onto the chip)

The main tools:

Tool	Vendor	FPGAs
Vivado	Xilinx/AMD	Artix, Kintex, Virtex, Zynq
Quartus	Intel	Cyclone, Arria, Stratix
iCEcube2 / nextpnr	Lattice	iCE40, ECP5

For open source toolchains (great for learning), the iCE40 family is fully supported by the IceStorm project.

The Big Gotchas

Gotcha #1: Everything is concurrent

In software, this runs top to bottom:

x = 5
x = x + 1  # x is now 6

In Verilog, these are independent concurrent assignments:

assign x = 5;        // one wire always = 5
assign x = x + 1;   // ILLEGAL: two drivers on same wire!

Gotcha #2: Non-blocking vs blocking assignments

Use <= in clocked blocks, = in combinational blocks. Always. No exceptions until you understand why.

Gotcha #3: Latches

If you write a combinational always block that doesn't assign a value in every case, you accidentally create a latch (memory that's not a flip-flop). Latches are timing nightmares.

// BAD: creates a latch (no else clause)
always @(*) begin
    if (enable)
        out = in;
    // What is out when enable=0? Latch!
end

// GOOD: explicit default
always @(*) begin
    if (enable)
        out = in;
    else
        out = 0;  // explicit
end

Gotcha #4: Simulation ≠ synthesis

Some Verilog is valid for simulation but can't be turned into actual hardware. #10 delay statements, for example, make no sense in synthesis. Stick to synthesizable constructs.

What to Build Next

Now you know enough to be dangerous. Here's a learning path:

Beginner projects:

4-bit adder
Shift register
Debounce circuit (clean up noisy button presses)
7-segment display driver

Intermediate projects:

UART transmitter/receiver (serial communication)
SPI or I2C controller
VGA signal generator (put pixels on a monitor!)
Simple CPU (yes, you can build one)

Advanced:

Pipelined designs
DDR memory controller
Video processing pipeline
Soft-core processor (MicroBlaze, RISC-V)

The Real Magic

Here's what makes FPGAs genuinely exciting:

A CPU doing video processing runs pixels through one at a time, 60 times a second. An FPGA can process every pixel simultaneously, every frame, with deterministic timing measured in nanoseconds.

A CPU executing a neural network runs multiplications sequentially. An FPGA can have 1,000 multipliers running in parallel, custom-fitted to your exact model dimensions.

This isn't magic — it's parallelism at the hardware level. You're not writing programs anymore. You're designing machines.

And now you know how.

Quick Reference Card

// Module skeleton
module name (input wire a, output reg b);
endmodule

// Continuous assignment (combinational)
assign out = a & b;

// Clocked block (sequential)
always @(posedge clk) begin
    q <= d;  // non-blocking!
end

// Combinational block
always @(*) begin
    case (sel)
        2'b00: out = a;
        default: out = 0;
    endcase
end

// Bus declarations
wire [7:0]  byte_wire;     // 8-bit wire
reg  [15:0] word_reg;      // 16-bit register
reg  [0:0]  bit_reg;       // 1-bit register (same as reg bit_reg)

// Literals
4'b1010    // 4-bit binary 1010
8'hFF      // 8-bit hex FF = 255
10'd100    // 10-bit decimal 100

// Constants
localparam MY_CONST = 42;

Now go build something. The hardware is waiting. ⚡