# Syed Noorul Saad

saadpiece.com · Waterloo, ON · sn3syed@uwaterloo.ca · (226) 989 5840 · \$\mathbf{Q}\$saads312

#### EDUCATION

#### University of Waterloo

Waterloo, ON

B.A.Sc Computer Engineering · GPA: 3.2/4.0 (recent three terms)

Sep 2022 - May 2027

Relevant courses: Reconfigurable Computing (Master's Level), Real Time Operating Systems, Digital

Hardware Systems, Computer Architecture, Compilers, Embedded Microprocessor Systems

#### TECHNICAL SKILLS

Languages: SystemVerilog, Verilog, C/C++, Python, Assembly (ARMv7, RISC-V)

Protocols: AXI4/AXI4-Lite, SPI, UART

Tools: Vivado (synthesis, STA, P&R), Quartus, CocoTB, GTKWave, Git, Linux

Concepts: RTL design, timing closure, clock domain crossing (CDC), pipelining/retiming, FSMs, high

performance design, design for verification, formal verification (Lean4)

# TECHNICAL PROJECTS

### RISC-V CPU + NPU Co-Processor RISC-V ISA, Verilog, AMD Vivado, FPGA

- Designed **5-stage pipelined RV32I core**, implementing 40+ instructions with data forwarding, hazard detection, and branch prediction to enable hardware-accelerated MNIST inference
- Architected NPU co-processor with custom ISA extensions for hardware-accelerated matrix-vector operations on Xilinx FPGA

# Matrix-Vector Multiplication Engine (ECE 327) SystemVerilog, AMD Vivado, FPGA

- Designed MVM engine (Microsoft Brainwave-inspired) on PYNQ-Z1 FPGA using **216 DSP48E1** slices (98% utilization): **27 parallel outputs at 280 MHz**
- Implemented **pipelined tanh unit** with Taylor series, achieving **47% critical path reduction** through retiming (170 $\rightarrow$ 320 MHz)

# UVM Design Verification Environment (Data Aligner) SystemVerilog 🗘

- Architected UVM testbench with constrained-random stimulus: 100% code coverage, 95% functional coverage
- Implemented verification IP (agents, drivers, monitors, scoreboard) and debugged 12 edge cases via waveform analysis

# Shell Jr C++, OpenAI API $\Box$

Built shell with command parsing, fork/exec process creation, I/O redirection, and AI-powered CLI explanations

### EXPERIENCE

#### UW ASIC Design Team

Waterloo, ON

RTL Design Engineer

August 2025 - Present

- Leading team designing multiple **lightweight RISC-V cores** as masters for shared analog matrix-vector multiplier accelerator, targeting low-cost FPGA implementation with planned TinyTapeout integration
- Architecting custom RISC-V ISA extensions (2 new instructions) for memory transfer and accelerator access via AXI interconnect with req/grant arbitration between multiple masters
- Designed and verified **SPI peripheral module** with **CDC synchronizers** for metastability prevention; developed **CocoTB testbenches** for block-level validation
- Evaluating accelerator interface protocols and making architectural decisions for future single-die integration (transitioning from multi-FPGA to silicon)
- Architected an **ACK bus arbitration protocol** using **open-drain signaling** and encoded request lines (00–11) to coordinate bus access among digital accelerator components

VCast Online Dubai, UAE

Software Engineer

January 2025 - May 2025

- Developed a **real-time client-server graph system** with asynchronous data handling, secure access control, and dynamic visualization of user-generated data.
- Optimized data-path performance through query restructuring and caching, **cutting backend** response time by 18%.

**Dematic** Waterloo, ON Technical Writer May 2024 - August 2024

• Documented embedded control systems (PLCs, sensors, real-time architecture) for automated logistics equipment, strengthening technical communication and system-level understanding