Skip to content

Latest commit

 

History

History
120 lines (110 loc) · 6.41 KB

Changelog.md

File metadata and controls

120 lines (110 loc) · 6.41 KB

Version 4.1a

  • Added
    • Model a arbitrary fixed latency between LLC cache and Memory controller
  • Changed
    • For Ramulator and DRAMSim3, memory access request is split into MEM_BUS_WIDTH sized parts and latency for each part is queried
  • Fixed
    • Rounding mode (rm) must be calculated again before executing FP instruction during simulation

Version 4.0a

  • Added
    • Comprehensive logging support
    • Command-line option -sim-file-path to specify a top-level directory to store statistics and log files
    • Command-line option -sim-file-prefix to specify prefix appended to all simulator generated files
    • Command-line option -sim-emulate-after-icount to specify the number of instruction to simulate after starting simulation mode
    • DRAMsim3 support
    • Ramulator support
    • Sample MARSS-RISCV configuration files for a 64-bit RISC-V In-order and Out-of-order SoC in configs folder
    • More performance counters to count different types of load instructions (byte, half-word, word, double-word)
    • Time-stamp to all the statistics files generated by the simulator
    • Specify latency in CPU cycles for RISC-V SYSTEM class instructions in the config file
    • Counter to track the number of CPU pipeline flushes
    • Counters to tracks each type of software exceptions and hardware interrupt processed during simulation
    • Parallel build support for Makefile
    • During the simulation, mtime is calculated using simulation clock cycles
    • Specify frequency for CPU and RTC device via the config file
    • Add option flush_on_context_switch in the config file to enable/disable flushing of BPU on a context switch
    • Start fetching the target from the next cycle on branch misprediction
    • Load for non-word quantities (byte and half-word) take an extra one cycle on cache-hit
    • Add function to invalidate entries in mem_request_queue on the miss-speculated path
  • Changed
    • Re-factor and modularize simulator code-base
    • STORE type instructions submit write-request to L1-data cache and exit memory stage in a single cycle
    • Delay for reading/writing page-table entries is now simulated via L1-Data cache
    • Print IPC for all the RISC-V CPU modes after simulation completes to the console and log file
    • In-order core doesn't support parallel execution in multiple functional units
    • Replace hot-cold LRU eviction policy with bit-PLRU eviction policy for BTB and caches
    • Improve the format of TinyEMU config file
    • Update MARSS-RISCV Docs
    • Update README.md
    • Page walk delays are simulated via L1 D$
    • Removed DRAMsim2 support
  • Fixed
    • Memory leaks
    • Don't start simulating DRAM access delay until cache lookup delay is simulated
    • Branch entry is added to BTB, only after the branch is resolved

Version 3.1a

  • Added
    • Print TLB stats to the terminal after the simulation completes
    • Specify latency for each FPU ALU instruction (fadd, fsub, fmul, fdiv, fmin, fmax, fcvt, cvt, fle, flt, feq, fsgnj, fqsrt, fmv, fclass) via TinyEMU config file
    • Figure showing the high-level overview of MARSS-RISCV in README.md
  • Changed
    • Simplify the base DRAM model
      • All memory accesses simulate a fixed latency mem_access_latency
      • Any subsequent accesses to the same physical page occupies a lower delay, which is roughly 60 percent of the fixed mem_access_latency
      • More info here
    • Parallel operation of functional units can be enabled or disabled in the in-order core via TinyEMU config file
    • Clean exception handling code
    • Simulate page table entry read/write delays directly via memory controller using a configurable fixed latency pte_rw_latency
    • Don't stall the pipeline stage for the write request to complete on the memory controller
    • Make FPU-ALU non-pipelined
    • Rename dram_dispatch_queue tomem_request_queue
    • Update MARSS-RISCV Docs
    • Update README.md
    • Update TinyEMU config file here
  • Fixed
    • memory leaks

Version 3.0a

  • Added
    • Support for separate RISC-V Bios and Kernel
    • Command line option flush-sim-mem to flush simulator memory hierarchy on every fresh simulation run
    • Command line option sim-trace to generate instruction commit trace during simulation
    • Distinct configurable read-hit and write-hit latency for all the caches
    • Return address stack (RAS)
    • Branch prediction and speculative execution support for out of order core
    • Print performance counters on terminal when the simulation completes
    • More performance counters:
      • Instruction types
      • ecall
      • page walks for loads, stores and instructions
      • memory controller delay for data and instructions
      • hardware interrupts
  • Changed
    • Port to TinyEMU version 2019-12-21
    • For bimodal branch predictor, store prediction bits in a separate Branch history table (BHT)
    • For in-order core, non-memory instructions can forward their result from MEM stage in addition to EX stage
    • For in-order core, relaxed interlocking on WAW data hazard
    • Simplified out of order core design, ROB slots are now used as physical registers along with a single rename table and a single global issue queue
  • Fixed
    • Correctly calculated the rounding mode for floating pointing instruction decoding
    • Converted c.addiw result buffer into int32_t on 64-bit simulation
    • Set the data type to unint64_t for 64-bit simulation, for the buffer which holds the memory address for atomic instructions
    • Issue #13 and #14 (thanks to Okhotnikov Grigory)

Version 2.0a

  • Added
  • Changed
    • Flush all the CPU caches and DRAM models for every new simulation run
  • Fixed
    • Issue #8: useless cleaning of local variables

Version 1.1a

  • Added
    • Add 16550A UART support (thanks to Marc Gauthier)
    • Add a timestamp suffix to the stats file
  • Changed
    • Reworked the dram latency parameters to match the Sifive HiFive U540 Board
    • Increased the dram dispatch queue size from 32 to 64
  • Fixed
    • Calculation of hardware page walk latency
    • Miscalculation in page fault counters
    • Issue #2: memory leaks in copy_file
    • Issue #3: 'log' instead 'log2'