ISPD 2017 Contest : Clock-Aware FPGA Placement


  • Latest

    1. It is stated in the README the following:
      Each half column consists of two columns.
      However, upon close examination of the file half-col-regions.txt generated by Vivado, we noticed two things:
      1 - There are two half columns that consist of ONE column, not two, specifically 14 and 15.
      2 - There is no half column on column 103. It is a column with IO sites so this is not a very serious problem.
      We would like to know if this follows some pattern and we are supposed to deduce this from the design.scl file or if we should hard code the half columns as in the file generated by Vivado. Also, are the half columns positions and structure the same between all benchmarks or should we be prepared to figure out during runtime where are the half columns exactly?

      Both of your observations are true.
      For (1), you can still keep the same "two-columns" rule in your placer. The difference between two-columns and the accurate one is small. This way your process is a lot easier.
      For (2), Yes. It is an IO column so it doesn't affect at all.
    2. Does BUFGCEs consume clock resources for the 24 rule? For the example below, for a certain clock net, all it~Rs loads are placed in the red box region, and the clock source (BUFGCE) is fixed at the green region. My questions are 1) Does BUFGCE consume one clock resource in the region clock region? 2) Do I need to preserve routing resources on the routing path (yellow regions) from the source to the loads for this clock?

      1) No. You don't need to reserve clock resource for BUFGCE;
      2) No. You don't need to reserve clock resources on the routing path;
      In summary: as long as you meet the contest rules, your placement is regarded as clock legal. Contest benchmarking won't do clock routing --- even it does there's a very high chance the routing will be successful.
    3. I just want to clarify the control signal: Does it mean all FFs have the same CE and SR signals? So when performing legalization, I don't need to consider the rule for CE and SR, and only need to consider the capacity (16 FFs in one SLICE) and clock rules (two clocks)?
      There would be only two global control signals. (controlSig0 and controlSig1) in all designs. But along with these two global control signals, there would be few local control signals which would be connected toCE and S,R pins of the flops. So to pack these FFs in SLICE, control set consideration (rule for CE and SR) is very much required.
    4. Can you release some difficult benchmaarks that show clock legalization problems?
      Yes a hard testcase will be released in a few days.
    5. Can I assume that all IO nodes are fixed in the evaluation?
      Yes all IOs and BUFGs are fixed.
    6. In the log of vivado, I noticed the message as below: "Optional param is set. Router will ignore 47 Global Clock nets." Does this mean that satisfying the constraint of clock region will be enough and we do not need to care about the routing of clock net?
      Yes. For this contest as long as your placement passed clock legalization check it is regarded as legal in terms of clock routing.
    7. Will clock nets counted in total wirelength computation.
      No. Clock nets are excluded in total wirelength.
    8. Regarding file "half-col-regions.txt". It seems that we can calculate that from the .scl file. If so, what information does the half-col-regions.txt file provide?
      In earlier Vivado patches (04 or earlier), clock legalization checker only report half-column region (col,row). This file was there for converting half-column region (col,row) to placement coordinates. Also this file helps understanding half-column rule and half column regions.
    9. We noticed that some half columns include IO columns, which accommodate BUFGCEs. For example, the left-most half column (starting from x=104) is an IO column in the clock region X3Y0. Does BUFGCE also consume the clock resource for its belonging half column? (i.e. it needs to be counted in the 12-clock rule.)
      No, BUFGCE itself doesn't count. It's the loads of the BUFG that count. If a clock has some loads that reside in this half-column, the clock is counted.
    10. Global clock nets are defined as the nets connecting to the clock buffers' output pins. Is that correct?
    11. There will be at most two global clock nets (e.g., controlSig0 and controlSig1) connecting to the SR and CE ports of each FF.
      Yes for this set of benchmarks, including hidden benchmarks.
    12. The first clock legalization rule ("Number of global clocks in each clock region is at most 24 clocks.") means that each clock region can only overlap with at most 24 different global clock net bounding box.

  • Clocking Specific

    1. Number of global clocks in each clock region is at most 24 clocks
      Within each clock region, each half column has at most 12 clocks
      Is this the complete set of clock routing constraints? meaning that if all of these are satisfied and the placement is legal w.r.t site types, all the clock nets can be routed ? (not considering signal nets)

      Please refer to the Problem Definition page for a set of clock routing rules (for track assignment) to have a routable solution. However, those routing rules are in addition to the challenge definition, and as such, are not neccessary to be satisfied for this challenge.
    2. Suppose there is a half-column with 12 different clocks and the half column immediately next to it (left/right) has another set of 12 different clocks. Can the clocks be routed in this case?
      Yes, the half columns within a clock region can have different clocks.
    3. Each clock region has enough resources to accommodate all clock loads assigned to that region.
      Does this only mean legality with respect to site types or is there some other aspect?

      Yes, this applies to legality with respect to site capacities.
    4. If both distribution and routing networks are segmented as clock regions, then how could a clock tree across more than one clock regions?
      As mentioned distribution and routing networks are segmented at clock region boundaries. This means there are three state buffers at the boundaries to allow a clock to traverse to a neighboring clock region, or to disable the buffer and allow the same track to be used by two clock nets in neighboring clock regions.
    5. So non-clock routing will not use the routing and distribution network resources, right?
      Yes, you can assume signals that are not driven by a BUFGCE do not use routing/distribution resources.
    6. Does the second rule cover the first one? That is, for generating a placement result without violating any clocking constraint, we only need to ensure that for each half column of a clock region, k does not exceed 12, where k is the number of different clock signals of the FFs placed in the half column.
      No. The second rule doesn't cover the first rule. Each clock region has many half columns.
    7. What is the definition of "half column" in the 2nd clocking rule? Could you define the belonging sites of each half column?
      Half column starts from the left mode column of the clock region. There's an upper half column and a lower half column. Every two neighboring upper half columns form a 12-clock rule, starting from the left most. Same for every two lower half columns.

  • General

    1. Is parallel computing allowed? Can I submit Windows binary?
      No. Only single-thread executable linux-64 is allowed in this contest.
    2. Does the Contest provide an ILP solver (such as CPLEX or Gurobi)? If yes, could you tell me the usage for the solver? If no, can we embed an ILP solver in our submitted program?
      No the contest doesn't provide ILP solver. You can use embedded solver in your submitted binary (statically linked).
    3. What is suggested command-line to run the placer?
      PlacerName -aux design.aux -out
    4. What kind of the output format should the placer produce?
      Please see "Bookshelf Format for FPGA Placement" below.
    5. Is there any routing information provided like number of tracks?
      Routing information is not provided. Participants can read Xilinx manuals to understand FPGA routing architecture.
    6. Do I have to submit source code ?
      No you don't have to submit source code. You do need to submit a binary executable on Linux (statically linked preferred). Please submit your alpha version by Feb 15th. The orginizer will need to verify that your binary can be run on testing environment.

  • Benchmarks

    1. We were wondering if you could you explain more about the format of IOs in the benchmarks? For example, for FPGA-example1, how many IOs are available and how many of them are allocated?
      All the IOs of the benchmark should be placed/fixed. IOs include: IBUF/OBUF/BUFG

  • Vivado Evaluation Flow

    1. Can we get the integration flow with Vivado?
      Yes Vivado integration flow will be released. The users will need to apply licenses, download Vivado 2016.4 and a special patch to have the integration flow.
    2. Can I get the second Vivado license?
      Yes you can get multiple Vivado licenses.

  • Legalization

    1. Please check Legalization Rules document for detailed information.

  • Bookshelf Format for FPGA Placement:

    1. Library cell (.lib file):
      • Each instance has a corresponding master library cell. It is defined in nodes file.
      • All library cells are defined in design.lib, a new addition to bookshelf format.
    2. PIN:
      • All pins are defined in library file (.lib) cell section.
      • Each instance has the same number of pins as defined in its master cell.
      • Not all the pins of an instance are used. Some are left unconnected.
      • Library file defines certain attributes associated with pins: direction, clock, and control.
      • Each net is a collection of pins, as specified in nets file.
    3. Layout file (.scl file):
      • Layout file is re-defined to accomodate FPGA placement.
      • There are two section in layout file: site definition section and site map section.
      • SITE definition speicifies available resources (LUT/FF/RAMB/DSP) that can be placed in one site.
      • RESOURCES specifies cell names that correspond to certain resource.
      • SITEMAP specifies the two-dimension array of sites for the entire device/chip.
    4. Placement file (.pl file):
      • The location of an instance has three fields: x-coord, y-coord (to determine the SITE) and BEL (index within the SITE).
      • In released benchmarks, placement file only contains locations of fixed instances (IBUF/OBUF/BUFGCE etc). These instances' locations, including BEL numbers, are not allowed to change during placement.
      • Placer's output placement file should contain locations of all instances
      • The following diagram shows the BEL number for LUTs/FFs placed inside a SLICE SITE:
              |   LUT 15   |   FF 15   |  
              |   LUT 14   |   FF 14   |  
              |   LUT 13   |   FF 13   |  
              |   LUT 12   |   FF 12   |  
              |   LUT 11   |   FF 11   |  
              |   LUT 10   |   FF 10   |  
              |   LUT  9   |   FF  9   |  
              |   LUT  8   |   FF  8   |  
              |   LUT  7   |   FF  7   |  
              |   LUT  6   |   FF  6   |  
              |   LUT  5   |   FF  5   |  
              |   LUT  4   |   FF  4   |  
              |   LUT  3   |   FF  3   |  
              |   LUT  2   |   FF  2   |  
              |   LUT  1   |   FF  1   |  
              |   LUT  0   |   FF  0   |  
      • The following is a snippet of a placement file:
              inst_1000 165 161 0                (this instance is a LUT)
              inst_1001 165 161 1                (this instance is a LUT)
              inst_1002 165 161 15               (this instance is a LUT)
              inst_1003 165 161 0                (this instance is a FF)
              inst_1004 165 161 15               (this instance is a FF)
              inst_1100 29 0 0                   (this instance is a DSP)
              inst_1101 29 2 0                   (this instance is a DSP)
              inst_1200 34 0 0                   (this instance is a BRAM)
              inst_1201 34 5 0                   (this instance is a BRAM)