40m QIL Cryo_Lab CTN SUS_Lab TCS_Lab OMC_Lab CRIME_Lab FEA ENG_Labs OptContFac Mariner WBEEShop
  40m Log  Not logged in ELOG logo
Entry  Tue Aug 27 19:34:36 2013, Jamie, Configuration, CDS, front end IPC configuration hosts.png
    Reply  Tue Aug 27 20:43:34 2013, Koji, Configuration, CDS, front end IPC configuration 
       Reply  Wed Aug 28 19:47:28 2013, jamie, Configuration, CDS, front end IPC configuration 
          Reply  Wed Aug 28 23:09:55 2013, jamie, Configuration, CDS, code to generate host IPC graph hosts.png40m-ipcs-graph.py
Message ID: 9074     Entry time: Tue Aug 27 19:34:36 2013     Reply to this: 9076
Author: Jamie 
Type: Configuration 
Category: CDS 
Subject: front end IPC configuration 

So the IPC situation on the front end network is not so great right now.  For various no-longer-valid reasons, c1lsc had no RFM card, all the IPC connections were routed through the c1rfm model on c1sus, and routed to c1lsc via dolphin PCIe as needed.  As things grew, c1rfm became overloaded.  Koji tried to fix the situation by breaking things out of c1rfm to make direct connections where we could.  This cleared up c1rfm a bit, but not c1mcs is overloading.

Reminder: PCIe (dolphin) is faster and higher bandwidth than RFM.  The more things we can put on PCIe the better.

Attached is a graph of my rough accounting of the intended direct IPC connections between the front ends.  By "intended direct" I mean what should be direct connections if we had all the appropriate hardware.  Right now the actual connection graph is more convoluted than this since things are passing through c1rfm.  I note this graph was NOT particularly easy to make, which is very unfortunate.  I had to manually look through every model and determine the ultimate source of every incoming IPC.  Kind of a pain in the butt.  It would be nice if there was a simple way to represent this.

Here are some various solutions to the problem as I see it:

a) put c1lsc on the RFM network

This would allow c1lsc to talk to c1ioo, c1iscex, and c1iscey without having to go through c1sus, thereby eliminating c1rfm altogether.  I'm not sure why we didn't just do this originally.

Requires:

  • One RFM card for c1lsc

b) put c1ioo on the PCIe network (and move c1sus's RFM card to c1lsc)

This is probably the most robust solution.

b1) There are roughly 8 IPCs going from c1ioo to c1sus, and 4 going the other way, and 3 IPCs from c1ioo to c1lsc.  If we put c1ioo on PCIe all of these now RFM connections would become direct PCIe connections, which would be a big win.

At this point only the end station front ends would be on RFM, and most of the connections to those come from c1lsc, so it would make sense to give c1lsc the RFM card, thereby eliminating a lot of stuff from c1rfm.

Requires:

  • dolphin card for c1ioo (do the old sun machines support these?  if they don't we could swap the old sun machine with a new spare aLIGO-approved supermicro machines, which we have spares of)
  • dolphin fibre to go to dolphin switch in 1X3 rack

b2) OR, we could move c1ioo to 1X4 with c1lsc and c1sus, and get a OneStop fibre cable to connect to its IO chassis.  We would still need a dolphin card, but we could use coper instead of fibre.  This is my preferred solution, since it moves c1ioo out of 1X1, where it's really in the way and making a lot of noise.  It would also be easier to manage all the machines if they're together in one rack.

Requires:

  • dolphin card for c1ioo
  • dolphin coper cable for c1ioo
  • OneStop fibre for c1ioo

c) put another cpu in c1sus

c1sus is (I believe) able to support another 6-core cpu.  If we added more cores to c1sus, we could break up c1rfm into c1rfm0, c1rfm1, etc.  This is a less elegant solution imho, but it would probably do the job.

Requires:

  • one new CPU for c1sus
Attachment 1: hosts.png  36 kB  Uploaded Wed Aug 28 20:38:50 2013  | Hide | Hide all
hosts.png
ELOG V3.1.3-