STC > Based on Axel's suggestion (and since I will never have a small army of
STC > grad students, maybe a couple undergrads) I configured and priced a 64 core
STC > workstation (see below). If I understand correctly, the idea is to use
STC > these as stand alone systems. That is, for the typical size of my problem,
STC > I would never need to run in two of these machines and can therefore forgo
STC > the fastest networking options. This one was priced at approximately
STC > $9500.00.
that seems a bit expensive.
STC >
STC > A lammps specific question:
STC >
STC > 1. How does the memory partitioning work? In lammps output I see:
STC >
STC > Memory usage per processor = 3.96686 Mbytes
that is per MPI process. OpenMP threads share memory, MPI is no-share memory. please note that this is a lower limit for memory consumption. the biggest memory consumer are usually neighbor lists and complex analysis operations that need to store intermediate data. this first strongly depends on cutoff and particle density, the second on how convoluted an analysis you will program.
so only the number of atoms may not be a good number to go on.
STC > Is this per core running? Or per physical processor. Say I am running in a
neither. MPI was mostly designed to be hardware agnostic (at least from the point of the programmer).
STC > 2 processor quad-core machine with 8 mpi processes. Does this translate to
STC > 4 MBytes per core? Or 1 MByte per core? My gut tells me the former. The
yes.
STC > reason I ask is because I am trying to extrapolate to my biggest system of
STC > interest and depending on the answer 64 GB may be overkill.
yes, but you won't really save money by installing less. 1GB/core is pretty much the lower limit of what you can get.
STC > And others not so lammps related but if anyone has input it would be
STC > welcome.
STC >
STC > 2. Since I am not going to be running on a network... would it be wise to
STC > forgo the network controller and just use the integrated one on the
STC > motherboard? The main use for the network adapter would be to download
STC > occasional data for analysis and to connect to the machine via ssh.
where would you plug in a dual channel 10GigE network card? do you even have a 1GigE port available where you're going to place the machine?
STC > 3. I am forgoing a RAID array in this configuration to stay within the
STC > $10,000.00 budget. One big concern I have is backing up the system but that
STC > is a conversation for another day.
hard drives are cheap. stick a bunch of them into the machine and configure a software raid. works very well. in fact, given the size and failure rate of hard drives, i configure a software raid-1 or raid-5 in *any* desktop i use.
make sure you have a USB-3.0 port and you have no problem backing up. external harddrives with up to 1TB (at 2.5" formfactor) are quite cheap these days and pretty fast. i use them like floppy disks.
STC > 4. I have included a GPU in this configuration. In a cluster I can have a
STC > login node and can run several jobs simultaneously. Would an inherent
STC > disadvantage of the single workstation strategy be that this is not
STC > possible. Could I in principle run simultaneously a job with 32 cores, a
STC > job with 16 cores + gpu, and leave the other 16 cores for other activities
STC > (data processing, etc...)
<sigh>
if you want to do GPU computing, build a machine that does GPU computing well. this is not the platform for it. best get a small intel CPU based desktop machine and - since you are on a budget - don't waste your money on a hyper-expensive Quadro GPU, that you may be using only sparingly.
in general, it is almost always a bad idea to cram too many options into a machine. better get something that does what you do the most extremely well. the money you save on the junking the quadro, will buy you a nice little desktop with a powerful GeForce GPU and you'll enjoy lighning fast graphics with your favorite OpenGL based viz tool. it'll do GPU computing as well. ...and after you found out that GPU computing works for you, write a proposal to get time on one of the available machines with GPUs. the people running those are often quite *desperate* to find users that can use their machines (well). there are not that many applications around that make good use of GPUs and even with those that can, only a small number of people are willing to experiment with it and rather stick to what they know. as jeff pointed out, the effort to get external time is moderate, especially if you can show some experience and there you can get access to all kinds of things that you'd like to experiment with. i would not waste serious money, especially when you have a rather limited budget on anything local, unless you really know what you're doing.
</sigh>
STC > Thanks for any input!
well, you probably got more than you asked for.
axel.
STC >
STC > Selection SummaryProcessor4 x Sixteen-Core AMD Opteron™ Model 6378 - 2.4GHz
STC > 32MB Cache (115W TDP) MotherboardAMD® SR5690+SR5670 Chipset - Dual Intel®
STC > Gigabit Ethernet - 8x LSI SAS2 Controller - IPMI 2.0 with LANMemory8 x 8GB
STC > PC3-12800 1600Mhz DDR3 ECC Registered DIMM ChassisThinkmate® TWX-748TQ -
STC > 4U/Tower - 5 x 3.5" SAS/SATA - 1400W RedundantHard Drive3.0TB SAS 2.0
STC > 6.0Gb/s 7200RPM - 3.5" - Seagate Constellation™ ES.3 5.25" BayLG 14x
STC > Blu-Ray Disc Rewriter and DVD/CD Rewriter with M-Disc (SATA)Video CardNVIDIA®
STC > Quadro® K4000 3.0GB GDDR5 (1xDVI-I DL, 2x DP) Network CardIntel® 10-Gigabit
STC > Ethernet Converged Network Adapter X540-T2 (Copper) (2x
STC > RJ-45)PeripheralsMicrosoft
STC > Wired Desktop 400 Keyboard and Mouse (USB) Operating SystemUbuntu Linux
STC > 12.04 LTS Server Edition (No Media) (Community Support)
STC > (32-bit/64-bit)Operating
STC > System InstallationPlease install my selected operating system in 64-bit
STC > mode where applicable. (Pre-Installed) WarrantyThinkmate® Three Year
STC > Warranty with Advanced Parts Replacement and RSLConfigured Tech
STC > SpecsProcessorsProduct
STC > LineOpteron 6300SocketSocket G34Clock Speed2.40 GHzHyperTransport 6.4 GT/sL3
STC > Cache16 MBL2 Cache8x 2MBCores/Threads16C / 16TAMD Turbo Core Technology Yes
STC > Wattage115WMemoryTechnologyDDR3Type240-pin RDIMM Speed1600 MHzError Checking
STC > ECCSignal ProcessingRegisteredMotherboards North BridgeAMD SR5690+SR5670Memory
STC > TechnologyDDR3 ECC RegisteredMemory Slots32 x 240-pin DIMMsExpansion Slots 2x
STC > PCI Express 2.0 x16,
STC > 2x PCI Express 2.0 x8,
STC > 1x UIOGraphics ControllerMatrox G200 16MB DDR2 graphicsNetwork ControllerIntel®
STC > 82576 Gigabit (2-port) Back-panel InterfacesPS/2 keyboard and mouse ports,
STC > 7x USB 2.0 ports (2x rear, 4x header, 1x Type A),
STC > 2x RJ-45 LAN Ports,
STC > 1x RJ-45 Dedicated LAN for IPMI,
STC > 1x VGA port,
STC > 1x Fast UART 16550 Serial port,
STC > 1x Serial port headerOn-Board Interfaces6 x SATA,
STC > 8 x SAS,
STC > 1 x USBUSB 2.0 Ports7 (2 rear ports, 1 onboard, 4 optional via header) LAN
STC > Ports3 (2 LAN, 1 IPMI)SAS 6Gbps Ports8SATA 3Gbps Ports6VGA Ports1 Video
STC > CardsMemory Capacity3 GBProcessorNVIDIA Quadro K4000DisplayPort
STC > Outputx2DVI Output
STC > x1ChassisProduct Type4U or TowerColorBlackWatts 1400WExternal Drive Bays5x
STC > 3.5" Hot-swap (SAS / SATA) Drive Bays
STC > 2x 5.25" Peripheral Drive Bay
STC > 1x 5.25" Bay for FloppyFront Ports 2x USB PortsCooling Fans3x 5000 RPM
STC > Hot-swap Cooling Fans,
STC > 3x 5000 RPM Hot-swap Rear Exhaust FansOptical Drives Product TypeBD-RE +
STC > DVDRWRead Speed12x BD-ROM, 16x DVD-ROM, 48x CD-ROMWrite Speed14x BD-R, 16x
STC > DVD+/-R, 48x CD-RRewrite Speed 2x BD-RE, 6x DVD-RW, 8x DVD+RW, 24x CD-RWHard
STC > DrivesRotational Speed7200RPMCache128MB Network CardsTransmission Speed10Gbps
STC > EthernetHost InterfacePCI Express 2.1 x8Cable Medium CopperPort Interface2x
STC > RJ-45VT for Connectivity (VT-c)VMDqVT for Directed I/O (VT-d)Yes
STC >
STC >