Find it

Tuesday, October 19, 2010

Moving to POWER7 from SPARC

Last weekend I was reading a very interesting thread about moving to POWER7 from SPARC. I found it very interesting hence thought to share it with you.


There is an attention-grabbing discussion going on under LinkedIn. The subject is –
“Moving to POWER7 from SPARC”


I think this is quite exciting topic or rather very strategic decision to make so I decided to have a brief summary note on it.

The discussion starts with question –

“My employer is considering seriously moving to POWER7 from SPARC as we retire EOSL (End of Service Life) hardware. Has anyone considered such a move or made the move from POWER to SPARC? “

Very difficult to answer isn’t it? Yes, it is!


I’m just trying to summarize the comments came from singular experts.


- There is a very valid point I agree upon - I would suggest not changing everything unless there is a very compelling reason. Point to ponder that if there is a particular need to run AIX, you can just integrate an AIX servers or cluster into your data centre without changing everything else which will be less overhead and sensible decision. Though this is not a technical point however first thought in my mind if my management ask me to do so.


- Datacenter power usage, cooling and space point of view - If one particular is struggling with power, cooling and space issue @ datacenter then consider mixing T- and M-series SPARC systems. T-series are great as power savers, they work best with highly parallel applications. M-series work great on single-threaded apps. The T-series are very efficient on power. However they do not do well on single threaded applications. For those you need to stick with M series.


- Another point - cost factor point of view - Solaris has x86 (Intel/AMD architecture), which is an option we don't have with AIX. For raw processing power and large memory footprints, Solaris 10 on Nehalem Intel is very motivating. You don't get all the RAS features of SPARC hardware, but if you have a load balanced applications or edge layer you can move there it can be a great fit. Also Solaris Supports x64 CPUs for excellent price/performance.


- The very vital point - The recent SPARC T3 servers aka “Rainbow Falls” that were announced last week at Oracle OpenWorld, that POWER7 isn't as desirable of a platform. Considering that a SPARC T3-4 can perform as well and in many benchmarks better than a 4 socket POWER7 box, but at considerably lower TCO, I don't see the point in the IBM/AIX/POWER route.

The SPARC T3 processor has the following specifications:

 
- 16 Cores x 8 Threads = 128 Threads Running at 1.65Ghz
- 2 x Execution Units per Core with 4 x Threads Each
- 1 x Floating Unit and 1 x Crypto Unit per Core
- 6MB of Shared L2 Cache
- 2 x DDR3 Memory Controllers
- 6 x Coherency Links for Glue-Less SMP up to 4 Sockets
- 2 x 10GbE NIU Controllers
- 2 x PCI-E 2.0 Controllers -> 8GB/s Bi-Directional I/O Each

Not done yet! Here is more details of various flavors of T3.



- Licensing factor - IBM will charge you for licenses left and right for each feature, especially on the virtualization front (LPARs, MPARs, and WPARs,), not to mention all the external components (HMCs, etc.). As where you can use Oracle VM Server for SPARC (LDoms) for free with the server and only pay for a RTU and support for S10 or S11 once for the whole machine (you can have hundreds or thousands of guests for no additional charge)! And don't forget that Solaris Containers are free and available on x86 and SPARC. Plus they can be used in LDoms(T-Series) and Dynamic System Domains (M-Series) for free! Oracle core licensing factor on SPARC T3 is 0.25

- AIX has no equivalent technology to ZFS.


 - Solaris can scale to >64 CPUs to solve extremely large problems.


- Point to be noted - AIX on POWER is a good platform, I don't want to badmouth AIX. But here are the two biggest issues with the platform:


1. Costs
2. Finding enough AIX folks to support you! [It doesn’t mean that out there lot of resources available for supporting Solaris, I mean "GOOD RESOURCES"…]


- Punch line - SPARC is not dead, actually is more alive than ever and Solaris is the most advanced OS in the market: why change ?


- Virtualization point of view – AIX got LiveMotion! But not Sun BUT---BUT there are ways. Lets discuss it in details.


You can migrate your LDoms in a cold or warm migration, look out for LDoms 1.3 Admin guide in the chapter "Migrating Logical Domains", in particular the section on "Migrating an Active Domain" on page 129. (Oracle VM Server for SPARC 2.0). With the warm migration, the LDom is suspended and moved within seconds->minutes depending on the size and network bandwidth. As for rebooting service domains, if you're on something like a T5240 or T5440, you can use external boot storage (SAN/JBOD/iSCSI) to keep the second service domain up and have enough PCI-E slots for redundancy. By splitting your redundancy (network and storage) between them, your LDom guests will continue running with IPMP and MPXIO. So no down-time. FYI, even if you reboot the Primary domain without a secondary service domain, the guests will continue to run and wait for the Primary to return, so all is not lost. The only way you'd lose all of your LDoms is if you lose power or reset the SC. You can also use Solaris Cluster to automate the migration or fail-over of LDom guests.


The bad thing about the IBM Power VM setup is the following:


1. High-Over Head! The VIO overhead can easily consume over 40%-50% of your resources! As where on a T-series with LDoms, it's one core for the Primary domain and less than 10% overhead for the networking and storage I/O virtualization. The CPU threads are partitioned at the hypervisor and CPU level. RAM is virtualized at the hypervisor and MMU level. And on the M-series with Dynamic Domains you have 0% overhead because the CPU, Memory, and I/O are electrically partitioned on the centerplane. Even Solaris Containers have extremely low overhead, usually less than 5%. So I can get more out of my Solaris servers than you can in the IBM PowerVM world.


2. If there is a fault on a CPU or memory module, you can take out multiple LPAR/WPARs. As where on M-series with Dynamic Domains, faults like that would only affect the Domain the CMU module was on. You can even do a mirrored Memory configuration on the M-Series to be fully fault tolerant. Not to mention that the Dynamic Domains are electrically isolated, something you don't get anywhere else.


3. Costs, you have to get licenses to enable PowerVM features and they add up quickly. As where LDoms and Dynamic Domains are free with the hardware.


NOTE: In AIX WPAR (equivalent to Sun Container) can be migrated on the fly BUT on Solaris Containers cannot be migrated on the fly BUT not to worry it is something Sun-Oracle is working on and we'll probably see down the road. As of now for live migrating something around, I would use an LDom for now!


Not biased about AIX POWER architecture as this is a very vital decision to make and at the same time very difficult situation to conclude on but taking help of above points what I can think of – I’ll stick to SPARC architecture!


Your comments on this will be very much appreciated.

6 comments:

  1. 1. Costs - everyone is selling his product with low TCO label, depends on how can you talk on that;
    2. Finding enough AIX folks to support you! - that's really pretty easy, it's really hard to find a system administrator who can not support AIX;
    3. AIX has no equivalent technology to ZFS - like your argument in "Virtualization point of view", you can find the way to do everything ZFS can do without ZFS;
    4. Solaris can scale to >64 CPUs to solve extremely large problems. - Yes, to compete with a giant bulldozer, you need an army of guys with shovels.

    Stephen

    ReplyDelete
  2. 2. fault on a CPU or memory module
    is not entirely correct. Yes this can be an issue on the entry/midrange models.
    No on the enterprise models, every instruction to a core/memory module is checked, RAS features of power systems. If you CuOD memory/cpu's a faulty cpu/memory module will be automatically replaced by a CuOD standby memory module/cpu

    ReplyDelete
  3. Interesting points.

    To anon#1, bulldozer is no good if the steering wheel is backwards and it keeps breaking down, which AIX/VIO/SVC does constantly. To anon#2, i've never seen it work correctly.

    ReplyDelete
  4. For performance critial workloads, storage virtulization (VIO and SVC) is not recommended.
    Storage virtulization is good for cost saving if the slight performance degration is acceptable.

    Stephen

    ReplyDelete
  5. Nice Blog dude...it's very informative and the way you present your views is quite interesting.It would be great if you cloud share your e-mail....Thanks for sharing such beautiful information....keep bloging.

    ReplyDelete
  6. We use AIX, Solaris, HP-UX, OVM linux over Intel,...
    The TCO of AIX is very high, managing cost of AIX is very high...
    Finaly after 8 years of AIX usage, we will remove all AIX server because is expensive to buy hardware, is very expensive for Virtualization software, is very expensive to manage.
    We will continue to use Solaris SPARC and we will migrate AIX to OVM linux some apps and Oracle database.
    HP-UX will remove at the next hardware refresh, HP-UX is dead, too expensive and too slow...

    ReplyDelete