Find it

Tuesday, October 19, 2010

Moving to POWER7 from SPARC

Last weekend I was reading a very interesting thread about moving to POWER7 from SPARC. I found it very interesting hence thought to share it with you.


There is an attention-grabbing discussion going on under LinkedIn. The subject is –
“Moving to POWER7 from SPARC”


I think this is quite exciting topic or rather very strategic decision to make so I decided to have a brief summary note on it.

The discussion starts with question –

“My employer is considering seriously moving to POWER7 from SPARC as we retire EOSL (End of Service Life) hardware. Has anyone considered such a move or made the move from POWER to SPARC? “

Very difficult to answer isn’t it? Yes, it is!


I’m just trying to summarize the comments came from singular experts.


- There is a very valid point I agree upon - I would suggest not changing everything unless there is a very compelling reason. Point to ponder that if there is a particular need to run AIX, you can just integrate an AIX servers or cluster into your data centre without changing everything else which will be less overhead and sensible decision. Though this is not a technical point however first thought in my mind if my management ask me to do so.


- Datacenter power usage, cooling and space point of view - If one particular is struggling with power, cooling and space issue @ datacenter then consider mixing T- and M-series SPARC systems. T-series are great as power savers, they work best with highly parallel applications. M-series work great on single-threaded apps. The T-series are very efficient on power. However they do not do well on single threaded applications. For those you need to stick with M series.


- Another point - cost factor point of view - Solaris has x86 (Intel/AMD architecture), which is an option we don't have with AIX. For raw processing power and large memory footprints, Solaris 10 on Nehalem Intel is very motivating. You don't get all the RAS features of SPARC hardware, but if you have a load balanced applications or edge layer you can move there it can be a great fit. Also Solaris Supports x64 CPUs for excellent price/performance.


- The very vital point - The recent SPARC T3 servers aka “Rainbow Falls” that were announced last week at Oracle OpenWorld, that POWER7 isn't as desirable of a platform. Considering that a SPARC T3-4 can perform as well and in many benchmarks better than a 4 socket POWER7 box, but at considerably lower TCO, I don't see the point in the IBM/AIX/POWER route.

The SPARC T3 processor has the following specifications:

 
- 16 Cores x 8 Threads = 128 Threads Running at 1.65Ghz
- 2 x Execution Units per Core with 4 x Threads Each
- 1 x Floating Unit and 1 x Crypto Unit per Core
- 6MB of Shared L2 Cache
- 2 x DDR3 Memory Controllers
- 6 x Coherency Links for Glue-Less SMP up to 4 Sockets
- 2 x 10GbE NIU Controllers
- 2 x PCI-E 2.0 Controllers -> 8GB/s Bi-Directional I/O Each

Not done yet! Here is more details of various flavors of T3.



- Licensing factor - IBM will charge you for licenses left and right for each feature, especially on the virtualization front (LPARs, MPARs, and WPARs,), not to mention all the external components (HMCs, etc.). As where you can use Oracle VM Server for SPARC (LDoms) for free with the server and only pay for a RTU and support for S10 or S11 once for the whole machine (you can have hundreds or thousands of guests for no additional charge)! And don't forget that Solaris Containers are free and available on x86 and SPARC. Plus they can be used in LDoms(T-Series) and Dynamic System Domains (M-Series) for free! Oracle core licensing factor on SPARC T3 is 0.25

- AIX has no equivalent technology to ZFS.


 - Solaris can scale to >64 CPUs to solve extremely large problems.


- Point to be noted - AIX on POWER is a good platform, I don't want to badmouth AIX. But here are the two biggest issues with the platform:


1. Costs
2. Finding enough AIX folks to support you! [It doesn’t mean that out there lot of resources available for supporting Solaris, I mean "GOOD RESOURCES"…]


- Punch line - SPARC is not dead, actually is more alive than ever and Solaris is the most advanced OS in the market: why change ?


- Virtualization point of view – AIX got LiveMotion! But not Sun BUT---BUT there are ways. Lets discuss it in details.


You can migrate your LDoms in a cold or warm migration, look out for LDoms 1.3 Admin guide in the chapter "Migrating Logical Domains", in particular the section on "Migrating an Active Domain" on page 129. (Oracle VM Server for SPARC 2.0). With the warm migration, the LDom is suspended and moved within seconds->minutes depending on the size and network bandwidth. As for rebooting service domains, if you're on something like a T5240 or T5440, you can use external boot storage (SAN/JBOD/iSCSI) to keep the second service domain up and have enough PCI-E slots for redundancy. By splitting your redundancy (network and storage) between them, your LDom guests will continue running with IPMP and MPXIO. So no down-time. FYI, even if you reboot the Primary domain without a secondary service domain, the guests will continue to run and wait for the Primary to return, so all is not lost. The only way you'd lose all of your LDoms is if you lose power or reset the SC. You can also use Solaris Cluster to automate the migration or fail-over of LDom guests.


The bad thing about the IBM Power VM setup is the following:


1. High-Over Head! The VIO overhead can easily consume over 40%-50% of your resources! As where on a T-series with LDoms, it's one core for the Primary domain and less than 10% overhead for the networking and storage I/O virtualization. The CPU threads are partitioned at the hypervisor and CPU level. RAM is virtualized at the hypervisor and MMU level. And on the M-series with Dynamic Domains you have 0% overhead because the CPU, Memory, and I/O are electrically partitioned on the centerplane. Even Solaris Containers have extremely low overhead, usually less than 5%. So I can get more out of my Solaris servers than you can in the IBM PowerVM world.


2. If there is a fault on a CPU or memory module, you can take out multiple LPAR/WPARs. As where on M-series with Dynamic Domains, faults like that would only affect the Domain the CMU module was on. You can even do a mirrored Memory configuration on the M-Series to be fully fault tolerant. Not to mention that the Dynamic Domains are electrically isolated, something you don't get anywhere else.


3. Costs, you have to get licenses to enable PowerVM features and they add up quickly. As where LDoms and Dynamic Domains are free with the hardware.


NOTE: In AIX WPAR (equivalent to Sun Container) can be migrated on the fly BUT on Solaris Containers cannot be migrated on the fly BUT not to worry it is something Sun-Oracle is working on and we'll probably see down the road. As of now for live migrating something around, I would use an LDom for now!


Not biased about AIX POWER architecture as this is a very vital decision to make and at the same time very difficult situation to conclude on but taking help of above points what I can think of – I’ll stick to SPARC architecture!


Your comments on this will be very much appreciated.

Thursday, October 14, 2010

Solaris Flash Archives

Flash images or flar image is very useful in situations where you need cloning/imaging or crashed server recovery. The flarcreate command creates a flash archive. A flash archive can be created on a system that is running a UFS root file system or a ZFS root file system. A flash archive of a ZFS root pool contains the entire pool hierarchy except for the swap and dump volumes and any excluded datasets. The swap and dump volumes are created when the flash archive is installed.


NOTE: By default, the flarcreate command ignores items that are located in "swap" partitions.

Let's see how we can work with flar image creation.

Create the archive:


For UFS:

# flarcreate -n "Solaris 10 10/09 build" -S -c -x /var/tmp/ /var/tmp/S10-1009.ufs.archive.sun4u-`date +'%Y%m%d%H%M'`

For ZFS:

# flarcreate -n "Solaris 10 10/09 build" -S -c /var/tmp/S10-1009.zfs.archive.sun4u-`date +'%Y%m%d%H%M'`


Where -


The "-n Solaris 10 10/09 build" implants a name into the FLAR image. The name should be something unique and meaningful to better identify it as the FLAR image for the system.

The "-x /var/tmp/" option causes the /var/tmp/ directory and its contents to be excluded from the FLAR image since it will not be needed in the FLAR image.

-S option causes to skip the disk space check and do not write archive size data to the archive. Without -S, flarcreate builds a compressed archive in memory before writing the archive to disk, to determine the size of the archive. The result of the use of -S is a significant decrease in the time it takes to create an archive.

-c Tells flar to compress the archive as it's writing it.


E.g. -


# time flarcreate -n "Solaris 10 10/09 build" -S -c /var/tmp/S10-1009.zfs.archive.sun4u-`date '+%m-%d-%y'`


Full Flash
Checking integrity...
Integrity OK.
Running precreation scripts...
Precreation scripts done.
Creating the archive...
Archive creation complete.
Running postcreation scripts...
Postcreation scripts done.


Running pre-exit scripts...
Pre-exit scripts done.


real 19m58.57s
user 13m42.99s
sys 1m55.48s



# ls -l /var/tmp/S10-1009.zfs.archive.sun4u*
-rw-r--r-- 1 root root 5339709933 Oct 14 04:54 /var/tmp/S10-1009.zfs.archive.sun4u-10-14-10


# flar info /var/tmp/S10-1009.zfs.archive.sun4u-10-14-10
archive_id=2f27a01690ce4fcaf398e638fcdcb66e
files_archived_method=cpio
creation_date=20101014093417
creation_master=XXXXXX
content_name=Solaris 10 10/09 build
creation_node=XXXXXXXX
creation_hardware_class=sun4u
creation_platform=SUNW,Sun-Fire-V240
creation_processor=sparc
creation_release=5.10
creation_os_name=SunOS
creation_os_version=Generic_142900-09
rootpool=rpool
bootfs=rpool/ROOT/s10s_u8wos_08a_Pre-patch
snapname=zflash.101014.04.10
files_compressed_method=compress
content_architectures=sun4c,sun4d,sun4m,sun4u,sun4s,sun4us
type=FULL


Also we can have a small shell script to create flar image -


#!/bin/sh
echo
echo Enter image name, i.e. Solaris build e.g. S10-1009.ufs.archive.sun4v
read ANS
echo "Image Name: ${ANS}" > /etc/image_catalog
echo "Image Created on: `date`" >> /etc/image_catalog
echo "Image Created by: `/usr/ucb/whoami` on `hostname`" >> /etc/image_catalog

#
# Clean up wtmpx so that new machine won't have last logins
#
cat /dev/null > /var/adm/wtmpx
#
# Now create flar, excluding -x /var/tmp/
#
flarcreate -n ${ANS} -c -a `/usr/ucb/whoami` -x /var/tmp/ /var/tmp/${ANS}_`date +'%Y%m%d%H%M'`