After a busy first day, here is the report for my second day at LinuxCon EMEA 2011, which started directly with some sessions (I skept the plenary for once):
Distributed redundancy by Roopesh Keeppattu – Huawei
Redundancy is about availability, by duplicating components to avoid unavailability of the service.
Availability measured with ’9′.
4 nines is 1 hour per year, 5 nines means 5 minutes, 6 nines 32 seconds.
Major types of redundancy: standby (cold – the other server remains unpowered, warm – all servers powered, hot – all servers provide identical services).
ALso notion of N modular redundancy (N servers in parallel).
1:N redundancy = 1 standby for N active units.
Traditional redundancy: mainly based on backup HW systems, with similar capabilities so large CAPEX and OPEX.
So moving to distributes redundancy to reduce costs.
1:1 scenario taken in account
Instead of duplicating HW, there is a duplication of process instances on a set of servers
3 models available: live migration of OS/VM or of processes or of pre-distributed processes
- live migration OS/VM: preserve states, less complex from application point of view but higher migration time due to size of what to transfer.
- processes migration: you encounter more complex migration design, which has to be part of the application, and gain on the data to transfer. This method can also provide dynamic load distribution. But you need pre-failure detection for failover.
- pre-distributed process: you increase again the ressource optimization, availability and switching time but also the complexity, the need of additional SW to deal with states and the linkage to the application.
The future is in redundancy in the cloud, and ressource abstraction.
I was hoping for a more in depth presentation, and was not satisfied by this one, as it didn’t go into details, just remaining at the surface. However, this is a critical topic for most customers today in their Linux adoption.
Experiences booting 100s of thousands to millions of Linux VMs by Andrew Sweeney – Sandia National Lab
Managing a large number of VMs presents some challenges and horror stories (such as filling fill the switch CAM table, creating VM feed back loops, finding some unique bugs or odd behaviour). Even 0.01 % of error is 100 VMs in their case.
They tried multiple technologies such as lguest, QEMU, KVM, NOVA. They are using a mixed of technologies due also to hardware limitations.
Guests configurations are computed at runtime. Everything is stored in RAM. They treat VMs as an application process. They use standard tools, the same TCP stacks, kernel…
They are using VMatic to generate the images and boot 1000 VMs in < 3 minutes.
Another tool used is Gproc (Cluster Management tool written in Go) allowing O(ln(n)) execution time.. It scales beyond 200K+ instances. Web based interface.
The first cluster type was:
- Using PXE boot for booting physical host Hypervisors and then start the guests.
- In July 2009, they reached 1 Million VM with lguest and 4600 Dell Super Computer (256 lguest per node) bootleneck being RAM.
They then created KANE (sort of their own cloud approach) made of 520 nodes with 12 GB RAM with Video cards (because that was more expensive to remove them !!)) 13 racks, 40 nodes/rack, 1 PDU/rack.
Then they developed a strongbox ARM cluster made of 490 nodes 512 MB RAM little power needed with lguest.
Then they started Megatux 2.0 to reach a higher number of VMs. Everything is virtual. They even created a network creation language. Use virtual quagga & linux virtual routers (+ physical) and virtual VDE switches. It supports multiple OS normally, but for Windows they got many blue screen (ipconfig before IP is sup, ping before IP is up, …). They’re using KSM a lot (and made patches) and various approaches to reduce VM footprint. gproc is used after the initial boot to push the VM images and start the VM + aggressive KSM.
Cold boot to experiment is performed in 7 minutes. 1 daemon per host to regulate KSM, VM state.
Interesting problem to collect info from 1 Million of nodes overloaded and where to store it ? Using network sniff, VM inspection. For that they used a MongoDB backend populated at runtime. Data collected in in best effort mode.
They’re looking at using KVM tool instead of QEMU/KVM to reduce memory footprint and AXFS (Advanced XIP FS) combined with cramfs.
Next steps with Android, more realistic network usage, improved monitoring, data visualisation and error handling.
A good talk on very unusual context with some interesting issues to consider, even if far from being current problems as of now.
I then met with my colleague Sue Paylor, who is one of our excellent FLOSS expert in EMEA, and that was again a good talk exchanging about our respective customer experiences, how to improve HA with Linux, and lots of various topics.
Providing High Perfomrance Round table (instead of SuSE Keynote)
Ludek Safar, Ministry of Interior, Czech Republic approched the linux topic from the desktop side, and they’re now moving to the Data Center (Oracle instances on physical hardware and the rest in Xen VMs including java based custom devs.). They help by giving publicity for some FLOSS projects. The choice of an enterprise distribution is specifically to be the linkage with the communitites. He likes the embedded approach with regards to the fully integrated hypervisor which provides the perfect cloud solution for them.
Dr. Udo Seidel, Amadeus explained that they started 9 years ago with Linux. They have done lots of internal developments including lots of mission critical workloads. Participating to events is key to keep good technical exchanges, influence the developments, give feedback. He really likes the flexibility and the open mindset. However he is still missing a central approach around role based manageemnt (a la AD).
Andreas Pöschl, BMW explained that they started back in 2003 for servers, and in 2006 decided that Windows and Linux were the strategic OS on x86. They run SAP on Linux e.g. and desktops on Windows. They do virtualization (1000 VMs) with Xen, including 16 cores 64 GB VMs for SAP. They don’t do direct contributions, but rather provide use/test cases for large configurations, and rely on their distribution providers to do the return. Sharing what they do with Linux is also important to improve the ecosystem. He insisted on the freedom of choice which avoids vendor lock-in and also marked his appreciation for the large set of possibilities offered by FLOSS. He is still concerned by boot time. BMW has requirements around storage and scale out, so they appreciate the work done on Btrfs. He mentioned usage of Linux in GENIVI that will bring infotainment to the end users.
Nils Brauckman underlined that the SuSE company, is organized to take this feedback and make it available upstream, doing that since 20 years, as well as providing mission critical solutions to customers, and detailed the new features brought into Linux 3.0 (btrfs rollback, snapshots, trace capabilities, …) bridging the gap between Unix and Linux. He underlined also for SuSE the new agility brought by being back as a separate Business Unit, operating like a single company.
I like more and more this type of round table, as it gives concrete production example of FLOSS usage, and show how serious customers are today, and also how far they want to push their usage, which creates interesting challenges for us !
It takes a community/village to raise a Distribution by Tim Burke, Red Hat
"Unix was a job, Linux is a crusade" Tim said it’s awesome to be a part of RHEL as well as OLPC.
He started by showing a large set of stars in the sky (glibc, LVM, X.org, Linux), independant stars that only come together when gathered in a distribution, which give them visibility. Then he showed the various actors, hardware vendors, translators, designers, lawyers, testers, and distribution vendors as well. The real competitors of Red Hat are VMWare, Microsoft, not the other collaborative groups such as other distribution makers. He explained the relationship around the kernel between upstream, Fedora and RHEL. He also underlined the benefit of working upstream such as they did around the Real Time extensions, instead of coming with a large patch developped separated.
The role of distribution makers is also to coordinate with hardware vendors (I’m well placed to know that !). Distribution can help create communities such as for AMQP, which was a real common need among FSI companies, as they know how to do it.
Mantra is "get it upstream first". Being divergent is being ignored, costs more, represents more work.
Time then gave some numbers:
- 80% of Fortune 500 run Linux.
- 92% of supercomputers for healthcare or analytics run Linux.
He mentioned the OVA to bring up in the stack integrated solution based on KVM.
No keynote without cloud, so Tim had to mention it and noticed Linux usage in it, and the integration characteristics it requires, very near from the one you have to make a distro.
A good talk, but not as pushy as the one made by Jim Whitehurst
How Linux runs the World of Finance by Christoph Lameter, Graphe Inc.
Christoph started by explaining the various players (Stickes, traders, banks, …) and explained their needs of speed. This creates the need for certain technologies (Real Time, kernel, binaries and network optimisation, RDMA APIs, fast C++ code, processor caches). One problem is the limitation of speed of light (even if that may change !). That sounded like a joke first, but is very serious !! 200 µs to go round the earth. It creates limitations to signaling of events.
We’re moving from manual to automated trading. Hours vs ms, human vs compute/algo, 30-60 trades/min vs 1000/s. Manual is used as a backu p mechanism only today.
The case for Linux is because you can modify what you want, and such win against competitors by speed improvements. The first there wins ! Windows couldn’t make it in term of latency in its network stack. Linux was already used for Internet, large companies such as Amazon, Facebook, … All major stock exchanges are on Linux today. Commercial solutions vendors focus on Linux. Solaris is diminishing after Oracle bought Sun.
Distributions used are mainly RHEL, some SLES (Germany mainly), a bit of Gentoo and Ubuntu/Debian.
There are still some challenges for Linux in Finance: involvement upstream is rare, as they want to protect their advantages. Regression in kernel components is creating higher latencies (so some still run RHEL 3 !). Christoph Gave an example of a customer having a 200% regression moving from RHEL4 to RHEL5.
The Forward path is with direct access to hardware (OS bypass) to gain on latency. RT linux does not scale and increases average latency. RT linux is used by exchanges not traders.
Linux dominates finance for the forseeable future. Common hardware looks like supercomputers today (Numa). HPC goes mainstream. Offload technology is seen with suspicion by the community. So again no willingness to contribute these improvements upstream.
One of the best presentation of the day, with lots of anecdotes, and a visible knowledge of the topic end to end.
Where is the Money in Open Source? Business Models and the Marketing of Open Source Technologies by Nithya Ruff, Wind River Systems
Nithya created a story to illustrate this talk. 3 communities: producers, distributors, consumers.
- Producers are interested by solving problems. License used is key. It’s all about meritocracy. How do developers make money ?: being hired by a company, consulting contracts, venture funded, sponsorship/grants/donations.
- Why consumers use linux: no vendor lock-in, comparable perf and high quality, time to market and savings, choice and flexibility, empowerment,, innovation and transparency
- Distributors make it available for consumers with support, favour FLOSS adoption making it safe to use, employ developers, solve some issues and contribute back, market FLOSS, and serve as a liaison between consumer and developer. Successful business models are subscription, services fee, training, books but also proprietary extensions
Marketing FLOSS is different. You need to clearly articulate your added value in the ecosystem. So you have to add value. (TTM, ROI, Integration, risk mitigation)
Prediction, by 2021, 100000 infrastructure core endpoints and 1B mobile endpoints and 20B MtoM endpoints.
Even more need to collaboration between the various communities.
I was expecting a bit more from such a presentation. Good for beginers, but lacks new thoughts on our ecosystem.
ReaR by Dag Wieers
I was particularly interested by this presentation as ReaR is a MondoRescue competitor, and Dag is mister rpmforge, mrepo, … so was really curious to attend it.
Rear provides a Disaster Recovery Workflow in bash. Its framework is easy to use and extend. It supports HP SmartArray, SW Raid, DRBD (not MondoRescue !), LVM, multipath, ext2,3,4, xfs, jfs, vfat. It supports tape, ISO, USB, eSATA, NFS, CIFS, rsync, HTTP, FTP, SFTP. It also provides back-ends with TSM, HP DP, Bacula, …
ReaR works on RHEL4,5,6. It’s shipped with SLES (the one distribution on which it’s tested).
It saves storage info and network info. It has local GRUB integration, serial console support, network and SSH key integration, syslinux management.
Dag then explained the use case of the Belgian Federal Police (HP-UX to Linux migration using Ignite before):
Developers prefered USB usage for flexibility instead of OBDR (also lack of OBDR support by latest HP HW). It manages labels on tape and USB devices. For this project, they support a central DR server with PXE boot and control the HTTP PUT upload with ACLs.
They provide a tool to detect when changes are needed to relaunch ReaR by cron.
In the future they plan to work on: better rsync support (like rsnapshot or rbme), more backup backends, PXE integration, code base reorganization, release process, website+doc, dev tools.
Dag made backup and restore demos.
I really liked the presentation. Dag is an excellent presentor, and has accomplished a huge work to improve the tool.If only I could also have some brilliant contributors like hom for my project !!
So after the presentation, I introduced myself to Dag, and we ended up talking together most of the evening during the dinner organized in a central place of Prague. We talked not only about DR, on which we share a lot of common ideas, but also about a large set of other topics, some of them HP related such as webOS future, … I like making new relationships during evens like LinuxCon as you end up talking with luminaries and that helps a lot enrich your own vision.
Some pictures of this event are available on Picasa.