Emancipation

Thursday, August 21, 2008

Tips & Tricks for a new sponsoree

I've recently been moved behind the firewall and am intern-ing at Sun.

One of the tasks of this job that I've been charged with is to sponsor community code contributions going in to OpenSolaris and I noticed a bunch of things that'd just make it less work & by extension time to go from an email with a .diff in it to integration. Let's begin:

So you want to go that extra mile for your sponsor:

You've signed your SCA , found a bug, sent your email to the list, and are ready to roll.
Awesome! OpenSolaris is huge and we appreciate the extra hands, plus you help make it more community driven.

So the next step is to go back and forth with your sponsor trying to find the best possible solution to the bug. Once you've done that, you can get down to work, fix the bug, and then build & test it.

Then you just send the code to your sponsor, and you're done right?
Well, you could do that. It's technically correct and is the standard way of doing things but if you really want to wow your sponsor and go above and beyond, there are a couple things that the gatekeepers require that you can do to make his or her job ( taking your code, committing it to the tree ) a breeze:

unified diffs : from the top of the tree, running a diff -u gives you a single patch file that can be applied to the tree, rather than a diff of every affected file. Better still is an hg export file. The standard comment for HG exports is :
Contributed by {your name} <{your email}>
{bug id} {bug synopsis}
note the single space between bug id and bug synopsis
recent gate checkout : a bunch of stuff may have changed in the meantime, so it's best to apply your code to a copy of the ON gate as recent as possible, so that the diff doesn't break
cstyle clean it : onbld has a tool in /opt/onbld/bin called 'cstyle' which checks to make sure that the pendantic little nits like indentation standards are adhered to. You only need to make the stuff you changed cstyle clean, don't worry about all the old stuff
change the copyright : you'll notice every file owned by Sun in the ON consolidation that has a line with Sun's copyright in it like:
/*
* Copyright 2008 Sun Microsystems, Inc. All rights reserved.
* Use is subject to license terms.
*/
These all need to be updated to reflect the current year ( in this case it's 2008 ) whenever the file changes.
Update the CDDL block: Some older files have a CDDL block that's no longer acceptable, so if you run across it you may need to update it. You'll be able to tell, because it gives a version number ( 1.0 ) of the license. So, update this:
The contents of this file are subject to the terms of the
Common Development and Distribution License, Version 1.0 only
(the "License"). You may not use this file except in compliance
with the License.
To this:
The contents of this file are subject to the terms of the
Common Development and Distribution License (the "License").
You may not use this file except in compliance with the License.
SCCS keywords : the ON gate used to use a different source-code management system (teamware). The gate is now managed by Mercurial. As a result, there's still some vestigial teamware bits lying around. Before a change can be integrated someone ( typically the sponsor, but it could be you, which is why we're here ) needs to remove the old SCCS keywords. They look like this:
#pragma ident "%Z%%M% %I% %E% SMI"
This whole line can just be removed from the file entirely. It's not needed anymore.

With these simple steps, you can cut (depending on the size of the patch) hours off your sponsor's task and surprise them with how thorough you are

Tuesday, May 13, 2008

Emancipation Community

I proposed the other day that the emancipation project be promoted to the Emancipation Community.

The way I see it, Jason & Steven's work on the ce driver, Roland's work on the xpg/posix stuff
and my libc_i18n work are all loosely related in that they are all reimplementations of closed source stuff, but aren't really as closely micro-coupled as much as one might think a project is ( we don't share an hg repo set, for example )

Here's what Plocher and I came up with for a charter:

Emancipation Community Charter ( rev 1 )
CG Problem statement

The OpenSolaris operating system is not completely open
because several components that are required to build and
boot the OS are only available in the "closed bin" archives.

Scope:
Initially, the focus will be on selected high-value efforts,
such as self-hosting an open ON, drivers, posix utils, but
the long range intent is to eliminate the need for (and use
of) closed source software on the opensolaris OE.

Goals/milestones:

Quarterly progress reports will be produced and sent to the
OS-announce alias to keep the larger community informed of
our progress.

In order of priority:

Goal 0: Replace the components needed to build and boot
the ON consolidation with whatever shims, hacks
and scaffolding is needed to produce a proof of
concept that self-hosts and boots, followed by a
reimplementation of the userland utilities as per same.

Goal 1: Determine the best way to replace the above hacks
with a permanent solution, including decision
making architectural and design guidelines that
can be used in similar situations elsewhere in
the emancipation effort (i.e., should we reuse
from some particular other open OS, roll our own,
do without; what makes a good -vs- poor choice,
how do we choose without causing unnecessary
strife, ...? Collaboration with the ARC community
is implied during this stage.

Goal 2: Develop and push the changes prototyped in phase
0 and formalized in phase 1 into the ON (and other)
consolidation(s) and remove the associated closed
bin pieces.

Goal 3: Seek replacement for high-use closed source software
such as media players, rich web players, etc.

Goal 4: After completion of goals 0 - 3, disband the community

To facilitate this community, the initial list of core contributors (
should they accept ) shall be:

Faciltator:
John Sonnenschein ( error404 )
Core Contributors:
Jason King ( jbk )
Steve Stallion ( stallion )
Roland Mainz ( gisburn )
Joerg Schilling ( joerg )
John Sonnenschein ( error404 ) ( note: i'm not sure if this is
implied by "facilitator" )
Garret D'Amore ( gdamore )
Contributors:
John Plocher ( plocher )

Wednesday, March 12, 2008

Bio & Affiliation

hey, so this is a little on the late side, but not too late I hope

Disclosure: I currently work for an MS Windows based contracting firm that has no commercial interest in OpenSolaris. I therefore am under no obligations that would prevent me from fully representing the community if elected to the OGB.

Bio: I'm an undergraduate computer science student at the University of Northern BC and a part-time software developer . I joined OpenSolaris approximately 3 or 4 months after the inception of the project, being a former 7 year Linux user. I was drawn to OSol because of the tradition of unrelenting engineering excellence that I knew Sun & her engineers for.
I envision the OGB's primary charter for the time being as being the liaison with the benefactor of OpenSolaris ( that is, Sun ). The OGB ought to be the go-between of Sun's management and the rough-and-tumble of the community. Sun has immense influence on the community as it is a community built around their code, and it's important for the OGB to act on the community's behalf when faced with decisions regarding steering that community ( for example, the recent trademark debacle. I believe the OGB could have had a much more influential role negotiating with Sun during that incident).

Wednesday, January 2, 2008

How to turn a mirror in to a RAID

People occasionally ask on the mailing lists and in #opensolaris how to add a disk to a zfs mirror to make a raidz. Today, I received in the mail a new SATA controller and a new disk, so I was left in the same circumstance.

There's a drought of information on the topic on the internet, probably due in large part to the typical deployment of ZFS ( i.e. large shops that have a ton of spare disks laying around, or have otherwise planned out a migration path beforehand ), rather than the small home user.

So, here's what I did:

On a high level, we have to remember what sort of replication we've got for any given RAID level. More accurately, we need to know how many disks can be broken before the whole thing falls apart.

When we've got a single drive, that drive can't die, or we lose everything (obvious). With a mirror, we can't have 2 drives die. A 3-disk RAIDZ ( raid5 ) requires at least 2 operational disks out of 3. So, when moving from a 2 disk mirror to a 3 disk raidz, we obviously don't have enough disks to have both of them operational in full, even if we break the mirror in to a single disk.

But, if we count the number of disks allowed to be dead ( 2 ) at any given time, and the number we have ( 3 ), we can spread them out such that two degraded pools exist. One single-disk ( broken mirror ) and one degraded zpool ( 2 disks + NULL ).

So the procedure we'd use to attain this state is break the mirror, create a zpool with the new disk and the old mirror drive, copy the data over, destroy the old mirror, attach the old second mirror disk to the new raidz.

For the purpose of demonstration, I'll use the disks I've got attached to the system, c2d0, c3d0, and c4d1 .

first, the starting condition:


$ zpool status

  pool: xenophanes
 state: ONLINE
 scrub: none requested
config:

        NAME        STATE     READ WRITE CKSUM
        xenophanes  ONLINE       0     0     0
          mirror
            c3d0      ONLINE       0     0     0
            c2d0      ONLINE       0     0     0

errors: No known data errors

Now, to break the mirror:

# zpool detach c2d0

so, what I've got now is a single-disk zpool comprised of c3d0, and two free disks, c2d0 and c4d1.

To create a raidz, you need 3 devices. We only have two. We can solve this problem, however, with sparse files and loopback.

Loopback allows you to use a file the same way you'd use any other block device in /dev. Linux has it ( mount -o loop ), Solaris has it ( lofiadm ). It's a pretty common thing.
A sparse file is a type of file where the filesystem only stores it's beginning and end pointer information, and a size. The actual contents of the file aren't stored until you begin to write to them. This allows us to do things like create a 140GB file on a 140GB disk with plenty of room to spare. And that's precisely what we'll do.

You can create a sparse file easily with dd(1) like so:

$ dd if=/dev/zero of=/xenophanes/disk.img bs=1024k seek=149k count=1

bs is block size, 1kb. seek is the number of blocks to skip ( and is equal to the size of the drive in kb, because of the previous bs= line ), and count tells dd(1) to copy one block.

and we can create a device like so:

# lofiadm -a /xenophanes/disk.img
/dev/lofi/1

So, to recap, here's what we have. We have a zpool, two spare disks ( c2d0 and c4d1 ) and a sparse file the size of those disks hooked up with lofi. And if you'll notice, that's precisely what we need.

From here out, we need to create the raidz, degrade it ( otherwise we'll fill up a sparse file that's the same size as the other disk, it'll run out of space, stuff will break... it won't be pretty )

# zpool create heraclitus raidz c2d0 c4d1 /dev/lofi/1

ta da! a raidz. Now let's break it.

# zpool offline heraclitus /dev/lofi/1 && lofiadm -d /dev/lofi/1 && rm /xenophanes/disk.img

and here's what we get:


# zpool status
  pool: heraclitus
 state: DEGRADED
status: One or more devices has been taken offline by the administrator.
        Sufficient replicas exist for the pool to continue functioning in a
        degraded state.
action: Online the device using 'zpool online' or replace the device with
        'zpool replace'.
 scrub: none requested
config:

        NAME             STATE     READ WRITE CKSUM
        heraclitus       DEGRADED     0     0     0
          raidz1         DEGRADED     0     0     0
            /dev/lofi/1  OFFLINE      0     0     0
            c4d1         ONLINE       0     0     0
            c2d0         ONLINE       0     0     0

errors: No known data errors

  pool: xenophanes
 state: ONLINE
 scrub: none requested
config:

        NAME        STATE     READ WRITE CKSUM
        xenophanes  ONLINE       0     0     0
          c3d0      ONLINE       0     0     0

errors: No known data errors

as you can see, heraclitus is degraded, but operational.

so, we can just create our filesystems and copy data over

# zfs create heraclitus/home && zfs create heraclitus/opt
# cd /xenophanes/home && cp -@Rp * /heraclitus/home/ && cd /xenophanes/opt && cp -@Rp * /heraclitus/opt

and go have a cup of coffee or 12. When that's complete, we destroy the old pool

# zpool destroy xenophanes

and replace the lofi disk with the old zpool's disk

# zpool replace heraclitus /dev/lofi/1 c3d0

and there you have it. a 3-disk raidz out of a 2-disk mirror. No data juggling, tape drives, or extra disks necessary

Tuesday, December 11, 2007

KDE & DTrace

At the moment I'm looking at adding dtrace probes to Apache/RogueWave's libstdcxx ( the C++ standard library) and I'm coming up against a couple hurdles, not least of which is the requirement for C++ name mangling.

C++ implements templates, namespaces, function overloading, inheritance and a myriad of other things that make plaintext names for functions unfeasible. So, in order to properly probe a function, we can't use the standard provider:module:function:probe naming scheme ( since the function will be something meaningless like __1cEswap4Ci_6FrTA1_v_ for a function named 'swap'

My current thought is that, since the function name is meaningless, perhaps we ought to exploit the probe names.

Talking with Damian on irc, we came up with a naming scheme for our probes that looks something like this: namespace_[class, if any]_function__probename

So a function like will have a probe named 'entry' that looks something like global-swap-TT-TT-entry.

The other option is to pretend the problem doesn't exist, use the fbt provider as normal, and pipe dtrace -l through c++filt but I don't really like that.

Thursday, November 1, 2007

Indiana

The proverbial cat is out of the bag now, and the Indiana project has released something.

I downloaded it last night and installed it in Fusion and Parallels. The installer is still pretty braindead, but I'm sure it'll improve to have options eventually. It is alpha after all.

I ran in to a weird bug where I haven't managed to coerce it in to letting me log in, so aside from that I haven't had a chance to fiddle enough to pass judgement on it.

It unfortunately had to come with what I perceive to be a slap in the face to the OpenSolaris community, when it was pronounced from on high that it shall be called OpenSolaris to the exclusion of all others.

This really gets my goat, so to speak (I'm rocking the animal metaphors today I guess) I don't particularly agree with the name change to begin with. I think OpenSolaris can exist as a noun in and of itself much as Linux does today, or as Joerg put it, like a screwdriver that you build things with, and not a particular screwdriver.

Ian's ranting about being confused how to download "OpenSolaris" notwithstanding (it being nothing but pure rhetoric to serve a purpose anyways), I think the target audience is intelligent and well-accustomed to the idea of distributions by now that calling OpenSolaris a codebase and nothing more isn't a real issue.

The slap in the face part comes primarily from the way that the name was chosen. Had there been a vote which suggested that Indiana be called OpenSolaris, then fine, fair enough. In this case however, the name came from executive fiat from Murdock quite aside from how the community feels about the issue.

Fortunately the OGB seems to have collectively grown a pair (individual members have been outspoken already. They're good.) and seem to be seriously discussing condemning this action by Murdock and the rest of the marketing crew and imploring Sun to hold a vote on the matter, which is something I support 100%.

So, in conclusion
Congrats to the indiana team for pushing something out the door. The marketing team's actions aren't your fault.

Friday, October 26, 2007

Large cats, ZFS and you

Like many, I got my copy of OSX 10.5 "Leopard" today to install on my MacBook. The upgrade went relatively smoothly and I was presented with the new UI in all it's flashy glory.

After doing the standard explorations: Coverflow in my images/ folder, new Mail.app. OSX supports ACL's now, that makes me pretty happy. Then, I decided to start playing with the OpenSolaris-derived features.

Dtrace. works like a charm. Colloquy spends an inordinate amount of time in syscall::read. A lot of Mac apps don't use syscall:::, that's interesting to note but not terribly surprising considering the way XNU is designed.

ZFS. Now, the version of ZFS that ships with OSX is read-only. My server, an Athlon64, and my workstation, a Blade1000 run SX:CE b74.

So, I pull out my USB stick, put it in the SPARC, and turn it in to a zpool named 'test'. export.
Put it in my MacBook.
$ sudo zfs import -a
The ZFS version is too new. OSX can't import it.

So I decide "what the hell..." and I log in to my ADC account, download the read/write beta of ZFS from Apple. Install it. Reboot. Put the USB stick in.
$ sudo zfs create newpool disk1
'newpool' shows up on my desktop. So far, so good.
$ sudo zfs create newpool/test
still good so far. I copy a small image over to the new pool, it does what you'd expect. So I unmount the volume. Won't unmount because it contains other volumes. that's silly, but okay... it's a beta and OSX doesn't really grok ZFS yet. Fire up Terminal.app again
$ sudo zpool export -f newpool
no dice, dataset busy. unmount it's mountpoint then. Try again. No feedback, it must've worked.
yank the USB key. The Mac gives me the multilingual kernel panic window ( it's not a blue screen, that means it's better than Windows, right? ). Whatever.
I plug the key in to my Blade1k and run a
# zpool import -a
I/O error reading dataset 'test'. Kernel Panic. That's interesting... shouldn't the old dataset been destroyed when the new dataset took the device over?

So, I bring it back over to the Mac and run disk util, maybe I can just destroy the EFI partition table and try again from scratch. Disk util somehow mounts 'test' to my desktop and sits there spinning forever. I can't unmount test at all because it's in use. Force Quit Disk Utility. Nothing. kill -9 it. Nothing.
$ sudo umount -f /Volumes/test
could not unmount. yank the drive. kernel panic.

At about this point I decide that my best course of action with respect to this USB key is to find a machine that doesn't support ZFS and format it there, and give up trying to coax OSX to behave like Solaris.

Casualties: one USB key.
Conclusion: When Apple says "beta", they mean it.