Posts in 'samba'

The Samba Buildfarm

Portability has always been very important to Samba. Nowadays Samba is mostly used on top of Linux, but Tridge developed the early versions of his SMB implementation on a Sun workstation.

A few years later, when the project was being picked up, it was ported to Linux and eventually to a large number of other free and non-free Unix-like operating systems.

Initially regression testing on different platforms was done manually and ad-hoc.

Once Samba had support for a larger number of platforms, including numerous variations and optional dependencies, making sure that it would still build and run on all of these became a non-trivial process.

To make it easier to find regressions in the Samba codebase that were platform-specific, tridge put together a system to automatically build Samba regularly on as many platforms as possible. So, in Spring 2001, the build farm was born - this was a couple of years before other tools like buildbot came around.

The Build Farm

The build farm is a collection of machines around the world that are connected to the internet, with as wide a variety of platforms as possible. In 2001, it wasn’t feasible to just have a single beefy machine or a cloud account on which we could run virtual machines with AIX, HPUX, Tru64, Solaris and Linux so we needed access to physical hardware.

The build farm runs as a single non-privileged user, which has a cron job set up that runs the build farm worker script regularly. Originally the frequency was every couple of hours, but soon we asked machine owners to run it as often as possible. The worker script is as short as it is simple. It retrieves a shell script from the main build farm repository with instructions to run and after it has done so, it uploads a log file of the terminal output to samba.org using rsync and a secret per-machine password.

Some build farm machines are dedicated, but there have also been a large number of the years that would just run as a separate user account on a machine that was tasked with something else. Most build farm machines are hosted by Samba developers (or their employers) but we’ve also had a number of community volunteers over the years that were happy to add an extra user with an extra cron job on their machine and for a while companies like SourceForge and HP provided dedicated porter boxes that ran the build farm.

Of course, there are some security usses with this way of running things. Arbitrary shell code is downloaded from a host claiming to be samba.org and run. If the machine is shared with other (sensitive) processes, some of the information about those processes might leak into logs.

Our web page has a section about adding machines for new volunteers, with a long list of warnings.

Since then, various other people have been involved in the build farm. Andrew Bartlett started contributing to the build farm in July 2001, working on adding tests. He gradually took over as the maintainer in 2002, and various others (Vance, Martin, Mathieu) have contributed patches and helped out with general admin.

In 2005, tridge added a script to automatically send out an e-mail to the committer of the last revision before a failed build. This meant it was no longer necessary to bisect through build farm logs on the web to find out who had broken a specific platform when; you’d just be notified as soon as it happened.

The web site

Once the logs are generated and uploaded to samba.org using rsync, the web site at http://build.samba.org/ is responsible for making them accessible to the world. Initially there was a single perl file that would take care of listing and displaying log files, but over the years the functionality has been extended to do much more than that.

Initial extensions to the build farm added support for viewing per-compiler and per-host builds, to allow spotting trends. Another addition was searching logs for common indicators of running out of disk space.

Over time, we also added more samba.org-projects to the build farm. At the moment there are about a dozen projects.

In a sprint in 2009, Andrew Bartlett and I changed the build farm to store machine and build metadata in a SQLite database rather than parsing all recent build log files every time their results were needed.

In a follow-up sprint a year later, we converted most of the code to Python. We also added a number of extensions; most notably, linking the build result information with version control information so we could automatically email the exact people that had caused the build breakage, and automatically notifying build farm owners when their machines were not functioning.

autobuild

Sometime in 2011 all committers started using the autobuild script to push changes to the master Samba branch. This script enforces a full build and testsuite run for each commit that is pushed. If the build or any part of the testsuite fails, the push is aborted. This alone massively reduced the number of problematic changes that was pushed, making it less necessary for us to be made aware of issues by the build farm.

The rewrite also introduced some time bombs into the code. The way we called out to our ORM caused the code to fetch all build summary data from the database every time the summary page was generated. Initially this was not a problem, but as the table grew to 100,000 rows, the build farm became so slow that it was frustrating to use.

Analysis tools

Over the years, various special build farm machines have also been used to run extra code analysis tools, like static code analysis, lcov, valgrind or various code quality scanners.

Summer of Code

Of the last couple of years the build farm has been running happily, and hasn’t changed much.

This summer one of our summer of code students, Krishna Teja Perannagari, worked on improving the look of the build farm - updating it to the current Samba house style - as well as various performance improvements in the Python code.

Jenkins?

The build farm still works reasonably well, though it is clear that various other tools that have had more developer attention have caught up with it. If we would have to reinvent the build farm today, we would probably end up using an off-the-shelve tool like Jenkins that wasn’t around 14 years ago. We would also be able to get away with using virtual machines for most of our workers.

Non-Linux platforms have become less relevant in the last couple of years, though we still care about them.

The build farm in its current form works well enough for us, and I think porting to Jenkins - with the same level of platform coverage - would take quite a lot of work and have only limited benefits.

(Thanks to Andrew Bartlett for proofreading the draft of this post.)

comments.

Samba 4.0.0, finally

This afternoon we released version 4.0.0 of Samba. This is a significant milestone, and I’m very proud of the result. Samba 4 is the first version that can be a domain controller in an Active Directory domain.

We embarked on this journey almost a decade ago - the first commit is from August 2003. It’s been a long and bumpy ride. I hardly recognize the people in this team photo from 2003 (I’m second from the left).

A lot has happened in that time. We wrote a few million lines of code. We migrated from CVS to Subversion to Git. We’ve drifted apart and grown back together as a team.

In my youthful naivity I predicted a release “within 1 or 2 years” during a talk at the NLUUG in 2004. But Active Directory was a lot harder than we thought, and there were quite a few other distractions as well. I’m glad this release, which is by far the biggest and longest running software project I have ever worked on, has finally happened.

Some older RCs of Samba 4 have already been packaged for Debian and Ubuntu, in the samba4 source package. For Debian jessie, these will be integrated into the main samba source package. Please use experimental if you do want to try the existing packages, as it is most up to date.

comments.

Summer of Code 2011

The Samba team is once again participating in the Summer of Code this year. This year we have 4 students working on various projects related to Samba.

This year I am mentoring Dhananjay Sathe, who is improving the GTK+ frontends for Samba. In particular, he is making it possible to manage shares and users of a remote Samba or Windows machine.

Dhananjay is also blogging about his progress.

comments.

On the way to Samba 4: Part 2

It’s been more than a month since the last status update on my Samba 4 work - much more than the two weeks I promised.

During the holidays I finally managed to release the new alpha of Samba 4, as well as releases of some of our companion libraries (tdb, talloc, tevent and ldb). The release includes a significant amount of bug fixes and a lot of work towards a properly functioning Active Directory DC, too much to list here.

This release I’ve mainly been involved in improving our Python bindings and our handling of internal and external libraries. We now use symbol versioning for our copy of Heimdal Kerberos as well as some of our other libraries. Hopefully this will fix some of the issues users of the evolution-mapi plugin have been seeing where they end up with both MIT Kerberos and Heimdal Kerberos loaded into the same process (with all the consequences of overlapping symbol names). Samba 4 now also has the ability to work with the system Heimdal rather than using the bundled copy. I have packaged alpha14 for Debian and Ubuntu (fixing most of the open bugs against the Samba 4 package in the BTS), but am currently waiting for the new release of ldb to pass through NEW before I can upload.

The next release is scheduled for the first week of February.

Currently Playing: Stream of Passion - Haunted

comments.

On the way to Samba 4: Part 1

After Samba XP 2008 Andrew and I started keeping a wiki page with our bi-weekly goals and achievements for Samba 4. Because planning in a Free Software project is hard (time availability and priorities change over time, and other volunteers are equally unpredictable) we called this our “Fantasy Page”; it listed things we wanted to work on next (“fantasies”), but reality being what it is we would usually actually end up working on something entirely different. We discussed our progress and new plans in - what I would now call - a bi-weekly standup call.

There were several reasons for doing this. It gave us some sense of direction as well as a sense of accomplishment; a way to look back at the end of the year and realize how much we had actually achieved. Because Samba 4 is such a long term project (it is 7 years old at this point) it is easy to become disillusioned, to look back at a year of commits and to not see the gradual improvement, just the fact that there is no release yet.

We managed to keep this up for two years, much longer than I had anticipated, and eventually started to slip last year.

More recently Kai and Tridge have started to blog weekly about their efforts to make Samba 4.0 a reality and I’m going to join them by trying to blog regularly - every two weeks - about my contributions, even if there were none.

In the next two weeks I plan to work on finally getting alpha 14 of Samba 4 out and on fixing the daily builds of Samba 4 and OpenChange for Ubuntu on Launchpad after we did a massive reorganization of the private libraries in Samba 4.

Current Playing: Zero 7 - Somersault

comments.

subunit usage in Samba

Both Samba 3 and Samba 4 are now using the “subunit” protocol inside their testsuite (aka “make test”). subunit is a streaming protocol used to report test results that is aimed at being simple to generate and parse as well as being human readable.

A very simple subunit stream might look like this:

1
2
3
4
5
6
7
8
9
test: samba4.tests.util.strlist.check_list_make
creating list...
list created!
success: samba4.tests.util.strlist.check_list_make
test: samba4.tests.util.strlist.check_list_make_shell
creating list...
xfail: samba4.tests.util.strlist.check_list_make_shell [
returned NT_STATUS_NOT_IMPLEMENTED
]

For those that are familiar with the TAP protocol used by Perl, it is similar to that, although it has a couple of features that TAP does not have. For example, it can report timestamps (useful for determining test duration) and has more flexible progress reporting.

Subunit is particularly useful for projects that use multiple programming languages as it allows a single tool to be used for test visualization or analysis rather than one per language. All that’s required per-language is a test runner that can spit out subunit streams.

selftest.pl, the main engine behind Samba’s test suite, has been using subunit internally since its creation a couple of years ago. Most other test tools we use can also report subunit, in particular our Python tests, blackbox tests, Perl tests (using tap2subunit) and smbtorture.

make test” never displays raw subunit results, it always formats them using our format-subunit script. Samba 4’s “make test” stores the raw subunit output in st/subunit.

I’m attending SNIA SDC at the moment and a couple of people here have asked me about the tools I use to display and analyse test results. They are:

subunit

The subunit project contains a bunch of convenience tools for working with subunit. Other than libraries for parsing/generating subunit for several languages it contains tools for manipulating and analysing subunit streams, including:

  • subunit-ls: List all tests in a subunit stream, optionally including their run times (I used this for the test duration summary I sent to the Samba mailing list earlier)
  • tap2subunit: convert a TAP stream to a Subunit stream
  • subunit-stats: Print statistics for a subunit stream (how many successful tests, failed tests, skipped tests, etc)
  • subunit-filter: E.g. remove test result or output from a stream
  • subunit-diff: Compare two subunit streams and see what tests have started failing or are no longer failing
  • subunit2pyunit: Format a subunit stream using Python’s standard unit test test result formatter

We’re including the subunit tree in the Samba git tree at lib/subunit.

tribunal

Tribunal is a GTK+ viewer for subunit streams. It allows for easy browsing of test results. Tribunal is still a bit rough around the edges, although it should already be useful.

Example usage:

1
2
$ make test
$ tribunal-subunit st/subunit

testrepository

Test Repository provides a database of test results which fits into developers work flow and keeps track of useful information like what tests are failing, or which failures have the same backtrace.

In particular Test Repository can re-run only the tests that failed in the previous test run:

1
2
3
4
5
6
7
$ testr init
# Run the full testsuite (1 hour goes by)
$ testr run
# Run those tests from the testsuite that failed in the previous run
# (this would be a lot shorter usually, depending on how many tests were
# failing)
$ testr run --failing

testrepository is also still in its early days, but can potentially be very useful, e.g. when comparing old test runs on the buildfarm.

comments.

Samba Summer of Code

As I have done in previous years, I am again participating in the Google Summer of Code as mentor for the Samba project.

Last year I Andrew and I co-mentored three students with mixed results. In the end we had to drop one of our students but the other two did well. I’ve only taken on one student this year for various reasons.

The amount of time required to mentor a student varies wildly depending on the student and is hard to predict based on their application. Some students seem to require quite a lot of mentoring while others are self-motivated and self-learning. This has not just been my experience, I’ve heard similar stories from fellow mentors on other projects.

Last summer Ricardo worked on SWAT for Samba 4 and he is still actively working on the project, even after the Summer of Code has finished. I hope to find the time to package SWAT in time for Debian Squeeze. At the moment SWAT just supports managing shares but Ricardo is working on user management.

In 2009 Calin worked on the GTK+ frontends for Samba, in particular changing them to be Python-based rather than C-based. This year his work is going to be continued by Sergio, hopefully with the some user-ready tools as the end result.

Currently Playing: Gazpacho - 117

comments.

Nostalgia: 10 Years of Samba Hacking

While searching for something else I happened to come across one of my first posts to the ntdom list in November 2000.

My post is a simple question about a Samba crash that I myself no doubt had introduced. I’m sure I could have found a solution to it by using Google - excuse me, AltaVista - but I still received a friendly reply from Jerry explaining me to use GDB. I’m not too embarrassed, at least I used proper punctuation and already wrote somewhat comprehensible English back then.

It’s also strange to realize it’s already been almost ten years since I started hacking on the Samba project.

comments.

Summer of Code 2009

For this years (the fifth?) Summer of Code, I participated once again as a mentor for the Samba and OpenChange projects.

Samba was assigned four slots this year: one was a CIFSFS project mentored by Steve French and the other three were Python projects related to Samba 4, co-mentored by Andrew and me. Our students did very well this year, although we unfortunately had to drop one after the mid-term evaluations due to lack of effort. Nonetheless, we’re very happy with the results of the other two projects:

Calin Crisan (France) converted the rest of the applications in SambaGtk to Python, and worked on a GTK+ user manager for Samba and Windows. With his improvements, it is now possible to edit registries, manage users, inspect the endpoint mapper, plan tasks and manage services on a remote Windows machine using a GTK+ application on a Linux workstation.

Ricardo Velhote (Portugal) designed and implemented a new version of SWAT - the Samba Web Administration Tool. Unlike the old SWAT, his implementation is more than just a simple web-based editor for smb.conf. As we were expecting at the start of the Summer of Code, not all of the functionality could be implemented properly in a couple of months, not while getting the design and infrastructure right. With a basic version working, we now hope the remaining subsystems can be contributed with help from the community.

I’m planning to merge Calin’s improvements to Samba-Gtk into the mainline in the next month or so. SWAT is a standalone application and will continue to live as a separate project, while being a part of the Samba ecosystem. Congratulations, Calin and Ricardo!

comments.

Franky” Talk at SambaXP

I’ll be giving a talk at the next NLLGG meeting about the Franky project.

Update: Slides

comments.

SambaXP 2009

Last week most of the Samba team met again for our annual conference in Göttingen. It was nice seeing everybody again, specially the folks I hadn’t seen since the last one.

Together with Andrew and his wife Kirsty I took the train from Amsterdam into Germany a couple of days early and we did some sightseeing together with Anatoli and Nadezhda during the weekend. There’s still plenty of things to discover in Göttingen for me, even though I’ve already been there about two dozen times. We did a tour of the city walls, visited some of the churches and climbed the tower.

Julien’s talk about OpenChange was interesting and humorous as always. Volkers’ tutorial on asynchronous programming in C. Even though I’ve spent quite some time working with and looking at these API’s it was nice going through them step by step once again. It’s a strange thing to wrap your head around.

Andrew and I also gave our yearly “State of Samba 4” talk again. As I’ve mentioned in other places, I’m really excited about the social effects of the Franky project. Once again I was reminded that giving a talk the morning after the conference party (this year in the “Oriental Lounge”) is a bad idea.

Several of my fellow Debian Samba maintainers made it to SambaXP, it was nice to see Christian, Luk, Michael and Noël there. We made some decisions about the direction of the Samba packages, and a plan to allow the Samba 3 and Samba 4 packages to be installed on the same system. Unfortunately I had to miss Christian’s talk because it was in the same timeslot as Jeff’s talk about the CIFS kernel module.

comments.

Reconciling the Samba 3 and Samba 4 source code trees

While a few of us have been working very hard on Samba 4 to allow it to rock your socks off as an Active Directory Domain Controller, some of the other Samba developers have been working just as hard on improving the existing Samba 3 codebase and adding features to that. This situation has caused tension between developers as well as technical problems in the past - code with the same purpose is being developed in parallel, libraries diverge because features are only added in one branch and not in the other, one codebase is considered “obsolete” by some and the other is considered only a playground for experimental features by others.

As of yesterday, we now have the two codebases living in one and the same git branch. This should make it a lot easier for the two to use the same libraries. Better yet, it should allow us to reconcile the copies of various libraries that exist in both codebases, all of which have diverged to some degree in the last few years.

After a few problems came up merging the two branches the easy way (they both have a directory called “source” and git doesn’t deal well with renaming them to “source3” and “source4” respectively), we decided to replay the history of both branches . This has the disadvantage that all existing branches that are based on the Samba 3 and Samba 4 branches will have to be rebased against the new master branch, but it also means we keep the ability to run “git log” inside of our source directories and having it work right.

Other than the fact that this makes it possible to share more code between the two codebases, one of the ideas we have is also to see if it is possible to provide an Active Directory DC by glueing the best bits of Samba 3 and Samba 4 together (aka “Franky“) before they are eventually merged completely.

Currently Playing: Phideaux - Formaldehyde

comments.

SambaXP and other travel

It’s been a busy two weeks. Wilco and I drove up to Göttingen on Sunday two weeks ago to spend some days hacking and meeting up with the other developers before the start of SambaXP. It was really nice to see everybody again after more than 7 months.

SambaXP was a bit different this year. There were three tracks during the second part of the conference this year, one more than previously and of course, there were several engineers from Microsoft attending this time! Some of the interesting talks this year included Julien’s update on OpenChange, Tridge’s talk on PFIF, the talk from the likewise folks and of course the talk from Microsofts’ Wolfgang Grieskamp on SMB2. We also had some other informal discussions with the Microsoft folks about specific topics - very useful!

There are some photos up on the SambaXP homepage. And just to be ahead of the comments: yes, I know I need a haircut.

I did some initial work on several bits and pieces of code that I hope to expand over the next few months. Volker has started working on ncacn_ip_tcp support and I have been working on making the Samba 3 DCE/RPC library compatible with Samba 4. This should allow OpenChange to use Samba 3 in the future.

Guenther, Wilco and I made some initial progress on the policy library, allowing client-side manipulation of (group) policies in Samba. I worked with Simo on trying to get rid of an evil hack in Samba4’s event subsystem.

David Holder blogged about some of the IPv6 development that we did during the conference: http://www.ipv6consultancy.com/ipv6blog/?p=34

And lots of other things I can’t remember at the moment…

After the conference Andrew, Wilco and I drove back to the Netherlands and I played tour guide for a bit showing Andrew around the country during the afternoon and hacking Samba together in the morning. Later this week we took the train to Brussels, Eurostar to London and visited Sam’s company in the UK Midlands for a couple of days.

And in the midst of all this, it seems Ubuntu Hardy was released. Congratulations to all those involved!

Currently Playing: Brandi Carlile - Turpentine

comments.

Samba’s tdbdump reimplemented in Python

Less than 150 characters in Python, while the original implementation in C requires more than 2000 characters

1
2
3
4
5
6
import tdb, sys

db = tdb.Tdb(sys.argv[1])
for (k, v) in db.iteritems():
    print "{\nkey(%d) = %r\ndata(%d) = %r\n}\n" % (len(k), k, len(v), v)
}

comments.

GTK+ LDB Browser

As some may have noticed, a large portion of my Samba 4 work during the last few months has been focussed on adding Python bindings for our various public libraries and the refactoring necessary to make it possible to add Python bindings. So far, we have bindings for LDB and TDB but I intend to add bindings for most of our public API so it is possible to, for example, open Windows registry files, join domains, etc. from Python.

LDB is our LDAP-like embedded database, and is for LDAP what sqlite is for SQL. Last night I decided to see how hard it would be to write a graphical browser for LDB using Python, and it turned out to be quite easy, thanks to PyGTK. There is a screenshot of what it looks like here. Packages with the Python bindings for LDB are already in Debian.

The sources for gtkldb are available in the samba-gtk bzr branch at http://people.samba.org/bzr/jelmer/samba-gtk/trunk, along with some of the GTK+ frontends for Samba 4 I wrote earlier (gregedit, gwcrontab, gwsvcctl, gepdump and gwsam).

comments.