Don’t gloat about bloat

I’m going to take a moment to talk about bloat. What it means, and why one man’s bloat is another man’s necessity. First, a definition: bloat is an arbitrary measure of subjectively “useful” features minus subjectively “useless” features, divided by the size of something (in memory or on disk or whatever). If you want a non-computing comparison, then in pure Body Mass Index terms, most rugby players are considered morbidly obese – but in reality, the “useful” muscle accounts for the high weight-to-height ratio, compared to “useless” fat that BMI is meant to indicate.

And the same is true for apps. An app with few features and high requirements (RAM or disk) is bloated. An app with many features, all of which are useful, but modest requirements, is not bloated. The problems are how to define all the terms involved – and those messy in-betweens. If you need to pick between two apps (say for inclusion by default on a distro CD), and those two apps have feature parity, then the choice between them is easy. Where apps have their own unique features, it requires looking at the bigger picture.

Let’s take e-mail clients for example. Ubuntu ships with Evolution by default – a hefty application consuming just under 60 MiB of disk space in the Ubuntu default install of ~2.2 GiB. These numbers are taken on AMD64, by the way, where C-based apps are larger than on i386. And my figure for Evolution is just for the client – it doesn’t include the data server component used by Ekiga. Would a different client be smaller? Absolutely. Thunderbird would save about 20 meg of space. Claws (with no plugins) would save over 50 MiB. But the question is, is the space saved worth the trade-offs? The 20 meg gap between Thunderbird and Evolution means going from giving new Linux users “Outlook for free” to “Outlook Express for free” – are the added features, the added kudos of offering a Free equivalent to a £90 groupware client, worth 20 meg? Is that bloat, or is it usefulness? That’s the question distro makers need to answer. Taking it further, OpenOffice.org accounts for a terrifying amount of that 2.2 GiB install – just under 245 MiB, or 10% of the install. Would any inevitable space savings from switching to the GNOME Office suite (Gnumeric et al) be worth the loss in features?

What I’m getting to here is about my Banshee post. When I made my post, citing 6 meg saved, it was facetious of me. The numbers are true, don’t get me wrong – but it would be wrong to characterise Rhythmbox of being bloated. The difference in size between the two apps can be accounted for by things like, say, RB having about 6 meg of documentation provided, and Banshee having none. I picked an attention-grabbing headline, because the fact that a C#-based app like Banshee could actually be pretty lightweight was interesting, and challenged preconceptions. The real headline though, the reason I’m proposing switching to Banshee as default, isn’t because of space savings. It’s because the other plusses – the active development community, the constant improvement, and the features in Banshee not found in Rhythmbox – do not result in any significant losses in disk space. The Ubuntu Desktop team would love to have all the space in the world, to be able to include big things like OpenJDK by default, but they don’t and they can’t. The reason why I feel now is the time to make this proposal for Banshee (and I should note that whilst I’ve been using Banshee for a few years, I’ve previously strongly felt that Banshee was insufficiently mature, and also pulled in too much crap on disk, to be a sensible proposal), is that Banshee’s biggest plus – the Ubuntu-style striving for self-improvement – now comes with so little cost compared to the rather unmoving, “mature and unflinching” Rhythmbox.

I’ll always advocate for what I feel is the best technical solution to a task, regardless of the languages involved – and I really feel that “actively worked on” is enough of a feature to top the scales. Listen is pretty nice, but missing a bunch of features (are the missing features bloat? Who can say) and is still a little immature. Exaile is better, but actually consumes more resources than Banshee for no added features. Songbird? Well, Songbird is just plain bloated (very much aware of the irony, thank you) with its forked copy of XulRunner gobbling up twice the RAM AND disk space of Rhythmbox, and offering only an un-Accessible UI in return.

So going into UDS, I want to make it clear – I want to discuss with the desktop team candidly about which of the available applications has a sufficiently high features-to-resources ratio, with a clean migration path and no major feature losses, to warrant changing to default. If Rhythmbox wins out, then so be it – the process will result in valuable feedback which I have no doubt Banshee’s upstream could convert into fixes by the time Ubuntu 10.04 rolls around. And that active upstream is why I think this is a decision Banshee stands a good chance of winning. May the best app win, now and in the future.

13 Responses to “Don’t gloat about bloat”

  1. It is true that features, size of community, etc. are more important than programming languages. But it just so happens that programs written in modern programming languages will in general have more features, be more likely to be actively maintained, etc. Because it is easier to maintain it!

    So while a programming language by itself is not a technical reason, don’t think it is just a coincidence.

    Everyone not working on kernel-mode code, program in C# or Python, or some GC programming language.

    [reply]

  2. It seems to me that when you consider the resource requirements of Banshee, you also need to include the mono dependencies. Regarding mono apps, I do not know quite what to make of the controversy, but it seems to be that since mono is a stumbling block for so many in the community, it would make sense NOT to include any mono apps (or libraries) in the default Ubuntu installation (though always optional to install through Synaptic) since there is an equal or superior alternative for each mono app included in the current default install.

    You may regard this as OT but I think it is relevant since we are talking about bloat.

    Also, I hope that this aspect / issue is address at UDS but it appears that it had already been decided upon by the SABDFL.

    [reply]

    directhex Reply:

    @stlouisubntu, Mono is already part of Ubuntu, and I’m not aware of any proposals being taken seriously by anyone who matters (which includes SABDFL) to remove it. The “equal or superior” question is entirely the focus of the Banshee proposal though – and I would ALWAYS support a clearly superior choice regardless of language, provided all histrionics are left at the door.

    [reply]

  3. “I do not know quite what to make of the controversy, but it seems to be that since mono is a stumbling block for so many in the community, it would make sense NOT to include any mono apps (or libraries) in the default Ubuntu installation”

    That’s a terrible reason to discriminate against any language/app/tool. Mono is a large project, as with all large projects there are people who love it and people who hate it. Before you make any decision you need to take out the emotion and just evaluate whether or not the project is of benefit. If everyone listened to the vocal minority we’d still be running everything from a terminal window and use vi.

    [reply]

  4. Simple here program bloat is program bloat. Currently all C and C++ programs in most distributions are bloated due to there complier on disk and ram. Everything mono depend on is bloated by the complier that built the code it uses including the Linux Kernel it self.

    Program Bloat is simply using more resources than it should. Not numbers of features that is not even part of the measurements. Straw man arguments that trading down on features is required is wrong. There are a lot of programs that are bloated in different ways.

    Problem is we are in the netbook age to build cheaper netbooks OS’s cannot keep on demanding larger and larger specs yet more and more features are required. The focus on cleaning up program bloat has to return. This includes stopping using systems that cannot opt to the most effective form. If you were offering lower ram lower disk and lower cpu overheads you would have been heading in the right direction.

    Mono JIT is the wrong direction for the market.

    Feature Bloat is a completely different factor don’t use it as a smoke screen.

    Under valuing ram is a major problem. Yes it would be nice to fit more on disk. Is there a way to get over 6 megs more disk space without trading bloat from one location to another.

    The truth fact is there is way more than 6 megs of bloat hidden in that livecd. Most scary fact lot hides with Libc and GTK. Currently gcc lacks whole program optimization same livecd rebuilt with pgroup complier free up scary amounts there is over 15 percent bloat yes enough space to fit mono complete runtime and complete java runtime and still have space for more applications. Most of that saving comes from simply solving out functions that when passed a constant value will return a constant result that gcc currently cannot do because they are in libraries or different .o files. Basically gcc fails to optimize.

    By the way you said something else. Can 64 bit systems be hybrid answer is yes part 32 part 64 bit. How much would you save just by just having programs that run well as 32 bit as 32 bit. By the way some 32 bit programs are faster than same program 64 bit even on 64 bit Linux. This would free up even more if it done well.

    Going true hybrid 32 bit and 64 bit would require rethinking the complete package management system also require rethinking how core libs like libc and other could be shared between 32 bit and 64 bit to save on disk space ie 64 bit version of libc able to handle calls from 32 bit applications and the reverse in places were it truly reduces bloat in cpu ram and disk.

    Again another path not trading instead reducing bloat keeping features and gaining speed.

    There is mountains of work to get the native code of Linux in order without the distractions of mono.

    Mono JIT is trading bloat less disk space for more ram usage. So you give up one form of bloat to take on another. That is not bloat reduction. It bloat trading.

    If you want to keep JIT you really need to start building patches for the Linux kernel so it can kick your applications JIT data from memory and regenerate it latter. Problem is cpu time again this does not make JIT cheap also means your optimizations get lost maybe you find a way to hide that in the bytecode. Wait you don’t design the .net bytecode so you cannot add new features to it so allowing imprinting of optimizations to make JIT generation fast. Next problem imprinting optimizations could make application larger on disk.

    Next every .net program is a PE exe with a non used dos stub. Drop the dos stub from every .net executable because number 1 Linux is not going to use it will free up a few KB per application. Yes kinda breaks windows from running the .net executable. But this is a Linux disk who cares if the application don’t run on Windows. Then next question why have a PE format at all on there. Why not have a ELF stub with the .Net executable inside. Min size for a ELF is less 100 bytes to trigger another program to run the executable so removing need for binfnt-misc to run .net applications.

    Yes there are a lot of saving that can be done to .net once you give up MS compatibility. Again its a Linux LiveCD. It don’t need MS comparability. Also there are many compatibility failures with Linux ie missing linux stubbing support.

    MS .Net might be a standard but how can you alter it. Lack of means to improve makes it insanely hard to be effective and forces keeping items that cause bloat even after they should have been removed.

    Mono needs either to wake up the market of computers has changed or be discriminated against until they do.

    Current Mono is not optimized to be bloatless. Old rule throw stones from a glass house. That is exactly what you are doing when you are saying mono is better. What you have been comparing crap built applications to crappy .net applications and trying to give a valid long term result.

    [reply]

    directhex Reply:

    @oiaohm, What a lot of waffle. Someone’s channeling Jose_X i think.

    First point: PGI. Unlike most people reading this, I have access to a selection of proprietary compilers, including PGI 8.0. And short version: good luck. IF you can get a typical app which was only ever tested with gcc to compile (say, Gedit, which was the app I tried to work with), then space savings are barely measurable. Generally speaking, PGI is the compiler nobody’s bothering to renew their licenses for – Intel is what the cool kids use.

    Second point: 32/64 bit mix. Bi-arch libs are possible (look at MacOS X), and it doesn’t save space.

    Third point: Nobody uses binfmt to run Mono apps. Every distro uses shell script stubs. Using ELF stubs instead would mean packages can no longer be arch-independent, multiplying the amount of space they take on the archive (oh look, trading bloat)

    Fourth point: ECMA/ISO standards are subject to the same committees in all directions. Want to make changes? Join ECMA, or file a proposal with them via a member (e.g. the GNOME Foundation).

    Fifth point: TL;DR. Learn to be concise, and tailor your narrative to maintain reader interest.

    [reply]

  5. I was expect to be beaten on stub with a correct answer.

    #!/usr/bin/mono Less than 20 byte of stub. That is the stub for cross processor. I gave you the biggest that did not depend on location of mono. Now of course mono would have to support script style running. Shells stop processing straight after that line and run the file with the program they were told.

    You know the funny part about that trading bloat even putting a i386 elf stub on is smaller than the dos stub so I have not traded bloat at all. I just did not remove as much as I could.

    Do your size measurements that dos stub is not the smallest. Basically don’t answer with understanding what you are answering directhex.

    Most systems out there are x86 and x86-64. Even better binfmt-misc + qemu can run elf-i386 on any CPU type passing all system calls straight through to kernel so as long as the stub does not use anything other than syscalls it don’t even need a 32 bit environment to work. Guess what stub has no need for anything else other than syscalls. Very bad mistake to think I had not covered cross cpu type. I have it covered 100 percent with 2 different paths just neither is windows compatible or standard compadible.

    Is there any particular reason why .net applications could not be shipped not stubbed at all and be stubbed on install. There is no particular reason at all that could not be done. You have basically said crap that it exists in the repo is a packaging defect nothing more. Please look a relocatable shell scripts they are processed on install correcting there stub. Small program to stub .net applications on install would be recovered very quickly by the lack of need for stubbing scripts that cost 1 kb disk space each. 20 or 100 bytes is smaller than the dos stub so everything is smaller. Nothing is bigger. To be correct 1 installed program you would recover the cost of the stubbing program compared to current .net applications and still have saving. This is bloat reduction. Even if you are using platform Dependant elf stubs cost still could be recovered quickly. About 100 bytes per stub times by 8 common platforms yep still recovered in 1 single application with profit. This is true bloat reduction.

    PGI is pain in but to build stuff like Gedit with. You need a preprocessor you don’t have and I am not giving it up either. Also you need to get the saving I am talking about to rebuild everything Gedit depends on. So PGI can solve through dependencies. Surprising how many calls Gedit does that are constant or should be swaped. For example of should be swap is printf(“text”), using a putf instead is massively faster avoiding printf processing text before printing. Currently these optimizations don’t happen in gcc. There are a few calls into GTK gedit does that fall into the should be swapped camp. Some applications reduce small amount others reduce a lot more to get back to the averaged I stated. Gedit is not one that reduces to max some poorly coded but commonly used applications cut down by 30 percent +. 15 percent is a average.

    32/64 bit mix does save space. If the livecd does have wine on it the thing already has 32 bit environment on the 64 system. Then anything that runs on the environment wine depends on and is smaller and faster is a straight up saving. Catch is most 32 bit applications are not provided in a installable form for the 64 bit platform because packaging was never built to support duel mode. You can hack convert i386 packages to installable on 64 bit systems. This is a true case of repo bloat there are .so files that are duplicate on the repos of most distributions. Just one .so file is packaged for 32 bit other is packages for x64 32 bit emulation.

    There is bloat everywhere that we can be going after that has nothing to do with changing features.

    Bi-arch example Mac OS X is not what I am talking about. Mac OS are a dual binary the 32 bit and 64 bit do not interact with each other. Yes you can strip the full lot of 32 bit code away and the application still works 64 bit or the reverse. I have coded with ARM processes where changing between 32 bit and 64 bit is simple does not even require ring 0 its used to make programs fit into smaller ram, rom and flashs without losing performance or features. x86 handling of 32 and 64 bit mode is nasty so you have to weigh up carefully where a 32 bit call going into 64 bit code or reverse will save more than the context switchs and data transfer there are still savings just not as many as arm since its a higher cost than arm 64 to 32 and reverse switching. Compliers all forms could be got to weigh up the 32 vs 64 bit and uses the best for blocks of code. Its another lacking optimization.

    Again we are in the netbook age the next round of cpus will be arm. So finding were its worth while operating in hybrid mode is useful information.

    Start thinking the problem through directhex not being another mono idiot thinking yes I have beat him because the following does not match. You will find more often than not I have see every possible path.

    Proposal has to get past Vote. Mono has never put up 1 alteration that has made it into standard so all you are are copiers. Until I see proof that you can alter standard I will keep on treating you as just copiers. I am kinda abusive to people who just follow and don’t think about what they are doing because most of the time they are creating non required bloat back up with arguments that have no logic.

    [reply]

    vslee Reply:

    @oiaohm, is this all you do, read blogs and complain? If you really think GCC is bloated, why not get off your butt and do something about it, like submitting patches or creating a whole new design. Posting stuff about Mono on blogs isn’t going to accomplish anything.

    Even if Mono is bloated (which i don’t think it is), posting som comments isn’t going to magically make it better. If you have some ideas, write up some patches and submit them. Sure beats just sitting around and complaining.

    [reply]

    directhex Reply:

    @oiaohm, Fuck me. When I said “learn to be more concise”, that didn’t mean “follow up an 891-word post with a 1010 word post”. Seriously. If you don’t want to be misinterpreted by casual readers as some kind of rambling incoherent, you NEED to work on your posting style.

    I frankly don’t have the energy to reply point by point to a comment which manages to be longer than the blog post it’s in reply to. But short version:

    “PGI is pain in but to build stuff like Gedit with. You need a preprocessor you don’t have and I am not giving it up either.” – So your answer to “your proprietary compiler actually sucks” is “you need some special secret sauce I won’t give you” – how terribly convenient. Does Roy know how much you promote not only proprietary apps, but proprietary apps with proprietary add-ons? Hint: libxml2 was where it choked.

    “Is there any particular reason why .net applications could not be shipped not stubbed at all and be stubbed on install.” – Yes. ECMA 335 mandates a PE header, and on a Live CD things ARE INSTALLED – it’s a filesystem image measuring ~2.2 gig uncompressed.

    “If the livecd does have wine on it the thing already has 32 bit environment on the 64 system.” – It doesn’t though. Try again.

    I really can’t be bothered to answer more than that. The “wall of text” technique is a popular one used to make comments essentially unrespondable.

    Minor discussion points: PGI is slower than GCC in many cases. And I compiled Mono with ICC once – it was actually ~8x slower at running some managed benchmarks (and similarly faster in others).

    [reply]

  6. [...] is boing rude again: http://www2.apebox.org/wordpress/rants/90/ “@oiaohm, What a lot of waffle. Someone’s channeling Jose_X i [...]

  7. [...] is boing rude again: http://www2.apebox.org/wordpress/rants/90/ “@oiaohm, What a lot of waffle. Someone’s channeling Jose_X i [...]

  8. [...] http://www2.apebox.org/wordpress/r…  <&lt; This one schestowitz [...]

  9. [...] Reviewing the links I saw for the first time another apebox.org rant where he admits: When I made my post, citing 6 meg saved, it was facetious of me. The numbers are true, don’t get [...]

Leave a Reply