Monday, November 07, 2005

Such simple things. . . (pt. 2)

I don't use a generic kernel. As part of getting LFS installed a while back, I worked out a kernel config for my specific needs. A custom kernel is much more efficient & speedy, so it's obviously a good thing to have if you can.

I was using a 2.6.11 kernel, it booted fine. But a 2.6.13 built from the same config didn't want to know - "unable to mount root filesystem" was all it would say.

But it's an EXT3 filesystem! There's no better-established filesystem in the entire *nix world! You MUST be able to mount it! I told it in despair. But no, it refused.

Getting a bit desperate, I wondered if the config itself had gotten fouled up, rather than the new kernel itself having a problem. So I tried compiling a new 2.6.11 kernel with the same config. And it failed in exactly the same way.

So, some progress made: The kernel sources are fine, it's my configuration that's broken. Time to troubleshoot, I reboot off my old, working kernel.

Or what HAD been a working kernel. And technically, it still was. The kernel booted fine. But none of the modules (except Nvidia) worked. Argh!

Fortunately, most of my kernel isn't modular. Only sound and lm_sensors. The Nvidia module was OK, and my network card is compiled in directly. So I could still use Firefox & Google to look for solutions.

Didn't find any.

I went back through my config with a fine tooth comb. But I still saw nothing wrong: Filesystem support was there, NVIDIA and Athlon support built in everywhere. What on Earth was the problem??

I gave up. I'd run out of ideas, and I'd been problem-solving for hours. I stuck a post on the Gentoo forum detailing the problems, and asking for help.

First post in reply came in about 20mins later. The error message was deceptive: It couldn't mount the filesystem because it couldn't talk to the disk. There was a hard-drive chipset support option that was no longer working, it appeared.

Fired with new enthusiasm, I went and looked at the SCSI and ATA sections (SATA is treated as SCSI). And there it was! Support for SATA in the ATA section was deprecated: I would need specific SATA support for my chipset in the SCSI section instead.

The MoBo manual stated I was reliant on Silicon Image for my SATA drives. So I checked the appropriate box, recompiled, and rebooted into 2.6.13 again.

And suddenly, everything worked. The root filesystem mounted, the modules loaded, sound & sensors are suddenly not returning nothing but errors. My custom config lived again!

In celebration, I restored my login prompt's clear-screen switch (which I had disabled to be able to see error messages earlier) and added the purple ASCII Gentoo logo I'd spotted whilst making the change.

After hours of trying, I had eliminated two minor problems. In both cases, I had actually broken the thing I was trying to fix, but in both cases I was still able to get a working system. And in both cases, as soon as I'd identified the problem, it was a literal 10-second fix.

Like so much in Linux, it's so simple, when you know how. . .


titanium said...

You've inspired me, Dom. I really enjoyed your last two posts. Thanks. Like so much in Linux, it's so simple, when you know how
That's so true...

12:36 PM  
Dominic said...

Inspired is good, but inspired to what?

Inspired to switch back to Windows as you don't have kernel config hassles?


2:53 PM  
Computer Guru said...

Great info on the LFS..... I'm trying it in a month.

4:51 PM  

Post a Comment

<< Home