Linux 3.8 failing to operate as Hyper-V guest
Tuesday, April 16. 2013
Earlier I wrote about Hyper-V crashing with BSOD. The entire project was doomed from the beginning. After I managed get the Windows not to crash, all I managed to do is get the Linux installer to hang whenever it attempted to anything major on the hard drive. I configured Hyper-V to provide the hard drive from a .vhdx-file, so I initially suspected that old .vhd-file might help, but no, nothing helped. Any minor operations succeeded, but any sort of normal usage made the Linux to hang.
Symptoms include:
- Console message: "INFO: task jbd2/sda blocked for more than 120 seconds" and instruction to deactivate the warning with:
echo 0 > /proc/sys/kernel/hung_task_timeout_secs
Example: - Repeated "Sense Key" -messages in dmesg, example:
- No change in /sys/block/sda/stat:
- Kernel documentation about block-device stat says that columns 3 and 6 contain the number of sectors read and written.
- In my hung box, the values don't increase.
I was puzzled about this for a very long time. It took me several hours to bump into Linux-SCSI mailing list's discussion about the issue. There Mr. Olaf Hering describes an issue "storvsc loops with No Sense messages".
Luckily Mr. Hering realized what's going on and made a patch to fix the problem. Unfortunately the fix is not yet pushed into mainstream Linux kernel.
Since I was about to install ArchLinux, I took the trouble of compiling the necessary kernel module of hv_storvsc.ko into following kernel versions:
- 3.8.4, used in installation ISO-image:
- SHA-1 sum: 74d2a5de73a4c7d963b649eb34b171eba86a268c
- 3.8.6, the version that got installed when I got my install done:
- SHA-1 sum: 57a4216fc6749085820703d47cd87dcce47b1739
- 3.8.7, the version that it upgraded into when I did a system update:
- SHA-1 sum: 3f8757ab69c97a6389c7c83a8ef57f16e9caa85d
All of the packages are available for you to download at http://opensource.hqcodeshop.com/ArchLinux/2013.04.01/. Your only trick is to get them replaced into initial RAM-disk -image. I just replaced the original file at /usr/lib/modules and re-ran the mkinitrd-command.
Fedora 17: Ethernet interface lost
Monday, April 15. 2013
There was an update to my Fedora 17 Linux and among others, I got a new kernel. I didn't notice it at the time, but the reboot ate one of my Ethernet interfaces. There are two NICs on the motherboard, but on top of those, I have an Intel multi-port NIC. So in the end, there are more than your usual dose of ports.
Traffic to one particular LAN didn't function and I started to investigate:
# ifconfig -a
...
rename5: flags=4098<BROADCAST,MULTICAST> mtu 1500
ether 90:e2:ba:1d:33:f1 txqueuelen 1000 (Ethernet)
RX packets 0 bytes 0 (0.0 B)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 0 bytes 0 (0.0 B)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
device memory 0xfe7e0000-fe800000
Well... I don't remember which one of my Ethernet-ports was rename5 after installation. Typically they are something like eth0, eth1 and so forth. Modern Linuxes tend to add more complexity with names like p2p2 or so, but I've never seen rename5-type naming.
From that I concluded that udev goofed up something. Fedora 17 does not create the /etc/udev/rules.d/70-persistent-net.rules-file which would solve my problem. Lot of Googling later, I found this page, it contains very useful Perl-script to dig enough system information and report it in udev-compatible format, in my case it yields:
# perl /root/bin/write_udev
...
# Added by 'write_udev' for detected device 'rename5'.
SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", ATTR{address}=="90:e2:ba:1d:33:f1", NAME="rename5"
I created the persistent rule -file and added above into it. I just edited the NAME-part and renamed the interface properly.
Getting the rules to take effect was bit tricky. None of these worked:
udevadm trigger
udevadm control --reload-rules
udevadm trigger --attr-match=address='90:e2:ba:1d:33:f1'
udevadm trigger --sysname rename5
The trick was to get the full path with udevadm trigger --verbose --sysname rename5 -command and use the test-command with the full path:
udevadm test --action=add /sys/devices/pci0000:00/0000:00:06.0/0000:02:00.1/net/rename5
Then I got my new rule to take effect immediately and my interface up and working.
Linux guest running on Hyper-V crashing with IRQL LESS OR NOT EQUAL
Wednesday, April 10. 2013
Since most modern Intel CPUs have VT-x in them and Windows 8 Pro has Hyper-V in it, I had to make use of the combo on my old laptop. It has a i5 mobile CPU which makes it on the less powerful end of CPUs. But the simple existence of a possibility of running a Linux on top of a Windows laptop makes me want to try it.
I added the Hyper-V feature into my Windows. I started the Hyper-V Management Console, added a virtual network switch and created a new virtual machine. It booted into Linux installer and the entire Windows crashed with a blue-screen-of-death. WTF?!
After a number of attempts with tweaking the Hyper-V settings, no avail. Every attempt to actually do anything reasonable in the guest system yielded a BSOD. Couple more futile attempts on command-line indicated that it had to have something to do with networking.
Next day I managed to Google into one discussion thread on Microsoft's social forums. There another unfortunate user is experiencing the same symptoms than me. Unlike his Windows 7, on my Windows 8 BSOD there isn't much of a stack trace or any usable information. But I had to try something, so I took an Ethernet-cable and plugged it into my laptop and reconfigured the Hyper-V virtual switch not to use the Intel Centrino 6200 WLAN, but a 1 Gbit/s Realtek port. That did the trick! Apparently some network drivers are not Hyper-V compatible. I don't know how to tell the difference between functioning or not functioning driver, but it is there.
There seems to be some sort of issue with hard drive, but that's an another story ...
CentOS 6.4 SSD RAID-1 /w TRIM support
Tuesday, April 2. 2013
The short version is: it does not work.
Having a SSD is feasible in the long run only if there is possibility of operating system informing the drive that an entire drive block (typically 16 KiB) can be erased. openSUSE wiki has following quote in it: "There are three terms often used to interchangeably describe this same basic functionality: Discard, UNMAP, and TRIM." This discard is possible only when there are no operating system sectors (typically 512 bytes) in the drive block.
Here is what I tried to do: I installed two IntelĀ® Solid-State Drive 520 Series drives into my server and tried to see if RedHat-based CentOS 6.4 has enough backported bits & pieces to support RAID-1 /w TRIM.
The drives are fine and kernel TRIM-support is there:
hdparm -I /dev/sda | fgrep -i trim
* Data Set Management TRIM supported
* Deterministic read after TRIM
My initial attempt had GPT-partition table with a single RAID-partition on it. The command I used is:
mdadm --create /dev/md0 --level=1 --raid-devices=2 /dev/sda1 /dev/sdb1
I created EXT-4 on top of md0:
mkfs.ext4 /dev/md0
and mounted it:
mount /dev/md0 /mnt/tmp/ -o discard,noatime,nodiratime
The discard-support is in the kernel ok:
mount | fgrep md0
/dev/md0 on /mnt/tmp type ext4 (rw,noatime,nodiratime,discard)
The next thing I tried to do is confirm if TRIM does work either automatically or as a batch job. I followed instructions from this blog entry and tried to run hdparm --fibmap on a newly created file. It failed with a segfault. Apparently that is a known issue, so I ended up packaging the latest version myself. My own RPM-package is available at http://opensource.hqcodeshop.com/CentOS/6%20x86_64/hdparm-9.43-1.el6.x86_64.rpm.
With latest hdparm 9.43 I could verify that FIEMAP (file extent map) ioctl() does not return correct results on a soft-RAID-1 device. The LBA-sector given by hdparm does not seem to contain the file's data.
My next attempt was to tear down the existing md0 and re-build it using entire drive as RAID-device.
mdadm --stop /dev/md0
mdadm --zero-superblock /dev/sda1
mdadm --zero-superblock /dev/sdb1
mdadm --create /dev/md0 --level=1 --raid-devices=2 /dev/sda /dev/sdb
mkfs.ext4 /dev/md0
mount /dev/md0 /mnt/tmp/ -o discard,noatime,nodiratime
I ran the same test, but this time hdparm --fibmap informed it failed to determine the drive geometry correctly. A short peek into the source code revealed that current version of hdparm works only with partitions. It tries to load the RAID-partition start LBA even if it is not located on a partition. I made a quick fix for that, it turned out that /sys/block/md0/md/rd0/block has DEVTYPE=disk or DEVTYPE=partition to indicate the base drive type.
Nevertheless, it did not help. fibmap does not return the correct LBA-sector. I loaded a Bash-script to do the manual TRIMming of a SSD-RAID-1, but it only confirmed what I already knew. The LBA-mapping does not work enough to see if discard works or not.
Currently I don't have anything on my RAID-1, I'll have to see if it is possible to get the discard working somehow. A newer Linux might do the trick.